coffeemaker

command module
v0.0.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 6, 2024 License: MIT Imports: 13 Imported by: 0

README

REDDITOR

Import

go get github.com/soumitsalman/go-reddit

Sample Code

import (
	"encoding/json"
	"fmt"
	"os"
	"time"

	ds "github.com/soumitsalman/beansack/sdk"
	"github.com/soumitsalman/go-reddit/collector"
)

func collectAndStore() {
	config :=collector.NewCollectorConfig(localFileStore)
	collector.NewCollector(config).Collect()
}

func localFileStore(contents []ds.Bean) {
	filename := fmt.Sprintf("outputs_REDDIT_%s", time.Now().Format("2006-01-02-15-04-05.json"))
	file, _ := os.Create(filename)
	defer file.Close()
	json.NewEncoder(file).Encode(contents)

}

This has the most common read functions for reddit api

NEWS COLLECTOR

Simple utility for scraping blogs, news articles and sitemaps with more fidelity than some of the default libraries. This is a wrapper on top existing libraries such as

  • github.com/go-shiori/go-readability
  • github.com/gocolly/colly/v2

Usage

Get Package:

go get github.com/soumitsalman/newscollector

Import:

import (
	"github.com/soumitsalman/newscollector/loaders"
)

Collecting One-off URLs:

func main() {
	urls := []string{
		"https://kennybrast.medium.com/planning-a-successful-devops-strategy-for-a-fortune-200-enterprise-56304f1e28a8",
		"https://medium.com/@bohane.michael/navigating-risk-in-investment-fbbec34acd5f",
		"https://mymoneychronicles.medium.com/5-underrated-michael-jackson-songs-dfb6f8b08bb9",
		"https://thehackernews.com/2024/02/new-idat-loader-attacks-using.html",
		"https://thehackernews.com/2024/02/microsoft-releases-pyrit-red-teaming.html",
		"https://blogs.scientificamerican.com/at-scientific-american/systems-analysis-look-back-1966-scientific-american-article/",
		"https://www.scientificamerican.com/article/even-twilight-zone-coral-reefs-arent-safe-from-bleaching/",
		"https://www.scientificamerican.com/blog/at-scientific-american/reception-on-capitol-hill-will-celebrate-scientific-americans-cities-issue/",
		"https://blogs.scientificamerican.com/at-scientific-american/reception-on-capitol-hill-will-celebrate-scientific-americans-cities-issue/",
	}

	collector := loaders.NewDefaultWebTextLoader(&loaders.WebLoaderConfig{})()
	for _, url := range urls {
	 	collector.LoadDocument(url)
	}

	for _, article := range collector.ListAll() {
		fmt.Println(article.ToString())
	}
}

Scraping From Sitemaps:

func main() {
	// built-in sitemap scrapper for thehackersnews.com
	// collector := loaders.NewTheHackersNewsSiteLoader(7)
	// built-in scrapper for Medium's sitemap
	// collector := loaders.NewMediumSiteLoader(2)
	// built-in scrapper for YC's hackernews.com topstories.json
	collector := loaders.NewYCHackerNewsSiteLoader(2)
	// the integer value refers to indicating that the collector will collect posts from the last N number days
	collector.LoadSite()

	for _, article := range collector.ListAll() {
		fmt.Println(article.ToString())
	}
}

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
sdk

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL