Documentation
¶
Overview ¶
Package press provides tools to crawl and archive articles from press websites.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
Types ¶
type Article ¶ added in v0.8.0
Article holds informations about an article and its content.
func (*Article) MarshalBinary ¶ added in v0.14.0
func (*Article) UnmarshalBinary ¶ added in v0.14.0
type Auth ¶
type Auth struct {
// contains filtered or unexported fields
}
Auth stores credentials.
func (*Auth) Credential ¶ added in v0.10.0
func (auth *Auth) Credential(name string) *Credential
type Crawler ¶
type Crawler struct {
// contains filtered or unexported fields
}
Crawler fetches press articles and archives them.
func NewCrawler ¶
NewCrawler creates a new press article crawler with the provided options.
type Credential ¶ added in v0.10.0
type Credential struct { Domain string Cookies []*http.Cookie // cookies associated with the (last) authentication UseCookies bool // whether to re-use cookies for authentication. // contains filtered or unexported fields }
func (Credential) Expires ¶ added in v0.11.0
func (cred Credential) Expires() (expires time.Time)
type Option ¶
type Option func(c *config) error
Option customizes configuration.
func WithCookies ¶ added in v0.8.0
WithCookies configures the crawler's cookies used by the web browser.
func WithHeadless ¶
WithHeadless configures the crawler's underlying web browser.
func WithNumCPUs ¶ added in v0.9.0
WithNumCPUs limits the number of active goroutines during web browsing. A negative value indicates no limit. A zero value indicates to use the number of available CPUs.
func WithStdout ¶ added in v0.15.0
WithStdout specifies the output sink to use for logging during web browsing.
func WithTimeout ¶
WithTimeout configures the crawler to use a global fetch-timeout.
Source Files
¶
Directories
¶
Path | Synopsis |
---|---|
cmd
|
|
press-archiver
Command press-archiver archives press articles.
|
Command press-archiver archives press articles. |
press-archiver-cookies
Command press-archiver-cookies refreshes cookies for press articles.
|
Command press-archiver-cookies refreshes cookies for press articles. |
press-archiver-ls
Command press-archiver-ls displays the contents of the archives press articles database.
|
Command press-archiver-ls displays the contents of the archives press articles database. |
press-archiver-rm
Command press-archiver-rm removes a set of archives from the database of press articles.
|
Command press-archiver-rm removes a set of archives from the database of press articles. |
press-archiver-srv
Command press-archiver-srv serves PDFs from archived press articles.
|
Command press-archiver-srv serves PDFs from archived press articles. |
Package padb provides an interface to handle press articles.
|
Package padb provides an interface to handle press articles. |
pabolt
Package pabolt implements the pressdb interface with boltdb.
|
Package pabolt implements the pressdb interface with boltdb. |