parser

package
v0.0.0-...-2b05f1f Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 23, 2025 License: MPL-2.0 Imports: 24 Imported by: 0

Documentation

Overview

Package parser implements a validating parser for the PSL files.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func BlocksOfType

func BlocksOfType[T Block](tree Block) []T

BlocksOfType recursively collects and returns all blocks of concrete type T in the given parse tree.

For example, BlocksOfType[*parser.Comment](ast) returns all comment nodes in ast.

func ValidateOffline

func ValidateOffline(l *List) []error

ValidateOffline runs offline validations on a parsed PSL.

func ValidateOnline

func ValidateOnline(ctx context.Context, l *List, client *github.Repo, prHistory *githistory.History) (errs []error)

ValidateOnline runs online validations on a parsed PSL. Online validations are slower than offline validation, especially when checking the entire PSL. All online validations respect cancellation on the given context.

Types

type Block

type Block interface {
	// SrcRange returns the block's SourceRange.
	SrcRange() SourceRange
	// Children returns the block's direct children, if any.
	Children() []Block
	// Changed reports whether the tree rooted at block has changed
	// since the base of comparison (see List.SetBaseVersion).
	Changed() bool
	// contains filtered or unexported methods
}

A Block is a parsed chunk of a PSL file. Each block is one of the concrete types Comment, Section, Suffixes, Suffix, or Wildcard.

type Comment

type Comment struct {

	// Text is the unprocessed content of the comment lines, with the
	// leading comment syntax removed.
	Text []string
	// contains filtered or unexported fields
}

Comment is a comment block, consisting of one or more contiguous lines of commented text.

func (Comment) Changed

func (b Comment) Changed() bool

func (*Comment) Children

func (c *Comment) Children() []Block

func (Comment) SrcRange

func (b Comment) SrcRange() SourceRange

type ErrCommentPreventsSectionSort

type ErrCommentPreventsSectionSort struct {
	SourceRange
}

func (ErrCommentPreventsSectionSort) Error

type ErrCommentPreventsSuffixSort

type ErrCommentPreventsSuffixSort struct {
	SourceRange
}

func (ErrCommentPreventsSuffixSort) Error

type ErrConflictingSuffixAndException

type ErrConflictingSuffixAndException struct {
	*Suffix
	Wildcard *Wildcard
}

func (ErrConflictingSuffixAndException) Changed

func (b ErrConflictingSuffixAndException) Changed() bool

func (ErrConflictingSuffixAndException) Error

func (ErrConflictingSuffixAndException) SrcRange

func (b ErrConflictingSuffixAndException) SrcRange() SourceRange

type ErrDuplicateSection

type ErrDuplicateSection struct {
	*Section
	FirstDefinition *Section
}

func (ErrDuplicateSection) Changed

func (b ErrDuplicateSection) Changed() bool

func (ErrDuplicateSection) Error

func (e ErrDuplicateSection) Error() string

func (ErrDuplicateSection) SrcRange

func (b ErrDuplicateSection) SrcRange() SourceRange

type ErrDuplicateSuffix

type ErrDuplicateSuffix struct {
	Name            string
	Block                 // Suffix or Wildcard
	FirstDefinition Block // Suffix or Wildcard
}

func (ErrDuplicateSuffix) Error

func (e ErrDuplicateSuffix) Error() string

type ErrInvalidEncoding

type ErrInvalidEncoding struct {
	Encoding string
}

ErrInvalidEncoding reports that the input is encoded with something other than UTF-8.

func (ErrInvalidEncoding) Error

func (e ErrInvalidEncoding) Error() string

type ErrInvalidSuffix

type ErrInvalidSuffix struct {
	SourceRange
	Suffix string
	Err    error
}

ErrInvalidSuffix reports that a suffix suffix is not a valid PSL entry.

func (ErrInvalidSuffix) Error

func (e ErrInvalidSuffix) Error() string

type ErrInvalidUnicode

type ErrInvalidUnicode struct {
	SourceRange
}

ErrInvalidUnicode reports that a line contains characters that are not valid Unicode.

func (ErrInvalidUnicode) Error

func (e ErrInvalidUnicode) Error() string

type ErrMismatchedSection

type ErrMismatchedSection struct {
	SourceRange
	EndName string
	Section *Section
}

ErrMismatchedSection reports that a file section was started under one name but ended under another.

func (ErrMismatchedSection) Error

func (e ErrMismatchedSection) Error() string

type ErrMissingEntityEmail

type ErrMissingEntityEmail struct {
	Suffixes *Suffixes
}

ErrMissingEntityEmail reports that a block of suffixes does not have a parseable contact email address in its header comment.

func (ErrMissingEntityEmail) Error

func (e ErrMissingEntityEmail) Error() string

type ErrMissingEntityName

type ErrMissingEntityName struct {
	Suffixes *Suffixes
}

MissingEntityName reports that a block of suffixes does not have a parseable owner name in its header comment.

func (ErrMissingEntityName) Error

func (e ErrMissingEntityName) Error() string

type ErrMissingSection

type ErrMissingSection struct {
	Name string
}

func (ErrMissingSection) Error

func (e ErrMissingSection) Error() string

type ErrMissingTXTRecord

type ErrMissingTXTRecord struct {
	Block
}

func (ErrMissingTXTRecord) Error

func (e ErrMissingTXTRecord) Error() string

type ErrNestedSection

type ErrNestedSection struct {
	SourceRange
	Name    string
	Section *Section
}

ErrNestedSection reports that a file section is being started while already within a section.

func (ErrNestedSection) Error

func (e ErrNestedSection) Error() string

type ErrSectionInSuffixBlock

type ErrSectionInSuffixBlock struct {
	SourceRange
}

ErrSectionInSuffixBlock reports that a comment within a suffix block contains a section delimiter.

func (ErrSectionInSuffixBlock) Error

func (e ErrSectionInSuffixBlock) Error() string

type ErrTXTCheckFailure

type ErrTXTCheckFailure struct {
	Block
	Err error
}

func (ErrTXTCheckFailure) Error

func (e ErrTXTCheckFailure) Error() string

type ErrTXTRecordMismatch

type ErrTXTRecordMismatch struct {
	Block
	PR int
}

func (ErrTXTRecordMismatch) Error

func (e ErrTXTRecordMismatch) Error() string

type ErrUnclosedSection

type ErrUnclosedSection struct {
	Section *Section
}

ErrUnclosedSection reports that a file section was not closed properly before EOF.

func (ErrUnclosedSection) Error

func (e ErrUnclosedSection) Error() string

type ErrUnknownSection

type ErrUnknownSection struct {
	*Section
}

func (ErrUnknownSection) Changed

func (b ErrUnknownSection) Changed() bool

func (ErrUnknownSection) Error

func (e ErrUnknownSection) Error() string

func (ErrUnknownSection) SrcRange

func (b ErrUnknownSection) SrcRange() SourceRange

type ErrUnknownSectionMarker

type ErrUnknownSectionMarker struct {
	SourceRange
}

ErrUnknownSectionMarker reports that a line looks like a file section marker (e.g. "===BEGIN ICANN DOMAINS==="), but is not one of the recognized kinds of marker.

func (ErrUnknownSectionMarker) Error

func (e ErrUnknownSectionMarker) Error() string

type ErrUnstartedSection

type ErrUnstartedSection struct {
	SourceRange
	Name string
}

ErrUnstartedSection reports that section end marker was found without a corresponding start.

func (ErrUnstartedSection) Error

func (e ErrUnstartedSection) Error() string

type List

type List struct {

	// Blocks are the top-level elements of the list, in the order
	// they appear.
	Blocks []Block
	// contains filtered or unexported fields
}

List is a parsed public suffix list.

func Parse

func Parse(bs []byte) (*List, []error)

Parse parses bs as a PSL file and returns the parse result.

The parser tries to keep going when it encounters errors. Parse and validation errors are accumulated in the Errors field of the returned File.

If the returned File has a non-empty Errors field, the parsed file does not comply with the PSL format (documented at https://github.com/publicsuffix/list/wiki/Format), or with PSL submission guidelines (https://github.com/publicsuffix/list/wiki/Guidelines). A File with errors should not be used to calculate public suffixes for FQDNs.

func (List) Changed

func (b List) Changed() bool

func (*List) Children

func (l *List) Children() []Block

func (*List) Clean

func (l *List) Clean() []error

Clean cleans the list, editing its contents as necessary to conform to PSL style rules such as ordering of suffixes.

Clean does not make any semantic changes. The list after cleaning represents exactly the same set of public suffixes and attached metadata.

If the list's structure prevents Clean from applying some necessary changes, Clean applies as many changes as possible then returns errors describing the cleanups that could not take place.

func (*List) MarshalDebug

func (l *List) MarshalDebug() []byte

MarhsalDebug returns the list serialized to a verbose debugging format. This format is private to this package and for development use only. The format may change drastically without notice.

func (*List) MarshalPSL

func (l *List) MarshalPSL() []byte

MarshalPSL returns the list serialized to standard PSL text format.

func (*List) PublicSuffix

func (l *List) PublicSuffix(d domain.Name) domain.Name

PublicSuffix returns the public suffix of n.

This follows the PSL algorithm to the letter. Notably: a rule "*.foo.com" does not implicitly create a "foo.com" rule, and there is a hardcoded implicit "*" rule so that unknown TLDs are all public suffixes.

func (*List) RegisteredDomain

func (l *List) RegisteredDomain(d domain.Name) (domain.Name, bool)

RegisteredDomain returns the registered/registerable domain of n. Returns (domain, true) when the input is a child of a public suffix, and (zero, false) when the input is itself a public suffix.

RegisteredDomain follows the PSL algorithm to the letter. Notably: a rule "*.foo.com" does not implicitly create a "foo.com" rule, and there is a hardcoded implicit "*" rule so that unknown TLDs are all public suffixes.

func (*List) SetBaseVersion

func (l *List) SetBaseVersion(old *List, wholeSuffixBlocks bool)

SetBaseVersion sets the list's base of comparison to old, and updates the changed/unchanged annotations on all Blocks to match.

If wholeSuffixBlocks is true, any changed Suffix or Wildcard within a Suffixes block marks all suffixes and wildcards in that block as changed.

Precise marking (wholeSuffixBlocks=false) is intended for maintainer and machine edits, where change-aware validators should exaine only the specific changed items.

Expansive marking (wholeSuffixBlocks=true) is intended for external PRs from suffix block owners, to opportunistically point out more issues that they have the knowledge and authority to fix.

func (List) SrcRange

func (b List) SrcRange() SourceRange

type MaintainerInfo

type MaintainerInfo struct {
	// Name is the name of the entity responsible for maintaining a
	// set of suffixes.
	//
	// For ICANN suffixes, this is typically the TLD name, or the name
	// of NIC that controls the TLD.
	//
	// For private domains this is the name of the legal entity
	// (usually a company, sometimes an individual) that owns all
	// domains in the block.
	//
	// In a well-formed PSL file, Name is non-empty for all suffix
	// blocks.
	Name string

	// URLs are links to further information about the suffix block's
	// domains and its maintainer.
	//
	// For ICANN domains this is typically the NIC's information page
	// for the TLD, or failing that a general information page such as
	// a Wikipedia entry.
	//
	// For private domains this is usually the website for the owner
	// of the domains.
	//
	// May be empty when the block header doesn't have
	// machine-readable URLs.
	URLs []*url.URL

	// Maintainer is the contact name and email address of the person
	// or persons responsible for maintaining a block.
	//
	// This field may be empty if there is no machine-readable contact
	// information.
	Maintainers []*mail.Address

	// Other is some unstructured additional notes. They may contain
	// anything, including some of the above information that wasn't
	// in a known parseable form.
	Other []string

	// MachineEditable is whether this information can be
	// machine-edited and written back out without loss of
	// information. The exact formatting of the information may
	// change, but no information will be lost.
	MachineEditable bool
}

func (*MaintainerInfo) Compare

func (m *MaintainerInfo) Compare(n *MaintainerInfo) int

func (MaintainerInfo) HasInfo

func (m MaintainerInfo) HasInfo() bool

HasInfo reports whether m has any maintainer information at all.

type Section

type Section struct {

	// Name is he section name. In a normal well-formed PSL file, the
	// names are "ICANN DOMAINS" and "PRIVATE DOMAINS".
	Name string
	// Blocks are the child blocks contained within the section.
	Blocks []Block
	// contains filtered or unexported fields
}

Section is a named part of a PSL file, containing suffixes which behave similarly.

func (Section) Changed

func (b Section) Changed() bool

func (*Section) Children

func (s *Section) Children() []Block

func (Section) SrcRange

func (b Section) SrcRange() SourceRange

type SourceRange

type SourceRange struct {
	FirstLine int
	LastLine  int
}

SourceRange describes a slice of lines from an unparsed source file. FirstLine and LastLine behave like normal slice offsets, i.e. they represent the half-open range [FirstLine:LastLine).

func (SourceRange) LocationString

func (s SourceRange) LocationString() string

LocationString prints a human-readable description of the SourceRange.

func (SourceRange) NumLines

func (s SourceRange) NumLines() int

NumLines returns the number of source lines described by SourceRange.

type Suffix

type Suffix struct {

	// Domain is the public suffix's domain name.
	Domain domain.Name
	// contains filtered or unexported fields
}

Suffix is one public suffix, represented in the standard domain name format.

func (Suffix) Changed

func (b Suffix) Changed() bool

func (*Suffix) Children

func (s *Suffix) Children() []Block

func (*Suffix) PublicSuffix

func (s *Suffix) PublicSuffix(n domain.Name) (suffix domain.Name, ok bool)

PublicSuffix returns the public suffix of n according to this Suffix rule taken in isolation. If n is not a child domain of s PublicSuffix returns (zeroValue, false).

func (*Suffix) RegisteredDomain

func (s *Suffix) RegisteredDomain(n domain.Name) (regDomain domain.Name, ok bool)

RegisteredDomain returns the registered/registerable domain of n according to this Suffix rule taken in isolation. The registered domain is defined as n's public suffix plus one more child label. If n is not a child domain of s, RegisteredDomain returns (zeroValue, false).

func (Suffix) SrcRange

func (b Suffix) SrcRange() SourceRange

type Suffixes

type Suffixes struct {

	// Info is information about the authoritative maintainers for
	// this set of suffixes.
	Info MaintainerInfo

	// Blocks are the child blocks contained within the section.
	Blocks []Block
	// contains filtered or unexported fields
}

Suffixes is a list of PSL domain suffixes with optional additional metadata.

Suffix sections consist of a header comment that contains a mix of structured and unstructured information, followed by a list of domain suffixes. The suffix list may contain additional unstructured inline comments.

func (Suffixes) Changed

func (b Suffixes) Changed() bool

func (*Suffixes) Children

func (s *Suffixes) Children() []Block

func (Suffixes) SrcRange

func (b Suffixes) SrcRange() SourceRange

type Wildcard

type Wildcard struct {

	// Domain is the base of the wildcard public suffix, without the
	// leading "*" label.
	Domain domain.Name
	// Exceptions are the domain.Labels that, when they appear in the
	// wildcard position of Domain, cause a FQDN to _not_ match this
	// wildcard. For example, if Domain="foo.com" and Exceptions=[bar,
	// qux], zot.foo.com is a public suffix, but bar.foo.com and
	// qux.foo.com are not.
	Exceptions []domain.Label
	// contains filtered or unexported fields
}

Wildcard is a wildcard public suffix, along with any exceptions to that wildcard.

func (Wildcard) Changed

func (b Wildcard) Changed() bool

func (*Wildcard) Children

func (w *Wildcard) Children() []Block

func (*Wildcard) PublicSuffix

func (w *Wildcard) PublicSuffix(n domain.Name) (suffix domain.Name, isException, ok bool)

PublicSuffix returns the public suffix of n according to this Wildcard rule taken in isolation. If n is not a child domain of w PublicSuffix returns (zeroValue, false).

func (*Wildcard) RegisteredDomain

func (w *Wildcard) RegisteredDomain(n domain.Name) (regDomain domain.Name, isException, ok bool)

RegisteredDomain returns the registered/registerable domain of n according to this Suffix rule taken in isolation. The registered domain is defined as n's public suffix plus one more child label. If n is not a child domain of s, RegisteredDomain returns (zeroValue, false).

func (Wildcard) SrcRange

func (b Wildcard) SrcRange() SourceRange

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL