lr

package

v0.11.0 Latest Latest Go to latest Published: Feb 22, 2025 License: ISC Imports: 18 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/moorara/algo

Links

Open Source Insights

Documentation ¶

Overview ¶

Package lr provides common data structures and algorithms for building LR parsers. LR parsers are bottom-up parsers that analyse deterministic context-free languages in linear time.

Bottom-up parsing constructs a parse tree for an input string starting at the leaves (bottom) and working towards the root (top). This process involves reducing a string w to the start symbol of the grammar. At each reduction step, a specific substring matching the body of a production is replaced by the non-terminal at the head of that production.

Bottom-up parsing during a left-to-right scan of the input constructs a rightmost derivation in reverse:

S = γ₀ ⇒ᵣₘ γ₁ ⇒ᵣₘ γ₂ ⇒ᵣₘ ... ⇒ᵣₘ γₙ₋₁ ⇒ᵣₘ γₙ = w

At each step, the handle βₙ in γₙ is replaced by the head of the production Aₙ → βₙ to obtain the previous right-sentential form γₙ₋₁. If the process produces the start symbol S as the only sentential form, parsing is complete. If a grammar is unambiguous, then every right-sentential form of the grammar has exactly one handle.

The most common type of bottom-up parser is LR(k) parsing. The L is for left-to-right scanning of the input, the R for constructing a rightmost derivation in reverse, and the k for the number of input symbols of lookahead that are used in making parsing decisions.

Advantages of LR parsing:

Can recognize nearly all programming language constructs defined by context-free grammars.
Detects syntax errors at the earliest possible point during a left-to-right scan.
The class of grammars that can be parsed using LR methods is a proper superset of the class of grammars that can be parsed with predictive or LL methods. For a grammar to be LR(k), we must be able to recognize the occurrence of the right side of a production in a right-sentential form, with k input symbols of lookahead. This requirement is far less stringent than that for LL(k) grammars where we must be able to recognize the use of a production seeing only the first k symbols of what its right side derives.

In LR(k) parsing, the cases k = 0 or k = 1 are most commonly used in practical applications. LR parsing methods use pushdown automata (PDA) to parse an input string. A pushdown automaton is a type of automaton used for Type 2 languages (context-free languages) in the Chomsky hierarchy. A PDA uses a state machine with a stack. The next state is determined by the current state, the next input, and the top of the stack. LR(0) parsers do not rely on any lookahead to make parsing decisions. An LR(0) parser bases its decisions entirely on the current state and the parsing stack. LR(1) parsers determine the next state based on the current state, one lookahead symbol, and the top of the stack.

Shift-reduce parsing is a bottom-up parsing technique that uses a stack for grammar symbols and an input buffer for the remaining string. The parser alternates between shifting symbols from the input to the stack and reducing the top of the stack based on grammar rules. This process continues until the stack contains only the start symbol and the input is empty, or an error occurs.

Certain context-free grammars cannot be parsed using shift-reduce parsers because they may encounter shift/reduce conflicts (indecision between shifting or reducing) or reduce/reduce conflicts (indecision between multiple reductions). Technically speaking, these grammars are not in the LR(k) class.

For more details on parsing theory, refer to "Compilers: Principles, Techniques, and Tools (2nd Edition)".

Index ¶

Constants
Variables
type Action
- func (a *Action) Equal(rhs *Action) bool
- func (a *Action) String() string
type ActionHandlePair
- func (p *ActionHandlePair) Equal(rhs *ActionHandlePair) bool
- func (p *ActionHandlePair) String() string
type ActionType
type AggregatedConflictError
- func (e AggregatedConflictError) Error() string
- func (e AggregatedConflictError) ErrorOrNil() error
- func (e AggregatedConflictError) Unwrap() []error
type Associativity
- func (a Associativity) String() string
type Automaton
type ConflictError
- func (e *ConflictError) Error() string
- func (e *ConflictError) IsReduceReduce() bool
- func (e *ConflictError) IsShiftReduce() bool
type EvaluateFunc
type Grammar
- func NewGrammarWithLR0(G *grammar.CFG) *Grammar
- func NewGrammarWithLR0Kernel(G *grammar.CFG) *Grammar
- func NewGrammarWithLR1(G *grammar.CFG) *Grammar
- func NewGrammarWithLR1Kernel(G *grammar.CFG) *Grammar
type Item
type Item0
- func (i *Item0) Compare(rhs Item) int
- func (i *Item0) DotSymbol() (grammar.Symbol, bool)
- func (i *Item0) Equal(rhs Item) bool
- func (i *Item0) IsComplete() bool
- func (i *Item0) IsFinal() bool
- func (i *Item0) IsInitial() bool
- func (i *Item0) IsKernel() bool
- func (i *Item0) Item1(lookahead grammar.Terminal) *Item1
- func (i *Item0) Next() (Item, bool)
- func (i *Item0) String() string
type Item1
- func (i *Item1) Compare(rhs Item) int
- func (i *Item1) DotSymbol() (grammar.Symbol, bool)
- func (i *Item1) Equal(rhs Item) bool
- func (i *Item1) GetPrefix() grammar.String[grammar.Symbol]
- func (i *Item1) GetSuffix() grammar.String[grammar.Symbol]
- func (i *Item1) IsComplete() bool
- func (i *Item1) IsFinal() bool
- func (i *Item1) IsInitial() bool
- func (i *Item1) IsKernel() bool
- func (i *Item1) Item0() *Item0
- func (i *Item1) Next() (Item, bool)
- func (i *Item1) String() string
type ItemSet
- func NewItemSet(items ...Item) ItemSet
type ItemSetCollection
- func NewItemSetCollection(sets ...ItemSet) ItemSetCollection
type Parser
- func (p *Parser) Parse(tokenF parser.TokenFunc, prodF parser.ProductionFunc) error
- func (p *Parser) ParseAndBuildAST() (parser.Node, error)
- func (p *Parser) ParseAndEvaluate(eval EvaluateFunc) (*Value, error)
type ParsingTable
- func NewParsingTable(states []State, terminals []grammar.Terminal, ...) *ParsingTable
- func (t *ParsingTable) ACTION(s State, a grammar.Terminal) (*Action, error)
- func (t *ParsingTable) AddACTION(s State, a grammar.Terminal, action *Action) bool
- func (t *ParsingTable) Equal(rhs *ParsingTable) bool
- func (t *ParsingTable) GOTO(s State, A grammar.NonTerminal) (State, error)
- func (t *ParsingTable) ResolveConflicts() error
- func (t *ParsingTable) SetGOTO(s State, A grammar.NonTerminal, next State)
- func (t *ParsingTable) String() string
type Precedence
- func (p *Precedence) Equal(rhs *Precedence) bool
- func (p *Precedence) String() string
type PrecedenceHandle
- func PrecedenceHandleForProduction(p *grammar.Production) *PrecedenceHandle
- func PrecedenceHandleForTerminal(t grammar.Terminal) *PrecedenceHandle
- func (h *PrecedenceHandle) Equal(rhs *PrecedenceHandle) bool
- func (h *PrecedenceHandle) IsProduction() bool
- func (h *PrecedenceHandle) IsTerminal() bool
- func (h *PrecedenceHandle) String() string
type PrecedenceHandles
- func NewPrecedenceHandles(handles ...*PrecedenceHandle) PrecedenceHandles
type PrecedenceLevel
- func (p *PrecedenceLevel) Equal(rhs *PrecedenceLevel) bool
- func (p *PrecedenceLevel) String() string
type PrecedenceLevels
- func (p PrecedenceLevels) Compare(lhs, rhs *ActionHandlePair) (int, error)
- func (p PrecedenceLevels) Equal(rhs PrecedenceLevels) bool
- func (p PrecedenceLevels) Precedence(h *PrecedenceHandle) (*Precedence, bool)
- func (p PrecedenceLevels) String() string
- func (p PrecedenceLevels) Verify() error
type State
type StateMap
- func BuildStateMap(C ItemSetCollection) StateMap
- func (m StateMap) All() iter.Seq2[State, ItemSet]
- func (m StateMap) FindItem(s State, item Item) int
- func (m StateMap) FindItemSet(I ItemSet) State
- func (m StateMap) Item(s State, i int) Item
- func (m StateMap) ItemSet(s State) ItemSet
- func (m StateMap) States() []State
- func (m StateMap) String() string
type Value
- func (v *Value) String() string

Constants ¶

View Source

const ErrState = State(-1)

Variables ¶

View Source

var (
	EqItem = func(lhs, rhs Item) bool {
		return lhs.Equal(rhs)
	}

	CmpItem = func(lhs, rhs Item) int {
		return lhs.Compare(rhs)
	}

	EqItemSet = func(lhs, rhs ItemSet) bool {
		return lhs.Equal(rhs)
	}

	CmpItemSet = cmpItemSet
)

View Source

var (
	EqState   = generic.NewEqualFunc[State]()
	HashState = hash.HashFuncForInt[State](nil)

	CmpState = func(lhs, rhs State) int {
		return int(lhs) - int(rhs)
	}
)

Functions ¶

This section is empty.

Types ¶

type Action ¶

type Action struct {
	Type       ActionType
	State      State               // Only set for SHIFT actions
	Production *grammar.Production // Only set for REDUCE actions
}

Action represents an action in the LR parsing table or automaton.

func (*Action) Equal ¶ added in v0.10.3

func (a *Action) Equal(rhs *Action) bool

Equal determines whether or not two actions are the same.

func (*Action) String ¶

func (a *Action) String() string

String returns a string representation of an action.

type ActionHandlePair ¶ added in v0.10.6

type ActionHandlePair struct {
	Action *Action
	Handle *PrecedenceHandle
}

ActionHandlePair associates an action with a precedence handle. It is used to resolve precedence conflicts when multiple handles have the same precedence order. The action is selected based on the associativity of the handles and the types of actions involved.

func (*ActionHandlePair) Equal ¶ added in v0.10.6

func (p *ActionHandlePair) Equal(rhs *ActionHandlePair) bool

Equal determines whether or not two pairs of action-handle are the same.

func (*ActionHandlePair) String ¶ added in v0.10.6

func (p *ActionHandlePair) String() string

String returns a string representation of the pair of action-handle.

type ActionType ¶

type ActionType int

ActionType enumerates the possible types of actions in an LR parser.

const (
	SHIFT  ActionType = 1 + iota // Advance to the next state by consuming input.
	REDUCE                       // Apply a production to reduce symbols on the stack.
	ACCEPT                       // Accept the input as successfully parsed.
	ERROR                        // Signal an error in parsing.
)

type AggregatedConflictError ¶ added in v0.10.6

type AggregatedConflictError []*ConflictError

AggregatedConflictError represents a collection of conflict errors. It is used to accumulate multiple conflict errors and generate a consolidated, more concise error message that is more useful for resolving all conflicts at once.

func (AggregatedConflictError) Error ¶ added in v0.10.6

func (e AggregatedConflictError) Error() string

Error builds and returns a consolidated string representation of the collection of conflict errors. The resulting message is more concise and provides a clearer summary of all the conflicts.

func (AggregatedConflictError) ErrorOrNil ¶ added in v0.10.6

func (e AggregatedConflictError) ErrorOrNil() error

ErrorOrNil returns an error if the AggregatedConflictError contains any errors, or nil if it has none. This method is useful for ensuring that a valid error value is returned after accumulating errors, indicating whether errors are present or not.

func (AggregatedConflictError) Unwrap ¶ added in v0.10.6

func (e AggregatedConflictError) Unwrap() []error

Unwrap implements the unwrap interface for AggregatedConflictError. It returns the slice of accumulated errors wrapped in the AggregatedConflictError instance. If there are no errors, it returns nil, indicating that e does not wrap any error.

type Associativity ¶ added in v0.10.6

type Associativity int

Associativity represents the associativity property of a terminal or a production rule.

const (
	NONE  Associativity = iota // Not associative
	LEFT                       // Left-associative
	RIGHT                      // Right-associative
)

func (Associativity) String ¶ added in v0.10.6

func (a Associativity) String() string

String returns a string representation of the associativity.

type Automaton ¶ added in v0.10.3

type Automaton interface {
	// Initial returns the initial item of an augmented grammar.
	// For LR(0), the initial item is "S′ → •S",
	// and for LR(1), the initial item is "S′ → •S, $".
	Initial() Item

	// Closure computes the closure of a given item set.
	// It may compute the closure for either an LR(0) item set or an LR(1) item set.
	CLOSURE(ItemSet) ItemSet

	// GOTO(I, X) computes the closure of the set of all items "A → αX•β",
	// where "A → α•Xβ" is in the set of items I and X is a grammar symbol.
	//
	// The GOTO function defines transitions in the automaton for the grammar.
	// Each state of the automaton corresponds to a set of items, and
	// GOTO(I, X) specifies the transition from the state I on grammar symbol X.
	GOTO(ItemSet, grammar.Symbol) ItemSet

	// Canonical constructs the canonical collection of item sets for the augmented grammar G′.
	//
	// The canonical collection forms the basis for constructing an automaton, used for making parsing decisions.
	// Each state of the automaton corresponds to an item set in the canonical collection.
	Canonical() ItemSetCollection
}

Automaton represents the core of an LR automaton used in LR parsing. It provides essential functions for constructing LR parsing tables, including state transitions (GOTO) and generating item sets (Canonical).

An LR automaton has two orthogonal dimensions:

The item dimension: it can be based on either LR(0) or LR(1) items.
The item set dimension: it can operate on either complete item sets or only kernel item sets.

type ConflictError ¶ added in v0.10.6

type ConflictError struct {
	State    State
	Terminal grammar.Terminal
	Actions  set.Set[*Action]
}

ConflictError represents a conflict in an LR parsing table. A conflict occurs when the grammar is ambiguous, resulting in multiple actions being associated with a specific state s and terminal a in the ACTION table. A conflict is either a shift/reduce conflict or a reduce/reduce conflict.

func (*ConflictError) Error ¶ added in v0.10.6

func (e *ConflictError) Error() string

Error returns a detailed string representation of the conflict error.

func (*ConflictError) IsReduceReduce ¶ added in v0.10.6

func (e *ConflictError) IsReduceReduce() bool

IsReduceReduce returns true if the conflict is a reduce/reduce conflict. A reduce/reduce conflict occurs when all actions associated with a state and terminal are REDUCE actions, and there is more than one such action.

func (*ConflictError) IsShiftReduce ¶ added in v0.10.6

func (e *ConflictError) IsShiftReduce() bool

IsShiftReduce returns true if the conflict is a shift/reduce conflict. A shift/reduce conflict arises when there is at least one SHIFT action and one REDUCE action associated with the same state and terminal.

type EvaluateFunc ¶ added in v0.10.5

type EvaluateFunc func(*grammar.Production, []*Value) (any, error)

EvaluateFunc is a function invoked every time a production rule is matched or applied during the parsing of an input string. It receives a list of values corresponding to the right-hand side of the matched production and expects a value to be returned representing the left-hand side of the production.

The returned value will be subsequently used as an input in the evaluation of other production rules. Both the input and output values are of the generic type any.

The caller is responsible for ensuring that each value is converted to the appropriate type based on the production rule and the position of the symbol corresponding to the value in the production's right-hand side. The input values must retain the same type they were originally evaluated as when returned.

The function may return an error if there are issues with the input values, such as mismatched types or unexpected inputs.

type Grammar ¶ added in v0.10.6

type Grammar struct {
	*grammar.CFG
	Automaton
}

Grammar represents a context-free grammar with additional features tailored for LR parsing and constructing an LR parser. The grammar is augmented with a new start symbol S′ and a new production "S′ → S" to facilitate the construction of the LR parser. It extends a regular context-free grammar by incorporating precedence and associativity information to handle ambiguities in the grammar and resolve conflicts. It also provides essential functionalities for building an LR automaton and the corresponding LR parsing table.

Always use one of the provided functions to create a new instance of this type. Direct instantiation of the struct is discouraged.

func NewGrammarWithLR0 ¶ added in v0.10.6

func NewGrammarWithLR0(G *grammar.CFG) *Grammar

NewGrammarWithLR0 creates a new augmented grammar based on the complete sets of LR(0) items.

func NewGrammarWithLR0Kernel ¶ added in v0.10.6

func NewGrammarWithLR0Kernel(G *grammar.CFG) *Grammar

NewGrammarWithLR0Kernel creates a new augmented grammar based on the kernel sets of LR(0) items.

func NewGrammarWithLR1 ¶ added in v0.10.6

func NewGrammarWithLR1(G *grammar.CFG) *Grammar

NewGrammarWithLR1 creates a new augmented grammar based on the complete sets of LR(1) items.

func NewGrammarWithLR1Kernel ¶ added in v0.10.6

func NewGrammarWithLR1Kernel(G *grammar.CFG) *Grammar

NewGrammarWithLR1Kernel creates a new augmented grammar based on the kernel sets of LR(1) items.

type Item ¶

type Item interface {
	fmt.Stringer
	generic.Equaler[Item]
	generic.Comparer[Item]

	// IsInitial checks if an item is the initial item in the augmented grammar.
	// For LR(0), the initial item is "S′ → •S",
	// and for LR(1), the initial item is "S′ → •S, $".
	IsInitial() bool

	// IsKernel determines whether an item is a kernel item.
	//
	// Kernel items include:
	//
	//   - The initial item.
	//   - All items where the dot is not at the beginning (left end) of the item's body.
	//
	// Non-kernel items are those where the dot is at the beginning, except for the initial item.
	IsKernel() bool

	// IsComplete checks if the dot has reached the end of the item's body.
	IsComplete() bool

	// IsFinal checks if an item is the final item in the augmented grammar.
	// For LR(0), the final item is "S′ → S•",
	// and for LR(1), the initial item is "S′ → S•, $".
	IsFinal() bool

	// Dot returns the grammar symbol at the dot position in the item's body.
	// If the dot is at the end of the body, it returns nil and false.
	DotSymbol() (grammar.Symbol, bool)

	// NextItem generates a new item by advancing the dot one position forward in the item's body.
	// If the dot is at the end of the body, it returns an empty item and false.
	Next() (Item, bool)
}

Item represents an LR item for a context-free grammar.

This interface defines the methods required by an LR parser.

type Item0 ¶ added in v0.10.3

type Item0 struct {
	*grammar.Production
	Start grammar.NonTerminal // The start symbol S′ in the augmented grammar.
	Dot   int                 // Position of the dot in the production body.
}

Item0 represents an LR(0) item for a context-free grammar. It implements the Item interface.

An LR(0) item is a production with a dot at some position of the body. For example, production A → XYZ yields the four LR(0) items:

A → •XYZ
A → X•YZ
A → XY•Z
A → XYZ•

The production A → ε generates only one LR(0) item, A → •.

Intuitively, an LR(0) item tracks the progress of a production during parsing. For instance, A → X•YZ indicates that a string derivable from X has been seen, and the parser expects a string derivable from YZ next.

func (*Item0) Compare ¶ added in v0.10.3

func (i *Item0) Compare(rhs Item) int

Compare compares two LR(0) items to establish a consistent and deterministic order. The comparison follows these rules:

Initial items precede non-initial items.
Kernel items precede non-kernel items.
If both items are kernel or non-kernel, items with production heads equal to the augmented start symbol S′ come first.
Next, items with further dot positions come first.
If dot positions are identical, the order is determined by comparing productions.

This function is useful for sorting LR(0) items, ensuring stability and predictability in their ordering.

func (*Item0) DotSymbol ¶ added in v0.10.3

func (i *Item0) DotSymbol() (grammar.Symbol, bool)

Dot returns the grammar symbol at the dot position in the item's body. If the dot is at the end of the body, it returns nil and false. For example, in the LR(0) item A → α•Bβ, it returns B.

func (*Item0) Equal ¶ added in v0.10.3

func (i *Item0) Equal(rhs Item) bool

Equal determines whether or not two LR(0) items are the same.

func (*Item0) IsComplete ¶ added in v0.10.3

func (i *Item0) IsComplete() bool

IsComplete checks if the dot has reached the end of the item's body.

func (*Item0) IsFinal ¶ added in v0.10.3

func (i *Item0) IsFinal() bool

IsInitial checks if an LR(0) item is the final item "S′ → S•" in the augmented grammar.

func (*Item0) IsInitial ¶ added in v0.10.3

func (i *Item0) IsInitial() bool

IsInitial checks if an LR(0) item is the initial item "S′ → •S" in the augmented grammar.

func (*Item0) IsKernel ¶ added in v0.10.3

func (i *Item0) IsKernel() bool

IsKernel determines whether an LR(0) item is a kernel item.

Kernel items include:

The initial item, "S′ → •S".
All items where the dot is not at the beginning (left end) of the item's body.

Non-kernel items are those where the dot is at the beginning, except for the initial item.

func (*Item0) Item1 ¶ added in v0.10.3

func (i *Item0) Item1(lookahead grammar.Terminal) *Item1

Item1 converts an LR(0) item into its equivalent LR(1) item by adding a lookahead.

func (*Item0) Next ¶ added in v0.10.3

func (i *Item0) Next() (Item, bool)

NextItem generates a new LR(0) item by advancing the dot one position forward in the item's body. If the dot is at the end of the body, it returns an empty LR(0) item and false. For example, for the LR(0) item A → α•Bβ, it returns A → αB•β.

func (*Item0) String ¶ added in v0.10.3

func (i *Item0) String() string

String returns a string representation of an LR(0) item.

type Item1 ¶ added in v0.10.3

type Item1 struct {
	*grammar.Production
	Start     grammar.NonTerminal // The start symbol S′ in the augmented grammar.
	Dot       int                 // Position of the dot in the production body.
	Lookahead grammar.Terminal
}

Item1 represents an LR(1) item for a context-free grammar. The "1" indicates the length of the lookahead of the item. Item1 implements the Item interace.

An LR(1) item is a production with a dot at some position of the body, followed by either a terminal symbol or the right endmarker $.

A → α.β, a

Informally, for an item of the form "A → α., a", a reduction by A → α is performed only if the next input symbol matches a. The lookahead does not impact an item of the form "A → α.β, a" when β is not ε.

func (*Item1) Compare ¶ added in v0.10.3

func (i *Item1) Compare(rhs Item) int

Compare compares two LR(1) items to establish a consistent and deterministic order. The comparison follows these rules:

Initial items precede non-initial items.
Kernel items precede non-kernel items.
If both items are kernel or non-kernel, items with production heads equal to the augmented start symbol S′ come first.
Next, items with further dot positions come first.
If dot positions are identical, the order is determined by comparing productions.
If both dot positions and productions are identical, the order is determined by comparing lookahead symbols.

This function is useful for sorting LR(1) items, ensuring stability and predictability in their ordering.

func (*Item1) DotSymbol ¶ added in v0.10.3

func (i *Item1) DotSymbol() (grammar.Symbol, bool)

Dot returns the grammar symbol at the dot position in the item's body. If the dot is at the end of the body, it returns nil and false. For example, in the LR(1) item A → α•Bβ, it returns B.

func (*Item1) Equal ¶ added in v0.10.3

func (i *Item1) Equal(rhs Item) bool

Equal determines whether or not two LR(1) items are the same.

func (*Item1) GetPrefix ¶ added in v0.10.3

func (i *Item1) GetPrefix() grammar.String[grammar.Symbol]

GetPrefix returns the prefix α of an LR(1) item of the form "A → α•β, a. α represents the portion of the production that has already been parsed. α can be the empty string ε if nothing has been parsed yet.

func (*Item1) GetSuffix ¶ added in v0.10.3

func (i *Item1) GetSuffix() grammar.String[grammar.Symbol]

GetSuffix returns the suffix β of an LR(1) item of the form "A → α•β, a". β represents the portion of the production that has not yet been parsed. β can be the empty string ε if there is no remaining unparsed portion.

func (*Item1) IsComplete ¶ added in v0.10.3

func (i *Item1) IsComplete() bool

IsComplete checks if the dot has reached the end of the item's body.

func (*Item1) IsFinal ¶ added in v0.10.3

func (i *Item1) IsFinal() bool

IsInitial checks if an LR(1) item is the final item "S′ → S•, $" in the augmented grammar.

func (*Item1) IsInitial ¶ added in v0.10.3

func (i *Item1) IsInitial() bool

IsInitial checks if an LR(1) item is the initial item "S′ → •S, $" in the augmented grammar.

func (*Item1) IsKernel ¶ added in v0.10.3

func (i *Item1) IsKernel() bool

IsKernel determines whether an LR(1) item is a kernel item.

Kernel items include:

The initial item, "S′ → •S, $".
All items where the dot is not at the beginning (left end) of the item's body.

Non-kernel items are those where the dot is at the beginning, except for the initial item.

func (*Item1) Item0 ¶ added in v0.10.3

func (i *Item1) Item0() *Item0

Item0 converts an LR(1) item into its equivalent LR(0) item by dropping the lookahead.

func (*Item1) Next ¶ added in v0.10.3

func (i *Item1) Next() (Item, bool)

NextItem generates a new LR(1) item by advancing the dot one position forward in the item's body. If the dot is at the end of the body, it returns an empty LR(1) item and false. For example, for the LR(1) item A → α•Bβ, it returns A → αB•β.

func (*Item1) String ¶ added in v0.10.3

func (i *Item1) String() string

String returns a string representation of an LR(1) item.

type ItemSet ¶

type ItemSet set.Set[Item]

ItemSet represents a set of items for a context-free grammar.

func NewItemSet ¶

func NewItemSet(items ...Item) ItemSet

NewItemSet creates a new set of items.

type ItemSetCollection ¶

type ItemSetCollection set.Set[ItemSet]

ItemSet represents a collection of item sets for a context-free grammar.

func NewItemSetCollection ¶

func NewItemSetCollection(sets ...ItemSet) ItemSetCollection

NewItemSetCollection creates a new collection of item sets.

type Parser ¶ added in v0.10.3

type Parser struct {
	L lexer.Lexer
	T *ParsingTable
}

Parser is a general LR parser for LR(1) grammars. It implements the parser.Parser interface.

func (*Parser) Parse ¶ added in v0.10.3

func (p *Parser) Parse(tokenF parser.TokenFunc, prodF parser.ProductionFunc) error

Parse implements the LR parsing algorithm. It analyzes a sequence of input tokens (terminal symbols) provided by a lexical analyzer. It attempts to parse the input according to the production rules of a context-free grammar, determining whether the input string belongs to the language defined by the grammar.

This method requires a parsing table, which must be generated from a grammar by an LR parser (e.g., Simple LR, Canonical LR, or LALR).

The Parse method invokes the provided functions each time a token or a production rule is matched. This allows the caller to process or react to each step of the parsing process.

An error is returned if the input fails to conform to the grammar rules, indicating a syntax issue, or if any of the provided functions return an error, indicating a semantic issue.

func (*Parser) ParseAndBuildAST ¶ added in v0.10.5

func (p *Parser) ParseAndBuildAST() (parser.Node, error)

ParseAndBuildAST implements the LR parsing algorithm. It analyzes a sequence of input tokens (terminal symbols) provided by a lexical analyzer. It attempts to parse the input according to the production rules of a context-free grammar, constructing an abstract syntax tree (AST) that reflects the structure of the input.

If the input string is valid, the root node of the AST is returned, representing the syntactic structure of the input string.

An error is returned if the input fails to conform to the grammar rules, indicating a syntax issue.

func (*Parser) ParseAndEvaluate ¶ added in v0.10.5

func (p *Parser) ParseAndEvaluate(eval EvaluateFunc) (*Value, error)

ParseAndEvaluate implements the LR parsing algorithm. It analyzes a sequence of input tokens (terminal symbols) provided by a lexical analyzer. It attempts to parse the input according to the production rules of a context-free grammar, evaluating the input in a bottom-up fashion using a rightmost derivation.

During the parsing process, the provided EvaluateFunc is invoked each time a production rule is matched. The function is called with values corresponding to the symbols in the body of the production, enabling the caller to process and evaluate the input incrementally.

An error is returned if the input fails to conform to the grammar rules, indicating a syntax issue, or if the evaluation function returns an error, indicating a semantic issue.

type ParsingTable ¶

type ParsingTable struct {
	States       []State
	Terminals    []grammar.Terminal
	NonTerminals []grammar.NonTerminal
	// contains filtered or unexported fields
}

ParsingTable represents an LR parsing table.

func NewParsingTable ¶

func NewParsingTable(states []State, terminals []grammar.Terminal, nonTerminals []grammar.NonTerminal, precedences PrecedenceLevels) *ParsingTable

NewParsingTable creates an empty parsing table for an LR parser.

func (*ParsingTable) ACTION ¶

func (t *ParsingTable) ACTION(s State, a grammar.Terminal) (*Action, error)

ACTION looks up and returns the action for state s and terminal a. If the ACTION[s,a] contains more than one action, it returns an erroneous ACTION and an error, indicating a conflict.

func (*ParsingTable) AddACTION ¶

func (t *ParsingTable) AddACTION(s State, a grammar.Terminal, action *Action) bool

AddACTION adds a new action for state s and terminal a to the parsing table. Multiple actions can be added for the same state s and terminal a. It returns false if the ACTION[s,a] contains more than one action, indicating a conflict.

func (*ParsingTable) Equal ¶ added in v0.10.5

func (t *ParsingTable) Equal(rhs *ParsingTable) bool

Equal determines whether or not two parsing tables are the same. The equality check is merely based on the ACTION and GOTO tables.

func (*ParsingTable) GOTO ¶

func (t *ParsingTable) GOTO(s State, A grammar.NonTerminal) (State, error)

GOTO looks up and returns the next state for state s and non-terminal A. If the GOTO[s,A] contains more than one state, it returns an error.

func (*ParsingTable) ResolveConflicts ¶ added in v0.10.6

func (t *ParsingTable) ResolveConflicts() error

ResolveConflicts checks the parsing table for any conflicts between actions. A conflict occurs when multiple actions are assigned to the same state and terminal symbol. Conflicts arise when the grammar is ambiguous.

If conflicts are found, this method first attempts to resolve them using the specified precedence levels. The action with the highest precedence replaces all conflicting actions in the table entry.

If a conflict cannot be resolved, a detailed error describing the conflict is created and accumulated.

func (*ParsingTable) SetGOTO ¶

func (t *ParsingTable) SetGOTO(s State, A grammar.NonTerminal, next State)

SetGOTO updates the next state for state s and non-terminal A in the parsing table. If the next state is ErrState, it will not be added to the table.

func (*ParsingTable) String ¶

func (t *ParsingTable) String() string

String returns a human-readable string representation of the parsing table.

type Precedence ¶ added in v0.10.6

type Precedence struct {
	Order int
	Associativity
}

Precedence represents the precedence of a terminal or a production rule.

func (*Precedence) Equal ¶ added in v0.10.6

func (p *Precedence) Equal(rhs *Precedence) bool

Equal determines whether or not two predencces are the same.

func (*Precedence) String ¶ added in v0.10.6

func (p *Precedence) String() string

String returns a string representation of the predencces.

type PrecedenceHandle ¶ added in v0.10.6

type PrecedenceHandle struct {
	*grammar.Terminal
	*grammar.Production
}

PrecedenceHandle represents either a terminal symbol or a production rule in the context of determining precedence for conflict resolution.

func PrecedenceHandleForProduction ¶ added in v0.10.6

func PrecedenceHandleForProduction(p *grammar.Production) *PrecedenceHandle

PrecedenceHandleForProduction creates a new precedence handle for a production rule. If the production contains terminal symbols, the handle is represented by the first terminal in the right-hand side (body) of the production. Otherwise, the handle is represented by the production itself.

func PrecedenceHandleForTerminal ¶ added in v0.10.6

func PrecedenceHandleForTerminal(t grammar.Terminal) *PrecedenceHandle

PrecedenceHandleForTerminal creates a new precedence handle represented by a terminal symbol.

func (*PrecedenceHandle) Equal ¶ added in v0.10.6

func (h *PrecedenceHandle) Equal(rhs *PrecedenceHandle) bool

Equal determines whether or not two handles are the same.

func (*PrecedenceHandle) IsProduction ¶ added in v0.10.6

func (h *PrecedenceHandle) IsProduction() bool

IsProduction returns true if the handle represents a production rule.

func (*PrecedenceHandle) IsTerminal ¶ added in v0.10.6

func (h *PrecedenceHandle) IsTerminal() bool

IsTerminal returns true if the handle represents a terminal symbol.

func (*PrecedenceHandle) String ¶ added in v0.10.6

func (h *PrecedenceHandle) String() string

String returns a string representation of the handle.

type PrecedenceHandles ¶ added in v0.10.6

type PrecedenceHandles set.Set[*PrecedenceHandle]

PrecedenceHandles represents a set of terminals and/or production rules (referred to as handles).

func NewPrecedenceHandles ¶ added in v0.10.6

func NewPrecedenceHandles(handles ...*PrecedenceHandle) PrecedenceHandles

NewPrecedenceHandles creates a new set of terminals and/or production rules (referred to as handles).

type PrecedenceLevel ¶ added in v0.10.6

type PrecedenceLevel struct {
	Associativity Associativity
	Handles       PrecedenceHandles
}

PrecedenceLevel defines a set of terminals and/or production rules (referred to as handles), each of which shares the same precedence level and associativity.

func (*PrecedenceLevel) Equal ¶ added in v0.11.0

func (p *PrecedenceLevel) Equal(rhs *PrecedenceLevel) bool

Equal determines whether or not two precedence levels are the same.

func (*PrecedenceLevel) String ¶ added in v0.11.0

func (p *PrecedenceLevel) String() string

String returns a string representation of the precedence level.

type PrecedenceLevels ¶ added in v0.10.6

type PrecedenceLevels []*PrecedenceLevel

PrecedenceLevels represents an ordered list of precedence levels defined for specific terminals or production rules of a context-free grammar. The order of these levels is crucial for resolving conflicts.

func (PrecedenceLevels) Compare ¶ added in v0.10.6

func (p PrecedenceLevels) Compare(lhs, rhs *ActionHandlePair) (int, error)

Compare compares two action-handle pairs to determine their precedence. A handle that appears earlier in the list has higher precedence and is considered larger. If both handles have the same precedence level, they are compared based on the associativity of their level. In this case, the associativity determines which actions types precede.

If either handle is not found in the list, the function returns -1 and false.

func (PrecedenceLevels) Equal ¶ added in v0.11.0

func (p PrecedenceLevels) Equal(rhs PrecedenceLevels) bool

Equal determines whether or not two ordered list of precedence levels are the same.

func (PrecedenceLevels) Precedence ¶ added in v0.10.6

func (p PrecedenceLevels) Precedence(h *PrecedenceHandle) (*Precedence, bool)

Precedence searches for a precedence handle in the list of precedence levels. If found, it returns the handle's associativity and its order in the list. If not found, it returns nil and false.

func (PrecedenceLevels) String ¶ added in v0.11.0

func (p PrecedenceLevels) String() string

String returns a string representation of the list of precedence levels.

func (PrecedenceLevels) Verify ¶ added in v0.11.0

func (p PrecedenceLevels) Verify() error

Verify checks whether a list of precedence levels is valid. A precedence handle must not appear in multiple levels.

type State ¶

type State int

State represents a state in the LR parsing table or automaton.

type StateMap ¶

type StateMap [][]Item

StateMap represents a two-leveled indexed map where the first index (state) maps to an item set, and within each item set, the second index maps to individual items.

func BuildStateMap ¶

func BuildStateMap(C ItemSetCollection) StateMap

BuildStateMap constructs a deterministic mapping of item sets and individual items. It creates a StateMap where the first index (state) corresponds to an item set, and the second index maps to individual items within each set, sorted in a consistent order.

func (StateMap) All ¶

func (m StateMap) All() iter.Seq2[State, ItemSet]

All returns an iterator sequence that contains all state-ItemSet pairs in the StateMap.

func (StateMap) FindItem ¶ added in v0.10.3

func (m StateMap) FindItem(s State, item Item) int

FindItemSet finds the state corresponding to a given item set. It returns the state if found, or ErrState if no match exists.

func (StateMap) FindItemSet ¶ added in v0.10.3

func (m StateMap) FindItemSet(I ItemSet) State

FindItemSet finds the state corresponding to a given item set. It returns the state if found, or ErrState if no match exists.

func (StateMap) Item ¶ added in v0.10.3

func (m StateMap) Item(s State, i int) Item

Item returns the item at the specified index i within the item set indexed by state s.

func (StateMap) ItemSet ¶ added in v0.10.3

func (m StateMap) ItemSet(s State) ItemSet

ItemSet returns the item set associated with the specified state s. It creates a new ItemSet from the items corresponding to the given state.

func (StateMap) States ¶ added in v0.10.3

func (m StateMap) States() []State

States returns all states in the map.

func (StateMap) String ¶

func (m StateMap) String() string

String returns a string representation of all states in the map.

type Value ¶ added in v0.10.5

type Value struct {
	Val any
	Pos *lexer.Position
}

Value represents a value used during the evaluation process, along with its corresponding positional information in the input.

func (*Value) String ¶ added in v0.10.5

func (v *Value) String() string

String returns a string representation of a value.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
canonical Package canonical provides data structures and algorithms for building Canonical LR parsers.	Package canonical provides data structures and algorithms for building Canonical LR parsers.
lookahead Package lookahead provides data structures and algorithms for building Look-Ahead LR (LALR) parsers.	Package lookahead provides data structures and algorithms for building Look-Ahead LR (LALR) parsers.
simple Package simple provides data structures and algorithms for building Simple LR (SLR) parsers.	Package simple provides data structures and algorithms for building Simple LR (SLR) parsers.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL