bytecode

package
v0.1.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 21, 2023 License: MIT Imports: 26 Imported by: 0

README

bytecode

The bytecode package supports a simple bytecode interpreter. This allows operations (especially those that might be repeated) to be compiled into an expression of the semantics of the operation, without having to have the string parsed and lexically analyzed repeatedly.

Bytecode can be generated explicitly (as in the first example below) or by using the compiler package which accepts text in a Go-like language called Ego and generates bytecode. Once the bytecode is generated, a runtime Context object is created which is used to manage the execution of a bytecode stream. This includes it's active symbol table, program counter, stack, etc. A Context is separate from the bytecode because the same bytecode could be executed on multiple threads, each with it's own Context.

The bytecode also supports a symbol table. This can be used to store named values and retrieve them as part of the execution of the bytecode. The symbol table also contains function pointers for each of the built-in function and function packages. Calling functions is managed by the bytecode, but can be used to call a function provided by the caller as native Go code.

Example

Here is a trivial example of generating bytecode and executing it.

// Create a ByteCode object and write some instructions into it.
b := bytecode.New("sample program")
b.Emit(bytecode.Load, "strings")
b.Emit(bytecode.Member, "left")
b.Emit{bytecode.Push, "fruitcake")
b.Emit(bytecode.Push, 5)
b.Emit(bytecode.Call, 2)
b.Emit(bytecode.Stop)

// Make a symbol table, so we can call the function library.
s := symbols.NewSymbolTable("sample program")
functions.AddBuiltins(s)

// Make a runtime context for this bytecode, and then run it.
// The context has the symbol table and bytecode attached to it.
c := bytecode.NewContext(s, b)
err := c.Run()

// Retrieve the last value and extract a string 
v, err := b.Pop()
fmt.Printf("The result is %s\n", data.GetString(v))

This creates a new bytecode stream, and then adds instructions to it. These instructions would nominally be added by a parser. The Emit() function emits an instruction with only one value, the opcode. The Emit() method emits an instruction with two values, the opcode and an arbitrary operand value.

The stream puts arguments to a function on a stack, and then calls the function. The result is left on the stack, and can be popped off after execution completes. The result (which is always an abstract interface{}) is then converted to a string and printed.

ByteCodes

This table enumerates the bytecode values in the bytecode package, and what they do.

Opcode Description
Stop Stop execution of the current bytecode stream
AtLine <int> Record the current line number from the source file. This is used for forming error messages and debugging.
Push <any> Push a scalar (int, float64, string, or bool) value directly onto the stack.
Drop <int> Remove the specified number of items from the top of the stack and discard them.
Add Remove the top two items from the stack and add s[0] to s[1] together and push the result back on the stack.
Sub Remove the top two items from the stack and subtract s[0] from s[1] and push the result back on the stack
Div Remove the top two items from the stack and divide s[0] by s[1] and push the result back on the stack
Mul Remove the top two items from the stack and multiply s[0] by s[1] and push the result back on the stack.
And Remove the top two items from the stack and Boolean AND them together, and push the result back on the stack.
Or Remove the top two items from the stack and Boolean OR them together and push the result back on the stack.
Negate Remove the top item from the stack and push the negative (or Boolean NOT) of the value back on the stack.
Equal Remove the top two items and push a boolean expressions s[0] = s[1].
NotEqual Remove the top two items and push a boolean expressions s[0] != s[1].
GreaterThan Remove the top two items and push a boolean expressions s[0] > s[1].
LessThan Remove the top two items and push a boolean expressions s[0] < s[1].
GreaterThanOrEqual Remove the top two items and push a boolean expressions s[0] >= s[1].
LessThanOrEqual Remove the top two items and push a boolean expressions s[0] <= s[1].
Load <string> Load the named value from the symbol table and push it on the stack.
Store <string> Remove the top item from the stack and store it in the symbol table using the given name.
Array <int> Remove the specified number of items from the stack and create an array with those values, and push it back on the stack.
MakeArray
LoadIndex Remove the top item and use it as an index into the second item which must be an array, then push the array element back on the stack.
StoreIndex Remove the top item and use it as an index into the second item which must be an array, and store the third item into the array.
Struct <int> Remove the given number of pairs of items. The first item must be a string, and becomes the field with the second item as its value. The resulting struct is pushed back on the stack.
Member Remove the top item and use it as a field name into the second item which must be a struct, and store the third item into the struct.
Print Remove the top item from the stack and print it to the console.
Newline Print a newline character to the console.
Branch <addr> Transfer control to the instruction at the given location in the bytecode array.
BranchTrue <addr> Remove the top item. If it is true, transfer control to the instruction at the given location in the bytecode array.
BranchFalse <addr> Remove the top item. If it is false, transfer control to the instruction at the given location in the bytecode array.
Call <int> Remove the given number of items from the stack to form a parameter list. The remove the pointer to the function. This can be a pointer to a native function or a pointer to a bytecode structure containing a function written in the Ego language.
Return <bool> Return from a function. If the boolean value is true, then a return code is also popped from the stack and
passed to the caller's context.
SymbolCreate <string> Create a new symbol in the most-local table of the given name
SymbolDelete <string> Delete the symbol from the nearest scope in which it exists
Template <string> Compile the template on top of the stack, and store in the persisted template store under the <string> name.

Documentation

Index

Constants

View Source
const (
	// Discards the catch set, which means all errors are caught.
	AllErrorsCatchSet = 0

	// Set of errors that an ?optional is permitted to ignore.
	OptionalCatchSet = 1
)
View Source
const GrowStackBy = 50

GrowStackBy indicates the number of elements to add to the stack when it runs out of space.

Variables

View Source
var InstructionsExecuted atomic.Int64

InstructionsExecuted counts the number of byte code instructions executed.

View Source
var MaxStackSize atomic.Int32

MaxStackSize records the largest stack size encountered during a stack push operation. This can be used to determine if the initial stack size is adequate.

View Source
var Optimizations = []optimization{
	{
		Description: "Load followed by SetThis",
		Pattern: []instruction{
			{
				Operation: Load,
				Operand:   placeholder{Name: "name"},
			},
			{
				Operation: SetThis,
				Operand:   nil,
			},
		},
		Replacement: []instruction{
			{
				Operation: LoadThis,
				Operand:   placeholder{Name: "name"},
			},
		},
	},
	{
		Description: "Collapse constant push and createandstore",
		Pattern: []instruction{
			{
				Operation: Push,
				Operand:   placeholder{Name: "value"},
			},
			{
				Operation: CreateAndStore,
				Operand:   placeholder{Name: "name"},
			},
		},
		Replacement: []instruction{
			{
				Operation: CreateAndStore,
				Operand: []interface{}{
					placeholder{Name: "name"},
					placeholder{Name: "value"},
				},
			},
		},
	},
	{
		Description: "Unnecessary stack marker for constant store",
		Pattern: []instruction{
			{
				Operation: Push,
				Operand:   NewStackMarker("let"),
			},
			{
				Operation: Push,
				Operand:   placeholder{Name: "constant"},
			},
			{
				Operation: CreateAndStore,
				Operand:   placeholder{Name: "name"},
			},
			{
				Operation: DropToMarker,
				Operand:   NewStackMarker("let"),
			},
		},
		Replacement: []instruction{
			{
				Operation: Push,
				Operand:   placeholder{Name: "constant"},
			},
			{
				Operation: CreateAndStore,
				Operand:   placeholder{Name: "name"},
			},
		},
	},
	{
		Description: "Sequential PopScope",
		Pattern: []instruction{
			{
				Operation: PopScope,
				Operand:   placeholder{Name: "count1", Operation: OptCount, Register: 1},
			},
			{
				Operation: PopScope,
				Operand:   placeholder{Name: "count2", Operation: OptCount, Register: 1},
			},
		},
		Replacement: []instruction{
			{
				Operation: PopScope,
				Operand:   placeholder{Name: "count", Operation: OptRead, Register: 1},
			},
		},
	}, {
		Description: "Create and store",
		Pattern: []instruction{
			{
				Operation: SymbolCreate,
				Operand:   placeholder{Name: "symbolName"},
			},
			{
				Operation: Store,
				Operand:   placeholder{Name: "symbolName"},
			},
		},
		Replacement: []instruction{
			{
				Operation: CreateAndStore,
				Operand:   placeholder{Name: "symbolName"},
			},
		},
	},
	{
		Description: "Push and Storeindex",
		Pattern: []instruction{
			{
				Operation: Push,
				Operand:   placeholder{Name: "value"},
			},
			{
				Operation: StoreIndex,
			},
		},
		Replacement: []instruction{
			{
				Operation: StoreIndex,
				Operand:   placeholder{Name: "value"},
			},
		},
	},
	{
		Description: "Constant storeAlways",
		Pattern: []instruction{
			{
				Operation: Push,
				Operand:   placeholder{Name: "value"},
			},
			{
				Operation: StoreAlways,
				Operand:   placeholder{Name: "name"},
			},
		},
		Replacement: []instruction{
			{
				Operation: StoreAlways,
				Operand: []interface{}{
					placeholder{Name: "name"},
					placeholder{Name: "value"},
				},
			},
		},
	},
	{
		Description: "Constant addition fold",
		Pattern: []instruction{
			{
				Operation: Push,
				Operand:   placeholder{Name: "v1"},
			},
			{
				Operation: Push,
				Operand:   placeholder{Name: "v2"},
			},
			{
				Operation: Add,
			},
		},
		Replacement: []instruction{
			{
				Operation: Push,
				Operand:   placeholder{Name: "sum", Operation: OptRunConstantFragment},
			},
		},
	},
	{
		Description: "Constant subtraction fold",
		Pattern: []instruction{
			{
				Operation: Push,
				Operand:   placeholder{Name: "v1"},
			},
			{
				Operation: Push,
				Operand:   placeholder{Name: "v2"},
			},
			{
				Operation: Sub,
			},
		},
		Replacement: []instruction{
			{
				Operation: Push,
				Operand:   placeholder{Name: "difference", Operation: OptRunConstantFragment},
			},
		},
	},
	{
		Description: "Constant multiplication fold",
		Pattern: []instruction{
			{
				Operation: Push,
				Operand:   placeholder{Name: "v1"},
			},
			{
				Operation: Push,
				Operand:   placeholder{Name: "v2"},
			},
			{
				Operation: Mul,
			},
		},
		Replacement: []instruction{
			{
				Operation: Push,
				Operand:   placeholder{Name: "product", Operation: OptRunConstantFragment},
			},
		},
	},
}

Functions

func CopyPackagesToSymbols

func CopyPackagesToSymbols(s *symbols.SymbolTable)

func Format

func Format(opcodes []instruction) string

Format formats an array of bytecodes.

func FormatInstruction

func FormatInstruction(i instruction) string

FormatInstruction formats a single instruction as a string.

func GetPackage

func GetPackage(name string) (*data.Package, bool)

func GoRoutine

func GoRoutine(fName string, parentCtx *Context, args []interface{})

GoRoutine allows calling a named function as a go routine, using arguments. The invocation of GoRoutine should be in a "go" statement to run the code.

func IsPackage

func IsPackage(name string) bool

Types

type ByteCode

type ByteCode struct {
	// contains filtered or unexported fields
}

ByteCode contains the context of the execution of a bytecode stream. Note that there is a dependency in format.go on the name of the "Declaration" variable. PLEASE NOTE that Name must be exported because reflection is used to format opaque pointers to bytecodes in the low-level formatter.

func New

func New(name string) *ByteCode

New generates and initializes a new bytecode.

func (*ByteCode) Append

func (b *ByteCode) Append(a *ByteCode)

Append appends another bytecode set to the current bytecode, and updates all the branch references within that code to reflect the new base locaation for the code segment.

func (*ByteCode) Call

func (b *ByteCode) Call(s *symbols.SymbolTable) (interface{}, error)

Call generates a one-time context for executing this bytecode, and returns a value as well as an error condition if there was one from executing the code.

func (*ByteCode) ClearLineNumbers

func (b *ByteCode) ClearLineNumbers()

ClearLineNumbers scans the bytecode and removes the AtLine numbers in the code so far. This is done when the @line directive resets the line number; all previous line numbers are no longer valid and are set to zero.

func (*ByteCode) Declaration

func (b *ByteCode) Declaration() *data.FunctionDeclaration

Return the declaration object from the bytecode. This is primarily used in routines that format information about the bytecode. If you change the name of this function, you will also need to update the MethodByName() calls for this same function name.

func (*ByteCode) Disasm

func (b *ByteCode) Disasm(ranges ...int)

Disasm prints out a representation of the bytecode for debugging purposes.

func (*ByteCode) Emit

func (b *ByteCode) Emit(opcode Opcode, operands ...interface{})

Emit emits a single instruction. The opcode is required, and can optionally be followed by an instruction operand (based on whichever instruction) is issued. The instruction is emitted at the current "next address" of the bytecode object, which is then incremented.

func (*ByteCode) EmitAt

func (b *ByteCode) EmitAt(address int, opcode Opcode, operands ...interface{})

EmitAT emits a single instruction. The opcode is required, and can optionally be followed by an instruction operand (based on whichever instruction) is issued. This stores the instruction at the given location in the bytecode array, but does not affect the emit position unless this operation required expanding the bytecode storage.

func (*ByteCode) Instruction

func (b *ByteCode) Instruction(address int) *instruction

Instruction retrieves the instruction at the given address.

func (*ByteCode) Mark

func (b *ByteCode) Mark() int

Mark returns the address of the next instruction to be emitted. Use this BERFORE a call to Emit() if using it for branch address fixups later.

func (*ByteCode) Name

func (b *ByteCode) Name() string

func (ByteCode) NeedsCoerce

func (b ByteCode) NeedsCoerce(kind *data.Type) bool

func (*ByteCode) Opcodes

func (b *ByteCode) Opcodes() []instruction

Opcodes returns the opcode list for this bytecode array.

func (*ByteCode) Patch

func (b *ByteCode) Patch(start, deleteSize int, insert []instruction)

func (*ByteCode) Remove

func (b *ByteCode) Remove(address int)

Remove removes an instruction from the bytecode. The address is >= 0 it is the absolute address of the instruction to remove. Otherwise, it is the offset from the end of the bytecode to remove.

func (*ByteCode) Run

func (b *ByteCode) Run(s *symbols.SymbolTable) error

Run generates a one-time context for executing this bytecode, and then executes the code.

func (*ByteCode) Seal

func (b *ByteCode) Seal() *ByteCode

Truncate the output array to the current bytecode size. This is also where we will optionally run an optimizer.

func (*ByteCode) SetAddress

func (b *ByteCode) SetAddress(mark int, address int) error

SetAddress sets the given value as the target of the marked instruction. This is often used when an address has been saved and we need to update a branch destination, usually for a backwards branch operation.

func (*ByteCode) SetAddressHere

func (b *ByteCode) SetAddressHere(mark int) error

SetAddressHere sets the current address as the detination of the instruction at the marked location. This is used for address fixups, typically for forward branches.

func (*ByteCode) SetDeclaration

func (b *ByteCode) SetDeclaration(fd *data.FunctionDeclaration) *ByteCode

func (*ByteCode) SetName

func (b *ByteCode) SetName(name string) *ByteCode

func (*ByteCode) String

func (b *ByteCode) String() string

String formats a bytecode as a function declaration string.

type CallFrame

type CallFrame struct {
	Module  string
	Line    int
	Package string
	// contains filtered or unexported fields
}

CallFrame is an object used to store state of the bytecode runtime environment just before making a call to a bytecode subroutine. This preserves the state of the stack, PC, and other data at the time of the call. When a bytecode subroutine returns, this object is removed from the stack and used to reset the bytecode runtime state.

Note that this is exported (as are Module and Line within it) to support formatting of trace data using reflection.

func (CallFrame) String

func (f CallFrame) String() string

type ConstantWrapper

type ConstantWrapper struct {
	Value interface{}
}

Note there are reflection dependencies on the name of the field; it must be named "Value".

func (ConstantWrapper) String

func (w ConstantWrapper) String() string

String generates a human-readable string describing the value in the constant wrapper.

type Context

type Context struct {
	// contains filtered or unexported fields
}

Context holds the runtime information about an instance of bytecode being executed.

func NewContext

func NewContext(s *symbols.SymbolTable, b *ByteCode) *Context

NewContext generates a new context. It must be passed a symbol table and a bytecode array. A context holds the runtime state of a given execution unit (program counter, runtime stack, symbol table) and is used to actually run bytecode. The bytecode can continue to be modified after it is associated with a context.

func (*Context) AppendSymbols

func (c *Context) AppendSymbols(s *symbols.SymbolTable) *Context

AppendSymbols appends a symbol table to the current context. This is used to add in compiler maps, for example.

func (*Context) EnableConsoleOutput

func (c *Context) EnableConsoleOutput(flag bool) *Context

EnableConsoleOutput tells the context to begin capturing all output normally generated from Print and Newline into a buffer instead of going to stdout.

func (*Context) FormatFrames

func (c *Context) FormatFrames(maxDepth int) string

FormatFrames is called from the runtime debugger to print out the current call frames stored on the stack. It chases the stack using the frame pointer (FP) in the current context which points to the saved frame. Its FP points to the previous saved frame, and so on.

func (*Context) GetLine

func (c *Context) GetLine() int

GetLine retrieves the current line number from the original source being executed. This is stored in the context every time an AtLine instruction is executed.

func (*Context) GetModuleName

func (c *Context) GetModuleName() string

GetModuleName returns the name of the current module (typically the function name or program name).

func (*Context) GetName

func (c *Context) GetName() string

func (*Context) GetOutput

func (c *Context) GetOutput() string

GetOutput retrieves the output buffer. This is the buffer that contains all Print and related bytecode instruction output. This is used when output capture is enabled, which typically happens when a program is running as a Web service.

func (*Context) GetSymbols

func (c *Context) GetSymbols() *symbols.SymbolTable

func (*Context) GetTokenizer

func (c *Context) GetTokenizer() *tokenizer.Tokenizer

GetTokenizer gets the tokenizer in the current context for tracing and debugging.

func (*Context) IsRunning

func (c *Context) IsRunning() bool

func (*Context) Pop

func (c *Context) Pop() (interface{}, error)

Pop removes the top-most item from the stack.

func (*Context) PrintThisStack

func (c *Context) PrintThisStack(operation string)

Add a line to the trace output that shows the "this" stack of saved function receivers.

func (*Context) Result

func (c *Context) Result() interface{}

func (*Context) Resume

func (c *Context) Resume() error

Used to resume execution after an event like the debugger being invoked.

func (*Context) Run

func (c *Context) Run() error

Run executes a bytecode context.

func (*Context) RunFromAddress

func (c *Context) RunFromAddress(addr int) error

RunFromAddress executes a bytecode context from a given starting address.

func (*Context) SetBreakOnReturn

func (c *Context) SetBreakOnReturn()

func (*Context) SetByteCode

func (c *Context) SetByteCode(b *ByteCode) *Context

SetByteCode attaches a new bytecode object to the current run context.

func (*Context) SetDebug

func (c *Context) SetDebug(b bool) *Context

SetDebug turns debugging mode on or off for the current context.

func (*Context) SetFullSymbolScope

func (c *Context) SetFullSymbolScope(b bool) *Context

SetFullSymbolScope sets the flag that indicates if a symbol table read can "see" a symbol outside the current function. The default is off, which means symbols are not visible outside the function unless they are in the global symbol table. If true, then a symbol can be read from any level of the symbol table parentage chain.

func (*Context) SetGlobal

func (c *Context) SetGlobal(name string, value interface{}) error

SetGlobal stores a value in a the global symbol table that is at the top of the symbol table chain.

func (*Context) SetPC

func (c *Context) SetPC(pc int) *Context

SetPC sets the program counter (PC) which indicates the next instruction number to execute.

func (*Context) SetSingleStep

func (c *Context) SetSingleStep(b bool) *Context

SetSingleStep enables or disables single-step mode. This has no effect if debugging is not active.

func (*Context) SetStepOver

func (c *Context) SetStepOver(b bool) *Context

SetStepOver determines if single step operations step over a function call, or step into it.

func (*Context) SetTokenizer

func (c *Context) SetTokenizer(t *tokenizer.Tokenizer) *Context

SetTokenizer sets a tokenizer in the current context for use by tracing and debugging operations. This gives those functions access to the token stream used to compile the bytecode in this context.

func (*Context) SingleStep

func (c *Context) SingleStep() bool

SingleStep retrieves the current single-step setting for this context. This is used in the debugger to know how to handle break operations.

func (*Context) StepOver

func (c *Context) StepOver(b bool)

func (*Context) Tracing

func (c *Context) Tracing() bool

Tracing returns the trace status of the current context. When tracing is on, each time an instruction is executed, the current instruction and the top few items on the stack are printed to the console.

type DispatchMap

type DispatchMap map[Opcode]OpcodeHandler

DispatchMap is a map that is used to locate the function for an opcode.

type Opcode

type Opcode int

Constant describing instruction opcodes.

const (
	Stop Opcode = iota // Stop must be the zero-th item.
	AtLine
	Add
	AddressOf
	And
	ArgCheck
	Array
	Auth
	BitAnd
	BitOr
	BitShift
	Call
	Coerce
	Constant
	Copy
	CreateAndStore
	DeRef
	Div
	Drop
	DropToMarker
	Dup
	EntryPoint
	Equal
	Exp
	Explode
	Flatten
	FromFile
	GetThis
	GetVarArgs
	Go
	GreaterThan
	GreaterThanOrEqual
	Import
	InFile
	InPackage
	LessThan
	LessThanOrEqual
	Load
	LoadIndex
	LoadSlice
	LoadThis
	Log
	MakeArray
	MakeMap
	Member
	ModeCheck
	Modulo
	Mul
	Negate
	Newline
	NoOperation
	NotEqual
	Or
	Panic
	PopPackage
	PopScope
	Print
	Push
	PushPackage
	PushScope
	RangeInit
	ReadStack
	RequiredType
	Response
	Return
	Say
	SetThis
	StackCheck
	StaticTyping
	Store
	StoreAlways
	StoreBytecode
	StoreChan
	StoreGlobal
	StoreIndex
	StoreInto
	StoreViaPointer
	Struct
	Sub
	Swap
	SymbolCreate
	SymbolDelete
	SymbolOptCreate
	Template
	Timer
	TryPop
	Wait
	WillCatch

	// Everything from here on is a branch instruction, whose
	// operand must be present and is an integer instruction
	// address in the bytecode array. These instructions are
	// patched with offsets when code is appended.
	//
	// The first one in this list MIUST be BranchInstructions,
	// as it marks the start of the branch instructions, which
	// are instructions that can reference a bytecode address
	// as the operand.
	BranchInstructions
	Branch
	BranchTrue
	BranchFalse
	LocalCall
	RangeNext
	Try
)

type OpcodeHandler

type OpcodeHandler func(b *Context, i interface{}) error

OpcodeHandler defines a function that implements an opcode.

type OptimizerOperation

type OptimizerOperation int
const (
	OptNothing OptimizerOperation = iota
	OptStore
	OptRead
	OptCount
	OptRunConstantFragment
)

type StackMarker

type StackMarker struct {
	// contains filtered or unexported fields
}

StackMarker is a special object used to mark a location on the stack. It is used, for example, to mark locations to where a stack should be flushed. The marker contains a text description, and optionally any additional desired data.

func NewStackMarker

func NewStackMarker(label string, values ...interface{}) StackMarker

NewStackMarker generates a enw stack marker object, using the supplied label and optional list of datu.

func (StackMarker) String

func (sm StackMarker) String() string

Produce a string reprsentation of a stack marker.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL