Documentation
¶
Overview ¶
Package trie implements prefix tree data structures.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Trie ¶
type Trie[V any] interface { fmt.Stringer Equaler[Trie[V]] Collection2[string, V] Tree2[string, V] Height() int Min() (string, V, bool) Max() (string, V, bool) Floor(string) (string, V, bool) Ceiling(string) (string, V, bool) DeleteMin() (string, V, bool) DeleteMax() (string, V, bool) Select(int) (string, V, bool) Rank(string) int Range(string, string) []KeyValue[string, V] RangeSize(string, string) int Match(string) []KeyValue[string, V] WithPrefix(string) []KeyValue[string, V] LongestPrefixOf(string) (string, V, bool) // contains filtered or unexported methods }
Trie represents a trie (prefix tree) abstract data type.
func NewBinary ¶
NewBinaryTrie creates a new binary trie tree.
A trie, or prefix tree, is an ordered tree uses the digits in the keys where the keys are usually strings. We can use any radix to decompose the keys into digits. We shall choose radixes so that the digits are decimal digits, English letters, ASCII characters, etc.
In a trie, keys are stored on a path from the root node to any arbitrary node. In a sense, keys are stored on edges and not in nodes and the nodes themselves do not store the keys. The root node is always associated with the empty string. All the descendants of a node have the same common prefix of the string associated with that node.
A binary trie is a binary tree in which the left child represents the next character in a string and the right child represents an alternative character in the same position as the current character. The root node is always a sentinel node with a nil right child.
_ | b-------------d | | a---------o a | | | b--d*--n x* d*--n | | | y* k* c | e*
Includes words baby, bad, bank, box, dad, and dance.
The second parameter (eqVal) is needed only if you want to use the Equal method.
func NewPatricia ¶
NewPatricia creates a new Patricia trie.
A Patricia trie is a space-optimized variation of tries (prefix trees). Patricia trie is a special case of Radix trie with a radix of two (r = 2^x, x = 1). As a result, each node has only two children, like a binary tree.
The root node's bit position is always zero and it always only has a left child. Keys are a sequence of bits stored in nodes (the number of nodes equals the number of keys). Patricia is a threaded tree in which nil links are also utilized.
A Patricia trie is derived from a digital search tree in few steps.
A digital search tree is a binary tree in which keys are a sequence of bits stored in nodes. The i'th level in a DST corresponds to the i'th bit of the keys. At any given node at the i'th level, the left sub-tree includes all the keys with i'th bit set to zero and the right sub-tree includes all the keys with i'th bit set to one.
┌──<1000>──┐ │ │ ┌──<0010> <1001>──┐ │ │ ┌──<0011>──┐ <1100> │ │ <0000> <0011>
For searching (or inserting) a key in a DST, we start from the root node. At a given node at the i'th level, we compare the search key with the key in the node. If keys are not equal, we look at the i'th bit of the search key; If zero we continue the search with the left sub-tree, and if one we continue the search with the right sub-tree.
To reduce the number of compares required, we introduce the binary trie. A binary trie, in this context, is a binary tree with two kinds of nodes: internal nodes and leaf nodes. Internals nodes are only for guiding the search and leaf nodes contain the keys. Similar to digital search trees, the i'th level corresponds to the i'th bit of the keys.
┌───────────[ ]───────────┐ │ │ ┌──[ ] ┌──[ ]──┐ │ │ │ ┌──[ ]──┐ ┌──[ ] <1100> │ │ │ ┌──[ ]──┐ <0010> ┌──[ ]──┐ │ │ │ │ <0000> <0001> <1000> <1001>
We can further optimize a binary trie and build a compressed binary trie. To do so, we merge the internal nodes with only one child and add a bit position field to each internal node. The bit position at a given internal node determines which i'th bit of the keys should be tested at that node.
┌────────[ 1 ]────────┐ │ │ ┌──[ 3 ]──┐ ┌──[ 2 ]──┐ │ │ │ │ ┌──[ 4 ]──┐ <0010> ┌──[ 4 ]──┐ <1100> │ │ │ │ <0000> <0001> <1000> <1001>
We can finally derive a Patricia trie from a compressed binary trie. To this end, we substitute every internal node with a Patricia node. Since the number of internal nodes is one less than the number of leaf nodes, we add an extra Patricia node. This extra Patricia node is the root of the tree, its bit position is always zero, its left child points to the rest of the tree, and its right child is always nil. We move keys from leaf nodes to Patricia nodes in such a way that the bit number in Patricia nodes is equal to or less than the bit number in the parent node of the leaf node. Pointers from internal nodes to leaf nodes become thread pointers in the Patricia trie.
.......................... ┌────────(0|1100)... : : │ : : ┌────────(1|0000)────────┐ : : │ │ : : ┌──(3|0010)... ┌──(2|1001)..: : │ :.....: │ : :..(4|0001)... ...(4|1000)..: :.....: :.....:
Decent implementations of Patricia trie can often outperform balanced binary trees, and even hash tables. Patricia trie performs admirably when its bit-testing loops are well tuned.
The second parameter (eqVal) is needed only if you want to use the Equal method.