Documentation
¶
Index ¶
Constants ¶
View Source
const (
// HashSize simhash length,choose 32/64
HashSize = 32
)
Variables ¶
This section is empty.
Functions ¶
Types ¶
type Tokenizer ¶
type Tokenizer struct {
// contains filtered or unexported fields
}
Tokenizer word tokenizer
func NewTokenizer ¶
NewTokenizer create a new tokenizer chunkSize, suggestion value: 4 overlapSize, suggestion value: 1
func (*Tokenizer) Tokenize ¶
func (t *Tokenizer) Tokenize(input string) []TokenizerChunk
Tokenize execute tokenize simple set words weight value: 1
type TokenizerChunk ¶
TokenizerChunk word chunk
Click to show internal directories.
Click to hide internal directories.