word2vec

package
v0.0.0-...-f97de62 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 15, 2025 License: GPL-2.0, GPL-3.0 Imports: 10 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ConvertToMap

func ConvertToMap(wv WordVectors, vocab map[string]int) map[string][]float64

convertToMap converts WordVectors to a map[string][]float64.

func ReadTrainingData

func ReadTrainingData(filePath string) ([]string, error)

ReadTrainingData reads the training data from a JSON file.

Types

type SimpleWord2Vec

type SimpleWord2Vec struct {
	// Core word representation
	Vocabulary   map[string]int
	WordVectors  map[int][]float64
	VectorSize   int
	NgramVectors map[string][]float64
	VocabSize    int

	// Context and semantic information
	ContextEmbeddings map[string][]float64
	ContextLabels     map[string]string
	SentenceTags      map[string][]string
	UNKToken          string

	// Hyperparameters and network configuration
	LearningRate        float64
	MaxGrad             float64
	HiddenSize          int
	SimilarityThreshold float64
	Window              int
	Epochs              int
	NegativeSamples     int
	UseCBOW             bool
	NgramSize           int

	// Weights and biases (neural network parameters)
	Weights [][]float64
	Biases  [][]float64
	Ann     *g.ANN

	//MinWordFrequency
	MinWordFrequency int
}

func LoadModel

func LoadModel(filename string) (*SimpleWord2Vec, error)

LoadModel loads a trained model from a GOB file.

func TrainWord2VecModel

func TrainWord2VecModel(trainingDataPath, modelSavePath string, vectorSize, epochs, window, negativeSamples, minWordFrequency int, useCBOW bool) (*SimpleWord2Vec, error)

TrainWord2VecModel initializes, trains, and saves a SimpleWord2Vec model.

func (*SimpleWord2Vec) Backpropagate

func (sw2v *SimpleWord2Vec) Backpropagate(output, context []float64)

backpropagate performs backpropagation and updates weights and biases.

func (*SimpleWord2Vec) CalculateLoss

func (sw2v *SimpleWord2Vec) CalculateLoss(output, context []float64) float64

calculateLoss calculates the Mean Squared Error (MSE) loss.

func (*SimpleWord2Vec) ForwardPass

func (sw2v *SimpleWord2Vec) ForwardPass(words []string) []float64

forwardPass performs the forward pass of the RNN.

func (*SimpleWord2Vec) InitializeWeights

func (sw2v *SimpleWord2Vec) InitializeWeights()

InitializeWeights initializes the weights and biases of the RNN.

func (*SimpleWord2Vec) SaveModel

func (sw2v *SimpleWord2Vec) SaveModel(filename string) error

SaveModel saves the trained model to a GOB file.

func (*SimpleWord2Vec) Train

func (sw2v *SimpleWord2Vec) Train(trainingData []string)

Train implements a simplified Skip-gram model for training word embeddings.

func (*SimpleWord2Vec) TrainSentenceContext

func (sw2v *SimpleWord2Vec) TrainSentenceContext(sentences []string) map[string][]float64

type TrainingData

type TrainingData struct {
	Sentences []struct {
		Tokens []string `json:"tokens"`
	} `json:"sentences"`
}

TrainingData represents the structure of the training data JSON.

type Vector

type Vector []float64

Vector represents a word vector

type WordVectors

type WordVectors map[int][]float64

SimpleWord2Vec is a basic Word2Vec implementation in Go. does NOT include negative sampling or other important optimizations. WordVectors represents the word embeddings

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL