Documentation
¶
Index ¶
- func GetVersion() string
- type Metadata
- type MetadataItem
- type Model
- func (m *Model) Close() error
- func (m *Model) DisableExternalScorer()
- func (m *Model) EnableExternalScorer(scorerPath string)
- func (m *Model) GetModelBeamWidth() uint
- func (m *Model) GetModelSampleRate() int
- func (m *Model) SetModelBeamWidth(beamWidth uint)
- func (m *Model) SpeechToText(buffer []int16, bufferSize uint) string
- func (m *Model) SpeechToTextWithMetadata(buffer []int16, bufferSize uint) *Metadata
- type Stream
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func GetVersion ¶ added in v0.7.0
func GetVersion() string
PrintVersions Print version of this library and of the linked TensorFlow library.
Types ¶
type Metadata ¶
type Metadata C.struct_Metadata
Metadata represents a DeepSpeech metadata output
func (*Metadata) Confidence ¶
func (*Metadata) Items ¶
func (m *Metadata) Items() []MetadataItem
type MetadataItem ¶
type MetadataItem C.struct_MetadataItem
func (*MetadataItem) Character ¶
func (mi *MetadataItem) Character() string
func (*MetadataItem) StartTime ¶
func (mi *MetadataItem) StartTime() float32
func (*MetadataItem) Timestep ¶
func (mi *MetadataItem) Timestep() int
type Model ¶
type Model struct {
// contains filtered or unexported fields
}
Model represents a DeepSpeech model
func New ¶
New creates a new Model
modelPath The path to the frozen model graph. beamWidth The beam width used by the decoder. A larger beam width generates better results at the cost of decoding time.
func (*Model) DisableExternalScorer ¶ added in v0.7.0
func (m *Model) DisableExternalScorer()
func (*Model) EnableExternalScorer ¶ added in v0.7.0
EnableExternalScorer enables decoding using beam scoring with a KenLM language model.
lmPath The path to the language model binary file.
func (*Model) GetModelBeamWidth ¶ added in v0.7.0
func (*Model) GetModelSampleRate ¶
GetModelSampleRate read the sample rate that was used to produce the model file.
func (*Model) SetModelBeamWidth ¶ added in v0.7.0
func (*Model) SpeechToText ¶
SpeechToText uses the DeepSpeech model to perform Speech-To-Text. buffer A 16-bit, mono raw audio signal at the appropriate sample rate. bufferSize The number of samples in the audio signal.
func (*Model) SpeechToTextWithMetadata ¶
SpeechToTextWithMetadata uses the DeepSpeech model to perform Speech-To-Text. buffer A 16-bit, mono raw audio signal at the appropriate sample rate. bufferSize The number of samples in the audio signal.
type Stream ¶
type Stream struct {
// contains filtered or unexported fields
}
Stream represent a streaming state
func (*Stream) FeedAudioContent ¶
FeedAudioContent Feed audio samples to an ongoing streaming inference. aBuffer An array of 16-bit, mono raw audio samples at the appropriate sample rate. aBufferSize The number of samples in @p aBuffer.
func (*Stream) FinishStream ¶
FinishStream Signal the end of an audio signal to an ongoing streaming inference, returns the STT result over the whole audio signal.
func (*Stream) FinishStreamWithMetadata ¶
FinishStreamWithMetadata Signal the end of an audio signal to an ongoing streaming inference, returns extended metadata.
func (*Stream) FreeStream ¶
func (s *Stream) FreeStream()
DiscardStream Destroy a streaming state without decoding the computed logits. This can be used if you no longer need the result of an ongoing streaming inference and don't want to perform a costly decode operation.
func (*Stream) IntermediateDecode ¶
IntermediateDecode Compute the intermediate decoding of an ongoing streaming inference. This is an expensive process as the decoder implementation isn't currently capable of streaming, so it always starts from the beginning of the audio.