llama.go

module

v0.0.0-...-2e2e86e Latest Latest Go to latest Published: Jan 11, 2026 License: MIT

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/Qitmeer/llama.go

Links

Open Source Insights

README ¶

llama.go

Go bindings to llama.cpp

Installation

make sure you have git golang cmake gcc make installed on the system to build.

Build from source

~ git clone https://github.com/Qitmeer/llama.go.git
~ cd llama.go
~ make

Get model

Manually download the model:Hugging Face Qwen3-8B-GGUF
Please first set the storage location of the model file, which can be done using environment variables LLAMAGO_MODEL_DIR or command-line parameters model-dir
Default model files directory is ./data/models

~ ./llama --model-dir=<your_model_files_directory>
or
~ export LLAMAGO_MODEL_DIR=<your_model_files_directory>

As the startup of the server

~ ./llama --model=qwen2.5-0.5b-q8_0.gguf serve
or
~ ./llama --model=gpt-oss-20b-mxfp4.gguf --jinja serve

client:

~ ./llama run 天空为什么是蓝的

Or enable interactive mode to run:

~ ./llama run

Download Model by CLI:

~ ./llama pull gte-small-Q8_0-GGUF

or

~ ./llama pull gte-small-Q8_0-GGUF:gte-small-q8_0.gguf

or

~ ./llama pull llamago/gte-small-Q8_0-GGUF:gte-small-q8_0.gguf

Support REST API:

~ curl -s -k -X POST -H 'Content-Type: application/json' --data '{"prompt":"天空为什么是蓝的"}' http://127.0.0.1:8081/api/generate

WebUI

Enter this address http://127.0.0.1:8081 in the browser

Embedding

Local mode:

~ ./llama --model=qwen2.5-0.5b-q8_0.gguf embedding 天空为什么是蓝的 --output-file=./embs.json

Server mode:

~ curl -s -k -X POST -H 'Content-Type: application/json' --data '{"input":["天空","蓝色"]}' http://127.0.0.1:8081/api/embed
~ curl -s -k -X POST -H 'Content-Type: application/json' --data '{"prompt":"天空为什么是蓝的"}' http://127.0.0.1:8081/api/embeddings

Whisper

Firstly, you need to download the model from this address https://huggingface.co/ggerganov/whisper.cpp and then place it in LLAMAGO_MODEL_DIR or model-dir

~ ./llama --model=ggml-base.en.bin whisper --input="./your-voice.wav"

Directories ¶

Path	Synopsis
api
app
embedding
embedding/config
pull
run
cmd
llama command
common
auth
progress
readline
config
format
model
fs
fs/ggml
fs/gguf
fs/util/bufioutil
harmony
parser
parsers
template
thinking
runner
server
middleware
routes
system
limits
version
wrapper

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL