bench

package

v3.5.0 Latest Latest Go to latest Published: Apr 17, 2025 License: AGPL-3.0 Imports: 37 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/grafana/loki

Links

Open Source Insights

README ¶

LogQL Benchmark Suite

This directory contains a comprehensive benchmark suite for LogQL, Loki's query language. The suite is designed to generate realistic log data and benchmark various LogQL queries against different storage implementations.

Overview

The LogQL benchmark suite provides tools to:

Generate realistic log data with configurable cardinality, patterns, and time distributions
Store the generated data in different storage formats (currently supports "chunk" and "dataobj" formats)
Run benchmarks against a variety of LogQL queries, including log filtering and metric queries
Compare performance between different storage implementations

Getting Started

Prerequisites

Go 1.21 or later
At least 2GB of free disk space (for default dataset size)

Generating Test Data

Before running benchmarks, you need to generate test data:

# Generate default dataset (2GB)
make generate

# Generate a custom-sized dataset (e.g., 500MB)
make generate SIZE=524288000

# Generate for a specific tenant
make generate TENANT=my-tenant

The data generation process:

Creates synthetic log data with realistic patterns
Stores the data in multiple storage formats for comparison
Saves a configuration file that describes the generated dataset

The generated dataset is fully reproducible as it uses a fixed random seed. This ensures that benchmark results are comparable across different runs and environments, making it ideal for performance regression testing.

Running Benchmarks

Once data is generated, you can run benchmarks:

# Run all benchmarks
make bench

# List available benchmark queries
make list

# Run benchmarks with interactive UI
make run

# Run with debug output, you need to tail the logs to see the output `tail -f debug.log`
make run-debug

# Stream sample logs from the dataset
make stream

Architecture

Data Generation

The benchmark suite generates synthetic log data with:

Configurable label cardinality (clusters, namespaces, services, pods, containers)
Realistic log patterns for different applications (nginx, postgres, java, etc.)
Time-based patterns including dense intervals with higher log volume
Structured metadata and trace context

Storage Implementations

The suite supports multiple storage implementations:

DataObj Store: Stores logs as serialized protocol buffer objects
Chunk Store: Stores logs in compressed chunks similar to Loki's chunk format

Query Types

The benchmark includes various query types:

Log filtering queries (e.g., {app="nginx"} |= "error")
Metric queries (e.g., rate({app="nginx"}[5m]))
Aggregations (e.g., sum by (status_code) (rate({app="nginx"} | json | status_code != "" [5m])))

Extending the Suite

Adding New Application Types

The benchmark suite supports multiple application types, each generating different log formats. Currently, it includes applications like web servers, databases, caches, Nginx, Kubernetes, Prometheus, Grafana, Loki, and more.

To add a new application type:

Open faker.go and add any new data variables needed for your application:

myAppComponents := []string{
    "component1",
    "component2",
    // ...
}

Add helper methods to the Faker struct if needed:

// MyAppComponent returns a random component for my application
func (f *Faker) MyAppComponent() string {
    return myAppComponents[f.rnd.Intn(len(myAppComponents))]
}

Add a new entry to the defaultApplications slice:

{
    Name: "my-application",
    LogGenerator: func(level string, ts time.Time, f *Faker) string {
        // Generate log line in your desired format (JSON, logfmt, etc.)
        return fmt.Sprintf(
            `level=%s ts=%s component=%s msg="My application log"`,
            level, ts.Format(time.RFC3339), f.MyAppComponent(),
        )
    },
    OTELResource: map[string]string{
        "service_name":    "my-application",
        "service_version": "1.0.0",
        // Add any OpenTelemetry resource attributes
    },
}

The LogGenerator function receives:
- level: The log level (info, warn, error, etc.)
- ts: The timestamp for the log entry
- f: The faker instance for generating random data
The OTELResource map defines OpenTelemetry resource attributes that will be attached to logs as structured metadata.

After adding your application type, it will be automatically included in the generated dataset with the appropriate distribution based on the generator configuration.

Adding New Storage Implementations

To add a new storage implementation:

Implement the Store interface in store.go
Add the new store to the cmd/generate/main.go file
Update the benchmark test to include the new store type

Adding New Query Types

To add new query types:

Modify the GenerateTestCases method in generator_query.go
Add new query patterns that test different aspects of LogQL

Troubleshooting

If you see "Data directory is empty" errors, run make generate first
For memory issues, try generating a smaller dataset with make generate SIZE=524288000
For detailed logs, run benchmarks with make run-debug

Performance Profiles

The benchmark suite can generate CPU and memory profiles:

CPU profile: cpu.prof
Memory profile: mem.prof

These can be analyzed with:

go tool pprof -http=:8080 cpu.prof
go tool pprof -http=:8080 mem.prof

Documentation ¶

Index ¶

Constants
func SaveConfig(dataDir string, config *GeneratorConfig) error
type Application
type Batch
- func (b Batch) Size() int
type Builder
- func NewBuilder(dir string, opt Opt, stores ...Store) *Builder
- func (b *Builder) Generate(ctx context.Context, targetSize int64) error
type ChunkStore
- func NewChunkStore(dir, tenantID string) (*ChunkStore, error)
- func (s *ChunkStore) Close() error
- func (s *ChunkStore) Name() string
- func (s *ChunkStore) Querier() (logql.Querier, error)
- func (s *ChunkStore) Write(ctx context.Context, streams []logproto.Stream) error
type DataObjStore
- func NewDataObjStore(dir, tenantID string) (*DataObjStore, error)
- func (s *DataObjStore) Close() error
- func (s *DataObjStore) Name() string
- func (s *DataObjStore) Querier() (logql.Querier, error)
- func (s *DataObjStore) Write(_ context.Context, streams []logproto.Stream) error
type DenseInterval
type Faker
- func NewFaker(rnd *rand.Rand) *Faker
- func (f *Faker) AuthAction() string
- func (f *Faker) AuthError() string
- func (f *Faker) AuthSuccess() bool
- func (f *Faker) CacheError() string
- func (f *Faker) CacheKey() string
- func (f *Faker) CacheOp() string
- func (f *Faker) CacheSize() int
- func (f *Faker) CacheTTL() int
- func (f *Faker) DBError() string
- func (f *Faker) Duration() time.Duration
- func (f *Faker) Error() string
- func (f *Faker) ErrorMessage() string
- func (f *Faker) GRPCMethod() string
- func (f *Faker) GrafanaComponent() string
- func (f *Faker) GrafanaLogger() string
- func (f *Faker) GrafanaMessage() string
- func (f *Faker) Hostname() string
- func (f *Faker) IP() string
- func (f *Faker) K8sComponent() string
- func (f *Faker) K8sLogPrefix() string
- func (f *Faker) K8sMessage() string
- func (f *Faker) KafkaError() string
- func (f *Faker) KafkaEvent() string
- func (f *Faker) KafkaOffset() int
- func (f *Faker) KafkaPartition() int
- func (f *Faker) KafkaTopic() string
- func (f *Faker) Method() string
- func (f *Faker) NginxErrorType() string
- func (f *Faker) NginxPath() string
- func (f *Faker) OrgID() string
- func (f *Faker) PID() int
- func (f *Faker) Path() string
- func (f *Faker) PrometheusComponent() string
- func (f *Faker) PrometheusMessage() string
- func (f *Faker) PrometheusSubcomponent() string
- func (f *Faker) QueryType() string
- func (f *Faker) Referer() string
- func (f *Faker) RowsAffected() int
- func (f *Faker) SpanID() string
- func (f *Faker) Status() int
- func (f *Faker) SyslogPriority(isError bool) int
- func (f *Faker) Table() string
- func (f *Faker) TempoComponent() string
- func (f *Faker) TempoMessage() string
- func (f *Faker) TraceID() string
- func (f *Faker) User() string
- func (f *Faker) UserAgent() string
type Generator
- func NewGenerator(opt Opt) *Generator
- func (g *Generator) Batches() iter.Seq[*Batch]
- func (g *Generator) GenerateDataset(targetSize int64, outputFile string) error
type GeneratorConfig
- func LoadConfig(dataDir string) (*GeneratorConfig, error)
- func (c *GeneratorConfig) GenerateTestCases() []TestCase
- func (c *GeneratorConfig) NewRand() *rand.Rand
type LabelConfig
type LogGenerator
type OTELAttributes
type OTELTraceContext
type Opt
- func DefaultOpt() Opt
- func (o Opt) WithDenseInterval(start time.Time, duration time.Duration) Opt
- func (o Opt) WithLabelCardinality(clusters, namespaces, services, pods, containers int) Opt
- func (o Opt) WithLabelConfig(cfg LabelConfig) Opt
- func (o Opt) WithNumStreams(n int) Opt
- func (o Opt) WithSeed(seed int64) Opt
- func (o Opt) WithStartTime(t time.Time) Opt
- func (o Opt) WithTimeSpread(d time.Duration) Opt
type Store
type StreamMetadata
type TestCase
- func (c TestCase) Description() string
- func (c TestCase) Name() string

Constants ¶

View Source

const (
	DefaultDataDir = "./data"
)

Variables ¶

This section is empty.

Functions ¶

func SaveConfig ¶

func SaveConfig(dataDir string, config *GeneratorConfig) error

SaveConfig saves the generator configuration to a file in the data directory

Types ¶

type Application ¶

type Application struct {
	Name         string
	LogGenerator LogGenerator
	OTELResource map[string]string // OTEL resource attributes
}

Application represents a type of application that generates logs

type Batch ¶

type Batch struct {
	Streams []logproto.Stream
}

Batch represents a collection of log streams

func (Batch) Size ¶

func (b Batch) Size() int

Size of batch in bytes including all entries, labels and structured metadata.

type Builder ¶

type Builder struct {
	// contains filtered or unexported fields
}

Builder helps construct test datasets using multiple stores

func NewBuilder ¶

func NewBuilder(dir string, opt Opt, stores ...Store) *Builder

NewBuilder creates a new Builder

func (*Builder) Generate ¶

func (b *Builder) Generate(ctx context.Context, targetSize int64) error

Generate generates and stores the specified amount of data across all stores

type ChunkStore ¶

type ChunkStore struct {
	// contains filtered or unexported fields
}

func NewChunkStore ¶

func NewChunkStore(dir, tenantID string) (*ChunkStore, error)

func (*ChunkStore) Close ¶

func (s *ChunkStore) Close() error

Close flushes any remaining data and closes resources

func (*ChunkStore) Name ¶

func (s *ChunkStore) Name() string

Name implements Store

func (*ChunkStore) Querier ¶

func (s *ChunkStore) Querier() (logql.Querier, error)

Querier implements Store

func (*ChunkStore) Write ¶

func (s *ChunkStore) Write(ctx context.Context, streams []logproto.Stream) error

type DataObjStore ¶

type DataObjStore struct {
	// contains filtered or unexported fields
}

DataObjStore implements Store using the dataobj format

func NewDataObjStore ¶

func NewDataObjStore(dir, tenantID string) (*DataObjStore, error)

NewDataObjStore creates a new DataObjStore

func (*DataObjStore) Close ¶

func (s *DataObjStore) Close() error

Close flushes any remaining data and closes resources

func (*DataObjStore) Name ¶

func (s *DataObjStore) Name() string

Name implements Store

func (*DataObjStore) Querier ¶

func (s *DataObjStore) Querier() (logql.Querier, error)

func (*DataObjStore) Write ¶

func (s *DataObjStore) Write(_ context.Context, streams []logproto.Stream) error

Write implements Store

type DenseInterval ¶

type DenseInterval struct {
	Start    time.Time
	Duration time.Duration
}

DenseInterval represents a period of high log volume

type Faker ¶

type Faker struct {
	// contains filtered or unexported fields
}

Faker provides methods to generate fake data consistently

func NewFaker ¶

func NewFaker(rnd *rand.Rand) *Faker

NewFaker creates a new faker with the given random source

func (*Faker) AuthAction ¶

func (f *Faker) AuthAction() string

AuthAction returns a random authentication action

func (*Faker) AuthError ¶

func (f *Faker) AuthError() string

AuthError returns a random authentication error

func (*Faker) AuthSuccess ¶

func (f *Faker) AuthSuccess() bool

AuthSuccess returns a random authentication success status

func (*Faker) CacheError ¶

func (f *Faker) CacheError() string

CacheError returns a random cache error

func (*Faker) CacheKey ¶

func (f *Faker) CacheKey() string

CacheKey returns a random cache key

func (*Faker) CacheOp ¶

func (f *Faker) CacheOp() string

CacheOp returns a random cache operation

func (*Faker) CacheSize ¶

func (f *Faker) CacheSize() int

CacheSize returns a random cache size in bytes

func (*Faker) CacheTTL ¶

func (f *Faker) CacheTTL() int

CacheTTL returns a random cache TTL in seconds

func (*Faker) DBError ¶

func (f *Faker) DBError() string

DBError returns a random database error

func (*Faker) Duration ¶

func (f *Faker) Duration() time.Duration

Duration returns a random duration

func (*Faker) Error ¶

func (f *Faker) Error() string

Error returns a random general error message

func (*Faker) ErrorMessage ¶

func (f *Faker) ErrorMessage() string

ErrorMessage returns a random error message

func (*Faker) GRPCMethod ¶

func (f *Faker) GRPCMethod() string

GRPCMethod returns a random gRPC method

func (*Faker) GrafanaComponent ¶

func (f *Faker) GrafanaComponent() string

GrafanaComponent returns a random Grafana component

func (*Faker) GrafanaLogger ¶

func (f *Faker) GrafanaLogger() string

GrafanaLogger returns a random Grafana logger

func (*Faker) GrafanaMessage ¶

func (f *Faker) GrafanaMessage() string

GrafanaMessage returns a random Grafana message

func (*Faker) Hostname ¶

func (f *Faker) Hostname() string

Hostname returns a random hostname

func (*Faker) IP ¶

func (f *Faker) IP() string

IP returns a random IP address

func (*Faker) K8sComponent ¶

func (f *Faker) K8sComponent() string

K8sComponent returns a random Kubernetes component

func (*Faker) K8sLogPrefix ¶

func (f *Faker) K8sLogPrefix() string

K8sLogPrefix returns a random Kubernetes log prefix

func (*Faker) K8sMessage ¶

func (f *Faker) K8sMessage() string

K8sMessage returns a random Kubernetes message

func (*Faker) KafkaError ¶

func (f *Faker) KafkaError() string

KafkaError returns a random Kafka error

func (*Faker) KafkaEvent ¶

func (f *Faker) KafkaEvent() string

KafkaEvent returns a random Kafka event

func (*Faker) KafkaOffset ¶

func (f *Faker) KafkaOffset() int

KafkaOffset returns a random Kafka offset

func (*Faker) KafkaPartition ¶

func (f *Faker) KafkaPartition() int

KafkaPartition returns a random Kafka partition

func (*Faker) KafkaTopic ¶

func (f *Faker) KafkaTopic() string

KafkaTopic returns a random Kafka topic

func (*Faker) Method ¶

func (f *Faker) Method() string

Method returns a random HTTP method

func (*Faker) NginxErrorType ¶

func (f *Faker) NginxErrorType() string

NginxErrorType returns a random nginx error type

func (*Faker) NginxPath ¶

func (f *Faker) NginxPath() string

NginxPath returns a random nginx path

func (*Faker) OrgID ¶

func (f *Faker) OrgID() string

OrgID returns a random organization ID

func (*Faker) PID ¶

func (f *Faker) PID() int

PID returns a random process ID

func (*Faker) Path ¶

func (f *Faker) Path() string

Path returns a random API path

func (*Faker) PrometheusComponent ¶

func (f *Faker) PrometheusComponent() string

PrometheusComponent returns a random Prometheus component

func (*Faker) PrometheusMessage ¶

func (f *Faker) PrometheusMessage() string

PrometheusMessage returns a random Prometheus message

func (*Faker) PrometheusSubcomponent ¶

func (f *Faker) PrometheusSubcomponent() string

PrometheusSubcomponent returns a random Prometheus subcomponent

func (*Faker) QueryType ¶

func (f *Faker) QueryType() string

QueryType returns a random database query type

func (*Faker) Referer ¶

func (f *Faker) Referer() string

Referer returns a random referer URL

func (*Faker) RowsAffected ¶

func (f *Faker) RowsAffected() int

RowsAffected returns a random number of rows affected

func (*Faker) SpanID ¶

func (f *Faker) SpanID() string

SpanID returns a random span ID

func (*Faker) Status ¶

func (f *Faker) Status() int

Status returns a random HTTP status code

func (*Faker) SyslogPriority ¶

func (f *Faker) SyslogPriority(isError bool) int

SyslogPriority returns a random syslog priority

func (*Faker) Table ¶

func (f *Faker) Table() string

Table returns a random database table name

func (*Faker) TempoComponent ¶

func (f *Faker) TempoComponent() string

TempoComponent returns a random Tempo component

func (*Faker) TempoMessage ¶

func (f *Faker) TempoMessage() string

TempoMessage returns a random Tempo message

func (*Faker) TraceID ¶

func (f *Faker) TraceID() string

TraceID returns a random trace ID

func (*Faker) User ¶

func (f *Faker) User() string

User returns a random username

func (*Faker) UserAgent ¶

func (f *Faker) UserAgent() string

UserAgent returns a random user agent string

type Generator ¶

type Generator struct {
	// contains filtered or unexported fields
}

Generator represents a log generator with configuration

func NewGenerator ¶

func NewGenerator(opt Opt) *Generator

NewGenerator creates a new generator with the given options

func (*Generator) Batches ¶

func (g *Generator) Batches() iter.Seq[*Batch]

Batches returns an iterator that produces log batches Each batch contains the configured number of streams with generated log entries

func (*Generator) GenerateDataset ¶

func (g *Generator) GenerateDataset(targetSize int64, outputFile string) error

GenerateDataset generates a dataset of approximately the specified size

type GeneratorConfig ¶

type GeneratorConfig struct {
	StartTime  time.Time
	TimeSpread time.Duration
	// DenseIntervals defines periods of high log density
	// Each interval will have 10x more logs than normal periods
	DenseIntervals []DenseInterval
	LabelConfig    LabelConfig
	NumStreams     int   // Number of streams to generate per batch
	Seed           int64 // Source of randomness
}

GeneratorConfig contains all configuration for the log generator

func LoadConfig ¶

func LoadConfig(dataDir string) (*GeneratorConfig, error)

LoadConfig loads the generator configuration from the data directory

func (*GeneratorConfig) GenerateTestCases ¶

func (c *GeneratorConfig) GenerateTestCases() []TestCase

GenerateTestCases creates a sorted list of test cases using the configuration

func (*GeneratorConfig) NewRand ¶

func (c *GeneratorConfig) NewRand() *rand.Rand

NewRand creates a new random source using the configured seed

type LabelConfig ¶

type LabelConfig struct {
	Clusters    int      // 1-10 clusters
	Namespaces  int      // 10-100 namespaces
	Services    int      // 100-1000 services
	Pods        int      // 1000-10000 pods
	Containers  int      // 1-5 containers per pod
	LogLevels   []string // Log levels to use
	EnvTypes    []string // Environment types
	Regions     []string // Regions
	Datacenters []string // Datacenters
}

LabelConfig configures the cardinality of generated labels

type LogGenerator ¶

type LogGenerator func(level string, timestamp time.Time, faker *Faker) string

LogGenerator is a function that generates a log line

type OTELAttributes ¶

type OTELAttributes struct {
	Resource map[string]string // Resource attributes constant for the service
	Trace    *OTELTraceContext // Optional trace context
}

OTELAttributes represents OpenTelemetry attributes for logs

type OTELTraceContext ¶

type OTELTraceContext struct {
	TraceID string
	SpanID  string
}

OTELTraceContext represents OpenTelemetry trace context

type Opt ¶

type Opt struct {
	// contains filtered or unexported fields
}

Opt represents configuration options for the generator

func DefaultOpt ¶

func DefaultOpt() Opt

DefaultOpt returns the default options

func (Opt) WithDenseInterval ¶

func (o Opt) WithDenseInterval(start time.Time, duration time.Duration) Opt

WithDenseInterval adds a dense interval to the configuration

func (Opt) WithLabelCardinality ¶

func (o Opt) WithLabelCardinality(clusters, namespaces, services, pods, containers int) Opt

WithLabelCardinality configures the cardinality of different labels

func (Opt) WithLabelConfig ¶

func (o Opt) WithLabelConfig(cfg LabelConfig) Opt

WithLabelConfig sets the entire label configuration

func (Opt) WithNumStreams ¶

func (o Opt) WithNumStreams(n int) Opt

WithNumStreams sets the number of streams to generate per batch

func (Opt) WithSeed ¶

func (o Opt) WithSeed(seed int64) Opt

WithSeed sets the seed for random number generation

func (Opt) WithStartTime ¶

func (o Opt) WithStartTime(t time.Time) Opt

WithStartTime sets the start time for log generation

func (Opt) WithTimeSpread ¶

func (o Opt) WithTimeSpread(d time.Duration) Opt

WithTimeSpread sets the time spread for log generation

type Store ¶

type Store interface {
	// Write writes a batch of streams to the store
	Write(ctx context.Context, streams []logproto.Stream) error
	// Name returns the name of the store implementation
	Name() string
	// Close flushes any remaining data and closes resources
	Close() error
}

Store represents a storage backend for log data

type StreamMetadata ¶

type StreamMetadata struct {
	Labels string
	App    Application
}

StreamMetadata holds the consistent properties of a stream

type TestCase ¶

type TestCase struct {
	Query     string
	Start     time.Time
	End       time.Time
	Direction logproto.Direction
	Step      time.Duration // Step size for metric queries
}

TestCase represents a LogQL test case for benchmarking and testing

func (TestCase) Description ¶

func (c TestCase) Description() string

Description returns a detailed description of the test case including time range

func (TestCase) Name ¶

func (c TestCase) Name() string

Name returns a descriptive name for the test case. For log queries, it includes the direction. For metric queries (rate, sum), it returns the query with step size.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
cmd
bench
bench/views
generate
stream

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL