sampler

package
v0.69.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 19, 2025 License: Apache-2.0 Imports: 17 Imported by: 11

Documentation

Overview

Package sampler contains all the logic of the agent-side trace sampling

Currently implementation is based on the scoring of the "signature" of each trace Based on the score, we get a sample rate to apply to the given trace

Current score implementation is super-simple, it is a counter with polynomial decay per signature. We increment it for each incoming trace then we periodically divide the score by two every X seconds. Right after the division, the score is an approximation of the number of received signatures over X seconds. It is different from the scoring in the Agent.

Since the sampling can happen at different levels (client, agent, server) or depending on different rules, we have to track the sample rate applied at previous steps. This way, sampling twice at 50% can result in an effective 25% sampling. The rate is stored as a metric in the trace root.

Package sampler contains all the logic of the agent-side trace sampling

Index

Constants

View Source
const (
	// MetricSamplerSeen is the metric name for the number of traces seen by the sampler.
	MetricSamplerSeen = "datadog.trace_agent.sampler.seen"
	// MetricSamplerKept is the metric name for the number of traces kept by the sampler.
	MetricSamplerKept = "datadog.trace_agent.sampler.kept"
	// MetricSamplerSize is the current number of unique trace signatures tracked for stats calculation.
	MetricSamplerSize = "datadog.trace_agent.sampler.size"
)
View Source
const (

	// MetricsRareHits is the metric name for the number of traces kept by the rare sampler.
	MetricsRareHits = "datadog.trace_agent.sampler.rare.hits"
	// MetricsRareMisses is the metric name for the number of traces missed by the rare sampler.
	MetricsRareMisses = "datadog.trace_agent.sampler.rare.misses"
	// MetricsRareShrinks is the metric name for the number of times the rare sampler has shrunk.
	MetricsRareShrinks = "datadog.trace_agent.sampler.rare.shrinks"
)
View Source
const (
	// KeySamplingRateGlobal is a metric key holding the global sampling rate.
	KeySamplingRateGlobal = "_sample_rate"

	// KeySamplingRateClient is a metric key holding the client-set sampling rate for APM events.
	KeySamplingRateClient = "_dd1.sr.rcusr"

	// KeySamplingRatePreSampler is a metric key holding the API rate limiter's rate for APM events.
	KeySamplingRatePreSampler = "_dd1.sr.rapre"

	// KeySamplingRateEventExtraction is the key of the metric storing the event extraction rate on an APM event.
	KeySamplingRateEventExtraction = "_dd1.sr.eausr"

	// KeySamplingRateMaxEPSSampler is the key of the metric storing the max eps sampler rate on an APM event.
	KeySamplingRateMaxEPSSampler = "_dd1.sr.eamax"

	// KeyErrorType is the key of the error type in the meta map
	KeyErrorType = "error.type"

	// KeyAnalyzedSpans is the metric key which specifies if a span is analyzed.
	KeyAnalyzedSpans = "_dd.analyzed"

	// KeyHTTPStatusCode is the key of the http status code in the meta map
	KeyHTTPStatusCode = "http.status_code"

	// KeySpanSamplingMechanism is the metric key holding a span sampling rule that a span was kept on.
	KeySpanSamplingMechanism = "_dd.span_sampling.mechanism"
)

Variables

This section is empty.

Functions

func GetClientRate

func GetClientRate(s *pb.Span) float64

GetClientRate gets the rate at which the trace this span belongs to was sampled by the tracer. NOTE: This defaults to 1 if no rate is stored.

func GetEventExtractionRate

func GetEventExtractionRate(s *pb.Span) float64

GetEventExtractionRate gets the rate at which the trace from which we extracted this event was sampled at the tracer. This defaults to 1 if no rate is stored.

func GetGlobalRate

func GetGlobalRate(s *pb.Span) float64

GetGlobalRate gets the cumulative sample rate of the trace to which this span belongs to.

func GetMaxEPSRate

func GetMaxEPSRate(s *pb.Span) float64

GetMaxEPSRate gets the rate at which this event was sampled by the max eps event sampler.

func GetPreSampleRate

func GetPreSampleRate(s *pb.Span) float64

GetPreSampleRate returns the rate at which the trace this span belongs to was sampled by the agent's presampler. NOTE: This defaults to 1 if no rate is stored.

func IsAnalyzedSpan

func IsAnalyzedSpan(s *pb.Span) bool

IsAnalyzedSpan checks if a span is analyzed

func SampleByRate

func SampleByRate(traceID uint64, rate float64) bool

SampleByRate returns whether to keep a trace, based on its ID and a sampling rate. This assumes that trace IDs are nearly uniformly distributed.

func SetAnalyzedSpan

func SetAnalyzedSpan(s *pb.Span)

SetAnalyzedSpan marks a span analyzed

func SetClientRate

func SetClientRate(s *pb.Span, rate float64)

SetClientRate sets the rate at which the trace this span belongs to was sampled by the tracer.

func SetEventExtractionRate

func SetEventExtractionRate(s *pb.Span, rate float64)

SetEventExtractionRate sets the rate at which the trace from which we extracted this event was sampled at the tracer.

func SetMaxEPSRate

func SetMaxEPSRate(s *pb.Span, rate float64)

SetMaxEPSRate sets the rate at which this event was sampled by the max eps event sampler.

func SetPreSampleRate

func SetPreSampleRate(s *pb.Span, rate float64)

SetPreSampleRate sets the rate at which the trace this span belongs to was sampled by the agent's presampler.

func SingleSpanSampling added in v0.49.0

func SingleSpanSampling(pt *traceutil.ProcessedTrace) bool

SingleSpanSampling does single span sampling on the trace, returning true if the trace was modified

Types

type AdditionalMetricsReporter added in v0.65.0

type AdditionalMetricsReporter interface {
	// contains filtered or unexported methods
}

AdditionalMetricsReporter reports additional sampler metrics. Metrics reported through this interface are reported at each Metrics tick.

type DynamicConfig

type DynamicConfig struct {
	// RateByService contains the rate for each service/env tuple,
	// used in priority sampling by client libs.
	RateByService RateByService
}

DynamicConfig contains configuration items which may change dynamically over time.

func NewDynamicConfig

func NewDynamicConfig() *DynamicConfig

NewDynamicConfig creates a new dynamic config object which maps service signatures to their corresponding sampling rates. Each service will have a default assigned matching the service rate of the specified env.

type ErrorsSampler

type ErrorsSampler struct{ ScoreSampler }

ErrorsSampler is dedicated to catching traces containing spans with errors.

func NewErrorsSampler

func NewErrorsSampler(conf *config.AgentConfig) *ErrorsSampler

NewErrorsSampler returns an initialized Sampler dedicate to errors. It behaves just like the normal ScoreEngine except for its GetType method (useful for reporting).

type Metrics added in v0.65.0

type Metrics struct {
	// contains filtered or unexported fields
}

Metrics is a structure to record metrics for the different samplers.

func NewMetrics added in v0.65.0

func NewMetrics(statsd statsd.ClientInterface) *Metrics

NewMetrics creates a new Metrics.

func (*Metrics) Add added in v0.65.0

func (m *Metrics) Add(mr ...AdditionalMetricsReporter)

Add sampler metrics reporter.

func (*Metrics) RecordMetricsKey added in v0.65.0

func (m *Metrics) RecordMetricsKey(sampled bool, metricsKey MetricsKey)

RecordMetricsKey records if metricsKey has been seen before and whether it was kept or not.

func (*Metrics) Report added in v0.65.0

func (m *Metrics) Report()

Report reports the metrics and additional sampler metrics.

func (*Metrics) Start added in v0.65.0

func (m *Metrics) Start()

Start the metrics reporting loop.

func (*Metrics) Stop added in v0.65.0

func (m *Metrics) Stop()

Stop the metrics reporting loop.

type MetricsKey added in v0.65.0

type MetricsKey struct {
	// contains filtered or unexported fields
}

MetricsKey represents the key for the metrics.

func NewMetricsKey added in v0.65.0

func NewMetricsKey(service, env string, sampler Name, samplingPriority SamplingPriority) MetricsKey

NewMetricsKey creates a new MetricsKey.

type Name added in v0.65.0

type Name uint8

Name represents the name of the sampler.

const (
	// NameUnknown is the default value. It should not be used.
	NameUnknown Name = iota
	// NamePriority is the name of the priority sampler.
	NamePriority
	// NameNoPriority is the name of the no priority sampler.
	NameNoPriority
	// NameError is the name of the error sampler.
	NameError
	// NameRare is the name of the rare sampler.
	NameRare
	// NameProbabilistic is the name of the probabilistic sampler.
	NameProbabilistic
)

func (Name) String added in v0.65.0

func (n Name) String() string

String returns the string representation of the Name.

type NoPrioritySampler

type NoPrioritySampler struct{ ScoreSampler }

NoPrioritySampler is dedicated to catching traces with no priority set.

func NewNoPrioritySampler

func NewNoPrioritySampler(conf *config.AgentConfig) *NoPrioritySampler

NewNoPrioritySampler returns an initialized Sampler dedicated to traces with no priority set.

type PrioritySampler

type PrioritySampler struct {
	// contains filtered or unexported fields
}

PrioritySampler computes priority rates per tracerEnv, service to apply in a feedback loop with trace-agent clients. Computed rates are sent in http responses to trace-agent. The rates are continuously adjusted in function of the received traffic to match a targetTPS (target traces per second).

func NewPrioritySampler

func NewPrioritySampler(conf *config.AgentConfig, dynConf *DynamicConfig) *PrioritySampler

NewPrioritySampler returns an initialized Sampler

func (*PrioritySampler) GetTargetTPS added in v0.42.0

func (s *PrioritySampler) GetTargetTPS() float64

GetTargetTPS returns the target tps

func (*PrioritySampler) Sample

func (s *PrioritySampler) Sample(now time.Time, trace *pb.TraceChunk, root *pb.Span, tracerEnv string, clientDroppedP0sWeight float64) bool

Sample counts an incoming trace and returns the trace sampling decision and the applied sampling rate

func (*PrioritySampler) UpdateTargetTPS added in v0.42.0

func (s *PrioritySampler) UpdateTargetTPS(targetTPS float64)

UpdateTargetTPS updates the target tps

type ProbabilisticSampler added in v0.54.0

type ProbabilisticSampler struct {
	// contains filtered or unexported fields
}

ProbabilisticSampler is a sampler that overrides all other samplers, it deterministically samples incoming traces by a hash of their trace ID

func NewProbabilisticSampler added in v0.54.0

func NewProbabilisticSampler(conf *config.AgentConfig) *ProbabilisticSampler

NewProbabilisticSampler returns a new ProbabilisticSampler that deterministically samples a given percentage of incoming spans based on their trace ID

func (*ProbabilisticSampler) Sample added in v0.54.0

func (ps *ProbabilisticSampler) Sample(root *trace.Span) bool

Sample a trace given the chunk's root span, returns true if the trace should be kept

type RareSampler

type RareSampler struct {
	// contains filtered or unexported fields
}

RareSampler samples traces that are not caught by the Priority sampler. It ensures that we sample traces for each combination of (env, service, name, resource, error type, http status) seen on a top level or measured span for which we did not see any span with a priority > 0 (sampled by Priority). The resulting sampled traces will likely be incomplete and will be flagged with a exceptioKey metric set at 1.

func NewRareSampler

func NewRareSampler(conf *config.AgentConfig) *RareSampler

NewRareSampler returns a NewRareSampler that ensures that we sample combinations of env, service, name, resource, http-status, error type for each top level or measured spans

func (*RareSampler) IsEnabled added in v0.42.0

func (e *RareSampler) IsEnabled() bool

IsEnabled returns whether the sampler is enabled

func (*RareSampler) Sample

func (e *RareSampler) Sample(now time.Time, t *pb.TraceChunk, env string) bool

Sample a trace and returns true if trace was sampled (should be kept)

func (*RareSampler) SetEnabled added in v0.42.0

func (e *RareSampler) SetEnabled(enabled bool)

SetEnabled marks the sampler as enabled or disabled

type RateByService

type RateByService struct {
	// contains filtered or unexported fields
}

RateByService stores the sampling rate per service. It is thread-safe, so one can read/write on it concurrently, using getters and setters.

func (*RateByService) GetNewState

func (rbs *RateByService) GetNewState(version string) State

GetNewState returns the current state if the given version is different from the local version.

func (*RateByService) SetAll

func (rbs *RateByService) SetAll(rates map[ServiceSignature]float64)

SetAll the sampling rate for all services. If a service/env is not in the map, then the entry is removed.

type Sampler

type Sampler struct {
	// contains filtered or unexported fields
}

Sampler is the main component of the sampling logic Seen traces are counted per signature in a circular buffer of numBuckets. The sampler distributes uniformly on all signature a targetTPS. The bucket with the maximum counts over the period of the buffer is used to compute the sampling rates.

type SamplingPriority

type SamplingPriority int8

SamplingPriority is the type encoding a priority sampling decision.

const (
	// PriorityNone is the value for SamplingPriority when no priority sampling decision could be found.
	PriorityNone SamplingPriority = math.MinInt8

	// PriorityUserDrop is the value set by a user to explicitly drop a trace.
	PriorityUserDrop SamplingPriority = -1

	// PriorityAutoDrop is the value set by a tracer to suggest dropping a trace.
	PriorityAutoDrop SamplingPriority = 0

	// PriorityAutoKeep is the value set by a tracer to suggest keeping a trace.
	PriorityAutoKeep SamplingPriority = 1

	// PriorityUserKeep is the value set by a user to explicitly keep a trace.
	PriorityUserKeep SamplingPriority = 2
)

func GetSamplingPriority

func GetSamplingPriority(t *pb.TraceChunk) (SamplingPriority, bool)

GetSamplingPriority returns the value of the sampling priority metric set on this span and a boolean indicating if such a metric was actually found or not.

func (SamplingPriority) IsKeep added in v0.64.0

func (s SamplingPriority) IsKeep() bool

IsKeep returns whether the priority is "keep".

type ScoreSampler

type ScoreSampler struct {
	*Sampler
	// contains filtered or unexported fields
}

ScoreSampler samples pieces of traces by computing a signature based on spans (service, name, rsc, http.status, error.type) scoring it and applying a rate. The rates are applied on the TraceID to maximize the number of chunks with errors caught for the same traceID. For a set traceID: P(chunk1 kept and chunk2 kept) = min(P(chunk1 kept), P(chunk2 kept))

func (*ScoreSampler) GetTargetTPS added in v0.42.0

func (s *ScoreSampler) GetTargetTPS() float64

GetTargetTPS returns the target tps

func (*ScoreSampler) Sample

func (s *ScoreSampler) Sample(now time.Time, trace pb.Trace, root *pb.Span, env string) bool

Sample counts an incoming trace and tells if it is a sample which has to be kept

func (*ScoreSampler) UpdateTargetTPS added in v0.42.0

func (s *ScoreSampler) UpdateTargetTPS(targetTPS float64)

UpdateTargetTPS updates the target tps

type ServiceSignature

type ServiceSignature struct{ Name, Env string }

ServiceSignature represents a unique way to identify a service.

func (ServiceSignature) Hash

func (s ServiceSignature) Hash() Signature

Hash generates the signature of a trace with minimal information such as service and env, this is typically used by distributed sampling based on priority, and used as a key to store the desired rate for a given service,env tuple.

func (ServiceSignature) String

func (s ServiceSignature) String() string

type Signature

type Signature uint64

Signature is a hash representation of trace or a service, used to identify similar signatures.

type State

type State struct {
	Rates   map[string]float64
	Version string
}

State specifies the current state of DynamicConfig

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL