mirror of
https://github.com/hyperdxio/hyperdx
synced 2026-04-21 13:37:15 +00:00
## Problem High-throughput services can produce millions of spans per second. Storing every span is expensive, so we run the OpenTelemetry Collector's tail-sampling processor to keep only 1-in-N spans. Each kept span carries a `SampleRate` attribute recording N. Once data is sampled, naive aggregations are wrong: count() returns N-x fewer events than actually occurred, sum()/avg() are biased, and percentiles shift. Dashboards show misleadingly low request counts, throughput, and error rates, making capacity planning and alerting unreliable. ### Why Materialized Views Cannot Solve This Alone A materialized view that pre-aggregates sampled spans is a useful performance optimization for known dashboard queries, but it cannot replace a sampling-aware query engine. **Fixed dimensions.** A materialized view pre-aggregates by a fixed set of GROUP BY keys (e.g. `ServiceName`, `SpanName`, `StatusCode`, `TimestampBucket`). Trace exploration requires slicing by arbitrary span attributes -- `http.target`, `k8s.pod.name`, custom business tags -- in combinations that cannot be predicted at view creation time. Grouping by a different dimension either requires going back to raw table or a separate materialized views for every possible dimension combination. If you try to work around the fixed-dimensions problem by adding high-cardinality span attributes to the GROUP BY, the materialized table approaches a 1:1 row ratio with the raw table. You end up doubling storage without meaningful compression. **Fixed aggregation fields.** A typical MV only aggregates a single numeric column like `Duration`. Users want weighted aggregations over any numeric attribute: request body sizes, queue depths, retry counts, custom metrics attached to spans. Each new field requires adding more `AggregateFunction` columns and recreating the view. **Industry precedent.** Platforms that rely solely on pre-aggregation (Datadog, Splunk, New Relic, Elastic) get accurate RED dashboards but cannot correct ad-hoc queries over sampled span data. Only query-engine weighting (Honeycomb) produces correct results for arbitrary ad-hoc queries, including weighted percentiles and heatmaps. A better solution is making the query engine itself sampling-aware, so that all queries from dashboards, alerts, an ad-hoc searches, automatically weights by `SampleRate` regardless of which dimensions or fields the user picks. Materialized views remain a useful complement for accelerating known, fixed-dimension dashboard panels, but they are not a substitute for correct query-time weighting. ## Summary TraceSourceSchema gets a new optional field `sampleRateExpression` - the ClickHouse expression that evaluates to the per-span sample rate (e.g. `SpanAttributes['SampleRate']`). When not configured, all queries are unchanged. When set, the query builder rewrites SQL aggregations to weight each span by its sample rate: aggFn | Before | After (sample-corrected) | Overhead -------------- | ---------------------- | --------------------------------------------------- | -------- count | count() | sum(weight) | ~1x count + cond | countIf(cond) | sumIf(weight, cond) | ~1x avg | avg(col) | sum(col * weight) / sum(weight) | ~2x sum | sum(col) | sum(col * weight) | ~1x quantile(p) | quantile(p)(col) | quantileTDigestWeighted(p)(col, toUInt32(weight)) | ~1.5x min/max | unchanged | unchanged | 1x count_distinct | unchanged | unchanged (cannot correct) | 1x **Types**: - Add sampleRateExpression to TraceSourceSchema + Mongoose model - Add sampleWeightExpression to ChartConfig schema **Query builder:** - sampleWeightExpression is wrapped as greatest(toUInt64OrZero(toString(expr)), 1) so spans without a SampleRate attribute default to weight 1 (unsampled data produces identical results to the original queries). - Rewrite aggFnExpr in renderChartConfig.ts when sampleWeightExpression is set, with safe default-to-1 wrapping **Integration** (propagate sampleWeightExpression to all chart configs): - ChartEditor/utils.ts, DBSearchPage, ServicesDashboardPage, sessions - DBDashboardPage (raw SQL + builder branches) - AlertPreviewChart - SessionSubpanel - ServiceDashboardEndpointPerformanceChart - ServiceDashboardSlowestEventsTile (p95 query + events table) - ServiceDashboardEndpointSidePanel (error rate + throughput) - ServiceDashboardDbQuerySidePanel (total query time + throughput) - External API v2 charts, AI controller, alerts (index + template) **UI**: - Add Sample Rate Expression field to trace source admin form ### Screenshots or video | Before | After | | :----- | :---- | | | | ### How to test locally or on Vercel 1. 2. 3. ### References - Linear Issue: - Related PRs:
210 lines
12 KiB
Bash
Executable file
210 lines
12 KiB
Bash
Executable file
#!/bin/bash
|
|
set -e
|
|
|
|
# E2E-specific database initialization script
|
|
# Creates tables with e2e_ prefix to avoid collision with local dev data
|
|
|
|
# We don't have a JSON schema yet, so let's let the collector create the tables
|
|
if [ "$BETA_CH_OTEL_JSON_SCHEMA_ENABLED" = "true" ]; then
|
|
exit 0
|
|
fi
|
|
|
|
DATABASE=${HYPERDX_OTEL_EXPORTER_CLICKHOUSE_DATABASE:-default}
|
|
|
|
clickhouse client -n <<EOFSQL
|
|
CREATE DATABASE IF NOT EXISTS ${DATABASE};
|
|
|
|
CREATE TABLE IF NOT EXISTS ${DATABASE}.e2e_otel_logs
|
|
(
|
|
\`Timestamp\` DateTime64(9) CODEC(Delta(8), ZSTD(1)),
|
|
\`TimestampTime\` DateTime DEFAULT toDateTime(Timestamp),
|
|
\`TraceId\` String CODEC(ZSTD(1)),
|
|
\`SpanId\` String CODEC(ZSTD(1)),
|
|
\`TraceFlags\` UInt8,
|
|
\`SeverityText\` LowCardinality(String) CODEC(ZSTD(1)),
|
|
\`SeverityNumber\` UInt8,
|
|
\`ServiceName\` LowCardinality(String) CODEC(ZSTD(1)),
|
|
\`Body\` String CODEC(ZSTD(1)),
|
|
\`ResourceSchemaUrl\` LowCardinality(String) CODEC(ZSTD(1)),
|
|
\`ResourceAttributes\` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
|
|
\`ScopeSchemaUrl\` LowCardinality(String) CODEC(ZSTD(1)),
|
|
\`ScopeName\` String CODEC(ZSTD(1)),
|
|
\`ScopeVersion\` LowCardinality(String) CODEC(ZSTD(1)),
|
|
\`ScopeAttributes\` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
|
|
\`LogAttributes\` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
|
|
\`__hdx_materialized_k8s.cluster.name\` LowCardinality(String) MATERIALIZED ResourceAttributes['k8s.cluster.name'] CODEC(ZSTD(1)),
|
|
\`__hdx_materialized_k8s.container.name\` LowCardinality(String) MATERIALIZED ResourceAttributes['k8s.container.name'] CODEC(ZSTD(1)),
|
|
\`__hdx_materialized_k8s.deployment.name\` LowCardinality(String) MATERIALIZED ResourceAttributes['k8s.deployment.name'] CODEC(ZSTD(1)),
|
|
\`__hdx_materialized_k8s.namespace.name\` LowCardinality(String) MATERIALIZED ResourceAttributes['k8s.namespace.name'] CODEC(ZSTD(1)),
|
|
\`__hdx_materialized_k8s.node.name\` LowCardinality(String) MATERIALIZED ResourceAttributes['k8s.node.name'] CODEC(ZSTD(1)),
|
|
\`__hdx_materialized_k8s.pod.name\` LowCardinality(String) MATERIALIZED ResourceAttributes['k8s.pod.name'] CODEC(ZSTD(1)),
|
|
\`__hdx_materialized_k8s.pod.uid\` LowCardinality(String) MATERIALIZED ResourceAttributes['k8s.pod.uid'] CODEC(ZSTD(1)),
|
|
\`__hdx_materialized_deployment.environment.name\` LowCardinality(String) MATERIALIZED ResourceAttributes['deployment.environment.name'] CODEC(ZSTD(1)),
|
|
INDEX idx_trace_id TraceId TYPE bloom_filter(0.001) GRANULARITY 1,
|
|
INDEX idx_res_attr_key mapKeys(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_res_attr_value mapValues(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_scope_attr_key mapKeys(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_scope_attr_value mapValues(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_log_attr_key mapKeys(LogAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_log_attr_value mapValues(LogAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_lower_body lower(Body) TYPE tokenbf_v1(32768, 3, 0) GRANULARITY 8
|
|
)
|
|
ENGINE = MergeTree
|
|
PARTITION BY toDate(TimestampTime)
|
|
PRIMARY KEY (ServiceName, TimestampTime)
|
|
ORDER BY (ServiceName, TimestampTime, Timestamp)
|
|
TTL TimestampTime + toIntervalDay(30)
|
|
SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1;
|
|
|
|
CREATE TABLE IF NOT EXISTS ${DATABASE}.e2e_otel_traces
|
|
(
|
|
\`Timestamp\` DateTime64(9) CODEC(Delta(8), ZSTD(1)),
|
|
\`TraceId\` String CODEC(ZSTD(1)),
|
|
\`SpanId\` String CODEC(ZSTD(1)),
|
|
\`ParentSpanId\` String CODEC(ZSTD(1)),
|
|
\`TraceState\` String CODEC(ZSTD(1)),
|
|
\`SpanName\` LowCardinality(String) CODEC(ZSTD(1)),
|
|
\`SpanKind\` LowCardinality(String) CODEC(ZSTD(1)),
|
|
\`ServiceName\` LowCardinality(String) CODEC(ZSTD(1)),
|
|
\`ResourceAttributes\` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
|
|
\`ScopeName\` String CODEC(ZSTD(1)),
|
|
\`ScopeVersion\` String CODEC(ZSTD(1)),
|
|
\`SpanAttributes\` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
|
|
\`Duration\` UInt64 CODEC(ZSTD(1)),
|
|
\`StatusCode\` LowCardinality(String) CODEC(ZSTD(1)),
|
|
\`StatusMessage\` String CODEC(ZSTD(1)),
|
|
\`Events.Timestamp\` Array(DateTime64(9)) CODEC(ZSTD(1)),
|
|
\`Events.Name\` Array(LowCardinality(String)) CODEC(ZSTD(1)),
|
|
\`Events.Attributes\` Array(Map(LowCardinality(String), String)) CODEC(ZSTD(1)),
|
|
\`Links.TraceId\` Array(String) CODEC(ZSTD(1)),
|
|
\`Links.SpanId\` Array(String) CODEC(ZSTD(1)),
|
|
\`Links.TraceState\` Array(String) CODEC(ZSTD(1)),
|
|
\`Links.Attributes\` Array(Map(LowCardinality(String), String)) CODEC(ZSTD(1)),
|
|
\`__hdx_materialized_rum.sessionId\` String MATERIALIZED ResourceAttributes['rum.sessionId'] CODEC(ZSTD(1)),
|
|
\`SampleRate\` UInt64 MATERIALIZED greatest(toUInt64OrZero(SpanAttributes['SampleRate']), 1) CODEC(T64, ZSTD(1)),
|
|
INDEX idx_trace_id TraceId TYPE bloom_filter(0.001) GRANULARITY 1,
|
|
INDEX idx_rum_session_id __hdx_materialized_rum.sessionId TYPE bloom_filter(0.001) GRANULARITY 1,
|
|
INDEX idx_res_attr_key mapKeys(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_res_attr_value mapValues(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_span_attr_key mapKeys(SpanAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_span_attr_value mapValues(SpanAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_duration Duration TYPE minmax GRANULARITY 1,
|
|
INDEX idx_lower_span_name lower(SpanName) TYPE tokenbf_v1(32768, 3, 0) GRANULARITY 8
|
|
)
|
|
ENGINE = MergeTree
|
|
PARTITION BY toDate(Timestamp)
|
|
ORDER BY (ServiceName, SpanName, toDateTime(Timestamp))
|
|
TTL toDate(Timestamp) + toIntervalDay(30)
|
|
SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1;
|
|
|
|
CREATE TABLE ${DATABASE}.e2e_hyperdx_sessions
|
|
(
|
|
\`Timestamp\` DateTime64(9) CODEC(Delta(8), ZSTD(1)),
|
|
\`TimestampTime\` DateTime DEFAULT toDateTime(Timestamp),
|
|
\`TraceId\` String CODEC(ZSTD(1)),
|
|
\`SpanId\` String CODEC(ZSTD(1)),
|
|
\`TraceFlags\` UInt8,
|
|
\`SeverityText\` LowCardinality(String) CODEC(ZSTD(1)),
|
|
\`SeverityNumber\` UInt8,
|
|
\`ServiceName\` LowCardinality(String) CODEC(ZSTD(1)),
|
|
\`Body\` String CODEC(ZSTD(1)),
|
|
\`ResourceSchemaUrl\` LowCardinality(String) CODEC(ZSTD(1)),
|
|
\`ResourceAttributes\` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
|
|
\`ScopeSchemaUrl\` LowCardinality(String) CODEC(ZSTD(1)),
|
|
\`ScopeName\` String CODEC(ZSTD(1)),
|
|
\`ScopeVersion\` LowCardinality(String) CODEC(ZSTD(1)),
|
|
\`ScopeAttributes\` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
|
|
\`LogAttributes\` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
|
|
\`__hdx_materialized_rum.sessionId\` String MATERIALIZED ResourceAttributes['rum.sessionId'] CODEC(ZSTD(1)),
|
|
\`__hdx_materialized_type\` LowCardinality(String) MATERIALIZED toString(simpleJSONExtractInt(Body, 'type')) CODEC(ZSTD(1)),
|
|
INDEX idx_trace_id TraceId TYPE bloom_filter(0.001) GRANULARITY 1,
|
|
INDEX idx_rum_session_id __hdx_materialized_rum.sessionId TYPE bloom_filter(0.001) GRANULARITY 1,
|
|
INDEX idx_res_attr_key mapKeys(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_res_attr_value mapValues(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_scope_attr_key mapKeys(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_scope_attr_value mapValues(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_log_attr_key mapKeys(LogAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_log_attr_value mapValues(LogAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_body Body TYPE tokenbf_v1(32768, 3, 0) GRANULARITY 8
|
|
)
|
|
ENGINE = MergeTree
|
|
PARTITION BY toDate(TimestampTime)
|
|
PRIMARY KEY (ServiceName, TimestampTime)
|
|
ORDER BY (ServiceName, TimestampTime, Timestamp)
|
|
TTL TimestampTime + toIntervalDay(30)
|
|
SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1;
|
|
|
|
CREATE TABLE IF NOT EXISTS ${DATABASE}.e2e_otel_metrics_gauge
|
|
(
|
|
\`ResourceAttributes\` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
|
|
\`ResourceSchemaUrl\` String CODEC(ZSTD(1)),
|
|
\`ScopeName\` String CODEC(ZSTD(1)),
|
|
\`ScopeVersion\` String CODEC(ZSTD(1)),
|
|
\`ScopeAttributes\` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
|
|
\`ScopeDroppedAttrCount\` UInt32 CODEC(ZSTD(1)),
|
|
\`ScopeSchemaUrl\` String CODEC(ZSTD(1)),
|
|
\`ServiceName\` LowCardinality(String) CODEC(ZSTD(1)),
|
|
\`MetricName\` String CODEC(ZSTD(1)),
|
|
\`MetricDescription\` String CODEC(ZSTD(1)),
|
|
\`MetricUnit\` String CODEC(ZSTD(1)),
|
|
\`Attributes\` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
|
|
\`StartTimeUnix\` DateTime64(9) CODEC(Delta(8), ZSTD(1)),
|
|
\`TimeUnix\` DateTime64(9) CODEC(Delta(8), ZSTD(1)),
|
|
\`Value\` Float64 CODEC(ZSTD(1)),
|
|
\`Flags\` UInt32 CODEC(ZSTD(1)),
|
|
\`Exemplars.FilteredAttributes\` Array(Map(LowCardinality(String), String)) CODEC(ZSTD(1)),
|
|
\`Exemplars.TimeUnix\` Array(DateTime64(9)) CODEC(ZSTD(1)),
|
|
\`Exemplars.Value\` Array(Float64) CODEC(ZSTD(1)),
|
|
\`Exemplars.SpanId\` Array(String) CODEC(ZSTD(1)),
|
|
\`Exemplars.TraceId\` Array(String) CODEC(ZSTD(1)),
|
|
INDEX idx_res_attr_key mapKeys(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_res_attr_value mapValues(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_scope_attr_key mapKeys(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_scope_attr_value mapValues(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_attr_key mapKeys(Attributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_attr_value mapValues(Attributes) TYPE bloom_filter(0.01) GRANULARITY 1
|
|
)
|
|
ENGINE = MergeTree
|
|
PARTITION BY toDate(TimeUnix)
|
|
ORDER BY (ServiceName, MetricName, Attributes, toUnixTimestamp64Nano(TimeUnix))
|
|
TTL toDate(TimeUnix) + toIntervalDay(30)
|
|
SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1;
|
|
|
|
CREATE TABLE IF NOT EXISTS ${DATABASE}.e2e_otel_metrics_sum
|
|
(
|
|
\`ResourceAttributes\` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
|
|
\`ResourceSchemaUrl\` String CODEC(ZSTD(1)),
|
|
\`ScopeName\` String CODEC(ZSTD(1)),
|
|
\`ScopeVersion\` String CODEC(ZSTD(1)),
|
|
\`ScopeAttributes\` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
|
|
\`ScopeDroppedAttrCount\` UInt32 CODEC(ZSTD(1)),
|
|
\`ScopeSchemaUrl\` String CODEC(ZSTD(1)),
|
|
\`ServiceName\` LowCardinality(String) CODEC(ZSTD(1)),
|
|
\`MetricName\` String CODEC(ZSTD(1)),
|
|
\`MetricDescription\` String CODEC(ZSTD(1)),
|
|
\`MetricUnit\` String CODEC(ZSTD(1)),
|
|
\`Attributes\` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
|
|
\`StartTimeUnix\` DateTime64(9) CODEC(Delta(8), ZSTD(1)),
|
|
\`TimeUnix\` DateTime64(9) CODEC(Delta(8), ZSTD(1)),
|
|
\`Value\` Float64 CODEC(ZSTD(1)),
|
|
\`Flags\` UInt32 CODEC(ZSTD(1)),
|
|
\`AggregationTemporality\` Int32 CODEC(ZSTD(1)),
|
|
\`IsMonotonic\` Bool CODEC(Delta(1), ZSTD(1)),
|
|
\`Exemplars.FilteredAttributes\` Array(Map(LowCardinality(String), String)) CODEC(ZSTD(1)),
|
|
\`Exemplars.TimeUnix\` Array(DateTime64(9)) CODEC(ZSTD(1)),
|
|
\`Exemplars.Value\` Array(Float64) CODEC(ZSTD(1)),
|
|
\`Exemplars.SpanId\` Array(String) CODEC(ZSTD(1)),
|
|
\`Exemplars.TraceId\` Array(String) CODEC(ZSTD(1)),
|
|
INDEX idx_res_attr_key mapKeys(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_res_attr_value mapValues(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_scope_attr_key mapKeys(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_scope_attr_value mapValues(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_attr_key mapKeys(Attributes) TYPE bloom_filter(0.01) GRANULARITY 1,
|
|
INDEX idx_attr_value mapValues(Attributes) TYPE bloom_filter(0.01) GRANULARITY 1
|
|
)
|
|
ENGINE = MergeTree
|
|
PARTITION BY toDate(TimeUnix)
|
|
ORDER BY (ServiceName, MetricName, Attributes, toUnixTimestamp64Nano(TimeUnix))
|
|
TTL toDate(TimeUnix) + toIntervalDay(30)
|
|
SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1
|
|
EOFSQL
|