Commit graph

14 commits

Author SHA1 Message Date
Brandon Pereira
43dfb3aaff
chore to move critical path files (#1314)
moves them into a core folder, this allows us to easily track when core files are modified via path

no changeset because no version bump required

fixes HDX-2589
2025-10-30 15:16:33 +00:00
Drew Davis
2162a69039
feat: Optimize and fix filtering on toStartOfX primary key expressions (#1265)
Closes HDX-2576
Closes HDX-2491

# Summary

It is a common optimization to have a primary key like `toStartOfDay(Timestamp), ..., Timestamp`. This PR improves the experience when using such a primary key in the following ways:

1. HyperDX will now automatically filter on both `toStartOfDay(Timestamp)` and `Timestamp` in this case, instead of just `Timestamp`. This improves performance by better utilizing the primary index. Previously, this required a manual change to the source's Timestamp Column setting.
2. HyperDX now applies the same `toStartOfX` function to the right-hand-side of timestamp comparisons. So when filtering using an expression like `toStartOfDay(Timestamp)`, the generated SQL will have the condition `toStartOfDay(Timestamp) >= toStartOfDay(<selected start time>) AND toStartOfDay(Timestamp) <= toStartOfDay(<selected end time>)`. This resolves an issue where some data would be incorrectly filtered out when filtering on such timestamp expressions (such as time ranges less than 1 minute).

With this change, teams should no longer need to have multiple columns in their source timestamp column configuration. However, if they do, they will now have correct filtering.

## Testing

### Testing the fix

The part of this PR that fixes time filtering can be tested with the default logs table schema. Simply set the Timestamp Column source setting to `TimestampTime, toStartOfMinute(TimestampTime)`. Then, in the logs search, filter for a timespan < 1 minute.

<details>
<summary>Without the fix, you should see no logs, since they're incorrectly filtered out by the toStartOfMinute(TimestampTime) filter</summary>

https://github.com/user-attachments/assets/915d3922-55f8-4742-b686-5090cdecef60
</details>

<details>
<summary>With the fix, you should see logs in the selected time range</summary>

https://github.com/user-attachments/assets/f75648e4-3f48-47b0-949f-2409ce075a75
</details>

### Testing the optimization

The optimization part of this change is that when a table has a primary key like `toStartOfMinute(TimestampTime), ..., TimestampTime` and the Timestamp Column for the source is just `Timestamp`, the query will automatically filter by both  `toStartOfMinute(TimestampTime)` and `TimestampTime`.

To test this, you'll need to create a table with such a primary key, then create a source based on that table. Optionally, you could copy data from the default `otel_logs` table into the new table (`INSERT INTO default.otel_logs_toStartOfMinute_Key SELECT * FROM default.otel_logs`).

<details>
<summary>DDL for log table with optimized key</summary>

```sql
CREATE TABLE default.otel_logs_toStartOfMinute_Key
(
    `Timestamp` DateTime64(9) CODEC(Delta(8), ZSTD(1)),
    `TimestampTime` DateTime DEFAULT toDateTime(Timestamp),
    `TraceId` String CODEC(ZSTD(1)),
    `SpanId` String CODEC(ZSTD(1)),
    `TraceFlags` UInt8,
    `SeverityText` LowCardinality(String) CODEC(ZSTD(1)),
    `SeverityNumber` UInt8,
    `ServiceName` LowCardinality(String) CODEC(ZSTD(1)),
    `Body` String CODEC(ZSTD(1)),
    `ResourceSchemaUrl` LowCardinality(String) CODEC(ZSTD(1)),
    `ResourceAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
    `ScopeSchemaUrl` LowCardinality(String) CODEC(ZSTD(1)),
    `ScopeName` String CODEC(ZSTD(1)),
    `ScopeVersion` LowCardinality(String) CODEC(ZSTD(1)),
    `ScopeAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
    `LogAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
    `__hdx_materialized_k8s.pod.name` String MATERIALIZED ResourceAttributes['k8s.pod.name'] CODEC(ZSTD(1)),
    INDEX idx_trace_id TraceId TYPE bloom_filter(0.001) GRANULARITY 1,
    INDEX idx_res_attr_key mapKeys(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_res_attr_value mapValues(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_scope_attr_key mapKeys(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_scope_attr_value mapValues(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_log_attr_key mapKeys(LogAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_log_attr_value mapValues(LogAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_body Body TYPE tokenbf_v1(32768, 3, 0) GRANULARITY 8,
    INDEX idx_lower_body lower(Body) TYPE tokenbf_v1(32768, 3, 0) GRANULARITY 8
)
ENGINE = SharedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}')
PARTITION BY toDate(TimestampTime)
PRIMARY KEY (toStartOfMinute(TimestampTime), ServiceName, TimestampTime)
ORDER BY (toStartOfMinute(TimestampTime), ServiceName, TimestampTime, Timestamp)
TTL TimestampTime + toIntervalDay(90)
SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1
```
</details>

Once you have that source, you can inspect the queries generated for that source. Whenever a date range filter is selected, the query should have a `WHERE` predicate that filters on both `TimestampTime` and `toStartOfMinute(TimestampTime)`, despite `toStartOfMinute(TimestampTime)` not being included in the Timestamp Column of the source's configuration.
2025-10-27 17:20:36 +00:00
Warren
43e32aafc7
feat: revisit Otel metrics semantic convention migration logics (#1267)
Since users can still switch to the new metric name using feature gate

Follow up https://github.com/hyperdxio/hyperdx/pull/1248
2025-10-14 22:06:31 +00:00
Warren
5efa2ffa0d
feat: handle k8s metrics semantic convention updates (#1248)
Handle OpenTelemetry semantic versions based on the ScopeVersion field (metrics)
Related to [changes](https://opentelemetry.io/blog/2025/kubeletstats-receiver-metrics-deprecation/)

Old (switched to v0.137.0)
<img width="818" height="317" alt="image" src="https://github.com/user-attachments/assets/ceea52c6-ad06-4295-afae-a44f21b2e962" />

New (be able to handle multiple versions)
<img width="568" height="329" alt="image" src="https://github.com/user-attachments/assets/d2e282b2-cfd7-490a-a64d-502881a360a2" />


Ref: HDX-2322, HDX-2562
2025-10-08 21:04:40 +00:00
Drew Davis
fa45875d38
feat: Add delta() function for gauge metrics (#1147) 2025-09-11 17:10:43 -04:00
Aaron Knudtson
fa7875c427
feat: add Summary and Exponential Histogram metrics to source form (#832) 2025-05-21 13:22:04 -04:00
Dan Hable
96b8c50898
fix(metrics): fix histogram metric query (#823)
Fix the query to address issues with the value calculation as well as allow for grouping.

Ref: HDX-1726
2025-05-19 14:47:24 +00:00
Dan Hable
b9f7d32efa
refactor: clean up the chart config CTE render logic (#686)
Some additional refactoring and testing around the more complex CTE rendering.

Ref: HDX-1511
2025-03-17 14:45:26 +00:00
Dan Hable
a9dfa14930
fix: use CTE instead of listing all index parts in query (#666)
## feat: allow CTE definitions to be nested chart configs

In order to easily use a CTE for fixing large index issues with delta
trace events, this commit updates the type and `renderWith` function to
render a nested chart config.

Ref: HDX-1343

---

## fix: use CTE instead of listing all index parts in query

Instead of sending 2 queries to the DB and enumerating all of parts
and offsets in the query, this change uses a CTE to select the parts.
This reduces the size of the HTTP request, which fixes the URI too
long response.

Ref: HDX-1343
2025-03-14 13:34:47 +00:00
Dan Hable
99b60d50b2
fix: update sum metric query based on v1 integration test (#650)
Fix the sum query to produce the correct results from the min/max test case from v1.

Ref: HDX-1421
2025-03-07 07:03:03 +00:00
Warren
cd0e4fd71c
fix: correct handling of gauge metrics in renderChartConfig (#654) 2025-03-06 00:06:57 +00:00
Dan Hable
e80630c107
feat: supporting quantile histogram metrics (#635)
Additional `renderChartConfig` support to transform a histogram select into the correct SQL syntax to generate a chart. For parity with v1, this query only handles quantile queries.

<img width="1939" alt="Screenshot 2025-02-26 at 12 58 55 PM" src="https://github.com/user-attachments/assets/1126ac6c-c431-4d89-92d7-9df1e49e25cf" />

<img width="1960" alt="Screenshot 2025-02-26 at 3 11 07 PM" src="https://github.com/user-attachments/assets/e4fa09bf-1e27-4a90-ad25-6c6cb2890877" />

Ref: HDX-1339
2025-02-27 16:55:36 +00:00
Tom Alexander
521793df2d
fix: Ensure group-by works with sum metrics (#636)
Adds all available columns into the query so that we can properly apply the group by clause.

Ref: HDX-1419
2025-02-27 15:51:23 +00:00
Warren
57a6bc399f
feat: BETA metrics support (sum + gauge) (#629)
<img width="1310" alt="Screenshot 2025-02-25 at 3 43 11 PM" src="https://github.com/user-attachments/assets/38c98bc2-2ff2-412c-b26d-4ed9952439f2" />


Co-authored-by: Mike Shi <2781687+MikeShi42@users.noreply.github.com>
Co-authored-by: Dan Hable <418679+dhable@users.noreply.github.com>
Co-authored-by: Tom Alexander <3245235+teeohhem@users.noreply.github.com>
2025-02-26 00:00:48 +00:00