Commit graph

984 commits

Author SHA1 Message Date
github-actions[bot]
d322cd967d
Release HyperDX (#1325)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-19 04:24:41 +01:00
Drew Davis
c4915d45a7
feat: Add custom trace-level attributes above trace waterfall (#1356)
Ref HDX-2752

# Summary

This PR adds a new Source setting which specifies additional Span-Level attributes which should be displayed above the Trace Waterfall. The attributes may be specified for both Log and Trace kind sources.

These are displayed on the trace panel rather than at the top of the side panel because they are trace level (from any span in the trace), whereas the top of the side panel shows information specific to the span. In a followup PR, we'll add row-level attributes, which will appear at the top of the side panel.

Notes:

1. The attributes are pulled from the log and trace source (if configured for each) despite the search page only being set to query one of those sources. If an attribute value comes from the log source while the trace source is currently configured on the search page, the attribute tag will not offer the option to show the "Add to search" button and instead will just show a "search this value" option, which navigates the search page to search the value in the correct source.

## Demo

First, source is configured with highlighted attributes:
<img width="954" height="288" alt="Screenshot 2025-11-14 at 3 02 25 PM" src="https://github.com/user-attachments/assets/e5eccbfa-3389-4521-83df-d63fcb13232a" />

Values for those attributes within the trace show up on the trace panel above the waterfall:
<img width="1257" height="152" alt="Screenshot 2025-11-14 at 3 02 59 PM" src="https://github.com/user-attachments/assets/7e3e9d87-5f3a-409b-9b08-054d6e460970" />

Values are searchable when clicked, and if a lucene version of the property is provided, the lucene version will be used in the search box
<img width="1334" height="419" alt="Screenshot 2025-11-14 at 3 03 10 PM" src="https://github.com/user-attachments/assets/50d98f99-5056-48ce-acca-4219286a68f7" />
2025-11-18 19:34:07 +00:00
Warren
3f29e33852
chore: push local, aio, collector images to clickhouse/clickstack-xxx (#1376)
will migrate to `clickhouse/clickstack-xxx` repos in next PR
2025-11-18 17:17:52 +00:00
Drew Davis
09f07e576a
fix: Prevent incorrect dashboard side panel close (#1372)
Closes #1324 
Closes HDX-2724

This PR fixes a bug that caused the Side Panel to close when clicked when on a dashboard with more than one Search Table tile. The fix was to remove the `useClickOutside` hook in favor of Mantine's default behavior, which is to close when clicking outside the drawer.
2025-11-17 20:51:06 +00:00
Drew Davis
4d1eaf1073
style: Fix filter color and alert icon alignment (#1369)
## Before

<img width="493" height="105" alt="Screenshot 2025-11-14 at 2 46 59 PM" src="https://github.com/user-attachments/assets/412ce6ce-7c62-484b-a6d8-b4fd71b8980a" />

## After

<img width="510" height="103" alt="Screenshot 2025-11-14 at 2 46 20 PM" src="https://github.com/user-attachments/assets/a39d7406-c032-4ea3-9508-1e18c9aa18a6" />
2025-11-17 15:09:59 +00:00
Brandon Pereira
44a6a08a29
remove react select (#1367)
Co-authored-by: Elizabet Oliveira <elizabet.oliveira@clickhouse.com>
2025-11-14 13:36:10 -07:00
Elizabet Oliveira
af6a8d0dac
Refactor: Remove bootstrap, adopt semantic tokens, and improve Mantine UI usage (#1347) 2025-11-14 18:01:54 +00:00
Drew Davis
a7e150c825
feat: Improve Service Maps (#1353)
# Summary

This PR makes a number of minor fixes and improvements to the Service Maps feature:

1. The Service Map now has its own tab in the side panel. This resolves usability issues such as the trace panel capturing scroll events and appearing too large on the side panel. Closes HDX-2785, Closes HDX-2732.
2. On single-trace service maps (eg. the one on the side panel), request counts are now rendered as exact numbers (eg. `1 request`), rather than approximate numbers (eg. `~1 request`). Closes HDX-2741.
3. Service map viewport bounds are now reset when the input data changes (typically when the source or sampling level changes). Closes HDX-2778.
4. Service maps now have an empty state. Closes HDX-2739.

<img width="1359" height="902" alt="Screenshot 2025-11-11 at 11 00 05 PM" src="https://github.com/user-attachments/assets/6d8c7fda-bf4e-4dbe-83e4-6395f53511cb" />
<img width="1365" height="910" alt="Screenshot 2025-11-11 at 11 05 13 PM" src="https://github.com/user-attachments/assets/af5218f9-43f8-4536-abee-5ce090cf0438" />
2025-11-14 16:40:52 +00:00
Mike Shi
5e440ab390
update docs spelling (#1365) 2025-11-14 15:04:26 +00:00
Brandon Pereira
3fb5ef7083
fix prop warnings (#1366)
Very minor PR which fixes 2 HTML structure issues (div under p, button under button).

Before: 
<img width="968" height="146" alt="Screenshot 2025-11-13 at 6 58 22 PM" src="https://github.com/user-attachments/assets/ca16ea04-d308-47ec-a3f0-74a8c639dd4b" />

After:
Non prop errors (on search page at least)
2025-11-14 14:59:55 +00:00
Dan Hable
94a669d3ca
feat(tasks): emit duration and success/failure counts for tasks (#1364)
Emits success/failure counter values as well as execution duration as a gauge for task execution. This allows monitoring the background task health using HyperDX alerts.
2025-11-13 23:26:22 +00:00
Mike Shi
cfba5cb63d
feat: Sort source dropdown alphabetically (#1359)
Fixes HDX-2820
2025-11-13 14:43:27 +00:00
Drew Davis
c42a070a9e
fix: Fix session search behavior (#1357)
# Summary

This PR fixes a few bugs in the session search page:

1. Clicking ENTER now triggers a form submission on the session page for lucene conditions (SQL conditions already worked)
2. Clicking ENTER now triggers a form submission on the session side panel for both lucene and SQL conditions
3. The WHERE condition in the search sidebar is now interpreted in the correct `whereLanguage` instead of assuming lucene. Partially reverts #863, but I confirmed that the page-level search does not filter the sidepanel spans after this change.

This PR also fixes the same issue (ENTER now submits forms) on the dashboard and services page. #1208 introduced the issue by preventing the ENTER event from bubbling up to the form when using `AutocompleteInput` / `SearchInputV2`.

Closes HDX-2816
Closes HDX-2817

https://github.com/user-attachments/assets/b91bdb0f-e241-43c2-9854-88fbe43daec7
2025-11-13 14:28:03 +00:00
Drew Davis
7bb7a878de
feat: Add filter for root spans (#1341)
Closes HDX-2772

This PR adds a filter that allows for quickly viewing just root spans from a Trace source.

Notes:
- The state of this filter is persisted in the URL Query Params
- This filter is not persisted in a saved search to match the behavior of other filters, which are not persisted in saved searches.

<img width="1237" height="833" alt="Screenshot 2025-11-12 at 3 56 18 PM" src="https://github.com/user-attachments/assets/9e6b461d-f201-4521-b546-15f986c7ec5b" />
<img width="1252" height="693" alt="Screenshot 2025-11-12 at 3 56 32 PM" src="https://github.com/user-attachments/assets/0aa02818-93ba-4f57-96fd-58c46aac3d9d" />
2025-11-13 14:11:37 +00:00
Dan Hable
a75ce3be6e
fix(alerts): correct p-queue usage (#1355)
Avoid awating on the call to `add()`. Doing so causes the call to await not only for the function to be enqueued, but also finish execution.

This section of the [documentation](https://www.npmjs.com/package/p-queue) is key:
> [!IMPORTANT] If you await this promise, you will wait for the task to finish running, which may defeat the purpose of using a queue for concurrency. See the [Usage](https://www.npmjs.com/package/p-queue#usage) section for examples.
2025-11-12 20:12:09 +00:00
Tom Alexander
63fcf145cd
fix: optimize query key for aliasMap to prevent jitter (#1351)
Fixes: HDX-2787

During live tail, the date range changes every few seconds (e.g., from 9:00-9:15 to 9:02-9:17, etc...). The original aliasMap query key included the entire config object, which contains the dateRange property. Every date range change triggered a refetch of the alias map, even though aliases are derived from the SELECT statement and not from the date range.

While refetching, react-query sets aliasMap to undefined. This caused column IDs to change.  React-table uses column IDs as keys to track resize state, so when the ID changes, it loses the stored width and resets to the default size, causing the visible jitter.

Now we have a consistent aliasMap with the added benefit of less network requests.
2025-11-12 18:24:44 +00:00
Mike Shi
64b5673089
feat: Format row counts in search page (#1352) 2025-11-11 20:21:49 -05:00
Tom Alexander
b90a0649b8
fix: Switch to 'all' after filters change on kubernetes dashboard page (#1337)
Fixes: HDX-2789
2025-11-11 14:52:47 +00:00
Drew Davis
44caf197b4
feat: Zero-fill empty alert periods (#1340)
Closes HDX-2568

# Summary

This PR adds zero-filling to alert evaluation, meaning that periods with no data returned in the alert query will be interpreted as a 0 value, which will (a) cause BELOW-threshold alerts to ALERT and (b) cause ABOVE-threshold alerts to auto-resolve.

Grouped alerts do not have this behavior except when no group returns any data, since it is not always possible to know which groups should have data. A note about this behavior has been added to the UI. Groups are still auto-resolved (when appropriate) when the following period has no data for that group. 

<img width="773" height="460" alt="Screenshot 2025-11-10 at 8 53 05 AM" src="https://github.com/user-attachments/assets/82f6ced1-9bd1-41cd-832d-7bf8abb1253d" />
2025-11-10 19:27:27 +00:00
Drew Davis
78aff3365d
fix: Group alert histories by evaluation time (#1338)
Closes HDX-2728

# Summary

This PR groups AlertHistory records by `createdAt` time to avoid showing multiple alert histories for the same time on the alerts page. There can be multiple AlertHistory records for the same `createdAt` time for grouped alerts

## Testing

To test this, setup a Saved Search alert with a group by configured, then navigate to the alerts page to see one history per time:

<img width="1466" height="154" alt="Screenshot 2025-11-07 at 4 46 40 PM" src="https://github.com/user-attachments/assets/ccc48ba0-07b2-48b1-ad25-de8c88467611" />
<img width="791" height="773" alt="Screenshot 2025-11-07 at 4 46 30 PM" src="https://github.com/user-attachments/assets/2ab0f0c6-1d46-4c65-9fbb-cf4c5d62580e" />
2025-11-10 19:21:20 +00:00
Warren
840d73076c
feat: adjust alert template title and body to reflect alert state (#1339)
Currently, the resolved alert will have the same title and body message as the alerting one, which is misleading

Ref: HDX-2786

## Slack

### ALERT
<img width="723" height="358" alt="Screenshot 2025-11-09 at 10 02 52 PM" src="https://github.com/user-attachments/assets/b1c6f563-f095-457e-9a70-01c8149796c4" />

### RESOLVED
<img width="650" height="117" alt="Screenshot 2025-11-09 at 10 26 01 PM" src="https://github.com/user-attachments/assets/07ef1e7d-8ee5-4604-92cf-4811a0a5c811" />

## incident.io

### ALERT
<img width="1432" height="398" alt="Screenshot 2025-11-09 at 11 07 30 PM" src="https://github.com/user-attachments/assets/30e25eb3-32b2-4f51-934d-b28e75dd5cf7" />

### RESOLVED
<img width="1427" height="305" alt="Screenshot 2025-11-09 at 11 08 56 PM" src="https://github.com/user-attachments/assets/913a5b99-bb07-47ae-bec9-6b0814e4b400" />
2025-11-10 18:29:19 +00:00
Drew Davis
b33db7660b
refactor: Extract helper functions from processAlert (#1336)
This PR extracts a couple of functions out of the excessively long `processAlert` function, to improve readability. 

This is also intended to simplify HDX-2568, which will be able to re-use some of these functions.
2025-11-10 15:42:06 +00:00
hiasr
c5cb1d4bb0
fix: add json compatibility for infrastructure tab (#1256)
Co-authored-by: Ruben Hias <ruben.hias@techwolf.ai>
2025-11-09 08:58:05 +01:00
Tom Alexander
892e43f889
fix: Improve Kubernetes dashboard performance (#1333)
These are the minimal set of changes needed to improve the kubernetes dashboard with 100k+ pods.

**Changes:**
* Fixed a performance issue in ChartUtils that caused computation to be O(n^2). This caused charts to slow down rendering to a crawl and freeze the page. It's not as noticeable with a smaller data set. This was the main issue.
* Limited the number of items returned in the nodes, namespaces, and pods tables to 10k. This was the second biggest issue.
* Introduced a virtualized table to each of the tables to speed up rendering. This was the third biggest issue.
* Increased the amount of unique items returned from the metadata query so that users can filter for the items they need (UX improvement)

**Future changes that will improve the experience even more:**
1) Fetch 10k, but add pagination (UX)
2) Improve query for fetching tabular data. It's timeseries, but realistically, we can make a smarter more performant query
3) To fill out the data in the tables (cpu, memory, uptime, etc...) we make separate queries and combine them on the server side. We could make this one large query (we have an existing ticket in the backlog for it).
4) Chart rendering is very computational intensive. It would be a better user experience to load these after the table loads.

**Outstanding (existing) bugs that exist that I will fix in follow-up tickets:**
1) The namespaces query uses the wrong time window. It does not respect the global time picker date range.
2) Sorting of the table columns is broken.

Ref: HDX-2370

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-07 18:28:45 +00:00
Brandon Pereira
2743d85b37
Resizable Waterfall Traces Panel (#1328)
As requested by some users, we should have a way to resize the traces waterfall view (if there is enough trace data) so that they can see more of the trace, and still be able to click the trace events to view details

https://github.com/user-attachments/assets/d1b77887-4f6e-4972-b4b4-fefd70dd344e

Fixes HDX-2677
2025-11-07 15:33:06 +00:00
Drew Davis
59c6655a01
test: Refactor lengthy processAlert tests (#1335)
This PR refactors some common setup out of each `processAlert` integration test, as these tests are becoming very lengthy and hard to understand and write.

- Common mocks are now performed in beforeEach() instead of in each test
- `setupSavedSearchAlertTest` and `createAlertDetails` helpers are introduced to create the objects each test case depends on (connection, source, team, savedSearch, client, webhook, alert, AlertDetails)
- `processAlertAtTime` helper is introduced to run `processAlert` at a particular `now` time value, after querying the AlertHistory records that exist at that time.

Helps with HDX-2568
2025-11-07 15:09:56 +00:00
Brandon Pereira
99cb17c620
Webhook UX Improvements (#1323)
Adds the ability to edit existing webhooks and also test them

<img width="1884" height="1674" alt="Screenshot 2025-11-04 at 8 31 00 AM" src="https://github.com/user-attachments/assets/5f220ec4-c5ab-4ec7-89b9-cf39c215b87b" />
<img width="1922" height="590" alt="Screenshot 2025-11-04 at 8 52 39 AM" src="https://github.com/user-attachments/assets/5238df2c-90b7-465a-a029-5392c45a1e1a" />

Fixes HDX-2672

Closes https://github.com/hyperdxio/hyperdx/issues/1069
2025-11-07 14:45:50 +00:00
Brandon Pereira
2faa15a0a3
Add HTML <title> Tags (#1321)
Add title tags to most important pages, add a fallback title page so any other pages will have something basic.

Fixes HDX-2706
2025-11-05 19:45:31 +00:00
Brandon Pereira
1e39e1341b
Fix bug where generated URLs default to live tail (#1326)
This was originally prevented by this useEffect: https://github.com/hyperdxio/hyperdx/pull/1305/files#diff-6a2491347ca591776e19bf42f3b0f76b4fb6ba15f6e70e697d45c30218997b69L739 but I think having logic to deviate from the URL causing a lot of complexity in this page, so I personally think we should work towards making the URL the source of truth for the state instead.

Original Bug Reproduction:

1. Increase time range beyond default live tail duration
2. Click on a histogram bar and then click "View Events"
3. Note that the time range is updated
4. Wait a few seconds and the time range will be incorrectly reverted back to the default live tailing date range

Fix:

1. Do reproduction steps as above
2. At  step 4, see that URL is displayed as intended.

Fixes HDX-2718
2025-11-05 15:30:30 +00:00
Drew Davis
6e628bcded
feat: Support field:(<term>...) Lucene searches (#1315)
# Summary

This PR updates HyperDX's lucene support to include parenthesized field searches of the form `<field>:(<term>...)`.

Prior to these changes, HyperDX would ignore the `<field>` entirely and search as if the query were just `<term>...`.

With these changes, the search is performed just like a `<term>...` search except:

1. The `field` is used for the search, instead of the implicit field expression (eg. `Body` for `otel_logs`)
2. The search is performed without `hasToken()`, as we assume that fields do not have bloom filters setup (matching the current behavior for how we search fields)

This support has the added benefit of unlocking multi-token substring searches (Ref HDX-1931)
- Previously, you could not search a field for a substring with multiple tokens, eg `error.message:*Method not allowed*` is interpreted as 3 separate terms, and only `*Method` would be associated with `error.message`. `error.message:"Method not allowed"` and `error.message:"*Method not allowed*"` look for exact matches, instead of substrings.
- Now, this can be accomplished with `error.message:("Method not allowed")`. This matches the current behavior of a search like `"Method not allowed"`, which would search the source's default implicit column (eg. `Body`) for the substring "Method not allowed".

## Testing

To test these changes, this PR adds a few dozen query parser unit test cases.
2025-11-04 23:39:58 +00:00
Warren
f612bf3c00
feat: add support for alert auto-resolve + Incident.io integration (#1298)
Plus fixed 'group-by' alert state issues (alert histories)

<img width="836" height="362" alt="image" src="https://github.com/user-attachments/assets/1c132313-25ea-4059-9b7c-0bfaa85408ea" />

Ref: HDX-2661
Ref: HDX-2660
2025-11-04 22:53:04 +00:00
Brandon Pereira
8dee21c81a
Minor Heatmap Improvements (#1317)
Small improvements to heatmap logic:

1. Improve the logic around filtering the outliers. Previously it was hardcoded to Duration, now it will correctly use the `Value` from the user. If the value contains an aggregate function, it will also perform a CTE to properly calculate.
1. If the outliers query fails, we show the user the query error
2. We prioritize the outlier keys in the event deltas view over inliers (this was how it was before, now it only includes inliers if no outliers are found)
3. Ensure the autocomplete suggestions are displayed (there was a zindex issue)
2025-11-04 21:48:39 +00:00
Aaron Knudtson
19c5085cde
chore: split json otel collector to enable both during dev (#1247)
Gets us closer to a staging instance of json

<img width="216" height="174" alt="image" src="https://github.com/user-attachments/assets/b5cc3cf8-aef0-4ba4-9e9a-8c1d4fad5451" />


Co-authored-by: Warren <5959690+wrn14897@users.noreply.github.com>
2025-11-04 21:16:41 +00:00
Drew Davis
91e443f431
feat: Add service map (beta) (#1319)
Closes HDX-2699

# Summary

This PR adds a Service Map feature to HyperDX, based on (sampled) trace data.

## Demo

https://github.com/user-attachments/assets/602e9b42-1586-4cb1-9c99-024c7ef9d2bb

## How the service map is constructed

The service map is created by querying client-server (or producer-consumer) relationships from a Trace source. Two spans have a client-server/producer-consumer relationship if (a) they have the same trace ID and (b) the server/consumer's parent span ID is equal to the client/producer's span ID. This is accomplished via a self-join on the Trace table (the query can be found in `useServiceMap.ts`.

To help keep this join performant, user's can set a sampling level as low as 1% and up to 100%. Lower sampling levels will result in fewer rows being joined, and thus a faster service map load. Sampling is done on `cityHash64(TraceId)` to ensure that either a trace is included in its entirety or not included at all.
2025-11-04 20:20:26 +00:00
Brandon Pereira
24bf2b419d
fix state issues with relative time input not matching reality (#1320)
Fix issues where input state became out of sync in certain edge cases with relative time

**Before** (Demo Video):

https://github.com/user-attachments/assets/5f7b92f1-1dcd-413d-bca1-092d4e4bc360

**After**: issues are not observed.
2025-11-04 18:59:15 +00:00
Brandon Pereira
43dfb3aaff
chore to move critical path files (#1314)
moves them into a core folder, this allows us to easily track when core files are modified via path

no changeset because no version bump required

fixes HDX-2589
2025-10-30 15:16:33 +00:00
github-actions[bot]
f98193852c
Release HyperDX (#1289)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-10-30 11:13:50 +01:00
Brandon Pereira
de0b4fc755
Search Relative Time Queries (#1305)
Adds "Relative Time" switch to TimePicker component (if relative time is supported by parent). When enabled, searches will work similar to Live Tail but be relative to the option selected.

<img width="555" height="418" alt="Screenshot 2025-10-27 at 2 05 25 PM" src="https://github.com/user-attachments/assets/20d38011-d5d0-479f-a8ea-6b0be441ca87" />

Some notes:

1. When relative is enabled, I disabled very large time ranges to prioritize performance.
2. If you select "Last 15 mins" then reload, the Input will save "Live Tail" because these are the same option, this should be an edge case.
3. In the future, we might want to make "Relative Time" the default, but I didn't want to immediately do that. We could probably improve the UX further (cc @elizabetdev).
4. Moves a lot of the "Live Tail" logic out of various spots and centralizes it in a unified spot to support other values 

Fixes HDX-2653
2025-10-29 15:49:10 +00:00
Brandon Pereira
808413f5e8
Fix TimePicker Popovers (#1312)
Ensures Date Picker and Selects under TimePicker can be accessed. 

## Before

When trying to click a date in the TimePicker it would close the modal (due to click outside)

## After

Modal will remain open and interactive as expected

Fixes HDX-2662
2025-10-29 15:00:15 +00:00
Ruud Kamphuis
c6ad250f3d
Enable auto-provisioning for no-auth mode (#1297)
Co-authored-by: Aaron Knudtson <87577305+knudtty@users.noreply.github.com>
2025-10-29 09:42:39 -04:00
Drew Davis
7b6ed70c22
fix: Support custom Timestamp Columns in Surrounding Context panel (#1309)
Fixes HDX-2664

# Summary

This PR fixes an error in the surrounding context side panel that occurs when a source does not have a `Timestamp` column. To fix the error, the side panel will now reference the __hdx_timestamp alias queried by `useRowData`, which in turn is based on the Timestamp Column (or Displayed Timestamp Column) in the source's config.

## Testing

To reproduce the issue, create a source without a `Timestamp` column:

<details>
<summary>Source schema</summary>

```sql
CREATE TABLE default.otel_logs_other_ts
(
    `timestamp` DateTime64(9) CODEC(Delta(8), ZSTD(1)),
    `timestamp_time` DateTime DEFAULT toDateTime(timestamp),
    `TraceId` String CODEC(ZSTD(1)),
    `SpanId` String CODEC(ZSTD(1)),
    `TraceFlags` UInt8,
    `SeverityText` LowCardinality(String) CODEC(ZSTD(1)),
    `SeverityNumber` UInt8,
    `ServiceName` LowCardinality(String) CODEC(ZSTD(1)),
    `Body` String CODEC(ZSTD(1)),
    `ResourceSchemaUrl` LowCardinality(String) CODEC(ZSTD(1)),
    `ResourceAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
    `ScopeSchemaUrl` LowCardinality(String) CODEC(ZSTD(1)),
    `ScopeName` String CODEC(ZSTD(1)),
    `ScopeVersion` LowCardinality(String) CODEC(ZSTD(1)),
    `ScopeAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
    `LogAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
    `__hdx_materialized_k8s.cluster.name` LowCardinality(String) MATERIALIZED ResourceAttributes['k8s.cluster.name'] CODEC(ZSTD(1)),
    `__hdx_materialized_k8s.container.name` LowCardinality(String) MATERIALIZED ResourceAttributes['k8s.container.name'] CODEC(ZSTD(1)),
    `__hdx_materialized_k8s.deployment.name` LowCardinality(String) MATERIALIZED ResourceAttributes['k8s.deployment.name'] CODEC(ZSTD(1)),
    `__hdx_materialized_k8s.namespace.name` LowCardinality(String) MATERIALIZED ResourceAttributes['k8s.namespace.name'] CODEC(ZSTD(1)),
    `__hdx_materialized_k8s.node.name` LowCardinality(String) MATERIALIZED ResourceAttributes['k8s.node.name'] CODEC(ZSTD(1)),
    `__hdx_materialized_k8s.pod.name` LowCardinality(String) MATERIALIZED ResourceAttributes['k8s.pod.name'] CODEC(ZSTD(1)),
    `__hdx_materialized_k8s.pod.uid` LowCardinality(String) MATERIALIZED ResourceAttributes['k8s.pod.uid'] CODEC(ZSTD(1)),
    `__hdx_materialized_deployment.environment.name` LowCardinality(String) MATERIALIZED ResourceAttributes['deployment.environment.name'] CODEC(ZSTD(1)),
    INDEX idx_trace_id TraceId TYPE bloom_filter(0.001) GRANULARITY 1,
    INDEX idx_res_attr_key mapKeys(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_res_attr_value mapValues(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_scope_attr_key mapKeys(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_scope_attr_value mapValues(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_log_attr_key mapKeys(LogAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_log_attr_value mapValues(LogAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_lower_body lower(Body) TYPE tokenbf_v1(32768, 3, 0) GRANULARITY 8
)
ENGINE = MergeTree
PARTITION BY toDate(timestamp_time)
PRIMARY KEY (ServiceName, timestamp_time)
ORDER BY (ServiceName, timestamp_time, timestamp)
TTL timestamp_time + toIntervalDay(30)
SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1
```
</details>

Then try to open the surrounding context panel:

https://github.com/user-attachments/assets/d5f14e3c-83ce-40c0-b2c4-ef9fe8e1467e

With these changes, the error is fixed (as long as the source's timestamp column configuration is correct)

https://github.com/user-attachments/assets/605c50e5-9306-4d2e-a9b1-9afc3adca9b6
2025-10-28 17:58:22 +00:00
Drew Davis
3ee93ae918
feat: Show pinned filter values while filters are loading (#1308)
Closes HDX-2641

# Summary

With this change, HyperDX will now display pinned filter values as soon as the search page loads, without waiting for the filter values to be queried from ClickHouse. This enables users to quickly apply relevant filters, before the (sometimes very slow) filter values query completes.

## Demo

For this demo, I added an artificial delay to the filter query to simulate an environment where filter queries are slow

https://github.com/user-attachments/assets/6345cb91-7aba-4acc-a832-05efb3bf17d0
2025-10-28 16:47:27 +00:00
Drew Davis
d5a38c3e05
fix: Fix pattern sample query for sources with multi-column timestamp expressions (#1304)
Fixes HDX-2621

# Summary

This PR fixes a query error when opening a sample log line in the patterns table from a source with multiple timestamp columns. To fix the issue, we simply use the first of the timestamp columns. This is consistent with several other places in the app where we use just the first timestamp column - multiple timestamp columns is not fully supported.

## Testing

To reproduce the issue, set the Timestamp Column in source settings for the logs source to `TimestampTime, toStartOfMinute(TimestampTime)`. Then attempt to open a pattern sample:

https://github.com/user-attachments/assets/2464f97e-1423-437c-88f0-b45486feffcc

With these changes, the issue is fixed:

https://github.com/user-attachments/assets/54d8f0f2-532c-4eb4-a676-ab6a606ecac5
2025-10-28 16:19:46 +00:00
Drew Davis
15331acbee
feat: Auto-select correlated sources on k8s dashboard (#1302)
Closes HDX-2586

# Summary

This PR updates the K8s dashboard so that it auto-selects correlated log and metric sources.

Auto-selection of sources happens
1. During page load, if sources aren't specified in the URL params
2. When a new log source is selected, a correlated metric source is auto-selected. In this case, a notification is shown to the user to inform them that the metric source has been updated.

When a new metric source is selected, a correlated log source is not selected. This is to ensure the user has some way of selecting two non-correlated sources, if they truly want to. If the user does select a metric source which is not correlated with the selected log source, a warning notification will be shown to the user.

## Demo


https://github.com/user-attachments/assets/492121a1-0a51-4af9-a749-42771537678e
2025-10-28 12:17:53 +00:00
Brandon Pereira
757196f2e9
close modals when bluring (dates and search hints) (#1294)
When clicking outside the search hints or date modal, the modals close to prevent them appearing ontop of other modals when you click results

<img width="2168" height="1222" alt="Screenshot 2025-10-24 at 11 07 49 AM" src="https://github.com/user-attachments/assets/c930919a-7d91-420d-be46-1db5ca35c2de" />
<img width="1004" height="866" alt="Screenshot 2025-10-24 at 11 07 52 AM" src="https://github.com/user-attachments/assets/8969bc7d-2655-4a1d-8a34-1f301401edf8" />

Fixes HDX-2643
2025-10-27 18:55:39 +00:00
Dan Hable
431a9f01f3
refactor(tasks): create subdirectory for alerts code (#1281)
The alerts code and some of our private source tasks are becoming complex enough for multiple files. In order to keep the code clearer to debug and read, this commit moves the check alerts tasks into a sub-directory for just alert related code.
2025-10-27 18:16:50 +00:00
Aaron Knudtson
778092d34f
fix: long timeranges for task queries truncated (#1276)
Fixes HDX-2618

Related to https://github.com/ClickHouse/support-escalation/issues/6113

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-10-27 17:40:11 +00:00
Drew Davis
2162a69039
feat: Optimize and fix filtering on toStartOfX primary key expressions (#1265)
Closes HDX-2576
Closes HDX-2491

# Summary

It is a common optimization to have a primary key like `toStartOfDay(Timestamp), ..., Timestamp`. This PR improves the experience when using such a primary key in the following ways:

1. HyperDX will now automatically filter on both `toStartOfDay(Timestamp)` and `Timestamp` in this case, instead of just `Timestamp`. This improves performance by better utilizing the primary index. Previously, this required a manual change to the source's Timestamp Column setting.
2. HyperDX now applies the same `toStartOfX` function to the right-hand-side of timestamp comparisons. So when filtering using an expression like `toStartOfDay(Timestamp)`, the generated SQL will have the condition `toStartOfDay(Timestamp) >= toStartOfDay(<selected start time>) AND toStartOfDay(Timestamp) <= toStartOfDay(<selected end time>)`. This resolves an issue where some data would be incorrectly filtered out when filtering on such timestamp expressions (such as time ranges less than 1 minute).

With this change, teams should no longer need to have multiple columns in their source timestamp column configuration. However, if they do, they will now have correct filtering.

## Testing

### Testing the fix

The part of this PR that fixes time filtering can be tested with the default logs table schema. Simply set the Timestamp Column source setting to `TimestampTime, toStartOfMinute(TimestampTime)`. Then, in the logs search, filter for a timespan < 1 minute.

<details>
<summary>Without the fix, you should see no logs, since they're incorrectly filtered out by the toStartOfMinute(TimestampTime) filter</summary>

https://github.com/user-attachments/assets/915d3922-55f8-4742-b686-5090cdecef60
</details>

<details>
<summary>With the fix, you should see logs in the selected time range</summary>

https://github.com/user-attachments/assets/f75648e4-3f48-47b0-949f-2409ce075a75
</details>

### Testing the optimization

The optimization part of this change is that when a table has a primary key like `toStartOfMinute(TimestampTime), ..., TimestampTime` and the Timestamp Column for the source is just `Timestamp`, the query will automatically filter by both  `toStartOfMinute(TimestampTime)` and `TimestampTime`.

To test this, you'll need to create a table with such a primary key, then create a source based on that table. Optionally, you could copy data from the default `otel_logs` table into the new table (`INSERT INTO default.otel_logs_toStartOfMinute_Key SELECT * FROM default.otel_logs`).

<details>
<summary>DDL for log table with optimized key</summary>

```sql
CREATE TABLE default.otel_logs_toStartOfMinute_Key
(
    `Timestamp` DateTime64(9) CODEC(Delta(8), ZSTD(1)),
    `TimestampTime` DateTime DEFAULT toDateTime(Timestamp),
    `TraceId` String CODEC(ZSTD(1)),
    `SpanId` String CODEC(ZSTD(1)),
    `TraceFlags` UInt8,
    `SeverityText` LowCardinality(String) CODEC(ZSTD(1)),
    `SeverityNumber` UInt8,
    `ServiceName` LowCardinality(String) CODEC(ZSTD(1)),
    `Body` String CODEC(ZSTD(1)),
    `ResourceSchemaUrl` LowCardinality(String) CODEC(ZSTD(1)),
    `ResourceAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
    `ScopeSchemaUrl` LowCardinality(String) CODEC(ZSTD(1)),
    `ScopeName` String CODEC(ZSTD(1)),
    `ScopeVersion` LowCardinality(String) CODEC(ZSTD(1)),
    `ScopeAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
    `LogAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
    `__hdx_materialized_k8s.pod.name` String MATERIALIZED ResourceAttributes['k8s.pod.name'] CODEC(ZSTD(1)),
    INDEX idx_trace_id TraceId TYPE bloom_filter(0.001) GRANULARITY 1,
    INDEX idx_res_attr_key mapKeys(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_res_attr_value mapValues(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_scope_attr_key mapKeys(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_scope_attr_value mapValues(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_log_attr_key mapKeys(LogAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_log_attr_value mapValues(LogAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_body Body TYPE tokenbf_v1(32768, 3, 0) GRANULARITY 8,
    INDEX idx_lower_body lower(Body) TYPE tokenbf_v1(32768, 3, 0) GRANULARITY 8
)
ENGINE = SharedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}')
PARTITION BY toDate(TimestampTime)
PRIMARY KEY (toStartOfMinute(TimestampTime), ServiceName, TimestampTime)
ORDER BY (toStartOfMinute(TimestampTime), ServiceName, TimestampTime, Timestamp)
TTL TimestampTime + toIntervalDay(90)
SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1
```
</details>

Once you have that source, you can inspect the queries generated for that source. Whenever a date range filter is selected, the query should have a `WHERE` predicate that filters on both `TimestampTime` and `toStartOfMinute(TimestampTime)`, despite `toStartOfMinute(TimestampTime)` not being included in the Timestamp Column of the source's configuration.
2025-10-27 17:20:36 +00:00
Drew Davis
8190ee8f6a
perf: Improve getKeyValues query performance for JSON keys (#1284)
Closes HDX-2623

# Summary

This change improves the performance of `getKeyValues` when getting values of a JSON key. 

Generally, columns that are not referenced outside of a CTE will be pruned by the query planner. For JSON however, if the outer select references one field in a JSON column, then the inner select will read (it seems) the entire JSON object.

This PR also adds integration tests for `getKeyValues` to ensure that the function generates queries that work as expected in ClickHouse.

##  Performance impact (on single JSON Dashboard Filter)

- Original: 15.03s

<img width="584" height="71" alt="Screenshot 2025-10-21 at 3 28 07 PM" src="https://github.com/user-attachments/assets/184de198-cee1-4b1d-beed-ec4465d3e248" />

- Optimized: 0.443s

<img width="590" height="61" alt="Screenshot 2025-10-21 at 3 25 47 PM" src="https://github.com/user-attachments/assets/690d0ef0-15b8-47c5-9a7e-8b8f6b8f5e92" />
2025-10-27 16:46:46 +00:00
Aaron Knudtson
93edb6f84f
fix: memoize inputs to fix performance issues (#1291)
Fixes HDX-2644
Fixes #1288
2025-10-27 14:53:18 +00:00