Closes HDX-2881
# Summary
This PR adds the row-level highlighted attributes to the row overview panel, so that they appear for the span that is selected in the trace waterfall and in an expanded table row:
<img width="1071" height="966" alt="Screenshot 2025-11-20 at 2 32 09 PM" src="https://github.com/user-attachments/assets/febb6c12-4c58-4eac-b085-cbad3601b2fe" />
<img width="814" height="275" alt="Screenshot 2025-11-20 at 2 32 16 PM" src="https://github.com/user-attachments/assets/b3c6fbeb-205e-4b6a-9dfd-5ed9457a57df" />
This PR also makes some small updates to the descriptions of the highlighted attributes in the source configuration form.
Fixes: HDX-2852
If a user used the time picker and chose today, but with a time in the future, existing code would "infer" that and assume it was for an earlier year. Since the UI time picker doesnt allow users to select a _day_ in the future, we should be smart to parse down the time to match now()
Closes HDX-2874
# Summary
This PR hides the table header row when there are no displayed columns. This prevents a glitchy behavior where the table header icons would appear vertically aligned rather than in a horizontal row when loading the table data. This occurred because the table header was rendered with an empty, horizontally-skinny column header for the expand button column, and the icons were in that skinny column header.
I'd recommend reviewing with white space changes hidden - this is a very small change.
## Before
https://github.com/user-attachments/assets/fceb489b-d79d-40f8-99ba-d9e4c2c5ee27
## After
https://github.com/user-attachments/assets/3b382e08-43b7-49e4-81c6-45bb2aa00688
Users reported that the precision was way off to what the threshold value was, this helps ensure the two numbers have the same precision.
Before:
<img width="1280" height="363" alt="image" src="https://github.com/user-attachments/assets/fc1bc72c-a70e-4068-aa06-3a01d6c65b2b" />
After:
<img width="1446" height="618" alt="Screenshot 2025-11-19 at 4 20 38 PM" src="https://github.com/user-attachments/assets/49be78eb-dac9-49f4-b490-a354fb69fb71" />
**Note:** One thing that could be better is if we instead used the Number Format specified on the frontend, this would require us to move the Numbro dependency and logic into common-utils, and we would also probably want to update the alert value UI to also use numbro.. I can take a stab at this if we think it's better. I figured this was a good interim solution.
Fixes HDX-2847
Closes HDX-2753
Closes HDX-2755
Closes HDX-2756
# Summary
This PR adds a search function for filtering spans and logs in the trace waterfall. Because the waterfall can consist of both spans and logs, from two different sources, there is one input per source. If the correlated log source is not available (ensuring there are no logs in the waterfall) then there is only one input.
The input persists in the query parameters so that the filtered side panel can be shared, and the parameters are cleared when the side panel closes.
Currently, only Lucene is supported for searching.
This PR also adds a couple of minor improvements to the waterfall
1. There is now a count of spans and errors
2. There is now a span status in the waterfall tooltip
## Demo
https://github.com/user-attachments/assets/fb623875-5811-4f7f-9f40-c0b34de1c541
Closes HDX-2858
# Summary
This PR fixes a bug on the sessions page which caused any search submission to revert the source select back to its default.
The cause of this bug was that the form submission was updating the wrong URL query parameter (source instead of sessionSource), and form submission would trigger the select to take whatever (original) value was set for sessionSource).
Ref HDX-2752
# Summary
This PR adds a new Source setting which specifies additional Span-Level attributes which should be displayed above the Trace Waterfall. The attributes may be specified for both Log and Trace kind sources.
These are displayed on the trace panel rather than at the top of the side panel because they are trace level (from any span in the trace), whereas the top of the side panel shows information specific to the span. In a followup PR, we'll add row-level attributes, which will appear at the top of the side panel.
Notes:
1. The attributes are pulled from the log and trace source (if configured for each) despite the search page only being set to query one of those sources. If an attribute value comes from the log source while the trace source is currently configured on the search page, the attribute tag will not offer the option to show the "Add to search" button and instead will just show a "search this value" option, which navigates the search page to search the value in the correct source.
## Demo
First, source is configured with highlighted attributes:
<img width="954" height="288" alt="Screenshot 2025-11-14 at 3 02 25 PM" src="https://github.com/user-attachments/assets/e5eccbfa-3389-4521-83df-d63fcb13232a" />
Values for those attributes within the trace show up on the trace panel above the waterfall:
<img width="1257" height="152" alt="Screenshot 2025-11-14 at 3 02 59 PM" src="https://github.com/user-attachments/assets/7e3e9d87-5f3a-409b-9b08-054d6e460970" />
Values are searchable when clicked, and if a lucene version of the property is provided, the lucene version will be used in the search box
<img width="1334" height="419" alt="Screenshot 2025-11-14 at 3 03 10 PM" src="https://github.com/user-attachments/assets/50d98f99-5056-48ce-acca-4219286a68f7" />
Closes#1324
Closes HDX-2724
This PR fixes a bug that caused the Side Panel to close when clicked when on a dashboard with more than one Search Table tile. The fix was to remove the `useClickOutside` hook in favor of Mantine's default behavior, which is to close when clicking outside the drawer.
# Summary
This PR makes a number of minor fixes and improvements to the Service Maps feature:
1. The Service Map now has its own tab in the side panel. This resolves usability issues such as the trace panel capturing scroll events and appearing too large on the side panel. Closes HDX-2785, Closes HDX-2732.
2. On single-trace service maps (eg. the one on the side panel), request counts are now rendered as exact numbers (eg. `1 request`), rather than approximate numbers (eg. `~1 request`). Closes HDX-2741.
3. Service map viewport bounds are now reset when the input data changes (typically when the source or sampling level changes). Closes HDX-2778.
4. Service maps now have an empty state. Closes HDX-2739.
<img width="1359" height="902" alt="Screenshot 2025-11-11 at 11 00 05 PM" src="https://github.com/user-attachments/assets/6d8c7fda-bf4e-4dbe-83e4-6395f53511cb" />
<img width="1365" height="910" alt="Screenshot 2025-11-11 at 11 05 13 PM" src="https://github.com/user-attachments/assets/af5218f9-43f8-4536-abee-5ce090cf0438" />
Emits success/failure counter values as well as execution duration as a gauge for task execution. This allows monitoring the background task health using HyperDX alerts.
# Summary
This PR fixes a few bugs in the session search page:
1. Clicking ENTER now triggers a form submission on the session page for lucene conditions (SQL conditions already worked)
2. Clicking ENTER now triggers a form submission on the session side panel for both lucene and SQL conditions
3. The WHERE condition in the search sidebar is now interpreted in the correct `whereLanguage` instead of assuming lucene. Partially reverts #863, but I confirmed that the page-level search does not filter the sidepanel spans after this change.
This PR also fixes the same issue (ENTER now submits forms) on the dashboard and services page. #1208 introduced the issue by preventing the ENTER event from bubbling up to the form when using `AutocompleteInput` / `SearchInputV2`.
Closes HDX-2816
Closes HDX-2817
https://github.com/user-attachments/assets/b91bdb0f-e241-43c2-9854-88fbe43daec7
Closes HDX-2772
This PR adds a filter that allows for quickly viewing just root spans from a Trace source.
Notes:
- The state of this filter is persisted in the URL Query Params
- This filter is not persisted in a saved search to match the behavior of other filters, which are not persisted in saved searches.
<img width="1237" height="833" alt="Screenshot 2025-11-12 at 3 56 18 PM" src="https://github.com/user-attachments/assets/9e6b461d-f201-4521-b546-15f986c7ec5b" />
<img width="1252" height="693" alt="Screenshot 2025-11-12 at 3 56 32 PM" src="https://github.com/user-attachments/assets/0aa02818-93ba-4f57-96fd-58c46aac3d9d" />
Avoid awating on the call to `add()`. Doing so causes the call to await not only for the function to be enqueued, but also finish execution.
This section of the [documentation](https://www.npmjs.com/package/p-queue) is key:
> [!IMPORTANT] If you await this promise, you will wait for the task to finish running, which may defeat the purpose of using a queue for concurrency. See the [Usage](https://www.npmjs.com/package/p-queue#usage) section for examples.
Fixes: HDX-2787
During live tail, the date range changes every few seconds (e.g., from 9:00-9:15 to 9:02-9:17, etc...). The original aliasMap query key included the entire config object, which contains the dateRange property. Every date range change triggered a refetch of the alias map, even though aliases are derived from the SELECT statement and not from the date range.
While refetching, react-query sets aliasMap to undefined. This caused column IDs to change. React-table uses column IDs as keys to track resize state, so when the ID changes, it loses the stored width and resets to the default size, causing the visible jitter.
Now we have a consistent aliasMap with the added benefit of less network requests.
Closes HDX-2568
# Summary
This PR adds zero-filling to alert evaluation, meaning that periods with no data returned in the alert query will be interpreted as a 0 value, which will (a) cause BELOW-threshold alerts to ALERT and (b) cause ABOVE-threshold alerts to auto-resolve.
Grouped alerts do not have this behavior except when no group returns any data, since it is not always possible to know which groups should have data. A note about this behavior has been added to the UI. Groups are still auto-resolved (when appropriate) when the following period has no data for that group.
<img width="773" height="460" alt="Screenshot 2025-11-10 at 8 53 05 AM" src="https://github.com/user-attachments/assets/82f6ced1-9bd1-41cd-832d-7bf8abb1253d" />
Closes HDX-2728
# Summary
This PR groups AlertHistory records by `createdAt` time to avoid showing multiple alert histories for the same time on the alerts page. There can be multiple AlertHistory records for the same `createdAt` time for grouped alerts
## Testing
To test this, setup a Saved Search alert with a group by configured, then navigate to the alerts page to see one history per time:
<img width="1466" height="154" alt="Screenshot 2025-11-07 at 4 46 40 PM" src="https://github.com/user-attachments/assets/ccc48ba0-07b2-48b1-ad25-de8c88467611" />
<img width="791" height="773" alt="Screenshot 2025-11-07 at 4 46 30 PM" src="https://github.com/user-attachments/assets/2ab0f0c6-1d46-4c65-9fbb-cf4c5d62580e" />
This PR extracts a couple of functions out of the excessively long `processAlert` function, to improve readability.
This is also intended to simplify HDX-2568, which will be able to re-use some of these functions.
These are the minimal set of changes needed to improve the kubernetes dashboard with 100k+ pods.
**Changes:**
* Fixed a performance issue in ChartUtils that caused computation to be O(n^2). This caused charts to slow down rendering to a crawl and freeze the page. It's not as noticeable with a smaller data set. This was the main issue.
* Limited the number of items returned in the nodes, namespaces, and pods tables to 10k. This was the second biggest issue.
* Introduced a virtualized table to each of the tables to speed up rendering. This was the third biggest issue.
* Increased the amount of unique items returned from the metadata query so that users can filter for the items they need (UX improvement)
**Future changes that will improve the experience even more:**
1) Fetch 10k, but add pagination (UX)
2) Improve query for fetching tabular data. It's timeseries, but realistically, we can make a smarter more performant query
3) To fill out the data in the tables (cpu, memory, uptime, etc...) we make separate queries and combine them on the server side. We could make this one large query (we have an existing ticket in the backlog for it).
4) Chart rendering is very computational intensive. It would be a better user experience to load these after the table loads.
**Outstanding (existing) bugs that exist that I will fix in follow-up tickets:**
1) The namespaces query uses the wrong time window. It does not respect the global time picker date range.
2) Sorting of the table columns is broken.
Ref: HDX-2370
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This PR refactors some common setup out of each `processAlert` integration test, as these tests are becoming very lengthy and hard to understand and write.
- Common mocks are now performed in beforeEach() instead of in each test
- `setupSavedSearchAlertTest` and `createAlertDetails` helpers are introduced to create the objects each test case depends on (connection, source, team, savedSearch, client, webhook, alert, AlertDetails)
- `processAlertAtTime` helper is introduced to run `processAlert` at a particular `now` time value, after querying the AlertHistory records that exist at that time.
Helps with HDX-2568
This was originally prevented by this useEffect: https://github.com/hyperdxio/hyperdx/pull/1305/files#diff-6a2491347ca591776e19bf42f3b0f76b4fb6ba15f6e70e697d45c30218997b69L739 but I think having logic to deviate from the URL causing a lot of complexity in this page, so I personally think we should work towards making the URL the source of truth for the state instead.
Original Bug Reproduction:
1. Increase time range beyond default live tail duration
2. Click on a histogram bar and then click "View Events"
3. Note that the time range is updated
4. Wait a few seconds and the time range will be incorrectly reverted back to the default live tailing date range
Fix:
1. Do reproduction steps as above
2. At step 4, see that URL is displayed as intended.
Fixes HDX-2718
# Summary
This PR updates HyperDX's lucene support to include parenthesized field searches of the form `<field>:(<term>...)`.
Prior to these changes, HyperDX would ignore the `<field>` entirely and search as if the query were just `<term>...`.
With these changes, the search is performed just like a `<term>...` search except:
1. The `field` is used for the search, instead of the implicit field expression (eg. `Body` for `otel_logs`)
2. The search is performed without `hasToken()`, as we assume that fields do not have bloom filters setup (matching the current behavior for how we search fields)
This support has the added benefit of unlocking multi-token substring searches (Ref HDX-1931)
- Previously, you could not search a field for a substring with multiple tokens, eg `error.message:*Method not allowed*` is interpreted as 3 separate terms, and only `*Method` would be associated with `error.message`. `error.message:"Method not allowed"` and `error.message:"*Method not allowed*"` look for exact matches, instead of substrings.
- Now, this can be accomplished with `error.message:("Method not allowed")`. This matches the current behavior of a search like `"Method not allowed"`, which would search the source's default implicit column (eg. `Body`) for the substring "Method not allowed".
## Testing
To test these changes, this PR adds a few dozen query parser unit test cases.
Small improvements to heatmap logic:
1. Improve the logic around filtering the outliers. Previously it was hardcoded to Duration, now it will correctly use the `Value` from the user. If the value contains an aggregate function, it will also perform a CTE to properly calculate.
1. If the outliers query fails, we show the user the query error
2. We prioritize the outlier keys in the event deltas view over inliers (this was how it was before, now it only includes inliers if no outliers are found)
3. Ensure the autocomplete suggestions are displayed (there was a zindex issue)
Closes HDX-2699
# Summary
This PR adds a Service Map feature to HyperDX, based on (sampled) trace data.
## Demo
https://github.com/user-attachments/assets/602e9b42-1586-4cb1-9c99-024c7ef9d2bb
## How the service map is constructed
The service map is created by querying client-server (or producer-consumer) relationships from a Trace source. Two spans have a client-server/producer-consumer relationship if (a) they have the same trace ID and (b) the server/consumer's parent span ID is equal to the client/producer's span ID. This is accomplished via a self-join on the Trace table (the query can be found in `useServiceMap.ts`.
To help keep this join performant, user's can set a sampling level as low as 1% and up to 100%. Lower sampling levels will result in fewer rows being joined, and thus a faster service map load. Sampling is done on `cityHash64(TraceId)` to ensure that either a trace is included in its entirety or not included at all.
moves them into a core folder, this allows us to easily track when core files are modified via path
no changeset because no version bump required
fixes HDX-2589