## Summary This PR updates the `getTableMetadata` and `getSkipIndices` functions to handle distributed tables by looking up primary keys and indexes (respectively) from the underlying local table (since the distributed table does not have them). - Source config inference works again - The default order by optimization (adding `toStartOfXX()` to the search page order by when it's present in the primary key) now correctly applies when querying a distributed table source - The date range filter now correctly filters on both `toStartOfXX(TimestampTime)` and `TimestampTime` when `toStartOfXX(TimestampTime)` is present in the primary key of the local table. - Source schema preview now shows both the distributed table and the local table, when the source is defined by a distributed table. - Text indexes are now detected correctly for distributed tables ### Screenshots or video https://github.com/user-attachments/assets/d1c60964-99f0-4470-9378-a812f963c692 When text index is present, hasAllTokens is used: <img width="848" height="139" alt="Screenshot 2026-03-16 at 10 55 24 AM" src="https://github.com/user-attachments/assets/2bd780dc-291d-495f-bd12-c636988648c1" /> ### How to test locally or on Vercel <details> <summary>Testing locally, you'll need to create a distributed logs table with a local table that has a timestamp optimization:</summary> ```sql CREATE TABLE default.otel_logs_toStartOf on cluster hdx_cluster ( `Timestamp` DateTime64(9) CODEC(Delta(8), ZSTD(1)), `TimestampTime` DateTime DEFAULT toDateTime(Timestamp), `TraceId` String CODEC(ZSTD(1)), `SpanId` String CODEC(ZSTD(1)), `TraceFlags` UInt8, `SeverityText` LowCardinality(String) CODEC(ZSTD(1)), `SeverityNumber` UInt8, `ServiceName` LowCardinality(String) CODEC(ZSTD(1)), `Body` String CODEC(ZSTD(1)), `ResourceSchemaUrl` LowCardinality(String) CODEC(ZSTD(1)), `ResourceAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)), `ScopeSchemaUrl` LowCardinality(String) CODEC(ZSTD(1)), `ScopeName` String CODEC(ZSTD(1)), `ScopeVersion` LowCardinality(String) CODEC(ZSTD(1)), `ScopeAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)), `LogAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)), INDEX idx_trace_id TraceId TYPE bloom_filter(0.001) GRANULARITY 1, INDEX idx_res_attr_key mapKeys(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1, INDEX idx_res_attr_value mapValues(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1, INDEX idx_scope_attr_key mapKeys(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1, INDEX idx_scope_attr_value mapValues(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1, INDEX idx_log_attr_key mapKeys(LogAttributes) TYPE bloom_filter(0.01) GRANULARITY 1, INDEX idx_log_attr_value mapValues(LogAttributes) TYPE bloom_filter(0.01) GRANULARITY 1, INDEX idx_body Body TYPE tokenbf_v1(32768, 3, 0) GRANULARITY 8 ) ENGINE = MergeTree PARTITION BY toDate(TimestampTime) PRIMARY KEY (toStartOfMinute(TimestampTime), ServiceName, TimestampTime) ORDER BY (toStartOfMinute(TimestampTime), ServiceName, TimestampTime, Timestamp) TTL TimestampTime + toIntervalDay(30) SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1; CREATE TABLE default.otel_logs_toStartOf_distributed on cluster hdx_cluster ( `Timestamp` DateTime64(9) CODEC(Delta(8), ZSTD(1)), `TimestampTime` DateTime DEFAULT toDateTime(Timestamp), `TraceId` String CODEC(ZSTD(1)), `SpanId` String CODEC(ZSTD(1)), `TraceFlags` UInt8, `SeverityText` LowCardinality(String) CODEC(ZSTD(1)), `SeverityNumber` UInt8, `ServiceName` LowCardinality(String) CODEC(ZSTD(1)), `Body` String CODEC(ZSTD(1)), `ResourceSchemaUrl` LowCardinality(String) CODEC(ZSTD(1)), `ResourceAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)), `ScopeSchemaUrl` LowCardinality(String) CODEC(ZSTD(1)), `ScopeName` String CODEC(ZSTD(1)), `ScopeVersion` LowCardinality(String) CODEC(ZSTD(1)), `ScopeAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)), `LogAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)) ) ENGINE = Distributed('hdx_cluster', 'default', 'otel_logs_toStartOf', rand()); ALTER TABLE otel_logs_toStartOf ON CLUSTER hdx_cluster ADD INDEX text_idx(Body) TYPE text(tokenizer=splitByNonAlpha, preprocessor=lower(Body)) SETTINGS enable_full_text_index=1; ALTER TABLE otel_logs_toStartOf ON CLUSTER hdx_cluster MATERIALIZE INDEX text_idx; ``` </details> <details> <summary>To test text index detection, first enable full text indexes locally in your users.xml file</summary> ```xml <clickhouse> <profiles> <default> ... <enable_full_text_index>1</enable_full_text_index> </default> </profiles> ... <clickhouse> ``` </details> ### References - Linear Issue: Closes HDX-3703 - Related PRs: |
||
|---|---|---|
| .changeset | ||
| .claude | ||
| .cursor | ||
| .github | ||
| .husky | ||
| .vex | ||
| .vscode | ||
| .yarn/releases | ||
| agent_docs | ||
| docker | ||
| packages | ||
| proxy | ||
| scripts | ||
| smoke-tests/otel-collector | ||
| .env | ||
| .gitattributes | ||
| .gitignore | ||
| .kodiak.toml | ||
| .mcp.json | ||
| .nvmrc | ||
| .prettierignore | ||
| .prettierrc | ||
| .yarnrc.yml | ||
| AGENTS.md | ||
| CLAUDE.md | ||
| CONTRIBUTING.md | ||
| DEPLOY.md | ||
| docker-compose.ci.yml | ||
| docker-compose.dev.yml | ||
| docker-compose.yml | ||
| LICENSE | ||
| LOCAL.md | ||
| Makefile | ||
| nx.json | ||
| package.json | ||
| README.md | ||
| tsconfig.base.json | ||
| version.sh | ||
| yarn.lock | ||
HyperDX
HyperDX, a core component of ClickStack, helps engineers quickly figure out why production is broken by making it easy to search & visualize logs and traces on top of any ClickHouse cluster (imagine Kibana, for ClickHouse).
Documentation • Chat on Discord • Live Demo • Bug Reports • Contributing • Website
- 🕵️ Correlate/search logs, metrics, session replays and traces all in one place
- 📝 Schema agnostic, works on top of your existing ClickHouse schema
- 🔥 Blazing fast searches & visualizations optimized for ClickHouse
- 🔍 Intuitive full-text search and property search syntax (ex.
level:err), SQL optional! - 📊 Analyze trends in anomalies with event deltas
- 🔔 Set up alerts in just a few clicks
- 📈 Dashboard high cardinality events without a complex query language
{Native JSON string querying- ⚡ Live tail logs and traces to always get the freshest events
- 🔭 OpenTelemetry supported out of the box
- ⏱️ Monitor health and performance from HTTP requests to DB queries (APM)
Spinning Up HyperDX
HyperDX can be deployed as part of ClickStack, which includes ClickHouse, HyperDX, OpenTelemetry Collector and MongoDB.
docker run -p 8080:8080 -p 4317:4317 -p 4318:4318 docker.hyperdx.io/hyperdx/hyperdx-all-in-one
Afterwards, you can visit http://localhost:8080 to access the HyperDX UI.
If you already have an existing ClickHouse instance, want to use a single container locally, or are looking for production deployment instructions, you can view the different deployment options in our deployment docs.
If your server is behind a firewall, you'll need to open/forward port 8080, 8000 and 4318 on your firewall for the UI, API and OTel collector respectively.
We recommend at least 4GB of RAM and 2 cores for testing.
Hosted ClickHouse Cloud
You can also deploy HyperDX with ClickHouse Cloud, you can sign up for free and get started in just minutes.
Instrumenting Your App
To get logs, metrics, traces, session replay, etc into HyperDX, you'll need to instrument your app to collect and send telemetry data over to your HyperDX instance.
We provide a set of SDKs and integration options to make it easier to get started with HyperDX, such as Browser, Node.js, and Python
You can find the full list in our docs.
OpenTelemetry
Additionally, HyperDX is compatible with OpenTelemetry, a vendor-neutral standard for instrumenting your application backed by CNCF. Supported languages/platforms include:
- Kubernetes
- Javascript
- Python
- Java
- Go
- Ruby
- PHP
- .NET
- Elixir
- Rust
(Full list here)
Once HyperDX is running, you can point your OpenTelemetry SDK to the
OpenTelemetry collector spun up at http://localhost:4318.
Contributing
We welcome all contributions! There's many ways to contribute to the project, including but not limited to:
- Opening a PR (Contribution Guide)
- Submitting feature requests or bugs
- Improving our product or contribution documentation
- Voting on open issues or contributing use cases to a feature request
Motivation
Our mission is to help engineers ship reliable software. To enable that, we believe every engineer needs to be able to easily leverage production telemetry to quickly solve burning production issues.
However, in our experience, the existing tools we've used tend to fall short in a few ways:
- They're expensive, and the pricing has failed to scale with TBs of telemetry becoming the norm, leading to teams aggressively cutting the amount of data they can collect.
- They're hard to use, requiring full-time SREs to set up, and domain experts to use confidently.
- They requiring hopping from tool to tool (logs, session replay, APM, exceptions, etc.) to stitch together the clues yourself.
We hope you give HyperDX in ClickStack a try and let us know how we're doing!
Contact
HyperDX Usage Data
HyperDX collects anonymized usage data for open source deployments. This data
supports our mission for observability to be available to any team and helps
support our open source product run in a variety of different environments.
While we hope you will continue to support our mission in this way, you may opt
out of usage data collection by setting the USAGE_STATS_ENABLED environment
variable to false. Thank you for supporting the development of HyperDX!