hyperdx/AGENTS.md

256 lines
9.9 KiB
Markdown
Raw Permalink Normal View History

# HyperDX Development Guide
## What is HyperDX?
HyperDX is an observability platform that helps engineers search, visualize, and
monitor logs, metrics, traces, and session replays. It's built on ClickHouse for
blazing-fast queries and supports OpenTelemetry natively.
**Core value**: Unified observability with ClickHouse performance,
schema-agnostic design, and correlation across all telemetry types in one place.
## Architecture (WHAT)
This is a **monorepo** with three main packages:
- `packages/app` - Next.js frontend (TypeScript, Mantine UI, TanStack Query)
- `packages/api` - Express backend (Node.js 22+, MongoDB for metadata,
ClickHouse for telemetry)
- `packages/common-utils` - Shared TypeScript utilities for query parsing and
validation
**Data flow**: Apps → OpenTelemetry Collector → ClickHouse (telemetry data) /
MongoDB (configuration/metadata)
## Development Setup (HOW)
```bash
yarn setup # Install dependencies
feat: isolate dev environment for multi-agent worktree support (#1994) ## Summary - Isolate dev, E2E, and integration test environments so multiple git worktrees can run all three simultaneously without port conflicts - Each worktree gets a deterministic slot (0-99) with unique port ranges: dev (30100-31199), E2E (20320-21399), CI integration (14320-40098) - Dev portal dashboard (http://localhost:9900) auto-discovers all running stacks, streams logs, and provides a History tab for past run logs ## Port Isolation | Environment | Port Range | Project Name | |---|---|---| | Dev stack | 30100-31199 | `hdx-dev-<slot>` | | E2E tests | 20320-21399 | `e2e-<slot>` | | CI integration | 14320-40098 | `int-<slot>` | All three can run simultaneously from the same worktree with zero port conflicts. ## Dev Portal Features **Live tab:** - Auto-discovers dev, E2E, and integration Docker containers + local services (API, App) - Groups all environments for the same worktree into a single card - SSE log streaming with ANSI color rendering, capped at 5000 lines - Auto-starts in background from `make dev`, `make dev-e2e`, `make dev-int` **History tab:** - Logs archived to `~/.config/hyperdx/dev-slots/<slot>/history/` on exit (instead of deleted) - Each archived run includes `meta.json` with worktree/branch metadata - Grouped by worktree with collapsible cards, search by worktree/branch - View any past log file in the same log panel, delete individual runs or clear all - Custom dark-themed confirm modal (no native browser dialogs) ## What Changed - **`scripts/dev-env.sh`** — Slot-based port assignments, portal auto-start, log archival on exit - **`scripts/test-e2e.sh`** — E2E port range (20320-21399), log capture via `tee`, portal auto-start, log archival - **`scripts/ensure-dev-portal.sh`** — Shared singleton portal launcher (works sourced or executed) - **`scripts/dev-portal/server.js`** — Discovery for dev/E2E/CI containers, history API (list/read/delete), local service port probing - **`scripts/dev-portal/index.html`** — Live/History tabs, worktree-grouped cards, search, collapse/expand, custom confirm modal, ANSI color log rendering - **`docker-compose.dev.yml`** — Parameterized ports/volumes/project name with `hdx.dev.*` labels - **`packages/app/tests/e2e/docker-compose.yml`** — Updated to new E2E port defaults - **`Makefile`** — `dev-int`/`dev-e2e` targets with log capture + portal auto-start; `dev-portal-stop`; `dev-clean` stops everything + wipes slot data - **`.env` files** — Ports use `${VAR:-default}` syntax across dev, E2E, and CI environments - **`agent_docs/development.md`** — Full documentation for isolation, port tables, E2E/CI port ranges ## How to Use ```bash # Start dev stack (auto-starts portal) make dev # Run E2E tests (auto-starts portal, separate ports) make dev-e2e FILE=navigation # Run integration tests (auto-starts portal, separate ports) make dev-int FILE=alerts # All three can run simultaneously from the same worktree # Portal at http://localhost:9900 shows everything # Stop portal make dev-portal-stop # Clean up everything (all stacks + portal + history) make dev-clean ``` ## Dev Portal <img width="1692" height="944" alt="image" src="https://github.com/user-attachments/assets/6ed388a3-43bc-4552-aa8d-688077b79fb7" /> <img width="1689" height="935" alt="image" src="https://github.com/user-attachments/assets/8677a138-0a40-4746-93ed-3b355c8bd45e" /> ## Test Plan - [x] Run `make dev` — verify services start with slot-assigned ports - [x] Run `make dev` in a second worktree — verify different ports, no conflicts - [x] Run `make dev-e2e` and `make dev-int` simultaneously — no port conflicts - [x] Open http://localhost:9900 — verify all stacks grouped by worktree - [x] Click a service to view logs — verify ANSI colors render correctly - [x] Stop a stack — verify logs archived to History tab with correct worktree - [x] History tab — search, collapse/expand, view archived logs, delete - [x] `make dev-clean` — stops everything, wipes slot data and history
2026-03-31 18:24:24 +00:00
yarn dev # Start full stack with worktree-isolated ports
```
chore: set yarn npmMinimalAgeGate (#2022) ## Summary In response to the recent [axios supply chain attack](https://www.stepsecurity.io/blog/axios-compromised-on-npm-malicious-versions-drop-remote-access-trojan), we are tightening package management controls to reduce our exposure to malicious or compromised npm packages. **Changes:** - Updated `yarnPath` in `.yarnrc.yml` to point to Yarn 4.13.0 - Updated `packageManager` in `package.json` to reflect Yarn 4.13.0 - Removed old Yarn releases (4.5.1 and 1.22.18) from the `releases/` directory - Added Yarn 4.13.0 to the `releases/` directory - Set `npmMinimalAgeGate: 7` in `.yarnrc.yml` — Yarn will now block installation of any package version published less than 7 days ago, providing a buffer against freshly-injected malicious releases ### How to test locally or on Vercel 1. Pull this branch and run `yarn --version` — confirm it outputs `4.13.0`. 2. Run `yarn install` and verify it completes without errors. 3. Attempt to add a package version published within the last 7 days (e.g. a freshly released patch) and confirm Yarn rejects it with an age gate error. 4. Add a package version older than 7 days and confirm it installs successfully. 5. Confirm the old Yarn release files (`4.5.1`, `1.22.18`) are no longer present in `releases/`. ### References - Blog post: [axios compromised on npm — malicious versions drop remote access trojan](https://www.stepsecurity.io/blog/axios-compromised-on-npm-malicious-versions-drop-remote-access-trojan)
2026-03-31 18:37:07 +00:00
The project uses **Yarn 4.13.0** workspaces. Docker Compose manages ClickHouse,
MongoDB, and the OTel Collector.
feat: isolate dev environment for multi-agent worktree support (#1994) ## Summary - Isolate dev, E2E, and integration test environments so multiple git worktrees can run all three simultaneously without port conflicts - Each worktree gets a deterministic slot (0-99) with unique port ranges: dev (30100-31199), E2E (20320-21399), CI integration (14320-40098) - Dev portal dashboard (http://localhost:9900) auto-discovers all running stacks, streams logs, and provides a History tab for past run logs ## Port Isolation | Environment | Port Range | Project Name | |---|---|---| | Dev stack | 30100-31199 | `hdx-dev-<slot>` | | E2E tests | 20320-21399 | `e2e-<slot>` | | CI integration | 14320-40098 | `int-<slot>` | All three can run simultaneously from the same worktree with zero port conflicts. ## Dev Portal Features **Live tab:** - Auto-discovers dev, E2E, and integration Docker containers + local services (API, App) - Groups all environments for the same worktree into a single card - SSE log streaming with ANSI color rendering, capped at 5000 lines - Auto-starts in background from `make dev`, `make dev-e2e`, `make dev-int` **History tab:** - Logs archived to `~/.config/hyperdx/dev-slots/<slot>/history/` on exit (instead of deleted) - Each archived run includes `meta.json` with worktree/branch metadata - Grouped by worktree with collapsible cards, search by worktree/branch - View any past log file in the same log panel, delete individual runs or clear all - Custom dark-themed confirm modal (no native browser dialogs) ## What Changed - **`scripts/dev-env.sh`** — Slot-based port assignments, portal auto-start, log archival on exit - **`scripts/test-e2e.sh`** — E2E port range (20320-21399), log capture via `tee`, portal auto-start, log archival - **`scripts/ensure-dev-portal.sh`** — Shared singleton portal launcher (works sourced or executed) - **`scripts/dev-portal/server.js`** — Discovery for dev/E2E/CI containers, history API (list/read/delete), local service port probing - **`scripts/dev-portal/index.html`** — Live/History tabs, worktree-grouped cards, search, collapse/expand, custom confirm modal, ANSI color log rendering - **`docker-compose.dev.yml`** — Parameterized ports/volumes/project name with `hdx.dev.*` labels - **`packages/app/tests/e2e/docker-compose.yml`** — Updated to new E2E port defaults - **`Makefile`** — `dev-int`/`dev-e2e` targets with log capture + portal auto-start; `dev-portal-stop`; `dev-clean` stops everything + wipes slot data - **`.env` files** — Ports use `${VAR:-default}` syntax across dev, E2E, and CI environments - **`agent_docs/development.md`** — Full documentation for isolation, port tables, E2E/CI port ranges ## How to Use ```bash # Start dev stack (auto-starts portal) make dev # Run E2E tests (auto-starts portal, separate ports) make dev-e2e FILE=navigation # Run integration tests (auto-starts portal, separate ports) make dev-int FILE=alerts # All three can run simultaneously from the same worktree # Portal at http://localhost:9900 shows everything # Stop portal make dev-portal-stop # Clean up everything (all stacks + portal + history) make dev-clean ``` ## Dev Portal <img width="1692" height="944" alt="image" src="https://github.com/user-attachments/assets/6ed388a3-43bc-4552-aa8d-688077b79fb7" /> <img width="1689" height="935" alt="image" src="https://github.com/user-attachments/assets/8677a138-0a40-4746-93ed-3b355c8bd45e" /> ## Test Plan - [x] Run `make dev` — verify services start with slot-assigned ports - [x] Run `make dev` in a second worktree — verify different ports, no conflicts - [x] Run `make dev-e2e` and `make dev-int` simultaneously — no port conflicts - [x] Open http://localhost:9900 — verify all stacks grouped by worktree - [x] Click a service to view logs — verify ANSI colors render correctly - [x] Stop a stack — verify logs archived to History tab with correct worktree - [x] History tab — search, collapse/expand, view archived logs, delete - [x] `make dev-clean` — stops everything, wipes slot data and history
2026-03-31 18:24:24 +00:00
**This repo is multi-agent friendly.** `yarn dev`, `make dev-int`, and
`make dev-e2e` all use slot-based port isolation so multiple worktrees can run
dev servers, integration tests, and E2E tests simultaneously without conflicts.
A dev portal at http://localhost:9900 auto-starts and shows all running stacks.
See [`agent_docs/development.md`](agent_docs/development.md) for the full
multi-worktree setup, port allocation tables, and available commands.
## Working on the Codebase (HOW)
**Before starting a task**, read relevant documentation from the `agent_docs/`
directory:
- `agent_docs/architecture.md` - Detailed architecture patterns and data models
- `agent_docs/tech_stack.md` - Technology stack details and component patterns
- `agent_docs/development.md` - Development workflows, testing, and common tasks
- `agent_docs/code_style.md` - Code patterns and best practices (read only when
actively coding)
**After finishing all code edits**, run `yarn lint:fix` to auto-fix formatting
and lint issues across all packages. Pre-commit hooks handle this when
committing, but if you finish edits without committing, run `yarn lint:fix`
before stopping.
## Key Principles
1. **Multi-tenancy**: All data is scoped to `Team` - ensure proper filtering
2. **Type safety**: Use TypeScript strictly; Zod schemas for validation
3. **Existing patterns**: Follow established patterns in the codebase - explore
similar files before implementing
4. **Component size**: Keep files under 300 lines; break down large components
5. **UI Components**: Use custom Button/ActionIcon variants (`primary`,
`secondary`, `danger`) - see `agent_docs/code_style.md` for required patterns
6. **Testing**: Tests live in `__tests__/` directories; use Jest for
unit/integration tests
## Running Tests
Each package has different test commands available:
**packages/app** (unit tests only):
```bash
cd packages/app
yarn ci:unit # Run unit tests
yarn dev:unit # Watch mode for unit tests
```
**packages/api** (integration tests only):
```bash
make dev-int-build # Build dependencies (run once before tests)
make dev-int FILE=<TEST_FILE_NAME> # Spins up Docker services and runs tests.
# Ctrl-C to stop and wait for all services to tear down.
```
**packages/common-utils** (both unit and integration tests):
```bash
cd packages/common-utils
yarn ci:unit # Run unit tests
yarn dev:unit # Watch mode for unit tests
yarn ci:int # Run integration tests
yarn dev:int # Watch mode for integration tests
```
To run a specific test file or pattern:
```bash
yarn ci:unit <path/to/test.ts> # Run specific test file
yarn ci:unit --testNamePattern="test name pattern" # Run tests matching pattern
```
**Lint & type check across all packages:**
```bash
make ci-lint # Lint + TypeScript check across all packages
make ci-unit # Unit tests across all packages
```
**E2E tests (Playwright):**
```bash
# First-time setup (install Chromium browser):
cd packages/app && yarn playwright install chromium
[HDX-3796] Isolate E2E test environment with slot-based port assignment (#1983) ## Summary - Adds worktree-aware port isolation for E2E tests, mirroring the existing `dev-int` slot mechanism so multiple agents/developers can run E2E tests in parallel without port conflicts - Fixes the navigation E2E test that was broken by Live Tail URL updates swallowing client-side navigation - Adds `dev-e2e` Makefile target for running specific tests with `FILE=` and `GREP=` filters, plus `REPORT=1` to open the HTML report after tests finish ## Port Isolation Each worktree gets a deterministic slot (0–99) computed from its directory name. All E2E service ports are offset by that slot in the **44000–50100** range, avoiding collisions with `dev` (4317–27017) and `dev-int` (14320–40098). | Service | Base + slot | Variable | |---|---|---| | ClickHouse HTTP | 48123 + slot | `HDX_E2E_CH_PORT` | | ClickHouse Native | 49000 + slot | `HDX_E2E_CH_NATIVE_PORT` | | MongoDB | 49998 + slot | `HDX_E2E_MONGO_PORT` | | API server | 49100 + slot | `HDX_E2E_API_PORT` | | App (fullstack) | 48081 + slot | `HDX_E2E_APP_PORT` | | App (local) | 48001 + slot | `HDX_E2E_APP_LOCAL_PORT` | | OpAMP | 44320 + slot | `HDX_E2E_OPAMP_PORT` | ## New Make Targets ```bash make dev-e2e FILE=navigation # Run specific test file make dev-e2e FILE=navigation GREP="help menu" # Filter by test name make dev-e2e GREP="should navigate" # Grep across all files make dev-e2e FILE=navigation REPORT=1 # Open HTML report after run make dev-e2e-clean # Remove test artifacts ``` ## Linear https://linear.app/hyperdx/issue/HDX-3796
2026-03-26 18:19:14 +00:00
# Run all E2E tests:
make e2e
# Run a specific test file (dev mode: hot reload):
make dev-e2e FILE=navigation # Match files containing "navigation"
make dev-e2e FILE=navigation GREP="help menu" # Also filter by test name
make dev-e2e GREP="should navigate" # Filter by test name across all files
make dev-e2e FILE=navigation REPORT=1 # Open HTML report after run
make dev-e2e-clean # Remove test artifacts
```
## Important Context
- **Authentication**: Passport.js with team-based access control
- **State management**: Jotai (client), TanStack Query (server), URL params
(filters)
- **UI library**: Mantine components are the standard (not custom UI)
- **Database patterns**: MongoDB for metadata with Mongoose, ClickHouse for
telemetry queries
## PR Hygiene for Agent-Generated Code
When using agentic tools to generate PRs, follow these practices to keep reviews
efficient and accurate:
1. **Scope PRs to a single logical change**, even if the agent can produce more
in one session. Smaller, focused PRs move through the review pipeline faster
and are easier to classify accurately.
2. **Write the PR description to explain intent (the "why"), not just what
changed.** Reviewers need to understand the goal to catch cases where the
agent solved the wrong problem or made a plausible-but-wrong trade-off.
3. **Name agent-generated branches with a `claude/`, `agent/`, or `ai/` prefix**
(e.g., `claude/add-rate-limiting`). This allows the PR triage classifier to
apply appropriate scrutiny and lets reviewers calibrate their attention.
4. **Write or update tests alongside the implementation**, not after. Configure
your agent to produce tests before writing implementation code. See the
Testing section below for the commands to use.
## GitHub Action Workflow (when invoked via @claude)
When working on issues or PRs through the GitHub Action:
1. **Before writing any code**, post a comment outlining your implementation
plan — which files you'll change, what approach you'll take, and any
trade-offs or risks. Use `gh issue comment` for issues or `gh pr comment` for
PRs.
2. **After making any code changes**, always run these in order and fix any
failures before opening a PR:
- `make ci-lint` — lint + TypeScript type check
- `make ci-unit` — unit tests
3. Write a clear PR description explaining what changed and why.
## Git Commits
When committing code, use the git author's default profile (name and email from
git config). Do not add `Co-Authored-By` trailers.
**Pre-commit hooks must pass before committing.** Do not use `--no-verify` to
skip hooks. If the pre-commit hook fails (e.g. due to husky not being set up in
a worktree), run `npx lint-staged` manually before committing to ensure lint and
formatting checks pass. Fix any issues before creating the commit.
## Merge Conflict Resolution
1. **Never blindly pick a side.** Read both sides of every conflict to
understand the intent of each change before choosing a resolution.
2. **Refactor/move conflicts require extra verification.** When one side
refactored, moved, or extracted code (e.g., inline components to separate
files), always diff the discarded side against the destination files before
declaring the conflict resolved. Code can diverge after extraction — the
other branch may have made fixes or additions that the extracting branch
never picked up. A naive "keep ours" resolution silently drops those changes.
3. **Verify the result compiles.** After resolving, check for missing imports,
broken references, or type errors introduced by the resolution — especially
when discarding a side that added new dependencies or exports.
4. **Ask for help when uncertain.** If you are not 100% confident about which
side to keep, or whether a change can be safely discarded, stop and ask for
manual intervention rather than guessing. A wrong guess silently breaks
things; asking is always cheaper than debugging later.
Add Cursor Cloud specific instructions to AGENTS.md (#2081) ## What Adds a `## Cursor Cloud specific instructions` section to `AGENTS.md` with development environment notes for cloud agents. ## Why Future Cursor Cloud agents need non-obvious context about running the dev stack in the cloud VM environment, particularly: - **Docker requirement**: Docker must be installed and running before `yarn dev` or integration/E2E tests - **dash/bash incompatibility**: `yarn dev` uses `sh -c` to source `scripts/dev-env.sh` which contains bash-specific syntax (`BASH_SOURCE`). On Ubuntu where `/bin/sh` is `dash`, this fails. A bash workaround command is documented. - **Port mapping**: The default `/workspace` directory produces slot 76, so services run on non-standard ports (App: 30276, API: 30176, etc.) - **First-time registration**: Fresh MongoDB means the app shows a registration page with no external auth needed ## Evidence Full dev stack running with all services operational: [hyperdx_register_and_search_demo.mp4](https://cursor.com/agents/bc-b3139182-3221-4ea1-9a02-ac2df5e9b062/artifacts?path=%2Fopt%2Fcursor%2Fartifacts%2Fhyperdx_register_and_search_demo.mp4) [HyperDX logged in search page](https://cursor.com/agents/bc-b3139182-3221-4ea1-9a02-ac2df5e9b062/artifacts?path=%2Fopt%2Fcursor%2Fartifacts%2Fhyperdx_logged_in_search.webp) [HyperDX search query executed](https://cursor.com/agents/bc-b3139182-3221-4ea1-9a02-ac2df5e9b062/artifacts?path=%2Fopt%2Fcursor%2Fartifacts%2Fhyperdx_search_query.webp) <sub>To show artifacts inline, <a href="https://cursor.com/dashboard/cloud-agents#team-pull-requests">enable</a> in settings.</sub> <div><a href="https://cursor.com/agents/bc-b3139182-3221-4ea1-9a02-ac2df5e9b062"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cursor.com/assets/images/open-in-web-dark.png"><source media="(prefers-color-scheme: light)" srcset="https://cursor.com/assets/images/open-in-web-light.png"><img alt="Open in Web" width="114" height="28" src="https://cursor.com/assets/images/open-in-web-dark.png"></picture></a>&nbsp;<a href="https://cursor.com/background-agent?bcId=bc-b3139182-3221-4ea1-9a02-ac2df5e9b062"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cursor.com/assets/images/open-in-cursor-dark.png"><source media="(prefers-color-scheme: light)" srcset="https://cursor.com/assets/images/open-in-cursor-light.png"><img alt="Open in Cursor" width="131" height="28" src="https://cursor.com/assets/images/open-in-cursor-dark.png"></picture></a>&nbsp;</div> Co-authored-by: Cursor Agent <199161495+cursoragent@users.noreply.github.com>
2026-04-10 14:54:52 +00:00
## Cursor Cloud specific instructions
### Docker requirement
Docker must be installed and running before starting the dev stack or running
integration/E2E tests. The VM update script handles `yarn install` and
`yarn build:common-utils`, but Docker daemon startup is a prerequisite that must
already be available.
### Starting the dev stack
`yarn dev` uses `sh -c` to source `scripts/dev-env.sh`, which contains
bash-specific syntax (`BASH_SOURCE`). On systems where `/bin/sh` is `dash`
(e.g. Ubuntu), this fails with "Bad substitution". Work around it by running
with bash directly:
```bash
bash -c 'export PATH="/workspace/node_modules/.bin:$PATH" && source ./scripts/dev-env.sh && yarn build:common-utils && dotenvx run --convention=nextjs -- docker compose -p "$HDX_DEV_PROJECT" -f docker-compose.dev.yml up -d && yarn app:dev'
```
Port isolation assigns a slot based on the worktree directory name. In the
default `/workspace` directory, the slot is **76**, so services are at:
- **App**: http://localhost:30276
- **API**: http://localhost:30176
- **ClickHouse**: http://localhost:30576
- **MongoDB**: localhost:30476
### Key commands reference
See `AGENTS.md` above and `agent_docs/development.md` for the full command
reference. Quick summary:
- `make ci-lint` — lint + TypeScript type check
- `make ci-unit` — unit tests (all packages)
- `make dev-int FILE=<name>` — integration tests (spins up Docker services)
- `make dev-e2e FILE=<name>` — E2E tests (Playwright)
### First-time registration
When the dev stack starts fresh (empty MongoDB), the app shows a registration
page. Create any account to get started — no external auth provider is needed.
---
_Need more details? Check the `agent_docs/` directory or ask which documentation
to read._