hyperdx/packages/app/tests/e2e
Warren Lee 6e8ddd3736
feat: isolate dev environment for multi-agent worktree support (#1994)
## Summary
- Isolate dev, E2E, and integration test environments so multiple git worktrees can run all three simultaneously without port conflicts
- Each worktree gets a deterministic slot (0-99) with unique port ranges: dev (30100-31199), E2E (20320-21399), CI integration (14320-40098)
- Dev portal dashboard (http://localhost:9900) auto-discovers all running stacks, streams logs, and provides a History tab for past run logs

## Port Isolation

| Environment | Port Range | Project Name |
|---|---|---|
| Dev stack | 30100-31199 | `hdx-dev-<slot>` |
| E2E tests | 20320-21399 | `e2e-<slot>` |
| CI integration | 14320-40098 | `int-<slot>` |

All three can run simultaneously from the same worktree with zero port conflicts.

## Dev Portal Features

**Live tab:**
- Auto-discovers dev, E2E, and integration Docker containers + local services (API, App)
- Groups all environments for the same worktree into a single card
- SSE log streaming with ANSI color rendering, capped at 5000 lines
- Auto-starts in background from `make dev`, `make dev-e2e`, `make dev-int`

**History tab:**
- Logs archived to `~/.config/hyperdx/dev-slots/<slot>/history/` on exit (instead of deleted)
- Each archived run includes `meta.json` with worktree/branch metadata
- Grouped by worktree with collapsible cards, search by worktree/branch
- View any past log file in the same log panel, delete individual runs or clear all
- Custom dark-themed confirm modal (no native browser dialogs)

## What Changed

- **`scripts/dev-env.sh`** — Slot-based port assignments, portal auto-start, log archival on exit
- **`scripts/test-e2e.sh`** — E2E port range (20320-21399), log capture via `tee`, portal auto-start, log archival
- **`scripts/ensure-dev-portal.sh`** — Shared singleton portal launcher (works sourced or executed)
- **`scripts/dev-portal/server.js`** — Discovery for dev/E2E/CI containers, history API (list/read/delete), local service port probing
- **`scripts/dev-portal/index.html`** — Live/History tabs, worktree-grouped cards, search, collapse/expand, custom confirm modal, ANSI color log rendering
- **`docker-compose.dev.yml`** — Parameterized ports/volumes/project name with `hdx.dev.*` labels
- **`packages/app/tests/e2e/docker-compose.yml`** — Updated to new E2E port defaults
- **`Makefile`** — `dev-int`/`dev-e2e` targets with log capture + portal auto-start; `dev-portal-stop`; `dev-clean` stops everything + wipes slot data
- **`.env` files** — Ports use `${VAR:-default}` syntax across dev, E2E, and CI environments
- **`agent_docs/development.md`** — Full documentation for isolation, port tables, E2E/CI port ranges

## How to Use

```bash
# Start dev stack (auto-starts portal)
make dev

# Run E2E tests (auto-starts portal, separate ports)
make dev-e2e FILE=navigation

# Run integration tests (auto-starts portal, separate ports)
make dev-int FILE=alerts

# All three can run simultaneously from the same worktree
# Portal at http://localhost:9900 shows everything

# Stop portal
make dev-portal-stop

# Clean up everything (all stacks + portal + history)
make dev-clean
```

## Dev Portal

<img width="1692" height="944" alt="image" src="https://github.com/user-attachments/assets/6ed388a3-43bc-4552-aa8d-688077b79fb7" />

<img width="1689" height="935" alt="image" src="https://github.com/user-attachments/assets/8677a138-0a40-4746-93ed-3b355c8bd45e" />

## Test Plan
- [x] Run `make dev` — verify services start with slot-assigned ports
- [x] Run `make dev` in a second worktree — verify different ports, no conflicts
- [x] Run `make dev-e2e` and `make dev-int` simultaneously — no port conflicts
- [x] Open http://localhost:9900 — verify all stacks grouped by worktree
- [x] Click a service to view logs — verify ANSI colors render correctly
- [x] Stop a stack — verify logs archived to History tab with correct worktree
- [x] History tab — search, collapse/expand, view archived logs, delete
- [x] `make dev-clean` — stops everything, wipes slot data and history
2026-03-31 18:24:24 +00:00
..
components fix: Fix flaky E2E tests (#2013) 2026-03-30 18:28:42 +00:00
core [HDX-3796] Isolate E2E test environment with slot-based port assignment (#1983) 2026-03-26 18:19:14 +00:00
features feat: Add saved searches listing page (#2012) 2026-03-31 12:39:11 +00:00
fixtures chore: Use local clickhouse instance for playwright tests (#1711) 2026-02-13 15:43:12 +00:00
page-objects feat: Add saved searches listing page (#2012) 2026-03-31 12:39:11 +00:00
utils feat: isolate dev environment for multi-agent worktree support (#1994) 2026-03-31 18:24:24 +00:00
docker-compose.yml feat: isolate dev environment for multi-agent worktree support (#1994) 2026-03-31 18:24:24 +00:00
global-setup-fullstack.ts feat: isolate dev environment for multi-agent worktree support (#1994) 2026-03-31 18:24:24 +00:00
global-setup-local.ts chore: Use local clickhouse instance for playwright tests (#1711) 2026-02-13 15:43:12 +00:00
README.md chore: add playwright agents for cursor and claude (#1847) 2026-03-05 15:16:18 +00:00
seed-clickhouse.ts feat: isolate dev environment for multi-agent worktree support (#1994) 2026-03-31 18:24:24 +00:00

End-to-End Testing

This directory contains Playwright-based end-to-end tests for the HyperDX application. The tests are organized into core functionality and feature-specific test suites.

Prerequisites

  • Node.js (>=22.16.0 as specified in package.json)
  • Dependencies installed via yarn install
  • Development server running (automatically handled by test configuration)

Running Tests

Default: Full-Stack Mode

By default, make e2e runs tests in full-stack mode with MongoDB + API + local Docker ClickHouse for maximum consistency and real backend features:

# Run all tests (full-stack with MongoDB + API + local Docker ClickHouse)
make e2e

# For UI, specific tests, or other options, use the script from repo root:
./scripts/test-e2e.sh --ui                 # Run with Playwright UI
./scripts/test-e2e.sh --grep "@kubernetes"  # Run specific tests
./scripts/test-e2e.sh --grep "@smoke"
./scripts/test-e2e.sh --ui --last-failed   # Re-run only failed tests with UI

Optional: Local Mode (Frontend Only)

For faster iteration during development, use the script with --local to skip MongoDB and run frontend-only tests:

# From repo root - run local tests (no MongoDB, frontend only)
./scripts/test-e2e.sh --local
./scripts/test-e2e.sh --local --ui
./scripts/test-e2e.sh --local --grep "@search"

# From packages/app - run local tests (frontend only)
cd packages/app
yarn test:e2e --local

When to use local mode:

  • Quick frontend iteration during development
  • Testing UI components that don't need auth/persistence
  • Faster test execution when you don't need backend features

Direct Command Usage

From packages/app, you can use the test:e2e command with flags:

# Full-stack mode (default, with backend)
yarn test:e2e

# Local mode (frontend only)
yarn test:e2e --local

# Combine with other flags
yarn test:e2e --ui                    # UI mode (full-stack)
yarn test:e2e --ui --local            # UI mode (local)
yarn test:e2e --debug                 # Debug mode (full-stack)
yarn test:e2e --debug --local         # Debug mode (local)
yarn test:e2e --headed                # Visible browser (full-stack)
yarn test:e2e --headed --local        # Visible browser (local)

# Run specific test with any mode
yarn test:e2e tests/e2e/features/search/search.spec.ts
yarn test:e2e tests/e2e/features/dashboard.spec.ts --local

Watch mode (re-run on file save):

Playwright UI has built-in watch. Run with UI, then enable it per test:

./scripts/test-e2e.sh --keep-running --ui

In the Playwright UI sidebar, click the eye icon next to a test (or file/describe) to turn on watch for it. When you save changes to that test file, that test will re-run automatically.

Available flags:

  • --local - Run in local mode (frontend only), excludes @full-stack tests
  • --ui - Open Playwright UI for interactive debugging and watch mode
  • --debug - Run in debug mode with browser developer tools
  • --headed - Run tests in visible browser (default is headless)

Test Modes

Full-Stack Mode (Default)

Default behavior - runs with real backend (MongoDB + API) and demo ClickHouse data.

What it includes:

  • MongoDB (port 29998) - authentication, teams, users, persistence
  • API Server (port 29000) - full backend logic
  • App Server (port 28081) - frontend
  • Local Docker ClickHouse (localhost:8123) - seeded E2E test data (logs/traces/metrics/K8s). Seeded timestamps span a past+future window (~1h past, ~2h future from seed time) so relative ranges like "last 5 minutes" keep finding data. If you run tests more than ~2 hours after the last seed, re-run the global setup (or full test run) to re-seed.

Benefits:

  • Test authentication flows (login, signup, teams)
  • Test persistence (saved searches, dashboards, alerts)
  • Test real API endpoints and backend logic
  • Consistent with production environment
  • All features work (auth, persistence, data querying)
# Default: full-stack mode
make e2e
./scripts/test-e2e.sh --grep "@kubernetes"   # from repo root, for specific tags

Local Mode (for testing frontend-only features)

Frontend + ClickHouse mode - skips MongoDB/API, uses local Docker ClickHouse with seeded test data.

Use for:

  • Quick frontend iteration during development
  • Testing UI components that don't need auth
  • Faster test execution when backend features aren't needed
  • Consistent test data (same as full-stack mode)

Limitations:

  • No authentication (no login/signup)
  • No persistence (can't save searches/dashboards via API)
  • No API calls (queries go directly to local ClickHouse)

Note: Uses the same Docker ClickHouse and seeded data as full-stack mode, ensuring consistency between local and full-stack tests.

# Opt-in to local mode for speed (from repo root)
./scripts/test-e2e.sh --local
./scripts/test-e2e.sh --local --grep "@search"

Writing Tests

Since full-stack is the default, all tests have access to authentication, persistence, and real backend features:

import { expect, test } from '../../utils/base-test';
import { SearchPage } from '../page-objects/SearchPage';

test.describe('My Feature', { tag: '@full-stack' }, () => {
  test('should allow authenticated user to save search', async ({ page }) => {
    const ts = Date.now();
    const searchPage = new SearchPage(page);

    await searchPage.goto();
    await searchPage.openSaveSearchModal();
    await searchPage.savedSearchModal.saveSearchAndWaitForNavigation(
      `My Saved Search ${ts}`,
    );

    await expect(searchPage.alertsButton).toBeVisible();
  });
});

Note: Tests that need to run in full stack mode should be tagged with @full-stack so that when running with ./scripts/test-e2e.sh --local, they are skipped appropriately.

Page Object Pattern

All UI interactions in spec files must go through page objects (page-objects/) and components (components/). Never use raw page.getByTestId(), page.locator(), or page.getByRole() directly in spec files. If a needed interaction doesn't exist in a page object, add it there first.

Data Isolation

Tests run in parallel and share a database. Use Date.now() for every field the API uniqueness-checks — not just display names:

const ts = Date.now();
const name = `E2E Thing ${ts}`;
const url = `https://example.com/thing-${ts}`; // URL fields too, not just name

The webhook API enforces uniqueness on (team, service, url). A hardcoded URL will collide between parallel runs or retries and cause the form to stay open (API returns 400).

Scoped Assertions

Never assert global counts — other tests' data is in the shared DB. Scope assertions to the current test's unique data:

// ❌ Brittle — other tests' alerts pollute the count
await expect(alertsPage.getAlertCards()).toHaveCount(1);

// ✅ Scoped to this test's data
await expect(
  alertsPage.pageContainer.getByRole('link').filter({ hasText: name }),
).toBeVisible();

AI-Assisted Test Writing

The project ships with AI tooling for generating, fixing, and planning E2E tests using a live browser via the Playwright MCP server.

Claude Code

Use the /playwright <description> skill. It orchestrates three agents:

  • playwright-test-generator — drives a real browser, executes steps live, writes spec code following HyperDX conventions
  • playwright-test-healer — debugs failing tests interactively using the MCP browser tools
  • playwright-test-planner — explores the UI and produces a structured test plan before writing code
/playwright write a test that creates an alert from a saved search

The skill automatically runs the test after generation and invokes the healer if it fails. Update .claude/skills/playwright/SKILL.md if the output doesn't match project conventions.

Cursor

The Playwright MCP server is pre-configured in .cursor/mcp.json. Enable it under Settings → Tools & MCP.

To write a test, reference the @playwright rule in your prompt — it loads all HyperDX conventions automatically:

@playwright write a new E2E test at packages/app/tests/e2e/features/search.spec.ts
that verifies a user can save a search and see it in the sidebar

To fix a failing test:

@playwright this test is failing with [error]. Debug and fix it using the Playwright MCP tools.

The @playwright rule is a thin wrapper that points to .claude/skills/playwright/SKILL.md as the single source of truth for conventions — so both Claude Code and Cursor stay in sync automatically.

Test Organization

tests/e2e/
├── core/                 # Core application functionality
│   └── navigation.spec.ts # Navigation and routing
├── features/             # Feature-specific tests
│   ├── alerts.spec.ts
│   ├── chart-explorer.spec.ts
│   ├── dashboard.spec.ts
│   ├── search/
│   │   ├── search.spec.ts
│   │   ├── search-filters.spec.ts
│   │   └── saved-search.spec.ts
│   ├── sessions.spec.ts
│   └── traces-workflow.spec.ts
└── utils/                # Test utilities and helpers
    └── base-test.ts

Debugging Tests

The test:e2e command supports flags for different modes:

Interactive Mode

Run tests with the Playwright UI for interactive debugging:

# Full-stack mode (default)
yarn test:e2e --ui

# Local mode (frontend only)
yarn test:e2e --ui --local

Debug Mode

Run tests in debug mode with browser developer tools:

# Full-stack mode (default)
yarn test:e2e --debug

# Local mode
yarn test:e2e --debug --local

CI Mode

Run tests in ci mode, which runs it in a docker container and environment similar to how it runs inside of Github Actions

yarn test:e2e:ci

Single Test Debugging

To debug a specific test file, pass the file path as an argument:

# Full-stack mode (default)
yarn test:e2e tests/e2e/features/search/search.spec.ts --debug

# Local mode
yarn test:e2e tests/e2e/features/search/search.spec.ts --debug --local

Headed Mode

Run tests in headed mode (visible browser):

# Full-stack mode (default)
yarn test:e2e --headed

# Local mode
yarn test:e2e --headed --local

Test Output and Reports

HTML Reports

After test execution, view the detailed HTML report:

yarn playwright show-report

The report includes:

  • Test execution timeline
  • Screenshots of failures
  • Video recordings of failed tests
  • Network logs and console output

Test Results

Test artifacts are stored in:

  • test-results/ - Screenshots, videos, and traces for failed tests
  • playwright-report/ - HTML report files

Trace Viewer

For detailed debugging of failed tests, use the trace viewer:

yarn playwright show-trace test-results/[test-name]/trace.zip

Configuration

The test configuration is defined in playwright.config.ts:

  • Base URL: http://localhost:8080 (configurable via PLAYWRIGHT_BASE_URL)
  • Test Timeout: 60 seconds (increased from default 30s to reduce flaky test failures)
  • Retries: 1 retry locally, 2 on CI
  • Workers: Undefined (uses Playwright defaults)
  • Screenshots: Captured on failure only
  • Videos: Recorded and retained on failure
  • Traces: Collected on first retry
  • Global Setup: Ensures server readiness before tests
  • Web Server: Automatically starts local dev server with local mode enabled

Test Development

Writing Tests

Tests use the extended base test from utils/base-test.ts which provides:

  • Automatic handling of connection/sources
  • Tanstack Query devtools management
  • Network idle waiting after navigation

Best Practices

  • Use data test IDs for reliable element selection
  • Implement proper wait strategies for dynamic content
  • Group related assertions in test steps
  • Use descriptive test names and organize with appropriate tags
  • Clean up test data when necessary

Configuration Details

Port Configuration

Local Environment (make e2e):

  • MongoDB: 29998 (custom port to avoid conflicts)
  • API Server: 29000
  • App Server: 28081

CI Environment (GitHub Actions):

  • MongoDB: 27017 (default, accessed via service name mongodb)
  • API Server: 29000
  • App Server: 28081

The MongoDB port differs between local and CI to:

  • Avoid conflicts with existing MongoDB instances locally (port 27017)
  • Use standard ports in isolated CI containers (port 27017)
  • CI accesses MongoDB via hostname mongodb instead of localhost

Playwright Configuration

The test setup uses Playwright's webServer array feature (v1.32+) to start multiple servers:

  • API server (port 29000) - loads .env.e2e configuration
  • App server (port 28081) - connects to API

Troubleshooting

Common Issues

Server connection errors:

  • Port 28081 (full-stack) or 8081 (local mode) already in use
  • Check development server started successfully
  • Verify environment variables in .env.e2e

MongoDB connection issues (full-stack mode):

  • Check port 29998 is available locally: lsof -i :29998
  • View MongoDB logs: docker compose -p e2e -f tests/e2e/docker-compose.yml logs
  • MongoDB is auto-managed by make e2e (default)
  • Note: CI uses port 27017 internally (accessed via service name)

Sources don't appear in UI:

  • Check API logs for setupTeamDefaults errors
  • Verify DEFAULT_SOURCES in .env.e2e points to local Docker ClickHouse (localhost:8123)
  • Ensure you registered a new user (DEFAULT_SOURCES only applies to new teams)

Tests can't find demo data:

  • Verify sources use default database with e2e_ prefixed tables
  • Check Network tab - should query localhost:8123
  • Verify a source is selected in UI dropdown

Flaky Tests

For intermittent failures:

  1. Check the HTML report for timing issues
  2. Review network logs for failed requests
  3. Consider if individual test steps need longer wait times (global timeout is now 60s)
  4. Use the trace viewer to analyze test execution

CI/CD Integration

Tests run in full-stack mode on CI (GitHub Actions) with:

  • MongoDB service container for authentication and persistence
  • Local Docker ClickHouse for telemetry data (same as local mode)
  • 60-second test timeout (same as local)
  • Multiple retry attempts (2 retries on CI vs 1 locally)
  • Artifact collection for failed tests
  • GitHub Actions integration for PR comments
  • Parallel execution across 4 shards for faster feedback