chore: CLAUDE.md refactor (#1437)

Inspiration: https://www.humanlayer.dev/blog/writing-a-good-claude-md?utm_source=tldrdev
This commit is contained in:
Tom Alexander 2025-12-03 13:35:46 -05:00 committed by GitHub
parent b7789cedb7
commit bd96c98cbf
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 310 additions and 313 deletions

346
CLAUDE.md
View file

@ -1,336 +1,56 @@
# HyperDX Claude Agent Guide
# HyperDX Development Guide
This guide helps Claude AI agents understand and work effectively with the
HyperDX codebase.
## What is HyperDX?
## 🏗️ Project Overview
HyperDX is an observability platform that helps engineers search, visualize, and monitor logs, metrics, traces, and session replays. It's built on ClickHouse for blazing-fast queries and supports OpenTelemetry natively.
HyperDX is an observability platform built on ClickHouse that helps engineers
search, visualize, and monitor logs, metrics, traces, and session replays. It's
designed as an alternative to tools like Kibana but optimized for ClickHouse's
performance characteristics.
**Core value**: Unified observability with ClickHouse performance, schema-agnostic design, and correlation across all telemetry types in one place.
**Core Value Proposition:**
## Architecture (WHAT)
- Unified observability: correlate logs, metrics, traces, and session replays in
one place
- ClickHouse-powered: blazing fast searches and visualizations
- OpenTelemetry native: works out of the box with OTEL instrumentation
- Schema agnostic: works on top of existing ClickHouse schemas
This is a **monorepo** with three main packages:
## 📁 Architecture Overview
- `packages/app` - Next.js frontend (TypeScript, Mantine UI, TanStack Query)
- `packages/api` - Express backend (Node.js 22+, MongoDB for metadata, ClickHouse for telemetry)
- `packages/common-utils` - Shared TypeScript utilities for query parsing and validation
HyperDX follows a microservices architecture with clear separation between
components:
**Data flow**: Apps → OpenTelemetry Collector → ClickHouse (telemetry data) / MongoDB (configuration/metadata)
### Core Services
- **HyperDX UI (`packages/app`)**: Next.js frontend serving the user interface
- **HyperDX API (`packages/api`)**: Node.js/Express backend handling queries and
business logic
- **OpenTelemetry Collector**: Receives and processes telemetry data
- **ClickHouse**: Primary data store for all telemetry (logs, metrics, traces)
- **MongoDB**: Metadata storage (users, dashboards, alerts, saved searches)
### Data Flow
1. Applications send telemetry via OpenTelemetry → OTel Collector
2. OTel Collector processes and forwards data → ClickHouse
3. Users interact with UI → API queries ClickHouse
4. Configuration/metadata stored in MongoDB
## 🛠️ Technology Stack
### Frontend (`packages/app`)
- **Framework**: Next.js 14 with TypeScript
- **UI Components**: Mantine UI library
- **State Management**: Jotai for global state, TanStack Query for server state
- **Charts/Visualization**: Recharts, uPlot
- **Code Editor**: CodeMirror (for SQL/JSON editing)
- **Styling**: SCSS + CSS Modules
### Backend (`packages/api`)
- **Runtime**: Node.js 22+ with TypeScript
- **Framework**: Express.js
- **Database**:
- ClickHouse (primary telemetry data)
- MongoDB (metadata via Mongoose)
- **Authentication**: Passport.js with local strategy
- **Validation**: Zod schemas
- **OpenTelemetry**: Self-instrumented with `@hyperdx/node-opentelemetry`
### Common Utilities (`packages/common-utils`)
- Shared TypeScript utilities for query parsing, ClickHouse operations
- Zod schemas for data validation
- SQL formatting and query building helpers
## 🏛️ Key Architectural Patterns
### Database Models (MongoDB)
All models follow consistent patterns with:
- Team-based multi-tenancy (most entities belong to a `team`)
- ObjectId references between related entities
- Timestamps for audit trails
- Zod schema validation
**Key Models:**
- `Team`: Multi-tenant organization unit
- `User`: Team members with authentication
- `Source`: ClickHouse data source configuration
- `Connection`: Database connection settings
- `SavedSearch`: Saved queries and filters
- `Dashboard`: Custom dashboard configurations
- `Alert`: Monitoring alerts with thresholds
### Frontend Architecture
- **Page-level components**: Located in `pages/` (Next.js routing)
- **Reusable components**: Located in `src/` directory
- **State management**:
- Server state via TanStack Query
- Client state via Jotai atoms
- URL state via query parameters
- **API communication**: Custom hooks wrapping TanStack Query
### Backend Architecture
- **Router-based organization**: Separate routers for different API domains
- **Middleware stack**: Authentication, CORS, error handling
- **Controller pattern**: Business logic separated from route handlers
- **Service layer**: Reusable business logic (e.g., `agentService`)
## 🔧 Development Environment
### Setup Commands
## Development Setup (HOW)
```bash
# Install dependencies and setup hooks
yarn setup
# Start full development stack (Docker + local services)
yarn dev
yarn setup # Install dependencies
yarn dev # Start full stack (Docker + local services)
```
### Key Development Scripts
The project uses **Yarn 4.5.1** workspaces. Docker Compose manages ClickHouse, MongoDB, and the OTel Collector.
- `yarn app:dev`: Start API, frontend, alerts task, and common-utils in watch
mode
- `yarn lint`: Run linting across all packages
- `yarn dev:int`: Run integration tests in watch mode
- `yarn dev:unit`: Run unit tests in watch mode (per package)
## Working on the Codebase (HOW)
### ⚠️ BEFORE COMMITTING - Run Linting Commands
**Before starting a task**, read relevant documentation from the `agent_docs/` directory:
**Claude AI agents must run these commands before any commit:**
- `agent_docs/architecture.md` - Detailed architecture patterns and data models
- `agent_docs/tech_stack.md` - Technology stack details and component patterns
- `agent_docs/development.md` - Development workflows, testing, and common tasks
- `agent_docs/code_style.md` - Code patterns and best practices (read only when actively coding)
```bash
# 1. Fix linting issues in modified packages
cd packages/app && yarn run lint:fix
cd packages/api && yarn run lint:fix
cd packages/common-utils && yarn lint:fix
**Tools handle formatting and linting automatically** via pre-commit hooks. Focus on implementation; don't manually format code.
# 2. Check for any remaining linting issues from the main directory
yarn run lint
```
## Key Principles
**If linting issues remain after running lint:fix**: Some linting errors cannot
be automatically fixed and require manual intervention. If `yarn run lint` still
shows errors:
1. **Multi-tenancy**: All data is scoped to `Team` - ensure proper filtering
2. **Type safety**: Use TypeScript strictly; Zod schemas for validation
3. **Existing patterns**: Follow established patterns in the codebase - explore similar files before implementing
4. **Component size**: Keep files under 300 lines; break down large components
5. **Testing**: Tests live in `__tests__/` directories; use Jest for unit/integration tests
1. Read the linting error messages carefully to understand the issue
2. Manually fix the reported issues in the affected files
3. Re-run `yarn run lint` to verify all issues are resolved
4. Only commit once all linting errors are fixed
## Important Context
**Why this is necessary**: While the project has pre-commit hooks (`lint-staged`
with Husky) that automatically fix linting issues on commit, Claude AI agents do
not trigger these hooks. Therefore, you must manually run the lint:fix commands
before committing.
### Environment Configuration
- `.env.development`: Development environment variables
- Docker Compose manages ClickHouse, MongoDB, OTel Collector
- Hot reload enabled for all services in development
## 📝 Code Style & Patterns
### TypeScript Guidelines
- **Strict typing**: Avoid `any` type assertions (use proper typing instead)
- **Zod validation**: Use Zod schemas for runtime validation
- **Interface definitions**: Clear interfaces for all data structures
- **Error handling**: Proper error boundaries and serialization
### Component Patterns
- **Functional components**: Use React hooks over class components
- **Custom hooks**: Extract reusable logic into custom hooks
- **Props interfaces**: Define clear TypeScript interfaces for component props
- **File organization**: Keep files under 300 lines, break down large components
### UI Components & Styling
**Prefer Mantine UI**: Use Mantine components as the primary UI library:
```tsx
// ✅ Good - Use Mantine components
import { Button, TextInput, Modal, Select } from '@mantine/core';
// ✅ Good - Mantine hooks for common functionality
import { useDisclosure, useForm } from '@mantine/hooks';
```
**Component Hierarchy**:
1. **First choice**: Mantine components (`@mantine/core`, `@mantine/dates`,
etc.)
2. **Second choice**: Custom components built on Mantine primitives
3. **Last resort**: Custom styling using CSS Modules and SCSS
**Styling Approach**:
- Use Mantine's built-in styling system and theme
- SCSS modules for component-specific styles when needed
- Avoid inline styles unless absolutely necessary
- Leverage Mantine's responsive design utilities
### API Patterns
- **RESTful design**: Clear HTTP methods and resource-based URLs
- **Middleware composition**: Reusable middleware for auth, validation, etc.
- **Error handling**: Consistent error response format
- **Input validation**: Zod schemas for request validation
## 🧪 Testing Strategy
### Testing Tools
- **Unit Tests**: Jest with TypeScript support
- **Integration Tests**: Jest with database fixtures
- **Frontend Testing**: React Testing Library + Jest
- **E2E Testing**: Custom smoke tests with BATS
### Testing Patterns
- **TDD Approach**: Write tests before implementation for new features
- **Test organization**: Tests co-located with source files in `__tests__`
directories
- **Mocking**: MSW for API mocking in frontend tests
- **Database testing**: Isolated test databases with fixtures
### CI Testing
For integration testing in CI environments:
```bash
# Start CI testing stack (ClickHouse, MongoDB, etc.)
docker compose -p int -f ./docker-compose.ci.yml up -d
# Run integration tests
yarn dev:int
```
**CI Testing Notes:**
- Uses separate Docker Compose configuration optimized for CI
- Isolated test environment with `-p int` project name
- Includes all necessary services (ClickHouse, MongoDB, OTel Collector)
- Tests run against real database instances for accurate integration testing
## 🗄️ Data & Query Patterns
### ClickHouse Integration
- **Query building**: Use `common-utils` for safe query construction
- **Schema flexibility**: Support for various telemetry schemas via `Source`
configuration
### MongoDB Patterns
- **Multi-tenancy**: All queries filtered by team context
- **Relationships**: Use ObjectId references with proper population
- **Indexing**: Strategic indexes for query performance
- **Migrations**: Versioned migrations for schema changes
## 🚀 Common Development Tasks
### Adding New Features
1. **API First**: Define API endpoints and data models
2. **Database Models**: Create/update Mongoose schemas and ClickHouse queries
3. **Frontend Integration**: Build UI components and integrate with API
4. **Testing**: Add unit and integration tests
5. **Documentation**: Update relevant docs
### Performance Considerations
- **Frontend rendering**: Use virtualization for large datasets
- **API responses**: Implement pagination and caching where appropriate
- **Bundle size**: Monitor and optimize JavaScript bundle sizes
## 🔍 Key Files & Directories
### Configuration
- `packages/api/src/config.ts`: API configuration and environment variables
- `packages/app/next.config.js`: Next.js configuration
- `docker-compose.dev.yml`: Development environment setup
### Core Business Logic
- `packages/api/src/models/`: MongoDB data models
- `packages/api/src/routers/`: API route definitions
- `packages/api/src/controllers/`: Business logic controllers
- `packages/common-utils/src/`: Shared utilities and query builders
### Frontend Architecture
- `packages/app/pages/`: Next.js pages and routing
- `packages/app/src/`: Reusable components and utilities
- `packages/app/src/useUserPreferences.tsx`: Global user state management
## 🚨 Common Pitfalls & Guidelines
### Security
- **Server-side validation**: Always validate and sanitize on the backend
- **Team isolation**: Ensure proper team-based access control
- **API authentication**: Use proper authentication middleware
- **Environment variables**: Never commit secrets, use `.env` files
### Performance
- **React rendering**: Use proper keys and memoization for large lists
- **API pagination**: Implement cursor-based pagination for large datasets
### Code Quality
- **Component responsibility**: Single responsibility principle
- **Error boundaries**: Proper error handling at component boundaries
- **Type safety**: Prefer type-safe approaches over runtime checks
## 🔗 Useful Resources
- **OpenTelemetry Docs**: Understanding telemetry data structures
- **ClickHouse Docs**: Query optimization and schema design
- **Mantine UI**: Component library documentation
- **TanStack Query**: Server state management patterns
## 🤝 Contributing Guidelines
1. **Follow existing patterns**: Maintain consistency with current codebase
2. **Test coverage**: Add tests for new functionality
3. **Documentation**: Update relevant documentation
4. **Code review**: Ensure changes align with architectural principles
5. **Performance impact**: Consider impact on query performance and bundle size
- **Authentication**: Passport.js with team-based access control
- **State management**: Jotai (client), TanStack Query (server), URL params (filters)
- **UI library**: Mantine components are the standard (not custom UI)
- **Database patterns**: MongoDB for metadata with Mongoose, ClickHouse for telemetry queries
---
_This guide should be updated as the codebase evolves and new patterns emerge._
*Need more details? Check the `agent_docs/` directory or ask which documentation to read.*

34
agent_docs/README.md Normal file
View file

@ -0,0 +1,34 @@
# Agent Documentation Directory
This directory contains detailed documentation for AI coding agents working on the HyperDX codebase. These files use **progressive disclosure** - they're referenced from `CLAUDE.md` but only read when needed.
## Purpose
Instead of stuffing all instructions into `CLAUDE.md` (which goes into every conversation), we keep detailed, task-specific information here. This ensures:
1. **Better focus**: Only relevant context gets loaded per task
2. **Improved performance**: Smaller context window = better instruction following
3. **Easier maintenance**: Update specific docs without bloating the main file
## Files
- **`architecture.md`** - System architecture, data models, service relationships, security patterns
- **`tech_stack.md`** - Technology choices, UI component patterns, library usage
- **`development.md`** - Development workflows, testing strategy, common tasks, debugging
- **`code_style.md`** - Code patterns and best practices (read only when actively coding)
## Usage Pattern
When starting a task:
1. Agent reads `CLAUDE.md` first (always included)
2. Agent determines which (if any) docs from this directory are relevant
3. Agent reads only the needed documentation
4. Agent proceeds with focused, relevant context
## Maintenance
- Keep files focused on their specific domain
- Use file/line references instead of code snippets when possible
- Update when patterns or architecture change
- Keep documentation current with the codebase

View file

@ -0,0 +1,67 @@
# HyperDX Architecture
## Core Services
- **HyperDX UI (`packages/app`)**: Next.js frontend serving the user interface
- **HyperDX API (`packages/api`)**: Node.js/Express backend handling queries and business logic
- **OpenTelemetry Collector**: Receives and processes telemetry data
- **ClickHouse**: Primary data store for all telemetry (logs, metrics, traces)
- **MongoDB**: Metadata storage (users, dashboards, alerts, saved searches)
## Data Flow
1. Applications send telemetry via OpenTelemetry → OTel Collector
2. OTel Collector processes and forwards data → ClickHouse
3. Users interact with UI → API queries ClickHouse
4. Configuration/metadata stored in MongoDB
## Key MongoDB Models
All models follow consistent patterns with:
- Team-based multi-tenancy (most entities belong to a `team`)
- ObjectId references between related entities
- Timestamps for audit trails
- Zod schema validation
**Key Models** (see `packages/api/src/models/`):
- `Team`: Multi-tenant organization unit
- `User`: Team members with authentication
- `Source`: ClickHouse data source configuration
- `Connection`: Database connection settings
- `SavedSearch`: Saved queries and filters
- `Dashboard`: Custom dashboard configurations
- `Alert`: Monitoring alerts with thresholds
## Frontend Architecture
- **Pages**: `packages/app/pages/` (Next.js routing)
- **Components**: `packages/app/src/` (reusable components)
- **API communication**: Custom hooks wrapping TanStack Query
- **State**: See tech_stack.md for state management details
## Backend Architecture
- **Routers**: `packages/api/src/routers/` - Domain-specific API routes
- **Controllers**: `packages/api/src/controllers/` - Business logic separated from routes
- **Middleware**: Authentication, CORS, error handling
- **Services**: Reusable business logic (e.g., `agentService`)
## Data & Query Patterns
### ClickHouse Integration
- **Query building**: Use `common-utils` for safe query construction
- **Schema flexibility**: Support for various telemetry schemas via `Source` configuration
### MongoDB Patterns
- **Multi-tenancy**: All queries filtered by team context
- **Relationships**: Use ObjectId references with proper population
- **Indexing**: Strategic indexes for query performance
- **Migrations**: Versioned migrations for schema changes (see `packages/api/migrations/`)
## Security Requirements
- **Server-side validation**: Always validate and sanitize on the backend
- **Team isolation**: All data access must filter by team context
- **API authentication**: Use authentication middleware on protected routes
- **Secrets**: Never commit secrets; use `.env` files

37
agent_docs/code_style.md Normal file
View file

@ -0,0 +1,37 @@
# Code Style & Best Practices
> **Note**: Pre-commit hooks handle formatting automatically. Focus on implementation patterns.
## TypeScript
- Avoid `any` - use proper typing
- Use Zod schemas for runtime validation
- Define clear interfaces for data structures
- Implement proper error boundaries
## Code Organization
- **Single Responsibility**: One clear purpose per component/function
- **File Size**: Max 300 lines - refactor when approaching limit
- **DRY**: Reuse existing functionality; consolidate duplicates
- **In-Context Learning**: Explore similar files before implementing
## React Patterns
- Functional components with hooks (not class components)
- Extract reusable logic into custom hooks
- Define TypeScript interfaces for props
- Use proper keys for lists, memoization for expensive computations
## Refactoring
- Edit files directly - don't create `component-v2.tsx` copies
- Look for duplicate code across the affected area
- Verify all callers and integrations after changes
- Refactor to improve clarity or reduce complexity, not just to change
## File Naming
- Clear, descriptive names following package conventions
- Avoid "temp", "refactored", "improved" in permanent filenames

111
agent_docs/development.md Normal file
View file

@ -0,0 +1,111 @@
# Development Workflows
## Setup Commands
```bash
# Install dependencies and setup hooks
yarn setup
# Start full development stack (Docker + local services)
yarn dev
```
## Key Development Scripts
- `yarn app:dev`: Start API, frontend, alerts task, and common-utils in watch mode
- `yarn lint`: Run linting across all packages
- `yarn dev:int`: Run integration tests in watch mode
- `yarn dev:unit`: Run unit tests in watch mode (per package)
## Environment Configuration
- `.env.development`: Development environment variables
- Docker Compose manages ClickHouse, MongoDB, OTel Collector
- Hot reload enabled for all services in development
## Testing Strategy
### Testing Tools
- **Unit Tests**: Jest with TypeScript support
- **Integration Tests**: Jest with database fixtures
- **Frontend Testing**: React Testing Library + Jest
- **E2E Testing**: Custom smoke tests with BATS
### Testing Patterns
- **TDD Approach**: Write tests before implementation for new features
- **Test organization**: Tests co-located with source files in `__tests__/` directories
- **Mocking**: MSW for API mocking in frontend tests
- **Database testing**: Isolated test databases with fixtures
### CI Testing
For integration testing in CI environments:
```bash
# Start CI testing stack (ClickHouse, MongoDB, etc.)
docker compose -p int -f ./docker-compose.ci.yml up -d
# Run integration tests
yarn dev:int
```
**CI Testing Notes:**
- Uses separate Docker Compose configuration optimized for CI
- Isolated test environment with `-p int` project name
- Includes all necessary services (ClickHouse, MongoDB, OTel Collector)
- Tests run against real database instances for accurate integration testing
## Common Development Tasks
### Adding New Features
1. **API First**: Define API endpoints and data models
2. **Database Models**: Create/update Mongoose schemas and ClickHouse queries
3. **Frontend Integration**: Build UI components and integrate with API
4. **Testing**: Add unit and integration tests
5. **Documentation**: Update relevant docs
### Debugging
- Check browser and server console output for errors, warnings, or relevant logs
- Add targeted logging to trace execution and variable states
- For persistent issues, check `fixes/` directory for documented solutions
- Document complex fixes in `fixes/` directory with descriptive filenames
## Code Quality
### Pre-commit Hooks
The project uses Husky + lint-staged to automatically run:
- Prettier for formatting
- ESLint for linting
- API doc generation (for external API changes)
These run automatically on `git commit` for staged files.
### Manual Linting (if needed)
If you need to manually lint:
```bash
# Per-package linting with auto-fix
cd packages/app && yarn run lint:fix
cd packages/api && yarn run lint:fix
cd packages/common-utils && yarn lint:fix
# Check all packages
yarn run lint
```
## File Locations Quick Reference
- **Config**: `packages/api/src/config.ts`, `packages/app/next.config.js`, `docker-compose.dev.yml`
- **Models**: `packages/api/src/models/`
- **API Routes**: `packages/api/src/routers/`
- **Controllers**: `packages/api/src/controllers/`
- **Pages**: `packages/app/pages/`
- **Components**: `packages/app/src/`
- **Shared Utils**: `packages/common-utils/src/`

28
agent_docs/tech_stack.md Normal file
View file

@ -0,0 +1,28 @@
# HyperDX Technology Stack
## Frontend (`packages/app`)
- **Framework**: Next.js 14 with TypeScript
- **UI Components**: Mantine UI library (`@mantine/core`, `@mantine/dates`, `@mantine/hooks`)
- **State Management**: Jotai (global client state), TanStack Query (server state), URL params (filters)
- **Charts/Visualization**: Recharts, uPlot
- **Code Editor**: CodeMirror (for SQL/JSON editing)
- **Styling**: Mantine's built-in system, SCSS modules when needed
**UI Component Priority**: Mantine components first → Custom components on Mantine primitives → Custom SCSS modules as last resort
## Backend (`packages/api`)
- **Runtime**: Node.js 22+ with TypeScript
- **Framework**: Express.js
- **Database**: ClickHouse (telemetry data), MongoDB via Mongoose (metadata)
- **Authentication**: Passport.js with local strategy
- **Validation**: Zod schemas
- **Telemetry**: Self-instrumented with `@hyperdx/node-opentelemetry`
## Common Utilities (`packages/common-utils`)
- Shared TypeScript utilities for query parsing and ClickHouse operations
- Zod schemas for data validation
- SQL formatting and query building helpers