mirror of
https://github.com/hyperdxio/hyperdx
synced 2026-04-21 13:37:15 +00:00
337 lines
11 KiB
Markdown
337 lines
11 KiB
Markdown
# HyperDX Claude Agent Guide
|
|
|
|
This guide helps Claude AI agents understand and work effectively with the
|
|
HyperDX codebase.
|
|
|
|
## 🏗️ Project Overview
|
|
|
|
HyperDX is an observability platform built on ClickHouse that helps engineers
|
|
search, visualize, and monitor logs, metrics, traces, and session replays. It's
|
|
designed as an alternative to tools like Kibana but optimized for ClickHouse's
|
|
performance characteristics.
|
|
|
|
**Core Value Proposition:**
|
|
|
|
- Unified observability: correlate logs, metrics, traces, and session replays in
|
|
one place
|
|
- ClickHouse-powered: blazing fast searches and visualizations
|
|
- OpenTelemetry native: works out of the box with OTEL instrumentation
|
|
- Schema agnostic: works on top of existing ClickHouse schemas
|
|
|
|
## 📁 Architecture Overview
|
|
|
|
HyperDX follows a microservices architecture with clear separation between
|
|
components:
|
|
|
|
### Core Services
|
|
|
|
- **HyperDX UI (`packages/app`)**: Next.js frontend serving the user interface
|
|
- **HyperDX API (`packages/api`)**: Node.js/Express backend handling queries and
|
|
business logic
|
|
- **OpenTelemetry Collector**: Receives and processes telemetry data
|
|
- **ClickHouse**: Primary data store for all telemetry (logs, metrics, traces)
|
|
- **MongoDB**: Metadata storage (users, dashboards, alerts, saved searches)
|
|
|
|
### Data Flow
|
|
|
|
1. Applications send telemetry via OpenTelemetry → OTel Collector
|
|
2. OTel Collector processes and forwards data → ClickHouse
|
|
3. Users interact with UI → API queries ClickHouse
|
|
4. Configuration/metadata stored in MongoDB
|
|
|
|
## 🛠️ Technology Stack
|
|
|
|
### Frontend (`packages/app`)
|
|
|
|
- **Framework**: Next.js 14 with TypeScript
|
|
- **UI Components**: Mantine UI library + React Bootstrap
|
|
- **State Management**: Jotai for global state, TanStack Query for server state
|
|
- **Charts/Visualization**: Recharts, uPlot
|
|
- **Code Editor**: CodeMirror (for SQL/JSON editing)
|
|
- **Styling**: SCSS + CSS Modules
|
|
|
|
### Backend (`packages/api`)
|
|
|
|
- **Runtime**: Node.js 22+ with TypeScript
|
|
- **Framework**: Express.js
|
|
- **Database**:
|
|
- ClickHouse (primary telemetry data)
|
|
- MongoDB (metadata via Mongoose)
|
|
- **Authentication**: Passport.js with local strategy
|
|
- **Validation**: Zod schemas
|
|
- **OpenTelemetry**: Self-instrumented with `@hyperdx/node-opentelemetry`
|
|
|
|
### Common Utilities (`packages/common-utils`)
|
|
|
|
- Shared TypeScript utilities for query parsing, ClickHouse operations
|
|
- Zod schemas for data validation
|
|
- SQL formatting and query building helpers
|
|
|
|
## 🏛️ Key Architectural Patterns
|
|
|
|
### Database Models (MongoDB)
|
|
|
|
All models follow consistent patterns with:
|
|
|
|
- Team-based multi-tenancy (most entities belong to a `team`)
|
|
- ObjectId references between related entities
|
|
- Timestamps for audit trails
|
|
- Zod schema validation
|
|
|
|
**Key Models:**
|
|
|
|
- `Team`: Multi-tenant organization unit
|
|
- `User`: Team members with authentication
|
|
- `Source`: ClickHouse data source configuration
|
|
- `Connection`: Database connection settings
|
|
- `SavedSearch`: Saved queries and filters
|
|
- `Dashboard`: Custom dashboard configurations
|
|
- `Alert`: Monitoring alerts with thresholds
|
|
|
|
### Frontend Architecture
|
|
|
|
- **Page-level components**: Located in `pages/` (Next.js routing)
|
|
- **Reusable components**: Located in `src/` directory
|
|
- **State management**:
|
|
- Server state via TanStack Query
|
|
- Client state via Jotai atoms
|
|
- URL state via query parameters
|
|
- **API communication**: Custom hooks wrapping TanStack Query
|
|
|
|
### Backend Architecture
|
|
|
|
- **Router-based organization**: Separate routers for different API domains
|
|
- **Middleware stack**: Authentication, CORS, error handling
|
|
- **Controller pattern**: Business logic separated from route handlers
|
|
- **Service layer**: Reusable business logic (e.g., `agentService`)
|
|
|
|
## 🔧 Development Environment
|
|
|
|
### Setup Commands
|
|
|
|
```bash
|
|
# Install dependencies and setup hooks
|
|
yarn setup
|
|
|
|
# Start full development stack (Docker + local services)
|
|
yarn dev
|
|
```
|
|
|
|
### Key Development Scripts
|
|
|
|
- `yarn app:dev`: Start API, frontend, alerts task, and common-utils in watch
|
|
mode
|
|
- `yarn lint`: Run linting across all packages
|
|
- `yarn dev:int`: Run integration tests in watch mode
|
|
- `yarn dev:unit`: Run unit tests in watch mode (per package)
|
|
|
|
### ⚠️ BEFORE COMMITTING - Run Linting Commands
|
|
|
|
**Claude AI agents must run these commands before any commit:**
|
|
|
|
```bash
|
|
# 1. Fix linting issues in modified packages
|
|
cd packages/app && yarn run lint:fix
|
|
cd packages/api && yarn run lint:fix
|
|
cd packages/common-utils && yarn lint:fix
|
|
|
|
# 2. Check for any remaining linting issues from the main directory
|
|
yarn run lint
|
|
```
|
|
|
|
**If linting issues remain after running lint:fix**: Some linting errors cannot
|
|
be automatically fixed and require manual intervention. If `yarn run lint` still
|
|
shows errors:
|
|
|
|
1. Read the linting error messages carefully to understand the issue
|
|
2. Manually fix the reported issues in the affected files
|
|
3. Re-run `yarn run lint` to verify all issues are resolved
|
|
4. Only commit once all linting errors are fixed
|
|
|
|
**Why this is necessary**: While the project has pre-commit hooks (`lint-staged`
|
|
with Husky) that automatically fix linting issues on commit, Claude AI agents do
|
|
not trigger these hooks. Therefore, you must manually run the lint:fix commands
|
|
before committing.
|
|
|
|
### Environment Configuration
|
|
|
|
- `.env.development`: Development environment variables
|
|
- Docker Compose manages ClickHouse, MongoDB, OTel Collector
|
|
- Hot reload enabled for all services in development
|
|
|
|
## 📝 Code Style & Patterns
|
|
|
|
### TypeScript Guidelines
|
|
|
|
- **Strict typing**: Avoid `any` type assertions (use proper typing instead)
|
|
- **Zod validation**: Use Zod schemas for runtime validation
|
|
- **Interface definitions**: Clear interfaces for all data structures
|
|
- **Error handling**: Proper error boundaries and serialization
|
|
|
|
### Component Patterns
|
|
|
|
- **Functional components**: Use React hooks over class components
|
|
- **Custom hooks**: Extract reusable logic into custom hooks
|
|
- **Props interfaces**: Define clear TypeScript interfaces for component props
|
|
- **File organization**: Keep files under 300 lines, break down large components
|
|
|
|
### UI Components & Styling
|
|
|
|
**Prefer Mantine UI**: Use Mantine components as the primary UI library:
|
|
|
|
```tsx
|
|
// ✅ Good - Use Mantine components
|
|
import { Button, TextInput, Modal, Select } from '@mantine/core';
|
|
|
|
// ✅ Good - Mantine hooks for common functionality
|
|
import { useDisclosure, useForm } from '@mantine/hooks';
|
|
```
|
|
|
|
**Component Hierarchy**:
|
|
|
|
1. **First choice**: Mantine components (`@mantine/core`, `@mantine/dates`,
|
|
etc.)
|
|
2. **Second choice**: Custom components built on Mantine primitives
|
|
3. **Last resort**: React Bootstrap or custom CSS (only when Mantine doesn't
|
|
provide the functionality)
|
|
|
|
**Styling Approach**:
|
|
|
|
- Use Mantine's built-in styling system and theme
|
|
- SCSS modules for component-specific styles when needed
|
|
- Avoid inline styles unless absolutely necessary
|
|
- Leverage Mantine's responsive design utilities
|
|
|
|
### API Patterns
|
|
|
|
- **RESTful design**: Clear HTTP methods and resource-based URLs
|
|
- **Middleware composition**: Reusable middleware for auth, validation, etc.
|
|
- **Error handling**: Consistent error response format
|
|
- **Input validation**: Zod schemas for request validation
|
|
|
|
## 🧪 Testing Strategy
|
|
|
|
### Testing Tools
|
|
|
|
- **Unit Tests**: Jest with TypeScript support
|
|
- **Integration Tests**: Jest with database fixtures
|
|
- **Frontend Testing**: React Testing Library + Jest
|
|
- **E2E Testing**: Custom smoke tests with BATS
|
|
|
|
### Testing Patterns
|
|
|
|
- **TDD Approach**: Write tests before implementation for new features
|
|
- **Test organization**: Tests co-located with source files in `__tests__`
|
|
directories
|
|
- **Mocking**: MSW for API mocking in frontend tests
|
|
- **Database testing**: Isolated test databases with fixtures
|
|
|
|
### CI Testing
|
|
|
|
For integration testing in CI environments:
|
|
|
|
```bash
|
|
# Start CI testing stack (ClickHouse, MongoDB, etc.)
|
|
docker compose -p int -f ./docker-compose.ci.yml up -d
|
|
|
|
# Run integration tests
|
|
yarn dev:int
|
|
```
|
|
|
|
**CI Testing Notes:**
|
|
|
|
- Uses separate Docker Compose configuration optimized for CI
|
|
- Isolated test environment with `-p int` project name
|
|
- Includes all necessary services (ClickHouse, MongoDB, OTel Collector)
|
|
- Tests run against real database instances for accurate integration testing
|
|
|
|
## 🗄️ Data & Query Patterns
|
|
|
|
### ClickHouse Integration
|
|
|
|
- **Query building**: Use `common-utils` for safe query construction
|
|
- **Schema flexibility**: Support for various telemetry schemas via `Source`
|
|
configuration
|
|
|
|
### MongoDB Patterns
|
|
|
|
- **Multi-tenancy**: All queries filtered by team context
|
|
- **Relationships**: Use ObjectId references with proper population
|
|
- **Indexing**: Strategic indexes for query performance
|
|
- **Migrations**: Versioned migrations for schema changes
|
|
|
|
## 🚀 Common Development Tasks
|
|
|
|
### Adding New Features
|
|
|
|
1. **API First**: Define API endpoints and data models
|
|
2. **Database Models**: Create/update Mongoose schemas and ClickHouse queries
|
|
3. **Frontend Integration**: Build UI components and integrate with API
|
|
4. **Testing**: Add unit and integration tests
|
|
5. **Documentation**: Update relevant docs
|
|
|
|
### Performance Considerations
|
|
|
|
- **Frontend rendering**: Use virtualization for large datasets
|
|
- **API responses**: Implement pagination and caching where appropriate
|
|
- **Bundle size**: Monitor and optimize JavaScript bundle sizes
|
|
|
|
## 🔍 Key Files & Directories
|
|
|
|
### Configuration
|
|
|
|
- `packages/api/src/config.ts`: API configuration and environment variables
|
|
- `packages/app/next.config.js`: Next.js configuration
|
|
- `docker-compose.dev.yml`: Development environment setup
|
|
|
|
### Core Business Logic
|
|
|
|
- `packages/api/src/models/`: MongoDB data models
|
|
- `packages/api/src/routers/`: API route definitions
|
|
- `packages/api/src/controllers/`: Business logic controllers
|
|
- `packages/common-utils/src/`: Shared utilities and query builders
|
|
|
|
### Frontend Architecture
|
|
|
|
- `packages/app/pages/`: Next.js pages and routing
|
|
- `packages/app/src/`: Reusable components and utilities
|
|
- `packages/app/src/useUserPreferences.tsx`: Global user state management
|
|
|
|
## 🚨 Common Pitfalls & Guidelines
|
|
|
|
### Security
|
|
|
|
- **Server-side validation**: Always validate and sanitize on the backend
|
|
- **Team isolation**: Ensure proper team-based access control
|
|
- **API authentication**: Use proper authentication middleware
|
|
- **Environment variables**: Never commit secrets, use `.env` files
|
|
|
|
### Performance
|
|
|
|
- **React rendering**: Use proper keys and memoization for large lists
|
|
- **API pagination**: Implement cursor-based pagination for large datasets
|
|
|
|
### Code Quality
|
|
|
|
- **Component responsibility**: Single responsibility principle
|
|
- **Error boundaries**: Proper error handling at component boundaries
|
|
- **Type safety**: Prefer type-safe approaches over runtime checks
|
|
|
|
## 🔗 Useful Resources
|
|
|
|
- **OpenTelemetry Docs**: Understanding telemetry data structures
|
|
- **ClickHouse Docs**: Query optimization and schema design
|
|
- **Mantine UI**: Component library documentation
|
|
- **TanStack Query**: Server state management patterns
|
|
|
|
## 🤝 Contributing Guidelines
|
|
|
|
1. **Follow existing patterns**: Maintain consistency with current codebase
|
|
2. **Test coverage**: Add tests for new functionality
|
|
3. **Documentation**: Update relevant documentation
|
|
4. **Code review**: Ensure changes align with architectural principles
|
|
5. **Performance impact**: Consider impact on query performance and bundle size
|
|
|
|
---
|
|
|
|
_This guide should be updated as the codebase evolves and new patterns emerge._
|