mirror of
https://github.com/n8n-io/n8n
synced 2026-05-03 13:38:39 +00:00
145 lines
3.9 KiB
Markdown
145 lines
3.9 KiB
Markdown
|
|
# Performance Benchmarks
|
||
|
|
|
||
|
|
Microbenchmarks for measuring and tracking performance of critical code paths.
|
||
|
|
|
||
|
|
## When to Use Benchmarks
|
||
|
|
|
||
|
|
**Good fit:**
|
||
|
|
- Hot paths executed thousands of times (expression evaluation, data transforms)
|
||
|
|
- Comparing implementation approaches (current vs proposed)
|
||
|
|
- Detecting regressions in critical code
|
||
|
|
|
||
|
|
**Not a good fit:**
|
||
|
|
- API endpoint latency (use load testing - k6, artillery)
|
||
|
|
- Database query performance (use query analysis tools)
|
||
|
|
- Frontend rendering (use browser profiling)
|
||
|
|
- One-off operations (startup time, migrations)
|
||
|
|
|
||
|
|
**Rule of thumb:** If it runs millions of times per day across all users, benchmark it.
|
||
|
|
|
||
|
|
## Commands
|
||
|
|
|
||
|
|
```bash
|
||
|
|
pnpm --filter=@n8n/performance bench # Run benchmarks
|
||
|
|
pnpm --filter=@n8n/performance bench:baseline # Save new baseline
|
||
|
|
pnpm --filter=@n8n/performance bench:ci # CI check (fails if >10% slower)
|
||
|
|
```
|
||
|
|
|
||
|
|
## Adding a Benchmark
|
||
|
|
|
||
|
|
### 1. Create a bench file
|
||
|
|
|
||
|
|
```typescript
|
||
|
|
// benchmarks/my-feature/thing.bench.ts
|
||
|
|
import { bench, describe } from 'vitest';
|
||
|
|
|
||
|
|
describe('My Feature', () => {
|
||
|
|
bench('operation name', () => {
|
||
|
|
// Code to measure - runs thousands of times
|
||
|
|
doTheThing();
|
||
|
|
});
|
||
|
|
});
|
||
|
|
```
|
||
|
|
|
||
|
|
### 2. Add setup outside the bench function
|
||
|
|
|
||
|
|
```typescript
|
||
|
|
// Setup runs once, not measured
|
||
|
|
const data = createTestData();
|
||
|
|
const instance = new MyClass();
|
||
|
|
|
||
|
|
describe('My Feature', () => {
|
||
|
|
bench('with small input', () => {
|
||
|
|
instance.process(data.small);
|
||
|
|
});
|
||
|
|
|
||
|
|
bench('with large input', () => {
|
||
|
|
instance.process(data.large);
|
||
|
|
});
|
||
|
|
});
|
||
|
|
```
|
||
|
|
|
||
|
|
### 3. Add warmup if needed
|
||
|
|
|
||
|
|
```typescript
|
||
|
|
// Warmup ensures JIT compilation is done before measuring
|
||
|
|
for (let i = 0; i < 1000; i++) {
|
||
|
|
instance.process(data.small);
|
||
|
|
}
|
||
|
|
|
||
|
|
describe('My Feature', () => {
|
||
|
|
// Now benchmarks measure hot path, not JIT compilation
|
||
|
|
});
|
||
|
|
```
|
||
|
|
|
||
|
|
## Reading Results
|
||
|
|
|
||
|
|
```
|
||
|
|
name hz min max mean p99 rme samples
|
||
|
|
my operation 20,000 0.04 0.20 0.05 0.10 ±0.5% 10000
|
||
|
|
```
|
||
|
|
|
||
|
|
| Column | Meaning |
|
||
|
|
|--------|---------|
|
||
|
|
| hz | Operations per second (higher = faster) |
|
||
|
|
| mean | Average time per operation in ms |
|
||
|
|
| p99 | 99th percentile - worst case latency |
|
||
|
|
| rme | Margin of error - lower = more reliable |
|
||
|
|
| samples | Number of iterations run |
|
||
|
|
|
||
|
|
## Regression Detection
|
||
|
|
|
||
|
|
Benchmarks are compared against a saved baseline:
|
||
|
|
|
||
|
|
- **>10% slower** = regression (CI fails)
|
||
|
|
- **>10% faster** = improvement (consider updating baseline)
|
||
|
|
|
||
|
|
### Local Workflow
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# 1. Before making changes, save a baseline
|
||
|
|
pnpm --filter=@n8n/performance bench:baseline
|
||
|
|
|
||
|
|
# 2. Make your changes/refactors
|
||
|
|
|
||
|
|
# 3. Check for regressions
|
||
|
|
pnpm --filter=@n8n/performance bench:ci
|
||
|
|
```
|
||
|
|
|
||
|
|
### After Intentional Improvements
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Save new baseline to reflect the improvement
|
||
|
|
pnpm --filter=@n8n/performance bench:baseline
|
||
|
|
```
|
||
|
|
|
||
|
|
## Current Benchmarks
|
||
|
|
|
||
|
|
| Area | What it measures | Why it matters |
|
||
|
|
|------|------------------|----------------|
|
||
|
|
| Expression Engine | `={{ }}` evaluation speed | Runs for every node parameter |
|
||
|
|
|
||
|
|
## Current Status
|
||
|
|
|
||
|
|
This is a proof-of-concept for local regression detection.
|
||
|
|
|
||
|
|
### CI Integration (TODO)
|
||
|
|
|
||
|
|
Baselines are hardware-specific (an 8-core MacBook baseline is meaningless on a 2-core runner). CI needs its own baseline management:
|
||
|
|
|
||
|
|
- **Option A:** Store baselines as CI artifacts, restore before comparison
|
||
|
|
- **Option B:** External storage (S3, dedicated benchmark service)
|
||
|
|
- **Option C:** Compare against previous CI run on same runner type
|
||
|
|
|
||
|
|
## Known Limitations
|
||
|
|
|
||
|
|
- **Local noise**: Background processes affect results. Run multiple times to verify.
|
||
|
|
- **Baselines are machine-specific**: Cannot commit baselines to git - they must be generated on the same hardware they'll be compared against.
|
||
|
|
|
||
|
|
## Tips
|
||
|
|
|
||
|
|
1. **Keep benchmarks focused** - one thing per bench, not workflows
|
||
|
|
2. **Use realistic data sizes** - 100 items is typical, 10k is stress test
|
||
|
|
3. **Compare approaches** - benchmark both before deciding
|
||
|
|
4. **Don't over-benchmark** - only critical hot paths need this
|