|
|
||
|---|---|---|
| .. | ||
| benchmarks/expression-engine | ||
| profiles | ||
| scripts | ||
| package.json | ||
| README.md | ||
| vitest.config.ts | ||
Performance Benchmarks
Microbenchmarks for measuring and tracking performance of critical code paths.
When to Use Benchmarks
Good fit:
- Hot paths executed thousands of times (expression evaluation, data transforms)
- Comparing implementation approaches (current vs proposed)
- Detecting regressions in critical code
Not a good fit:
- API endpoint latency (use load testing - k6, artillery)
- Database query performance (use query analysis tools)
- Frontend rendering (use browser profiling)
- One-off operations (startup time, migrations)
Rule of thumb: If it runs millions of times per day across all users, benchmark it.
Commands
pnpm --filter=@n8n/performance bench # Run benchmarks
pnpm --filter=@n8n/performance bench:baseline # Save baseline for local comparison
pnpm --filter=@n8n/performance bench:compare # Compare against baseline (>10% = fail)
CI Regression Detection
Benchmarks run automatically on PRs that touch packages/testing/performance/** or packages/workflow/src/**. CodSpeed counts CPU instructions instead of wall-clock time, producing deterministic results regardless of runner load. It comments on PRs with results and regression warnings.
You can also trigger benchmarks manually for any branch via Actions > Test: Benchmarks > Run workflow.
Local vs CI
Local (bench) |
CI | |
|---|---|---|
| Measurement | Wall-clock time (Hz, ms) | CPU instruction count |
| Noise | 15-30% variance | Near-zero variance |
| Best for | Quick sanity checks, comparing approaches | Automated regression detection |
Local benchmarks are useful for eyeballing performance during development. Use bench:baseline + bench:compare for before/after comparisons on the same machine in the same session.
Adding a Benchmark
// benchmarks/my-feature/thing.bench.ts
import { bench, describe } from 'vitest';
// Setup runs once, not measured
const data = createTestData();
describe('My Feature', () => {
bench('operation name', () => {
doTheThing(data);
});
});
Reading Results
name hz min max mean p99 rme samples
my operation 20,000 0.04 0.20 0.05 0.10 ±0.5% 10000
| Column | Meaning |
|---|---|
| hz | Operations per second (higher = faster) |
| mean | Average time per operation in ms |
| p99 | 99th percentile - worst case latency |
| rme | Margin of error - lower = more reliable |
| samples | Number of iterations run |
Current Benchmarks
| Area | What it measures | Why it matters |
|---|---|---|
| Expression Engine | ={{ }} evaluation speed |
Runs for every node parameter |
Tips
- Keep benchmarks focused - one thing per bench, not workflows
- Use realistic data sizes - 100 items is typical, 10k is stress test
- Compare approaches - benchmark both before deciding
- Don't over-benchmark - only critical hot paths need this