Add comprehensive QA and testing framework (52→56 skills)

Introduces a full shift-left QA pipeline with Story Type classification
as the backbone of the Definition of Done:

New skills:
- /test-setup: scaffold test framework + CI/CD per engine (Godot/Unity/Unreal)
- /qa-plan: generate sprint test plan classifying stories by type
- /smoke-check: critical path gate (PASS/PASS WITH WARNINGS/FAIL) before QA hand-off
- /team-qa: orchestrate qa-lead + qa-tester through full QA cycle

Story Type classification (Logic/Integration/Visual/Feel/UI/Config/Data):
- Logic and Integration: BLOCKING DoD gate — unit/integration test required
- Visual/Feel and UI: ADVISORY — screenshot + sign-off evidence required
- Config/Data: ADVISORY — smoke check pass sufficient

Updated skills: story-done (test evidence gate), story-readiness (Story Type
check), gate-check (test framework at Technical Setup, test evidence at
Polish/Release), create-epics-stories (Type field + Test Evidence section)

Updated agents: qa-lead (shift-left philosophy + evidence table),
qa-tester (automated test patterns for Godot/Unity/Unreal)

New templates: test-evidence.md (manual sign-off record), test-plan.md
(sprint-oriented QA plan replacing generic feature template)

Updated coding-standards.md: Testing Standards section with DoD table,
test rules, what NOT to automate, and engine-specific CI/CD commands

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Donchitos 2026-03-16 13:48:32 +11:00
parent a2f8ed93ff
commit 168ac96c3a
13 changed files with 1704 additions and 87 deletions

View file

@ -10,7 +10,10 @@ memory: project
You are the QA Lead for an indie game project. You ensure the game meets
quality standards through systematic testing, bug tracking, and release
readiness evaluation.
readiness evaluation. You practice **shift-left testing** — QA is involved
from the start of each sprint, not just at the end. Testing is a **hard part
of the Definition of Done**: no story is Complete without appropriate test
evidence.
### Collaboration Protocol
@ -62,22 +65,62 @@ Before writing any code:
- Rules are your friend -- when they flag issues, they're usually right
- Tests prove it works -- offer to write them proactively
### Story Type → Test Evidence Requirements
Every story has a type that determines what evidence is required before it can be marked Done:
| Story Type | Required Evidence | Gate Level |
|---|---|---|
| **Logic** (formulas, AI, state machines) | Automated unit test in `tests/unit/[system]/` | BLOCKING |
| **Integration** (multi-system interaction) | Integration test OR documented playtest | BLOCKING |
| **Visual/Feel** (animation, VFX, feel) | Screenshot + lead sign-off in `production/qa/evidence/` | ADVISORY |
| **UI** (menus, HUD, screens) | Manual walkthrough doc OR interaction test | ADVISORY |
| **Config/Data** (balance, data files) | Smoke check pass | ADVISORY |
**Your role in this system:**
- Classify story types when creating QA plans (if not already classified in the story file)
- Flag Logic/Integration stories missing test evidence as blockers before sprint review
- Accept Visual/Feel/UI stories with documented manual evidence as "Done"
- Run or verify `/smoke-check` passes before any build goes to manual QA
### QA Workflow Integration
**Your skills to use:**
- `/qa-plan [sprint]` — generate test plan from story types at sprint start
- `/smoke-check` — run before every QA hand-off
- `/team-qa [sprint]` — orchestrate full QA cycle
**When you get involved:**
- Sprint planning: Review story types and flag missing test strategies
- Mid-sprint: Check that Logic stories have test files as they are implemented
- Pre-QA gate: Run `/smoke-check`; block hand-off if it fails
- QA execution: Direct qa-tester through manual test cases
- Sprint review: Produce sign-off report with open bug list
**What shift-left means for you:**
- Review story acceptance criteria before implementation starts (`/story-readiness`)
- Flag untestable criteria (e.g., "feels good" without a benchmark) before the sprint begins
- Don't wait until the end to find that a Logic story has no tests
### Key Responsibilities
1. **Test Strategy**: Define the overall testing approach -- what is tested
manually vs automatically, coverage goals, test environments, and test
data management.
2. **Test Plan Creation**: For each feature and milestone, create test plans
1. **Test Strategy & QA Planning**: At sprint start, classify stories by type,
identify what needs automated vs. manual testing, and produce the QA plan.
2. **Test Evidence Gate**: Ensure Logic/Integration stories have test files before
marking Complete. This is a hard gate, not a recommendation.
3. **Smoke Check Ownership**: Run `/smoke-check` before every build goes to manual QA.
A failed smoke check means the build is not ready — period.
4. **Test Plan Creation**: For each feature and milestone, create test plans
covering functional testing, edge cases, regression, performance, and
compatibility.
3. **Bug Triage**: Evaluate bug reports for severity, priority, reproducibility,
5. **Bug Triage**: Evaluate bug reports for severity, priority, reproducibility,
and assignment. Maintain a clear bug taxonomy.
4. **Regression Management**: Maintain a regression test suite that covers
6. **Regression Management**: Maintain a regression test suite that covers
critical paths. Ensure regressions are caught before they reach milestones.
5. **Release Quality Gates**: Define and enforce quality gates for each
7. **Release Quality Gates**: Define and enforce quality gates for each
milestone: crash rate, critical bug count, performance benchmarks, feature
completeness.
6. **Playtest Coordination**: Design playtest protocols, create questionnaires,
8. **Playtest Coordination**: Design playtest protocols, create questionnaires,
and analyze playtest feedback for actionable insights.
### Bug Severity Definitions

View file

@ -8,7 +8,9 @@ maxTurns: 10
You are a QA Tester for an indie game project. You write thorough test cases
and detailed bug reports that enable efficient bug fixing and prevent
regressions.
regressions. You also write automated test stubs and understand
engine-specific test patterns — when a story needs a GDScript/C#/C++ test
file, you can scaffold it.
### Collaboration Protocol
@ -60,19 +62,99 @@ Before writing any code:
- Rules are your friend — when they flag issues, they're usually right
- Tests prove it works — offer to write them proactively
### Automated Test Writing
For Logic and Integration stories, you write the test file (or scaffold it for the developer to complete).
**Test naming convention**: `[system]_[feature]_test.[ext]`
**Test function naming**: `test_[scenario]_[expected]`
**Pattern per engine:**
#### Godot (GDScript / GdUnit4)
```gdscript
extends GdUnitTestSuite
func test_[scenario]_[expected]() -> void:
# Arrange
var subject = [ClassName].new()
# Act
var result = subject.[method]([args])
# Assert
assert_that(result).is_equal([expected])
```
#### Unity (C# / NUnit)
```csharp
[TestFixture]
public class [SystemName]Tests
{
[Test]
public void [Scenario]_[Expected]()
{
// Arrange
var subject = new [ClassName]();
// Act
var result = subject.[Method]([args]);
// Assert
Assert.AreEqual([expected], result, delta: 0.001f);
}
}
```
#### Unreal (C++)
```cpp
IMPLEMENT_SIMPLE_AUTOMATION_TEST(
F[SystemName]Test,
"MyGame.[System].[Scenario]",
EAutomationTestFlags::GameFilter
)
bool F[SystemName]Test::RunTest(const FString& Parameters)
{
// Arrange + Act
[ClassName] Subject;
float Result = Subject.[Method]([args]);
// Assert
TestEqual("[description]", Result, [expected]);
return true;
}
```
**What to test for every Logic story formula:**
1. Normal case (typical inputs → expected output)
2. Zero/null input (should not crash; minimum output)
3. Maximum values (should not overflow or produce infinity)
4. Negative modifiers (if applicable)
5. Edge case from GDD (any specific edge case mentioned in the GDD)
### Key Responsibilities
1. **Test Case Writing**: Write detailed test cases with preconditions, steps,
1. **Test File Scaffolding**: For Logic/Integration stories, write or scaffold
the automated test file. Don't wait to be asked — offer to write it when
implementing a Logic story.
2. **Formula Test Generation**: Read the Formulas section of the GDD and generate
test cases covering all formula edge cases automatically.
3. **Test Case Writing**: Write detailed test cases with preconditions, steps,
expected results, and actual results fields. Cover happy path, edge cases,
and error conditions.
2. **Bug Report Writing**: Write bug reports with reproduction steps, expected
vs actual behavior, severity, frequency, environment, and supporting
4. **Bug Report Writing**: Write bug reports with reproduction steps, expected
vs. actual behavior, severity, frequency, environment, and supporting
evidence (logs, screenshots described).
3. **Regression Checklists**: Create and maintain regression checklists for
5. **Regression Checklists**: Create and maintain regression checklists for
each major feature and system. Update after every bug fix.
4. **Smoke Test Suites**: Maintain quick smoke test suites that verify core
functionality in under 15 minutes.
5. **Test Coverage Tracking**: Track which features and code paths have test
6. **Smoke Test Lists**: Maintain the `tests/smoke/` directory with critical path
test cases. These are the 10-15 scenarios that run in the `/smoke-check` gate
before any build goes to manual QA.
7. **Test Coverage Tracking**: Track which features and code paths have test
coverage and identify gaps.
### Bug Report Format

View file

@ -23,3 +23,43 @@
7. **Tuning Knobs** -- configurable values identified
8. **Acceptance Criteria** -- testable success conditions
- Balance values must link to their source formula or rationale
# Testing Standards
## Test Evidence by Story Type
All stories must have appropriate test evidence before they can be marked Done:
| Story Type | Required Evidence | Location | Gate Level |
|---|---|---|---|
| **Logic** (formulas, AI, state machines) | Automated unit test — must pass | `tests/unit/[system]/` | BLOCKING |
| **Integration** (multi-system) | Integration test OR documented playtest | `tests/integration/[system]/` | BLOCKING |
| **Visual/Feel** (animation, VFX, feel) | Screenshot + lead sign-off | `production/qa/evidence/` | ADVISORY |
| **UI** (menus, HUD, screens) | Manual walkthrough doc OR interaction test | `production/qa/evidence/` | ADVISORY |
| **Config/Data** (balance tuning) | Smoke check pass | `production/qa/smoke-[date].md` | ADVISORY |
## Automated Test Rules
- **Naming**: `[system]_[feature]_test.[ext]` for files; `test_[scenario]_[expected]` for functions
- **Determinism**: Tests must produce the same result every run — no random seeds, no time-dependent assertions
- **Isolation**: Each test sets up and tears down its own state; tests must not depend on execution order
- **No hardcoded data**: Test fixtures use constant files or factory functions, not inline magic numbers
(exception: boundary value tests where the exact number IS the point)
- **Independence**: Unit tests do not call external APIs, databases, or file I/O — use dependency injection
## What NOT to Automate
- Visual fidelity (shader output, VFX appearance, animation curves)
- "Feel" qualities (input responsiveness, perceived weight, timing)
- Platform-specific rendering (test on target hardware, not headlessly)
- Full gameplay sessions (covered by playtesting, not automation)
## CI/CD Rules
- Automated test suite runs on every push to main and every PR
- No merge if tests fail — tests are a blocking gate in CI
- Never disable or skip failing tests to make CI pass — fix the underlying issue
- Engine-specific CI commands:
- **Godot**: `godot --headless --script tests/gdunit4_runner.gd`
- **Unity**: `game-ci/unity-test-runner@v4` (GitHub Actions)
- **Unreal**: headless runner with `-nullrhi` flag

86
.claude/docs/templates/test-evidence.md vendored Normal file
View file

@ -0,0 +1,86 @@
# Test Evidence: [Story Title]
> **Story**: `[path to story file]`
> **Story Type**: [Visual/Feel | UI]
> **Date**: [date]
> **Tester**: [who performed the test]
> **Build / Commit**: [version or git hash]
---
## What Was Tested
[One paragraph describing the feature or behaviour that was validated. Include
the acceptance criteria numbers from the story that this evidence covers.]
**Acceptance criteria covered**: [AC-1, AC-2, AC-3]
---
## Acceptance Criteria Results
| # | Criterion (from story) | Result | Notes |
|---|----------------------|--------|-------|
| AC-1 | [exact criterion text] | PASS / FAIL | [any observations] |
| AC-2 | [exact criterion text] | PASS / FAIL | |
| AC-3 | [exact criterion text] | PASS / FAIL | |
---
## Screenshots / Video
List all captured evidence below. Store files in the same directory as this
document or in `production/qa/evidence/[story-slug]/`.
| # | Filename | What It Shows | Acceptance Criterion |
|---|----------|--------------|----------------------|
| 1 | `[filename.png]` | [brief description of what is visible] | AC-1 |
| 2 | `[filename.png]` | | AC-2 |
*If video: note the timestamp and what it demonstrates.*
---
## Test Conditions
- **Game state at start**: [e.g., "fresh save, player at level 1, no items"]
- **Platform / hardware**: [e.g., "Windows 11, GTX 1080, 1080p"]
- **Framerate during test**: [e.g., "stable 60fps" or "~45fps — within budget"]
- **Any special setup required**: [e.g., "dev menu used to trigger specific state"]
---
## Observations
[Anything noteworthy that didn't cause a FAIL but should be recorded. Examples:
minor visual jitter, frame dip under load, behaviour that technically passes
but felt slightly off. These become candidates for polish work.]
- [Observation 1]
- [Observation 2]
If nothing notable: *No significant observations.*
---
## Sign-Off
All three sign-offs are required before the story can be marked COMPLETE via
`/story-done`. Visual/Feel stories require the designer or art-lead sign-off.
UI stories require the UX lead or designer sign-off.
| Role | Name | Date | Signature |
|------|------|------|-----------|
| Developer (implemented) | | | [ ] Approved |
| Designer / Art Lead / UX Lead | | | [ ] Approved |
| QA Lead | | | [ ] Approved |
**Any sign-off can be marked "Deferred — [reason]"** if the person is
unavailable. Deferred sign-offs must be resolved before the story advances
past the sprint review.
---
*Template: `.claude/docs/templates/test-evidence.md`*
*Used for: Visual/Feel and UI story type evidence records*
*Location: `production/qa/evidence/[story-slug]-evidence.md`*

View file

@ -1,97 +1,144 @@
# Test Plan: [Feature/System Name]
# QA Plan: [Sprint/Feature Name]
## Overview
> **Date**: [date]
> **Generated by**: /qa-plan
> **Scope**: [N stories across N systems]
> **Engine**: [engine name and version]
> **Sprint file**: [path to sprint plan]
- **Feature**: [Name]
- **Design Doc**: [Link to design document]
- **Implementation**: [Link to code or PR]
- **Author**: [QA owner]
- **Date**: [Date]
- **Priority**: [Critical / High / Medium / Low]
---
## Scope
## Story Coverage Summary
### In Scope
| Story | Type | Automated Test Required | Manual Verification Required |
|-------|------|------------------------|------------------------------|
| [story title] | Logic | Unit test — `tests/unit/[system]/` | None |
| [story title] | Integration | Integration test — `tests/integration/[system]/` | Smoke check |
| [story title] | Visual/Feel | None (not automatable) | Screenshot + lead sign-off |
| [story title] | UI | None (not automatable) | Manual step-through |
| [story title] | Config/Data | Data validation (optional) | Spot-check in-game values |
- [What is being tested]
**Totals**: [N] Logic, [N] Integration, [N] Visual/Feel, [N] UI, [N] Config/Data
### Out of Scope
---
- [What is explicitly NOT being tested and why]
## Automated Tests Required
### Dependencies
### [Story Title] — Logic
- [Other systems that must be working for these tests to be valid]
**Test file path**: `tests/unit/[system]/[story-slug]_test.[ext]`
## Test Environment
**What to test**:
- [Formula or rule from GDD Formulas section — e.g., "damage = base * multiplier where multiplier ∈ [0.5, 3.0]"]
- [Each named state transition]
- [Each side effect that should / should not occur]
- **Build**: [Minimum build version]
- **Platform**: [Target platforms]
- **Preconditions**: [Required game state, save files, etc.]
**Edge cases to cover**:
- Zero / minimum input values
- Maximum / boundary input values
- Invalid or null input
- [GDD-specified edge cases]
## Test Cases
**Estimated test count**: ~[N] unit tests
### Functional Tests -- Happy Path
---
| ID | Test Case | Steps | Expected Result | Status |
|----|-----------|-------|----------------|--------|
| TC-001 | [Description] | 1. [Step] 2. [Step] | [Expected] | [ ] |
| TC-002 | [Description] | 1. [Step] 2. [Step] | [Expected] | [ ] |
### [Story Title] — Integration
### Functional Tests -- Edge Cases
**Test file path**: `tests/integration/[system]/[story-slug]_test.[ext]`
| ID | Test Case | Steps | Expected Result | Status |
|----|-----------|-------|----------------|--------|
| TC-010 | [Boundary value] | 1. [Step] | [Expected] | [ ] |
| TC-011 | [Zero/null input] | 1. [Step] | [Expected] | [ ] |
| TC-012 | [Maximum values] | 1. [Step] | [Expected] | [ ] |
**What to test**:
- [Cross-system interaction — e.g., "applying buff updates CharacterStats and triggers UI refresh"]
- [Round-trip — e.g., "save → load restores all fields"]
### Negative Tests
---
| ID | Test Case | Steps | Expected Result | Status |
|----|-----------|-------|----------------|--------|
| TC-020 | [Invalid input] | 1. [Step] | [Graceful handling] | [ ] |
| TC-021 | [Interrupted action] | 1. [Step] | [No corruption] | [ ] |
## Manual QA Checklist
### Integration Tests
### [Story Title] — Visual/Feel
| ID | Test Case | Systems Involved | Steps | Expected Result | Status |
|----|-----------|-----------------|-------|----------------|--------|
| TC-030 | [Cross-system interaction] | [System A, System B] | 1. [Step] | [Expected] | [ ] |
**Verification method**: Screenshot + [designer / art-lead] sign-off
**Evidence file**: `production/qa/evidence/[story-slug]-evidence.md`
**Who must sign off**: [designer / lead-programmer / art-lead]
### Performance Tests
- [ ] [Specific observable condition — e.g., "hit flash appears on frame of impact, not the frame after"]
- [ ] [Another falsifiable condition]
| ID | Test Case | Metric | Budget | Steps | Status |
|----|-----------|--------|--------|-------|--------|
| TC-040 | [Load time] | Seconds | [X]s | 1. [Step] | [ ] |
| TC-041 | [Frame rate] | FPS | [X] | 1. [Step] | [ ] |
| TC-042 | [Memory usage] | MB | [X]MB | 1. [Step] | [ ] |
### [Story Title] — UI
### Regression Tests
**Verification method**: Manual step-through
**Evidence file**: `production/qa/evidence/[story-slug]-evidence.md`
| ID | Related Bug | Test Case | Steps | Expected Result | Status |
|----|------------|-----------|-------|----------------|--------|
| TC-050 | BUG-[XXXX] | [Verify fix holds] | 1. [Step] | [Expected] | [ ] |
- [ ] [Every acceptance criterion translated into a manual check item]
## Test Results Summary
---
| Category | Total | Passed | Failed | Blocked | Skipped |
|----------|-------|--------|--------|---------|---------|
| Happy Path | | | | | |
| Edge Cases | | | | | |
| Negative | | | | | |
| Integration | | | | | |
| Performance | | | | | |
| Regression | | | | | |
| **Total** | | | | | |
## Smoke Test Scope
Critical paths to verify before QA hand-off (run via `/smoke-check`):
1. Game launches to main menu without crash
2. New game / session can be started
3. [Primary mechanic introduced or changed this sprint]
4. [System with regression risk from this sprint's changes]
5. Save / load cycle completes without data loss (if save system exists)
6. Performance is within budget on target hardware
---
## Playtest Requirements
| Story | Playtest Goal | Min Sessions | Target Player Type |
|-------|--------------|--------------|-------------------|
| [story] | [What question must be answered?] | [N] | [new player / experienced / etc.] |
Sign-off requirement: Playtest notes → `production/session-logs/playtest-[sprint]-[story-slug].md`
If no playtest sessions required: *No playtest sessions required for this sprint.*
---
## Definition of Done — This Sprint
A story is DONE when ALL of the following are true:
- [ ] All acceptance criteria verified — automated test result OR documented manual evidence
- [ ] Test file exists for all Logic and Integration stories and passes
- [ ] Manual evidence document exists for all Visual/Feel and UI stories
- [ ] Smoke check passes (run `/smoke-check sprint` before QA hand-off)
- [ ] No regressions introduced — previous sprint's features still pass
- [ ] Code reviewed (via `/code-review` or documented peer review)
- [ ] Story file updated to `Status: Complete` via `/story-done`
**Stories requiring playtest sign-off before close**: [list, or "None"]
---
## Test Results
*Fill in after testing is complete.*
| Story | Automated | Manual | Result | Notes |
|-------|-----------|--------|--------|-------|
| [title] | PASS | — | PASS | |
| [title] | — | PASS | PASS | |
| [title] | FAIL | — | BLOCKED | [describe failure] |
---
## Bugs Found
| Bug ID | Severity | Test Case | Description | Status |
|--------|----------|-----------|-------------|--------|
| ID | Story | Severity | Description | Status |
|----|-------|----------|-------------|--------|
| BUG-001 | | S[1-4] | | Open |
---
## Sign-Off
- **QA Tester**: [Name] -- [Date]
- **QA Lead**: [Name] -- [Date]
- **Feature Owner**: [Name] -- [Date]
- **QA Tester**: [name] — [date]
- **QA Lead**: [name] — [date]
- **Sprint Owner**: [name] — [date]
*Template: `.claude/docs/templates/test-plan.md`*
*Generated by: `/qa-plan` — do not edit this line*

View file

@ -151,6 +151,19 @@ For each epic, decompose the GDD's acceptance criteria into stories:
3. Each group = one story
4. Order stories within the epic: foundation behaviour first, edge cases last
**Story Type Classification** — assign each story a type based on its acceptance criteria:
| Story Type | Assign when criteria reference... |
|---|---|
| **Logic** | Formulas, numerical thresholds, state transitions, AI decisions, calculations |
| **Integration** | Two or more systems interacting, signals crossing boundaries, save/load round-trips |
| **Visual/Feel** | Animation behaviour, VFX, "feels responsive", timing, screen shake, audio sync |
| **UI** | Menus, HUD elements, buttons, screens, dialogue boxes, tooltips |
| **Config/Data** | Balance tuning values, data file changes only — no new code logic |
Mixed stories: assign the type that carries the highest implementation risk and note the secondary type.
The Story Type determines what evidence is required before `/story-done` can mark the story Complete.
For each story, map:
- **GDD requirement**: Which specific acceptance criterion does this satisfy?
- **TR-ID**: Look up the matching entry in `tr-registry.yaml` by normalizing the
@ -179,6 +192,7 @@ For each story, produce a story file embedding full context:
> **Epic**: [epic name]
> **Status**: Ready
> **Layer**: [Foundation / Core / Feature / Presentation]
> **Type**: [Logic | Integration | Visual/Feel | UI | Config/Data]
> **Manifest Version**: [date from control-manifest.md header — or "N/A" if manifest not yet created]
## Context
@ -232,6 +246,19 @@ This boundary prevents scope creep and keeps stories independently reviewable.
---
## Test Evidence
**Required evidence** (based on Story Type):
- Logic: `tests/unit/[system]/[story-slug]_test.[ext]` — must exist and pass
- Integration: `tests/integration/[system]/[story-slug]_test.[ext]` OR playtest doc
- Visual/Feel: `production/qa/evidence/[story-slug]-evidence.md` + sign-off
- UI: `production/qa/evidence/[story-slug]-evidence.md` or interaction test
- Config/Data: smoke check pass (`production/qa/smoke-*.md`)
**Status**: [ ] Not yet created
---
## Dependencies
- Depends on: [Story NNN-1 must be DONE, or "None"]
@ -313,6 +340,8 @@ After approval, write:
This epic is complete when:
- All stories are implemented and reviewed
- All acceptance criteria from [GDD filename] are passing
- All Logic and Integration stories have passing test files in `tests/`
- All Visual/Feel and UI stories have evidence docs with sign-off in `production/qa/evidence/`
- No Foundation or Core layer stories have open blockers
```

View file

@ -82,6 +82,9 @@ The project progresses through these stages:
- [ ] At least 3 Architecture Decision Records in `docs/architecture/` covering
Foundation-layer systems (scene management, event architecture, save/load)
- [ ] Engine reference docs exist in `docs/engine-reference/[engine]/`
- [ ] Test framework initialized: `tests/unit/` and `tests/integration/` directories exist
- [ ] CI/CD test workflow exists at `.github/workflows/tests.yml` (or equivalent)
- [ ] At least one example test file exists to confirm the framework is functional
- [ ] Master architecture document exists at `docs/architecture/architecture.md`
- [ ] Architecture traceability index exists at `docs/architecture/architecture-traceability.md`
- [ ] `/architecture-review` has been run (a review report file exists in `docs/architecture/`)
@ -161,7 +164,9 @@ The project progresses through these stages:
- [ ] `src/` has active code organized into subsystems
- [ ] All core mechanics from GDD are implemented (cross-reference `design/gdd/` with `src/`)
- [ ] Main gameplay path is playable end-to-end
- [ ] Test files exist in `tests/`
- [ ] Test files exist in `tests/unit/` and `tests/integration/` covering Logic and Integration stories
- [ ] All Logic stories from this sprint have corresponding unit test files in `tests/unit/`
- [ ] Smoke check has been run with a PASS or PASS WITH WARNINGS verdict — report exists in `production/qa/`
- [ ] At least 3 distinct playtest sessions documented in `production/playtests/`
- [ ] Playtest reports cover: new player experience, mid-game systems, and difficulty curve
- [ ] Fun hypothesis from Game Concept has been explicitly validated or revised
@ -186,7 +191,11 @@ The project progresses through these stages:
- [ ] All features from milestone plan are implemented
- [ ] Content is complete (all levels, assets, dialogue referenced in design docs exist)
- [ ] Localization strings are externalized (no hardcoded player-facing text in `src/`)
- [ ] QA test plan exists
- [ ] QA test plan exists (`/qa-plan` output in `production/qa/`)
- [ ] QA sign-off report exists (`/team-qa` output — APPROVED or APPROVED WITH CONDITIONS)
- [ ] All Must Have story test evidence is present (Logic/Integration: test files pass; Visual/Feel/UI: sign-off docs in `production/qa/evidence/`)
- [ ] Smoke check passes cleanly (PASS verdict) on the release candidate build
- [ ] No test regressions from previous sprint (test suite passes fully)
- [ ] Balance data has been reviewed (`/balance-check` run)
- [ ] Release checklist completed (`/release-checklist` or `/launch-checklist` run)
- [ ] Store metadata prepared (if applicable)
@ -304,6 +313,8 @@ Based on the verdict, suggest specific next steps:
- **No interaction pattern library?**`/ux-design patterns` to initialize it
- **GDDs not cross-reviewed?**`/review-all-gdds` (run after all MVP GDDs are individually approved)
- **Cross-GDD consistency issues?** → fix flagged GDDs, then re-run `/review-all-gdds`
- **No test framework?**`/test-setup` to scaffold the framework for your engine
- **No QA plan for current sprint?**`/qa-plan sprint` to generate one before implementation begins
- **Missing ADRs?**`/architecture-decision` for individual decisions
- **No master architecture doc?**`/create-architecture` for the full blueprint
- **ADRs missing engine compatibility sections?** → Re-run `/architecture-decision`

View file

@ -0,0 +1,260 @@
---
name: qa-plan
description: "Generate a QA test plan for a sprint or feature. Reads GDDs and story files, classifies stories by test type (Logic/Integration/Visual/UI), and produces a structured test plan covering automated tests required, manual test cases, smoke test scope, and playtest sign-off requirements. Run before sprint begins or when starting a major feature."
argument-hint: "[sprint | feature: system-name | story: path]"
user-invocable: true
allowed-tools: Read, Glob, Grep, Write
context: fork
agent: qa-lead
---
# QA Plan
This skill generates a structured QA plan for a sprint, feature, or individual
story. It reads all in-scope story files and their referenced GDDs, classifies
each story by test type, and produces a plan that tells developers exactly what
to automate, what to verify manually, what the smoke test scope is, and when
to bring in a playtester.
Run this before a sprint begins so the team knows upfront what testing work
is required. A test plan written after implementation is a post-mortem, not a
plan.
**Output:** `production/qa/qa-plan-[sprint-slug]-[date].md`
---
## Phase 1: Parse Scope
**Argument:** `$ARGUMENTS` (blank = ask user via AskUserQuestion)
Determine scope from the argument:
- **`sprint`** — read the most recent file in `production/sprints/`, extract
every story file path referenced. If `production/sprint-status.yaml` exists,
use it as the primary story list and fall back to the sprint plan for story
metadata.
- **`feature: [system-name]`** — glob `production/epics/*/story-*.md`, filter
to stories whose file path or title contains the system name. Also check the
epic index file (`EPIC.md`) in that system's directory.
- **`story: [path]`** — validate that the path exists and load that single file.
- **No argument** — use `AskUserQuestion`:
- "What is the scope for this QA plan?"
- Options: "Current sprint", "Specific feature (enter system name)",
"Specific story (enter path)", "Full epic"
After resolving scope, report: "Building QA plan for [N] stories in [scope]."
If a story file path is referenced but the file does not exist, note it as
MISSING and continue with the remaining stories. Do not fail the entire plan
for one missing file.
---
## Phase 2: Load Inputs
For each in-scope story file, read the full file and extract:
- **Story title** and story ID (from filename or header)
- **Story Type** field (if present in the file header — e.g., `Type: Logic`)
- **Acceptance criteria** — the complete numbered/bulleted list
- **Implementation files** — listed under "Files to Create / Modify" or similar
- **Engine notes** — any engine API warnings or version-specific notes
- **GDD reference** — the GDD path(s) cited
- **ADR reference** — the ADR(s) cited
- **Estimate** — hours or story points if present
- **Dependencies** — other stories this one depends on
After reading stories, load supporting context once (not per story):
- `design/gdd/systems-index.md` — to understand system priorities and which
GDDs are approved
- For each unique GDD referenced across all stories: read only the
**Acceptance Criteria** and **Formulas** sections. Do not load full GDD text —
these two sections contain the testable requirements and the math to verify.
- `docs/architecture/control-manifest.md` — scan for forbidden patterns that
automated tests should guard against (if the file exists)
If no GDD is referenced in a story, note it as a gap but do not block the plan.
The story will be classified using acceptance criteria alone.
---
## Phase 3: Classify Each Story
For each story, assign a Story Type. If the story already has a `Type:` field
in its header, use that value and validate it against the criteria below. If the
field is missing or ambiguous, infer the type from the acceptance criteria.
| Story Type | Classification Indicators |
|---|---|
| **Logic** | Acceptance criteria reference calculations, formulas, numerical thresholds, state transitions, AI decisions, data validation, buff/debuff stacking, economy transactions, or any testable computation |
| **Integration** | Criteria involve two or more systems interacting, signals or events propagating across system boundaries, save/load round-trips, network sync, or persistence |
| **Visual/Feel** | Criteria reference animation behaviour, VFX, shader output, "feels responsive", perceived timing, screen shake, particle effects, audio sync, or visual feedback quality |
| **UI** | Criteria reference menus, HUD elements, buttons, screens, dialogue boxes, inventory panels, tooltips, or any player-facing interface element |
| **Config/Data** | Changes are limited to balance tuning values, data files, or configuration — no new code logic is involved |
**Mixed stories** (e.g., a story that adds both a formula and a UI display):
assign the primary type based on which acceptance criteria carry the highest
implementation risk, and note the secondary type. Mixed Logic+Integration or
Visual+UI combinations are the most common.
After classifying all stories, produce a classification summary table in
conversation before proceeding to Phase 4. This gives the user visibility into
how tests will be allocated.
---
## Phase 4: Generate Test Plan
Assemble the full QA plan document. Use this structure:
````markdown
# QA Plan: [Sprint/Feature Name]
**Date**: [date]
**Generated by**: /qa-plan
**Scope**: [N stories across [N systems]]
**Engine**: [engine name from .claude/docs/technical-preferences.md, or "Not configured"]
**Sprint File**: [path to sprint plan if applicable]
---
## Test Summary
| Story | Type | Automated Test Required | Manual Verification Required |
|-------|------|------------------------|------------------------------|
| [story title] | Logic | Unit test — `tests/unit/[system]/` | None |
| [story title] | Integration | Integration test — `tests/integration/[system]/` | Smoke check |
| [story title] | Visual/Feel | None (not automatable) | Screenshot + lead sign-off |
| [story title] | UI | Interaction walkthrough | Manual step-through |
| [story title] | Config/Data | Data validation test | Spot-check in-game values |
---
## Automated Tests Required
### [Story Title] — [Type]
**Test file path**: `tests/[unit|integration]/[system]/[story-slug]_test.[ext]`
**What to test**:
- [Specific formula or rule from the GDD Formulas section]
- [Each named state transition or decision branch]
- [Each side effect that should or should not occur]
**Edge cases to cover**:
- Zero/minimum input values (e.g., 0 damage, empty inventory)
- Maximum/boundary input values (e.g., max level, stat cap)
- Invalid or null input (e.g., missing target, dead entity)
- [Any edge case explicitly called out in the GDD Edge Cases section]
**Estimated test count**: ~[N] unit tests
[If no GDD formula reference was found for this story, note:]
*No formula found in referenced GDD — test cases must be derived from acceptance
criteria directly. Review the GDD Formulas section before writing tests.*
---
## Manual QA Checklist
### [Story Title] — [Type]
**Verification method**: [Screenshot + designer sign-off | Playtest session |
Manual step-through | Comparison against reference footage]
**Who must sign off**: [designer / lead-programmer / qa-lead / art-lead]
**Evidence to capture**: [screenshot of X | video clip of Y | written playtest
notes | side-by-side comparison]
Checklist:
- [ ] [Specific observable condition — concrete and falsifiable]
- [ ] [Another condition]
- [ ] [Every acceptance criterion translated into a manual check item]
*If any criterion uses subjective language ("feels", "looks", "seems"), it must
be supplemented with a specific benchmark or a playtest protocol note.*
---
## Smoke Test Scope
Critical paths to verify before any QA hand-off for this sprint:
1. Game launches to main menu without crash
2. New game / new session can be started
3. [Primary mechanic introduced or changed this sprint]
4. [Any system with a regression risk from this sprint's changes]
5. Save / load cycle completes without data loss (if save system exists)
6. Performance is within budget on target hardware (no new frame spikes)
*Smoke tests are verified by the developer via `/smoke-check`. Reference this
list when running that skill.*
---
## Playtest Requirements
| Story | Playtest Goal | Min Sessions | Target Player Type |
|-------|--------------|--------------|-------------------|
| [story] | [What question must the session answer?] | [N] | [new player / experienced] |
**Sign-off requirement**: Playtest notes must be written to
`production/session-logs/playtest-[sprint]-[story-slug].md` and reviewed by
the [designer / qa-lead] before the story can be marked COMPLETE.
If no stories require playtest validation: *No playtest sessions required for
this sprint.*
---
## Definition of Done — This Sprint
A story is DONE when ALL of the following are true:
- [ ] All acceptance criteria verified — via automated test result OR documented
manual evidence (screenshot, video, or playtest notes with sign-off)
- [ ] Test file exists at the specified path for all Logic and Integration stories
- [ ] Manual evidence document exists for all Visual/Feel and UI stories
- [ ] Smoke check passes (run `/smoke-check sprint` before QA hand-off)
- [ ] No regressions introduced
- [ ] Code reviewed (via `/code-review` or documented peer review)
- [ ] Story file updated to `Status: Complete` (via `/story-done`)
````
When generating content, use the actual story titles, GDD formula text, and
acceptance criteria extracted in Phase 2. Do not use placeholder text — every
test entry should reflect the real requirements of these specific stories.
---
## Phase 5: Write Output
Show the complete plan in conversation (or a summary if the plan is very long),
then ask:
"May I write this QA plan to `production/qa/qa-plan-[sprint-slug]-[date].md`?"
Write the plan exactly as generated — do not truncate.
After writing:
"QA plan written to `production/qa/qa-plan-[sprint-slug]-[date].md`.
Next steps:
- Share this plan with the team before sprint implementation begins
- Run `/smoke-check sprint` after all stories are implemented to gate QA hand-off
- For Logic/Integration stories, create the test files at the listed paths
before marking stories done — `/story-done` checks for them"
---
## Collaborative Protocol
- **Never write the plan without asking** — Phase 5 requires explicit approval.
- **Classify conservatively**: when a story is ambiguous between Logic and
Integration, classify it as Integration — it requires both unit and
integration tests.
- **Do not invent test cases** beyond what acceptance criteria and GDD formulas
support. If a formula is absent from the GDD, flag it rather than guessing.
- **Playtest requirements are advisory**: the user decides whether a playtest
is warranted for borderline Visual/Feel stories. Flag the case; do not mandate.
- Use `AskUserQuestion` for scope selection when no argument is provided.
Keep all other phases non-interactive — present findings, then ask once to
approve the write.

View file

@ -0,0 +1,338 @@
---
name: smoke-check
description: "Run the critical path smoke test gate before QA hand-off. Executes the automated test suite, verifies core functionality, and produces a PASS/FAIL report. Run after a sprint's stories are implemented and before manual QA begins. A failed smoke check means the build is not ready for QA."
argument-hint: "[sprint | quick]"
user-invocable: true
allowed-tools: Read, Glob, Grep, Bash, Write
---
# Smoke Check
This skill is the gate between "implementation done" and "ready for QA
hand-off". It runs the automated test suite, checks for test coverage gaps,
batch-verifies critical paths with the developer, and produces a PASS/FAIL
report.
The rule is simple: **a build that fails smoke check does not go to QA.**
Handing a broken build to QA wastes their time and demoralises the team.
**Output:** `production/qa/smoke-[date].md`
---
## Phase 1: Detect Test Setup
Before running anything, understand the environment:
1. **Test framework check**: verify `tests/` directory exists.
If it does not: "No test directory found at `tests/`. Run `/test-setup`
to scaffold the testing infrastructure, or create the directory manually
if tests live elsewhere." Then stop.
2. **CI check**: check whether `.github/workflows/` contains a workflow file
referencing tests. Note in the report whether CI is configured.
3. **Engine detection**: read `.claude/docs/technical-preferences.md` and
extract the `Engine:` value. Store this for test command selection in
Phase 2.
4. **Smoke test list**: check whether `production/qa/smoke-tests.md` or
`tests/smoke/` exists. If a smoke test list is found, load it for use in
Phase 4. If neither exists, smoke tests will be drawn from the current QA
plan (Phase 4 fallback).
5. **QA plan check**: glob `production/qa/qa-plan-*.md` and take the most
recently modified file. If found, note the path — it will be used in
Phase 3 and Phase 4. If not found, note: "No QA plan found. Run
`/qa-plan sprint` before smoke-checking for best results."
Report findings before proceeding: "Environment: [engine]. Test directory:
[found / not found]. CI configured: [yes / no]. QA plan: [path / not found]."
---
## Phase 2: Run Automated Tests
Attempt to run the test suite via Bash. Select the command based on the engine
detected in Phase 1:
**Godot 4:**
```bash
godot --headless --script tests/gdunit4_runner.gd 2>&1
```
If the GDUnit4 runner script does not exist at that path, try:
```bash
godot --headless -s addons/gdunit4/GdUnitRunner.gd 2>&1
```
If neither path exists, note: "GDUnit4 runner not found — confirm the runner
path for your test framework."
**Unity:**
Unity tests require the editor and cannot be run headlessly via shell in most
environments. Check for recent test result artifacts:
```bash
ls -t test-results/ 2>/dev/null | head -5
```
If test result files exist (XML or JSON), read the most recent one and parse
PASS/FAIL counts. If no artifacts exist: "Unity tests must be run from the
editor or CI pipeline. Please confirm test status manually before proceeding."
**Unreal Engine:**
```bash
ls -t Saved/Logs/ 2>/dev/null | grep -i "test\|automation" | head -5
```
If no matching log found: "UE automation tests must be run via the Session
Frontend or CI pipeline. Please confirm test status manually."
**Unknown engine / not configured:**
"Engine not configured in `.claude/docs/technical-preferences.md`. Run
`/setup-engine` to specify the engine, then re-run `/smoke-check`."
**If the test runner is not available in this environment** (engine binary not
on PATH, runner script not found, etc.), report clearly:
"Automated tests could not be executed — engine binary not found on PATH.
Status will be recorded as NOT RUN. Confirm test results from your local IDE
or CI pipeline. Unconfirmed NOT RUN is treated as PASS WITH WARNINGS, not
FAIL — the developer must manually confirm results."
Do not treat NOT RUN as an automatic FAIL. Record it as a warning. The
developer's manual confirmation in Phase 4 can resolve it.
Parse runner output and extract:
- Total tests run
- Passing count
- Failing count
- Names of any failing tests (up to 10; if more, note the count)
- Any crash or error output from the runner itself
---
## Phase 3: Check Test Coverage
Draw the story list from, in priority order:
1. The QA plan found in Phase 1 (its Test Summary table lists expected test
file paths per story)
2. The current sprint plan from `production/sprints/` (most recently modified
file)
3. If the `quick` argument was passed, skip this phase entirely and note:
"Coverage scan skipped — run `/smoke-check sprint` for full coverage
analysis."
For each story in scope:
1. Extract the system slug from the story's file path
(e.g., `production/epics/combat/story-001.md``combat`)
2. Glob `tests/unit/[system]/` and `tests/integration/[system]/` for files
whose name contains the story slug or a closely related term
3. Check the story file itself for a `Test file:` header field or a
"Test Evidence" section
Assign a coverage status to each story:
| Status | Meaning |
|--------|---------|
| **COVERED** | A test file was found matching this story's system and scope |
| **MANUAL** | Story type is Visual/Feel or UI; a test evidence document was found |
| **MISSING** | Logic or Integration story with no matching test file |
| **EXPECTED** | Config/Data story — no test file required; spot-check is sufficient |
| **UNKNOWN** | Story file missing or unreadable |
MISSING entries are advisory gaps. They do not cause a FAIL verdict but must
appear prominently in the report and must be resolved before `/story-done` can
fully close those stories.
---
## Phase 4: Run Manual Smoke Checks
Draw the smoke test checklist from, in priority order:
1. The QA plan's "Smoke Test Scope" section (if QA plan was found in Phase 1)
2. `production/qa/smoke-tests.md` (if it exists)
3. `tests/smoke/` directory contents (if it exists)
4. The standard fallback list below (used only when none of the above exist)
Tailor batches 2 and 3 to the actual systems identified from the sprint or QA
plan. Replace bracketed placeholders with real mechanic names from the current
sprint's stories.
Use `AskUserQuestion` to batch-verify. Keep to at most 3 calls.
**Batch 1 — Core stability (always run):**
```
question: "Smoke check — Batch 1: Core stability. Please verify each:"
options:
- "Game launches to main menu without crash — PASS"
- "Game launches to main menu without crash — FAIL"
- "New game / session starts successfully — PASS"
- "New game / session starts successfully — FAIL"
- "Main menu responds to all inputs — PASS"
- "Main menu responds to all inputs — FAIL"
```
**Batch 2 — Sprint mechanic and regression (always run):**
```
question: "Smoke check — Batch 2: This sprint's changes and regression check:"
options:
- "[Primary mechanic this sprint] — PASS"
- "[Primary mechanic this sprint] — FAIL: [describe what broke]"
- "[Second notable change this sprint, if any] — PASS"
- "[Second notable change this sprint] — FAIL"
- "Previous sprint's features still work (no regressions) — PASS"
- "Previous sprint's features — regression found: [brief description]"
```
**Batch 3 — Data integrity and performance (run unless `quick` argument):**
```
question: "Smoke check — Batch 3: Data integrity and performance:"
options:
- "Save / load completes without data loss — PASS"
- "Save / load — FAIL: [describe what broke]"
- "Save / load — N/A (save system not yet implemented)"
- "No new frame rate drops or hitches observed — PASS"
- "Frame rate drops or hitches found — FAIL: [where]"
- "Performance — not checked in this session"
```
Record each response verbatim for the Phase 5 report.
---
## Phase 5: Generate Report
Assemble the full smoke check report:
````markdown
## Smoke Check Report
**Date**: [date]
**Sprint**: [sprint name / number, or "Not identified"]
**Engine**: [engine]
**QA Plan**: [path, or "Not found — run /qa-plan first"]
**Argument**: [sprint | quick | blank]
---
### Automated Tests
**Status**: [PASS ([N] tests, [N] passing) | FAIL ([N] failures) |
NOT RUN ([reason])]
[If FAIL, list failing tests:]
- `[test name]` — [brief failure description from runner output]
[If NOT RUN:]
"Manual confirmation required: did tests pass in your local IDE or CI? This
will determine whether the automated test row contributes to a FAIL verdict."
---
### Test Coverage
| Story | Type | Test File | Coverage Status |
|-------|------|-----------|----------------|
| [title] | Logic | `tests/unit/[system]/[slug]_test.[ext]` | COVERED |
| [title] | Visual/Feel | `tests/evidence/[slug]-screenshots.md` | MANUAL |
| [title] | Logic | — | MISSING ⚠ |
| [title] | Config/Data | — | EXPECTED |
**Summary**: [N] covered, [N] manual, [N] missing, [N] expected.
---
### Manual Smoke Checks
- [x] Game launches without crash — PASS
- [x] New game starts — PASS
- [x] [Core mechanic] — PASS
- [ ] [Other check] — FAIL: [user's description]
- [x] Save / load — PASS
- [-] Performance — not checked this session
---
### Missing Test Evidence
Stories that must have test evidence before they can be marked COMPLETE via
`/story-done`:
- **[story title]** (`[path]`) — Logic story has no test file.
Expected location: `tests/unit/[system]/[story-slug]_test.[ext]`
[If none:] "All Logic and Integration stories have test coverage."
---
### Verdict: [PASS | PASS WITH WARNINGS | FAIL]
[Verdict rules — first matching rule wins:]
**FAIL** if ANY of:
- Automated test suite ran and reported one or more test failures
- Any Batch 1 (core stability) check returned FAIL
- Any Batch 2 (primary sprint mechanic or regression check) returned FAIL
**PASS WITH WARNINGS** if ALL of:
- Automated tests PASS or NOT RUN (developer has not yet confirmed)
- All Batch 1 and Batch 2 smoke checks PASS
- One or more Logic/Integration stories have MISSING test evidence
**PASS** if ALL of:
- Automated tests PASS
- All smoke checks in all batches PASS or N/A
- No MISSING test evidence entries
````
---
## Phase 6: Write and Gate
Present the full report in conversation, then ask:
"May I write this smoke check report to `production/qa/smoke-[date].md`?"
Write only after approval.
After writing, deliver the gate verdict:
**If verdict is FAIL:**
"The smoke check failed. Do not hand off to QA until these failures are
resolved:
[List each failing automated test or smoke check with a one-line description]
Fix the failures and run `/smoke-check` again to re-gate before QA hand-off."
**If verdict is PASS WITH WARNINGS:**
"Smoke check passed with warnings. The build is ready for manual QA.
Advisory items to resolve before running `/story-done` on affected stories:
[list MISSING test evidence entries]
QA hand-off: share `production/qa/qa-plan-[sprint].md` with the qa-tester
agent to begin manual verification."
**If verdict is PASS:**
"Smoke check passed cleanly. The build is ready for manual QA.
QA hand-off: share `production/qa/qa-plan-[sprint].md` with the qa-tester
agent to begin manual verification."
---
## Collaborative Protocol
- **Never treat NOT RUN as automatic FAIL** — record it as NOT RUN and let
the developer confirm status manually. Unconfirmed NOT RUN contributes to
PASS WITH WARNINGS, not FAIL.
- **Never auto-fix failures** — report them and state what must be resolved.
Do not attempt to edit source code or test files.
- **PASS WITH WARNINGS does not block QA hand-off** — it records advisory
gaps for `/story-done` to follow up on.
- **`quick` argument** skips Phase 3 (coverage scan) and Phase 4 Batch 3.
Use it for rapid re-checks after fixing a specific failure.
- Use `AskUserQuestion` for all manual smoke check verification.
- **Never write the report without asking** — Phase 6 requires explicit
approval before any file is created.

View file

@ -45,6 +45,7 @@ Read the full story file. Extract and hold in context:
- **ADR reference(s)** referenced
- **Acceptance Criteria** — the complete list (every checkbox item)
- **Implementation files** — files listed under "files to create/modify"
- **Story Type** — the `Type:` field from the story header (Logic / Integration / Visual/Feel / UI / Config/Data)
- **Engine notes** — any engine-specific constraints noted
- **Definition of Done** — if present, the story-level DoD
- **Estimated vs actual scope** — if an estimate was noted
@ -133,6 +134,42 @@ For each acceptance criterion in the story:
4. For any ADVISORY untested criteria, add to the Completion Notes in Phase 7:
`"Untested criteria: [AC-N list]. Recommend adding tests in a follow-up story."`
### Test Evidence Requirement
Based on the Story Type extracted in Phase 2, check for required evidence:
| Story Type | Required Evidence | Gate Level |
|---|---|---|
| **Logic** | Automated unit test in `tests/unit/[system]/` — must exist and pass | BLOCKING |
| **Integration** | Integration test in `tests/integration/[system]/` OR playtest doc | BLOCKING |
| **Visual/Feel** | Screenshot + sign-off in `production/qa/evidence/` | ADVISORY |
| **UI** | Manual walkthrough doc OR interaction test in `production/qa/evidence/` | ADVISORY |
| **Config/Data** | Smoke check pass report in `production/qa/smoke-*.md` | ADVISORY |
**For Logic stories**: use `Glob` to check `tests/unit/[system]/` for a test
file matching the story slug. If none found:
- Flag as **BLOCKING**: "Logic story has no unit test file. Expected at
`tests/unit/[system]/[story-slug]_test.[ext]`. Create and run the test
before marking this story Complete."
**For Integration stories**: check `tests/integration/[system]/` AND
`production/session-logs/` for a playtest record referencing this story.
If neither exists: flag as **BLOCKING** (same rule as Logic).
**For Visual/Feel and UI stories**: glob `production/qa/evidence/` for a file
referencing this story. If none: flag as **ADVISORY**
"No manual test evidence found. Create `production/qa/evidence/[story-slug]-evidence.md`
using the test-evidence template and obtain sign-off before final closure."
**For Config/Data stories**: check for any `production/qa/smoke-*.md` file.
If none: flag as **ADVISORY** — "No smoke check report found. Run `/smoke-check`."
**If no Story Type is set**: flag as **ADVISORY**
"Story Type not declared. Add `Type: [Logic|Integration|Visual/Feel|UI|Config/Data]`
to the story header to enable test evidence gate enforcement in future stories."
Any BLOCKING test evidence gap prevents the COMPLETE verdict in Phase 6.
---
## Phase 4: Check for Deviations
@ -220,6 +257,11 @@ Before updating any files, present the full report:
| AC-2: [text] | Manual confirmation | COVERED |
| AC-3: [text] | — | UNTESTED |
### Test Evidence
**Story Type**: [Logic | Integration | Visual/Feel | UI | Config/Data | Not declared]
**Required evidence**: [unit test file | integration test or playtest | screenshot + sign-off | walkthrough doc | smoke check pass]
**Evidence found**: [YES — `[path]` | NO — BLOCKING | NO — ADVISORY]
### Deviations
[NONE] OR:
- BLOCKING: [description] — [GDD/ADR reference]
@ -257,6 +299,7 @@ If yes, edit the story file:
**Completed**: [date]
**Criteria**: [X/Y passing] ([any deferred items listed])
**Deviations**: [None] or [list of advisory deviations]
**Test Evidence**: [Logic: test file at path | Visual/Feel: evidence doc at path | None required (Config/Data)]
**Code Review**: [Pending / Complete / Skipped]
```

View file

@ -169,9 +169,14 @@ items pass or are explicitly marked N/A with a stated reason.
- [ ] **Performance budget noted if applicable**: If this story touches any
part of the gameplay loop, rendering, or physics, a performance budget or
a "no performance impact expected — [reason]" note is present.
- [ ] **Test strategy noted**: The story states whether verification is by
unit test, manual test, or playtest. "Acceptance criteria verified by
[test type]" is sufficient.
- [ ] **Story Type declared**: The story includes a `Type:` field in its header
identifying the test category (Logic / Integration / Visual/Feel / UI / Config/Data).
Without this, test evidence requirements cannot be enforced at story close.
Fix: Add `Type: [Logic|Integration|Visual/Feel|UI|Config/Data]` to the story header.
- [ ] **Test evidence requirement is clear**: If the Story Type is set, the story
includes a `## Test Evidence` section stating where evidence will be stored
(test file path for Logic/Integration, or evidence doc path for Visual/Feel/UI).
Fix: Add `## Test Evidence` with the expected evidence location for the story's type.
---

View file

@ -0,0 +1,210 @@
---
name: team-qa
description: "Orchestrate the QA team through a full testing cycle. Coordinates qa-lead (strategy + test plan) and qa-tester (test case writing + bug reporting) to produce a complete QA package for a sprint or feature. Covers: test plan generation, test case writing, smoke check gate, manual QA execution, and sign-off report."
argument-hint: "[sprint | feature: system-name]"
user-invocable: true
allowed-tools: Read, Glob, Grep, Write, Task
agent: qa-lead
---
When this skill is invoked, orchestrate the QA team through a structured testing cycle.
**Decision Points:** At each phase transition, use `AskUserQuestion` to present
the user with the subagent's proposals as selectable options. Write the agent's
full analysis in conversation, then capture the decision with concise labels.
The user must approve before moving to the next phase.
## Team Composition
- **qa-lead** — QA strategy, test plan generation, story classification, sign-off report
- **qa-tester** — Test case writing, bug report writing, manual QA documentation
## How to Delegate
Use the Task tool to spawn each team member as a subagent:
- `subagent_type: qa-lead` — Strategy, planning, classification, sign-off
- `subagent_type: qa-tester` — Test case writing and bug report writing
Always provide full context in each agent's prompt (story file paths, QA plan path, scope constraints). Launch independent qa-tester tasks in parallel where possible (e.g., multiple stories in Phase 5 can be scaffolded simultaneously).
## Pipeline
### Phase 1: Load Context
Before doing anything else, gather the full scope:
1. Detect the current sprint or feature scope from the argument:
- If argument is a sprint identifier (e.g., `sprint-03`): read all story files in `production/sprints/[sprint]/`
- If argument is `feature: [system-name]`: glob story files tagged for that system
- If no argument: read `production/session-state/active.md` and `production/sprint-status.yaml` (if present) to infer the active sprint
2. Read `production/stage.txt` to confirm the current project phase.
3. Count stories found and report to the user:
> "QA cycle starting for [sprint/feature]. Found [N] stories. Current stage: [stage]. Ready to begin QA strategy?"
### Phase 2: QA Strategy (qa-lead)
Spawn `qa-lead` via Task to review all in-scope stories and produce a QA strategy.
Prompt the qa-lead to:
- Read each story file
- Classify each story by type: **Logic** / **Integration** / **Visual/Feel** / **UI** / **Config/Data**
- Identify which stories require automated test evidence vs. manual QA
- Flag any stories with missing acceptance criteria or missing test evidence that would block QA
- Estimate manual QA effort (number of test sessions needed)
- Produce a strategy summary table:
| Story | Type | Automated Required | Manual Required | Blocker? |
|-------|------|--------------------|-----------------|----------|
Present the qa-lead's full strategy to the user, then use `AskUserQuestion`:
```
question: "QA Strategy Review"
options:
- "Looks good — proceed to test plan"
- "Adjust story types before proceeding"
- "Skip blocked stories and proceed with the rest"
- "Cancel — resolve blockers first"
```
If blockers are present: list them explicitly. The user may choose to skip blocked stories or cancel the cycle.
### Phase 3: Test Plan Generation
Using the strategy from Phase 2, produce a structured test plan document.
The test plan should cover:
- **Scope**: sprint/feature name, story count, dates
- **Story Classification Table**: from Phase 2 strategy
- **Automated Test Requirements**: which stories need test files, expected paths in `tests/`
- **Manual QA Scope**: which stories need manual walkthrough and what to validate
- **Out of Scope**: what is explicitly not being tested this cycle and why
- **Entry Criteria**: what must be true before QA can begin (smoke check pass, build stable)
- **Exit Criteria**: what constitutes a completed QA cycle (all stories PASS or FAIL with bugs filed)
Ask: "May I write the QA plan to `production/qa/qa-plan-[sprint]-[date].md`?"
Write only after receiving approval.
### Phase 4: Smoke Check Gate
Before any manual QA begins, run the smoke check.
Spawn `qa-lead` via Task with instructions to:
- Review the `tests/smoke/` directory for the current smoke test list
- Check whether each smoke test scenario can be verified given the current build
- Produce a smoke check result: **PASS** / **PASS WITH WARNINGS** / **FAIL**
Report the result to the user:
- **PASS**: "Smoke check passed. Proceeding to test case writing."
- **PASS WITH WARNINGS**: "Smoke check passed with warnings: [list issues]. These are non-blocking. Proceeding — note these for the sign-off report."
- **FAIL**: "Smoke check failed. QA cannot begin until these issues are resolved:
[list failures]
Fix them and re-run `/smoke-check`, or re-run `/team-qa` once resolved."
On FAIL: stop the cycle and surface the list of failures. Do not proceed.
### Phase 5: Test Case Writing (qa-tester)
For each story requiring manual QA (Visual/Feel, UI, Integration without automated tests):
Spawn `qa-tester` via Task for each story (run in parallel where possible), providing:
- The story file path
- The relevant section of the QA plan for that story
- The GDD acceptance criteria for the system being tested (if available)
- Instructions to write detailed test cases covering all acceptance criteria
Each test case set should include:
- **Preconditions**: game state required before testing begins
- **Steps**: numbered, unambiguous actions
- **Expected Result**: what should happen
- **Actual Result**: field left blank for the tester to fill in
- **Pass/Fail**: field left blank
Present the test cases to the user for review before execution. Group by story.
Use `AskUserQuestion` per story group (batched 3-4 at a time):
```
question: "Test cases ready for [Story Group]. Review before manual QA begins?"
options:
- "Approved — begin manual QA for these stories"
- "Revise test cases for [story name]"
- "Skip manual QA for [story name] — not ready"
```
### Phase 6: Manual QA Execution
Walk through each story in the approved manual QA list.
Batch stories into groups of 3-4 and use `AskUserQuestion` for each:
```
question: "Manual QA — [Story Title]\n[brief description of what to test]"
options:
- "PASS — all acceptance criteria verified"
- "PASS WITH NOTES — minor issues found (describe after)"
- "FAIL — criteria not met (describe after)"
- "BLOCKED — cannot test yet (reason)"
```
After each FAIL result: use `AskUserQuestion` to collect the failure description, then spawn `qa-tester` via Task to write a formal bug report in `production/qa/bugs/`.
Bug report naming: `BUG-[NNN]-[short-slug].md` (increment NNN from existing bugs in the directory).
After collecting all results, summarize:
- Stories PASS: [count]
- Stories PASS WITH NOTES: [count]
- Stories FAIL: [count] — bugs filed: [IDs]
- Stories BLOCKED: [count]
### Phase 7: QA Sign-Off Report
Spawn `qa-lead` via Task to produce the sign-off report using all results from Phases 46.
The sign-off report format:
```markdown
## QA Sign-Off Report: [Sprint/Feature]
**Date**: [date]
**QA Lead sign-off**: [pending]
### Test Coverage Summary
| Story | Type | Auto Test | Manual QA | Result |
|-------|------|-----------|-----------|--------|
| [title] | Logic | PASS | — | PASS |
| [title] | Visual | — | PASS | PASS |
### Bugs Found
| ID | Story | Severity | Status |
|----|-------|----------|--------|
| BUG-001 | [story] | S2 | Open |
### Verdict: APPROVED / APPROVED WITH CONDITIONS / NOT APPROVED
**Conditions** (if any): [list what must be fixed before the build advances]
### Next Step
[guidance based on verdict]
```
Verdict rules:
- **APPROVED**: All stories PASS or PASS WITH NOTES; no S1/S2 bugs open
- **APPROVED WITH CONDITIONS**: S3/S4 bugs open, or PASS WITH NOTES issues documented; no S1/S2 bugs
- **NOT APPROVED**: Any S1/S2 bugs open; or stories FAIL without documented workaround
Next step guidance by verdict:
- APPROVED: "Build is ready for the next phase. Run `/gate-check` to validate advancement."
- APPROVED WITH CONDITIONS: "Resolve conditions before advancing. S3/S4 bugs may be deferred to polish."
- NOT APPROVED: "Resolve S1/S2 bugs and re-run `/team-qa` or targeted manual QA before advancing."
Ask: "May I write this QA sign-off report to `production/qa/qa-signoff-[sprint]-[date].md`?"
Write only after receiving approval.
## Output
A summary covering: stories in scope, smoke check result, manual QA results, bugs filed (with IDs and severities), and the final APPROVED / APPROVED WITH CONDITIONS / NOT APPROVED verdict.

View file

@ -0,0 +1,423 @@
---
name: test-setup
description: "Scaffold the test framework and CI/CD pipeline for the project's engine. Creates the tests/ directory structure, engine-specific test runner configuration, and GitHub Actions workflow. Run once during Technical Setup phase before the first sprint begins."
argument-hint: "[force]"
user-invocable: true
allowed-tools: Read, Glob, Grep, Bash, Write
---
# Test Setup
This skill scaffolds the automated testing infrastructure for the project.
It detects the configured engine, generates the appropriate test runner
configuration, creates the standard directory layout, and wires up CI/CD
so tests run on every push.
Run this once during the Technical Setup phase, before any implementation
begins. A test framework installed at sprint start costs 30 minutes.
A test framework installed at sprint four costs 3 sprints.
**Output:** `tests/` directory structure + `.github/workflows/tests.yml`
---
## Phase 1: Detect Engine and Existing State
1. **Read engine config**:
- Read `.claude/docs/technical-preferences.md` and extract the `Engine:` value.
- If engine is not configured (`[TO BE CONFIGURED]`), stop:
"Engine not configured. Run `/setup-engine` first, then re-run `/test-setup`."
2. **Check for existing test infrastructure**:
- Glob `tests/` — does the directory exist?
- Glob `tests/unit/` and `tests/integration/` — do subdirectories exist?
- Glob `.github/workflows/` — does a CI workflow file exist?
- Glob `tests/gdunit4_runner.gd` (Godot) or `tests/EditMode/` (Unity) or
`Source/Tests/` (Unreal) for engine-specific artifacts.
3. **Report findings**:
- "Engine: [engine]. Test directory: [found / not found]. CI workflow: [found / not found]."
- If everything already exists AND `force` argument was not passed:
"Test infrastructure appears to be in place. Re-run with `/test-setup force`
to regenerate. Proceeding will not overwrite existing test files."
If the `force` argument is passed, skip the "already exists" early-exit and
proceed — but still do not overwrite files that already exist at a given path.
Only create files that are missing.
---
## Phase 2: Present Plan
Based on the engine detected and the existing state, present a plan:
```
## Test Setup Plan — [Engine]
I will create the following (skipping any that already exist):
tests/
unit/ — Isolated unit tests for formulas, state, and logic
integration/ — Cross-system tests and save/load round-trips
smoke/ — Critical path test list (15-minute manual gate)
evidence/ — Screenshot and manual test sign-off records
README.md — Test framework documentation
[Engine-specific files — see per-engine details below]
.github/workflows/tests.yml — CI: run tests on every push to main
Estimated time: ~5 minutes to create all files.
```
Ask: "May I create these files? I will not overwrite any test files that
already exist at these paths."
Do not proceed without approval.
---
## Phase 3: Create Directory Structure
After approval, create the following files:
### `tests/README.md`
```markdown
# Test Infrastructure
**Engine**: [engine name + version]
**Test Framework**: [GdUnit4 | Unity Test Framework | UE Automation]
**CI**: `.github/workflows/tests.yml`
**Setup date**: [date]
## Directory Layout
```
tests/
unit/ # Isolated unit tests (formulas, state machines, logic)
integration/ # Cross-system and save/load tests
smoke/ # Critical path test list for /smoke-check gate
evidence/ # Screenshot logs and manual test sign-off records
```
## Running Tests
[Engine-specific command — see below]
## Test Naming
- **Files**: `[system]_[feature]_test.[ext]`
- **Functions**: `test_[scenario]_[expected]`
- **Example**: `combat_damage_test.gd``test_base_attack_returns_expected_damage()`
## Story Type → Test Evidence
| Story Type | Required Evidence | Location |
|---|---|---|
| Logic | Automated unit test — must pass | `tests/unit/[system]/` |
| Integration | Integration test OR playtest doc | `tests/integration/[system]/` |
| Visual/Feel | Screenshot + lead sign-off | `tests/evidence/` |
| UI | Manual walkthrough OR interaction test | `tests/evidence/` |
| Config/Data | Smoke check pass | `production/qa/smoke-*.md` |
## CI
Tests run automatically on every push to `main` and on every pull request.
A failed test suite blocks merging.
```
```
### Engine-specific files
#### Godot 4 (`Engine: Godot`)
Create `tests/gdunit4_runner.gd`:
```gdscript
# GdUnit4 test runner — invoked by CI and /smoke-check
# Usage: godot --headless --script tests/gdunit4_runner.gd
extends SceneTree
func _init() -> void:
var runner := load("res://addons/gdunit4/GdUnitRunner.gd")
if runner == null:
push_error("GdUnit4 not found. Install via AssetLib or addons/.")
quit(1)
return
var instance = runner.new()
instance.run_tests()
quit(0)
```
Create `tests/unit/.gdignore_placeholder` with content:
`# Unit tests go here — one subdirectory per system (e.g., tests/unit/combat/)`
Create `tests/integration/.gdignore_placeholder` with content:
`# Integration tests go here — one subdirectory per system`
Note in the README: **Installing GdUnit4**
```
1. Open Godot → AssetLib → search "GdUnit4" → Download & Install
2. Enable the plugin: Project → Project Settings → Plugins → GdUnit4 ✓
3. Restart the editor
4. Verify: res://addons/gdunit4/ exists
```
#### Unity (`Engine: Unity`)
Create `tests/EditMode/` placeholder file `tests/EditMode/README.md`:
```markdown
# Edit Mode Tests
Unit tests that run without entering Play Mode.
Use for pure logic: formulas, state machines, data validation.
Assembly definition required: `tests/EditMode/EditModeTests.asmdef`
```
Create `tests/PlayMode/README.md`:
```markdown
# Play Mode Tests
Integration tests that run in a real game scene.
Use for cross-system interactions, physics, and coroutines.
Assembly definition required: `tests/PlayMode/PlayModeTests.asmdef`
```
Note in the README: **Enabling Unity Test Framework**
```
Window → General → Test Runner
(Unity Test Framework is included by default in Unity 2019+)
```
#### Unreal Engine (`Engine: Unreal` or `Engine: UE5`)
Create `Source/Tests/README.md`:
```markdown
# Unreal Automation Tests
Tests use the UE Automation Testing Framework.
Run via: Session Frontend → Automation → select "MyGame." tests
Or headlessly: UnrealEditor -nullrhi -ExecCmds="Automation RunTests MyGame.; Quit"
Test class naming: F[SystemName]Test
Test category naming: "MyGame.[System].[Feature]"
```
---
## Phase 4: Create CI/CD Workflow
### Godot 4
Create `.github/workflows/tests.yml`:
```yaml
name: Automated Tests
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
test:
name: Run GdUnit4 Tests
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
with:
lfs: true
- name: Run GdUnit4 Tests
uses: MikeSchulze/gdUnit4-action@v1
with:
godot-version: '[VERSION FROM docs/engine-reference/godot/VERSION.md]'
paths: |
tests/unit
tests/integration
report-name: test-results
- name: Upload Test Results
if: always()
uses: actions/upload-artifact@v4
with:
name: test-results
path: reports/
```
### Unity
Create `.github/workflows/tests.yml`:
```yaml
name: Automated Tests
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
test:
name: Run Unity Tests
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
with:
lfs: true
- name: Run Edit Mode Tests
uses: game-ci/unity-test-runner@v4
env:
UNITY_LICENSE: ${{ secrets.UNITY_LICENSE }}
with:
testMode: editmode
artifactsPath: test-results/editmode
- name: Run Play Mode Tests
uses: game-ci/unity-test-runner@v4
env:
UNITY_LICENSE: ${{ secrets.UNITY_LICENSE }}
with:
testMode: playmode
artifactsPath: test-results/playmode
- name: Upload Test Results
if: always()
uses: actions/upload-artifact@v4
with:
name: test-results
path: test-results/
```
Note: Unity CI requires a `UNITY_LICENSE` secret. Add to GitHub repository
secrets before the first CI run.
### Unreal Engine
Create `.github/workflows/tests.yml`:
```yaml
name: Automated Tests
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
test:
name: Run UE Automation Tests
runs-on: self-hosted # UE requires a local runner with the editor installed
steps:
- name: Checkout
uses: actions/checkout@v4
with:
lfs: true
- name: Run Automation Tests
run: |
"$UE_EDITOR_PATH" "${{ github.workspace }}/[ProjectName].uproject" \
-nullrhi -nosound \
-ExecCmds="Automation RunTests MyGame.; Quit" \
-log -unattended
shell: bash
- name: Upload Logs
if: always()
uses: actions/upload-artifact@v4
with:
name: test-logs
path: Saved/Logs/
```
Note: UE CI requires a self-hosted runner with Unreal Editor installed.
Set the `UE_EDITOR_PATH` environment variable on the runner.
---
## Phase 5: Create Smoke Test Seed
Create `tests/smoke/critical-paths.md`:
```markdown
# Smoke Test: Critical Paths
**Purpose**: Run these 10-15 checks in under 15 minutes before any QA hand-off.
**Run via**: `/smoke-check` (which reads this file)
**Update**: Add new entries when new core systems are implemented.
## Core Stability (always run)
1. Game launches to main menu without crash
2. New game / session can be started from the main menu
3. Main menu responds to all inputs without freezing
## Core Mechanic (update per sprint)
<!-- Add the primary mechanic for each sprint here as it is implemented -->
<!-- Example: "Player can move, jump, and the camera follows correctly" -->
4. [Primary mechanic — update when first core system is implemented]
## Data Integrity
5. Save game completes without error (once save system is implemented)
6. Load game restores correct state (once load system is implemented)
## Performance
7. No visible frame rate drops on target hardware (60fps target)
8. No memory growth over 5 minutes of play (once core loop is implemented)
```
---
## Phase 6: Post-Setup Summary
After writing all files, report:
```
Test infrastructure created for [engine].
Files created:
- tests/README.md
- tests/unit/ (directory)
- tests/integration/ (directory)
- tests/smoke/critical-paths.md
- tests/evidence/ (directory)
[engine-specific files]
- .github/workflows/tests.yml
Next steps:
1. [Engine-specific install step, e.g., "Install GdUnit4 via AssetLib"]
2. Write your first test: create tests/unit/[first-system]/[system]_test.[ext]
3. Run `/qa-plan sprint` before your first sprint to classify stories and set
test evidence requirements
4. `/smoke-check` before every QA hand-off
Gate note: /gate-check Technical Setup → Pre-Production now requires:
- tests/ directory with unit/ and integration/ subdirectories
- .github/workflows/tests.yml
- At least one example test file
Run /test-setup and write one example test before advancing.
```
---
## Collaborative Protocol
- **Never overwrite existing test files** — only create files that are missing.
If a test runner file exists, leave it as-is.
- **Always ask before creating files** — Phase 2 requires explicit approval.
- **Engine detection is non-negotiable** — if the engine is not configured,
stop and redirect to `/setup-engine`. Do not guess.
- **`force` flag skips the "already exists" early-exit but never overwrites.**
It means "create any missing files even if the directory already exists."
- For Unity CI, note that the `UNITY_LICENSE` secret must be configured
manually. Do not attempt to automate license management.