Add comprehensive QA and testing framework (52→56 skills)

Introduces a full shift-left QA pipeline with Story Type classification as the backbone of the Definition of Done: New skills: - /test-setup: scaffold test framework + CI/CD per engine (Godot/Unity/Unreal) - /qa-plan: generate sprint test plan classifying stories by type - /smoke-check: critical path gate (PASS/PASS WITH WARNINGS/FAIL) before QA hand-off - /team-qa: orchestrate qa-lead + qa-tester through full QA cycle Story Type classification (Logic/Integration/Visual/Feel/UI/Config/Data): - Logic and Integration: BLOCKING DoD gate — unit/integration test required - Visual/Feel and UI: ADVISORY — screenshot + sign-off evidence required - Config/Data: ADVISORY — smoke check pass sufficient Updated skills: story-done (test evidence gate), story-readiness (Story Type check), gate-check (test framework at Technical Setup, test evidence at Polish/Release), create-epics-stories (Type field + Test Evidence section) Updated agents: qa-lead (shift-left philosophy + evidence table), qa-tester (automated test patterns for Godot/Unity/Unreal) New templates: test-evidence.md (manual sign-off record), test-plan.md (sprint-oriented QA plan replacing generic feature template) Updated coding-standards.md: Testing Standards section with DoD table, test rules, what NOT to automate, and engine-specific CI/CD commands Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 13:27:18 +00:00 · 2026-03-16 13:48:32 +11:00 · 2026-03-16 13:48:32 +11:00 · 168ac96c3a
commit 168ac96c3a
parent a2f8ed93ff
13 changed files with 1704 additions and 87 deletions
--- a/.claude/agents/qa-lead.md
+++ b/.claude/agents/qa-lead.md
@ -10,7 +10,10 @@ memory: project

 You are the QA Lead for an indie game project. You ensure the game meets
 quality standards through systematic testing, bug tracking, and release
-readiness evaluation.
+readiness evaluation. You practice **shift-left testing** — QA is involved
+from the start of each sprint, not just at the end. Testing is a **hard part
+of the Definition of Done**: no story is Complete without appropriate test
+evidence.

 ### Collaboration Protocol

@ -62,22 +65,62 @@ Before writing any code:
 - Rules are your friend -- when they flag issues, they're usually right
 - Tests prove it works -- offer to write them proactively

+### Story Type → Test Evidence Requirements
+
+Every story has a type that determines what evidence is required before it can be marked Done:
+
+| Story Type | Required Evidence | Gate Level |
+|---|---|---|
+| **Logic** (formulas, AI, state machines) | Automated unit test in `tests/unit/[system]/` | BLOCKING |
+| **Integration** (multi-system interaction) | Integration test OR documented playtest | BLOCKING |
+| **Visual/Feel** (animation, VFX, feel) | Screenshot + lead sign-off in `production/qa/evidence/` | ADVISORY |
+| **UI** (menus, HUD, screens) | Manual walkthrough doc OR interaction test | ADVISORY |
+| **Config/Data** (balance, data files) | Smoke check pass | ADVISORY |
+
+**Your role in this system:**
+- Classify story types when creating QA plans (if not already classified in the story file)
+- Flag Logic/Integration stories missing test evidence as blockers before sprint review
+- Accept Visual/Feel/UI stories with documented manual evidence as "Done"
+- Run or verify `/smoke-check` passes before any build goes to manual QA
+
+### QA Workflow Integration
+
+**Your skills to use:**
+- `/qa-plan [sprint]` — generate test plan from story types at sprint start
+- `/smoke-check` — run before every QA hand-off
+- `/team-qa [sprint]` — orchestrate full QA cycle
+
+**When you get involved:**
+- Sprint planning: Review story types and flag missing test strategies
+- Mid-sprint: Check that Logic stories have test files as they are implemented
+- Pre-QA gate: Run `/smoke-check`; block hand-off if it fails
+- QA execution: Direct qa-tester through manual test cases
+- Sprint review: Produce sign-off report with open bug list
+
+**What shift-left means for you:**
+- Review story acceptance criteria before implementation starts (`/story-readiness`)
+- Flag untestable criteria (e.g., "feels good" without a benchmark) before the sprint begins
+- Don't wait until the end to find that a Logic story has no tests
+
 ### Key Responsibilities

-1. **Test Strategy**: Define the overall testing approach -- what is tested
-   manually vs automatically, coverage goals, test environments, and test
-   data management.
-2. **Test Plan Creation**: For each feature and milestone, create test plans
+1. **Test Strategy & QA Planning**: At sprint start, classify stories by type,
+   identify what needs automated vs. manual testing, and produce the QA plan.
+2. **Test Evidence Gate**: Ensure Logic/Integration stories have test files before
+   marking Complete. This is a hard gate, not a recommendation.
+3. **Smoke Check Ownership**: Run `/smoke-check` before every build goes to manual QA.
+   A failed smoke check means the build is not ready — period.
+4. **Test Plan Creation**: For each feature and milestone, create test plans
   covering functional testing, edge cases, regression, performance, and
   compatibility.
-3. **Bug Triage**: Evaluate bug reports for severity, priority, reproducibility,
+5. **Bug Triage**: Evaluate bug reports for severity, priority, reproducibility,
   and assignment. Maintain a clear bug taxonomy.
-4. **Regression Management**: Maintain a regression test suite that covers
+6. **Regression Management**: Maintain a regression test suite that covers
   critical paths. Ensure regressions are caught before they reach milestones.
-5. **Release Quality Gates**: Define and enforce quality gates for each
+7. **Release Quality Gates**: Define and enforce quality gates for each
   milestone: crash rate, critical bug count, performance benchmarks, feature
   completeness.
-6. **Playtest Coordination**: Design playtest protocols, create questionnaires,
+8. **Playtest Coordination**: Design playtest protocols, create questionnaires,
   and analyze playtest feedback for actionable insights.

 ### Bug Severity Definitions
--- a/.claude/agents/qa-tester.md
+++ b/.claude/agents/qa-tester.md
@ -8,7 +8,9 @@ maxTurns: 10

 You are a QA Tester for an indie game project. You write thorough test cases
 and detailed bug reports that enable efficient bug fixing and prevent
-regressions.
+regressions. You also write automated test stubs and understand
+engine-specific test patterns — when a story needs a GDScript/C#/C++ test
+file, you can scaffold it.

 ### Collaboration Protocol

@ -60,19 +62,99 @@ Before writing any code:
 - Rules are your friend — when they flag issues, they're usually right
 - Tests prove it works — offer to write them proactively

+### Automated Test Writing
+
+For Logic and Integration stories, you write the test file (or scaffold it for the developer to complete).
+
+**Test naming convention**: `[system]_[feature]_test.[ext]`
+**Test function naming**: `test_[scenario]_[expected]`
+
+**Pattern per engine:**
+
+#### Godot (GDScript / GdUnit4)
+
+```gdscript
+extends GdUnitTestSuite
+
+func test_[scenario]_[expected]() -> void:
+    # Arrange
+    var subject = [ClassName].new()
+
+    # Act
+    var result = subject.[method]([args])
+
+    # Assert
+    assert_that(result).is_equal([expected])
+```
+
+#### Unity (C# / NUnit)
+
+```csharp
+[TestFixture]
+public class [SystemName]Tests
+{
+    [Test]
+    public void [Scenario]_[Expected]()
+    {
+        // Arrange
+        var subject = new [ClassName]();
+
+        // Act
+        var result = subject.[Method]([args]);
+
+        // Assert
+        Assert.AreEqual([expected], result, delta: 0.001f);
+    }
+}
+```
+
+#### Unreal (C++)
+
+```cpp
+IMPLEMENT_SIMPLE_AUTOMATION_TEST(
+    F[SystemName]Test,
+    "MyGame.[System].[Scenario]",
+    EAutomationTestFlags::GameFilter
+)
+
+bool F[SystemName]Test::RunTest(const FString& Parameters)
+{
+    // Arrange + Act
+    [ClassName] Subject;
+    float Result = Subject.[Method]([args]);
+
+    // Assert
+    TestEqual("[description]", Result, [expected]);
+    return true;
+}
+```
+
+**What to test for every Logic story formula:**
+1. Normal case (typical inputs → expected output)
+2. Zero/null input (should not crash; minimum output)
+3. Maximum values (should not overflow or produce infinity)
+4. Negative modifiers (if applicable)
+5. Edge case from GDD (any specific edge case mentioned in the GDD)
+
 ### Key Responsibilities

-1. **Test Case Writing**: Write detailed test cases with preconditions, steps,
+1. **Test File Scaffolding**: For Logic/Integration stories, write or scaffold
+   the automated test file. Don't wait to be asked — offer to write it when
+   implementing a Logic story.
+2. **Formula Test Generation**: Read the Formulas section of the GDD and generate
+   test cases covering all formula edge cases automatically.
+3. **Test Case Writing**: Write detailed test cases with preconditions, steps,
   expected results, and actual results fields. Cover happy path, edge cases,
   and error conditions.
-2. **Bug Report Writing**: Write bug reports with reproduction steps, expected
-   vs actual behavior, severity, frequency, environment, and supporting
+4. **Bug Report Writing**: Write bug reports with reproduction steps, expected
+   vs. actual behavior, severity, frequency, environment, and supporting
   evidence (logs, screenshots described).
-3. **Regression Checklists**: Create and maintain regression checklists for
+5. **Regression Checklists**: Create and maintain regression checklists for
   each major feature and system. Update after every bug fix.
-4. **Smoke Test Suites**: Maintain quick smoke test suites that verify core
-   functionality in under 15 minutes.
-5. **Test Coverage Tracking**: Track which features and code paths have test
+6. **Smoke Test Lists**: Maintain the `tests/smoke/` directory with critical path
+   test cases. These are the 10-15 scenarios that run in the `/smoke-check` gate
+   before any build goes to manual QA.
+7. **Test Coverage Tracking**: Track which features and code paths have test
   coverage and identify gaps.

 ### Bug Report Format
--- a/.claude/docs/coding-standards.md
+++ b/.claude/docs/coding-standards.md
@ -23,3 +23,43 @@
  7. **Tuning Knobs** -- configurable values identified
  8. **Acceptance Criteria** -- testable success conditions
 - Balance values must link to their source formula or rationale
+
+# Testing Standards
+
+## Test Evidence by Story Type
+
+All stories must have appropriate test evidence before they can be marked Done:
+
+| Story Type | Required Evidence | Location | Gate Level |
+|---|---|---|---|
+| **Logic** (formulas, AI, state machines) | Automated unit test — must pass | `tests/unit/[system]/` | BLOCKING |
+| **Integration** (multi-system) | Integration test OR documented playtest | `tests/integration/[system]/` | BLOCKING |
+| **Visual/Feel** (animation, VFX, feel) | Screenshot + lead sign-off | `production/qa/evidence/` | ADVISORY |
+| **UI** (menus, HUD, screens) | Manual walkthrough doc OR interaction test | `production/qa/evidence/` | ADVISORY |
+| **Config/Data** (balance tuning) | Smoke check pass | `production/qa/smoke-[date].md` | ADVISORY |
+
+## Automated Test Rules
+
+- **Naming**: `[system]_[feature]_test.[ext]` for files; `test_[scenario]_[expected]` for functions
+- **Determinism**: Tests must produce the same result every run — no random seeds, no time-dependent assertions
+- **Isolation**: Each test sets up and tears down its own state; tests must not depend on execution order
+- **No hardcoded data**: Test fixtures use constant files or factory functions, not inline magic numbers
+  (exception: boundary value tests where the exact number IS the point)
+- **Independence**: Unit tests do not call external APIs, databases, or file I/O — use dependency injection
+
+## What NOT to Automate
+
+- Visual fidelity (shader output, VFX appearance, animation curves)
+- "Feel" qualities (input responsiveness, perceived weight, timing)
+- Platform-specific rendering (test on target hardware, not headlessly)
+- Full gameplay sessions (covered by playtesting, not automation)
+
+## CI/CD Rules
+
+- Automated test suite runs on every push to main and every PR
+- No merge if tests fail — tests are a blocking gate in CI
+- Never disable or skip failing tests to make CI pass — fix the underlying issue
+- Engine-specific CI commands:
+  - **Godot**: `godot --headless --script tests/gdunit4_runner.gd`
+  - **Unity**: `game-ci/unity-test-runner@v4` (GitHub Actions)
+  - **Unreal**: headless runner with `-nullrhi` flag
--- a/.claude/docs/templates/test-evidence.md
+++ b/.claude/docs/templates/test-evidence.md
@ -0,0 +1,86 @@
+# Test Evidence: [Story Title]
+
+> **Story**: `[path to story file]`
+> **Story Type**: [Visual/Feel | UI]
+> **Date**: [date]
+> **Tester**: [who performed the test]
+> **Build / Commit**: [version or git hash]
+
+---
+
+## What Was Tested
+
+[One paragraph describing the feature or behaviour that was validated. Include
+the acceptance criteria numbers from the story that this evidence covers.]
+
+**Acceptance criteria covered**: [AC-1, AC-2, AC-3]
+
+---
+
+## Acceptance Criteria Results
+
+| # | Criterion (from story) | Result | Notes |
+|---|----------------------|--------|-------|
+| AC-1 | [exact criterion text] | PASS / FAIL | [any observations] |
+| AC-2 | [exact criterion text] | PASS / FAIL | |
+| AC-3 | [exact criterion text] | PASS / FAIL | |
+
+---
+
+## Screenshots / Video
+
+List all captured evidence below. Store files in the same directory as this
+document or in `production/qa/evidence/[story-slug]/`.
+
+| # | Filename | What It Shows | Acceptance Criterion |
+|---|----------|--------------|----------------------|
+| 1 | `[filename.png]` | [brief description of what is visible] | AC-1 |
+| 2 | `[filename.png]` | | AC-2 |
+
+*If video: note the timestamp and what it demonstrates.*
+
+---
+
+## Test Conditions
+
+- **Game state at start**: [e.g., "fresh save, player at level 1, no items"]
+- **Platform / hardware**: [e.g., "Windows 11, GTX 1080, 1080p"]
+- **Framerate during test**: [e.g., "stable 60fps" or "~45fps — within budget"]
+- **Any special setup required**: [e.g., "dev menu used to trigger specific state"]
+
+---
+
+## Observations
+
+[Anything noteworthy that didn't cause a FAIL but should be recorded. Examples:
+minor visual jitter, frame dip under load, behaviour that technically passes
+but felt slightly off. These become candidates for polish work.]
+
+- [Observation 1]
+- [Observation 2]
+
+If nothing notable: *No significant observations.*
+
+---
+
+## Sign-Off
+
+All three sign-offs are required before the story can be marked COMPLETE via
+`/story-done`. Visual/Feel stories require the designer or art-lead sign-off.
+UI stories require the UX lead or designer sign-off.
+
+| Role | Name | Date | Signature |
+|------|------|------|-----------|
+| Developer (implemented) | | | [ ] Approved |
+| Designer / Art Lead / UX Lead | | | [ ] Approved |
+| QA Lead | | | [ ] Approved |
+
+**Any sign-off can be marked "Deferred — [reason]"** if the person is
+unavailable. Deferred sign-offs must be resolved before the story advances
+past the sprint review.
+
+---
+
+*Template: `.claude/docs/templates/test-evidence.md`*
+*Used for: Visual/Feel and UI story type evidence records*
+*Location: `production/qa/evidence/[story-slug]-evidence.md`*
--- a/.claude/docs/templates/test-plan.md
+++ b/.claude/docs/templates/test-plan.md
@ -1,97 +1,144 @@
-# Test Plan: [Feature/System Name]
+# QA Plan: [Sprint/Feature Name]

-## Overview
+> **Date**: [date]
+> **Generated by**: /qa-plan
+> **Scope**: [N stories across N systems]
+> **Engine**: [engine name and version]
+> **Sprint file**: [path to sprint plan]

- **Feature**: [Name]
- **Design Doc**: [Link to design document]
- **Implementation**: [Link to code or PR]
- **Author**: [QA owner]
- **Date**: [Date]
- **Priority**: [Critical / High / Medium / Low]
+---

-## Scope
+## Story Coverage Summary

-### In Scope
+| Story | Type | Automated Test Required | Manual Verification Required |
+|-------|------|------------------------|------------------------------|
+| [story title] | Logic | Unit test — `tests/unit/[system]/` | None |
+| [story title] | Integration | Integration test — `tests/integration/[system]/` | Smoke check |
+| [story title] | Visual/Feel | None (not automatable) | Screenshot + lead sign-off |
+| [story title] | UI | None (not automatable) | Manual step-through |
+| [story title] | Config/Data | Data validation (optional) | Spot-check in-game values |

- [What is being tested]
+**Totals**: [N] Logic, [N] Integration, [N] Visual/Feel, [N] UI, [N] Config/Data

-### Out of Scope
+---

- [What is explicitly NOT being tested and why]
+## Automated Tests Required

-### Dependencies
+### [Story Title] — Logic

- [Other systems that must be working for these tests to be valid]
+**Test file path**: `tests/unit/[system]/[story-slug]_test.[ext]`

-## Test Environment
+**What to test**:
+- [Formula or rule from GDD Formulas section — e.g., "damage = base * multiplier where multiplier ∈ [0.5, 3.0]"]
+- [Each named state transition]
+- [Each side effect that should / should not occur]

- **Build**: [Minimum build version]
- **Platform**: [Target platforms]
- **Preconditions**: [Required game state, save files, etc.]
+**Edge cases to cover**:
+- Zero / minimum input values
+- Maximum / boundary input values
+- Invalid or null input
+- [GDD-specified edge cases]

-## Test Cases
+**Estimated test count**: ~[N] unit tests

-### Functional Tests -- Happy Path
+---

-| ID | Test Case | Steps | Expected Result | Status |
-|----|-----------|-------|----------------|--------|
-| TC-001 | [Description] | 1. [Step] 2. [Step] | [Expected] | [ ] |
-| TC-002 | [Description] | 1. [Step] 2. [Step] | [Expected] | [ ] |
+### [Story Title] — Integration

-### Functional Tests -- Edge Cases
+**Test file path**: `tests/integration/[system]/[story-slug]_test.[ext]`

-| ID | Test Case | Steps | Expected Result | Status |
-|----|-----------|-------|----------------|--------|
-| TC-010 | [Boundary value] | 1. [Step] | [Expected] | [ ] |
-| TC-011 | [Zero/null input] | 1. [Step] | [Expected] | [ ] |
-| TC-012 | [Maximum values] | 1. [Step] | [Expected] | [ ] |
+**What to test**:
+- [Cross-system interaction — e.g., "applying buff updates CharacterStats and triggers UI refresh"]
+- [Round-trip — e.g., "save → load restores all fields"]

-### Negative Tests
+---

-| ID | Test Case | Steps | Expected Result | Status |
-|----|-----------|-------|----------------|--------|
-| TC-020 | [Invalid input] | 1. [Step] | [Graceful handling] | [ ] |
-| TC-021 | [Interrupted action] | 1. [Step] | [No corruption] | [ ] |
+## Manual QA Checklist

-### Integration Tests
+### [Story Title] — Visual/Feel

-| ID | Test Case | Systems Involved | Steps | Expected Result | Status |
-|----|-----------|-----------------|-------|----------------|--------|
-| TC-030 | [Cross-system interaction] | [System A, System B] | 1. [Step] | [Expected] | [ ] |
+**Verification method**: Screenshot + [designer / art-lead] sign-off
+**Evidence file**: `production/qa/evidence/[story-slug]-evidence.md`
+**Who must sign off**: [designer / lead-programmer / art-lead]

-### Performance Tests
+- [ ] [Specific observable condition — e.g., "hit flash appears on frame of impact, not the frame after"]
+- [ ] [Another falsifiable condition]

-| ID | Test Case | Metric | Budget | Steps | Status |
-|----|-----------|--------|--------|-------|--------|
-| TC-040 | [Load time] | Seconds | [X]s | 1. [Step] | [ ] |
-| TC-041 | [Frame rate] | FPS | [X] | 1. [Step] | [ ] |
-| TC-042 | [Memory usage] | MB | [X]MB | 1. [Step] | [ ] |
+### [Story Title] — UI

-### Regression Tests
+**Verification method**: Manual step-through
+**Evidence file**: `production/qa/evidence/[story-slug]-evidence.md`

-| ID | Related Bug | Test Case | Steps | Expected Result | Status |
-|----|------------|-----------|-------|----------------|--------|
-| TC-050 | BUG-[XXXX] | [Verify fix holds] | 1. [Step] | [Expected] | [ ] |
+- [ ] [Every acceptance criterion translated into a manual check item]

-## Test Results Summary
+---

-| Category | Total | Passed | Failed | Blocked | Skipped |
-|----------|-------|--------|--------|---------|---------|
-| Happy Path | | | | | |
-| Edge Cases | | | | | |
-| Negative | | | | | |
-| Integration | | | | | |
-| Performance | | | | | |
-| Regression | | | | | |
-| **Total** | | | | | |
+## Smoke Test Scope
+
+Critical paths to verify before QA hand-off (run via `/smoke-check`):
+
+1. Game launches to main menu without crash
+2. New game / session can be started
+3. [Primary mechanic introduced or changed this sprint]
+4. [System with regression risk from this sprint's changes]
+5. Save / load cycle completes without data loss (if save system exists)
+6. Performance is within budget on target hardware
+
+---
+
+## Playtest Requirements
+
+| Story | Playtest Goal | Min Sessions | Target Player Type |
+|-------|--------------|--------------|-------------------|
+| [story] | [What question must be answered?] | [N] | [new player / experienced / etc.] |
+
+Sign-off requirement: Playtest notes → `production/session-logs/playtest-[sprint]-[story-slug].md`
+
+If no playtest sessions required: *No playtest sessions required for this sprint.*
+
+---
+
+## Definition of Done — This Sprint
+
+A story is DONE when ALL of the following are true:
+
+- [ ] All acceptance criteria verified — automated test result OR documented manual evidence
+- [ ] Test file exists for all Logic and Integration stories and passes
+- [ ] Manual evidence document exists for all Visual/Feel and UI stories
+- [ ] Smoke check passes (run `/smoke-check sprint` before QA hand-off)
+- [ ] No regressions introduced — previous sprint's features still pass
+- [ ] Code reviewed (via `/code-review` or documented peer review)
+- [ ] Story file updated to `Status: Complete` via `/story-done`
+
+**Stories requiring playtest sign-off before close**: [list, or "None"]
+
+---
+
+## Test Results
+
+*Fill in after testing is complete.*
+
+| Story | Automated | Manual | Result | Notes |
+|-------|-----------|--------|--------|-------|
+| [title] | PASS | — | PASS | |
+| [title] | — | PASS | PASS | |
+| [title] | FAIL | — | BLOCKED | [describe failure] |
+
+---

 ## Bugs Found

-| Bug ID | Severity | Test Case | Description | Status |
-|--------|----------|-----------|-------------|--------|
+| ID | Story | Severity | Description | Status |
+|----|-------|----------|-------------|--------|
+| BUG-001 | | S[1-4] | | Open |
+
+---

 ## Sign-Off

- **QA Tester**: [Name] -- [Date]
- **QA Lead**: [Name] -- [Date]
- **Feature Owner**: [Name] -- [Date]
+- **QA Tester**: [name] — [date]
+- **QA Lead**: [name] — [date]
+- **Sprint Owner**: [name] — [date]
+
+*Template: `.claude/docs/templates/test-plan.md`*
+*Generated by: `/qa-plan` — do not edit this line*
--- a/.claude/skills/create-epics-stories/SKILL.md
+++ b/.claude/skills/create-epics-stories/SKILL.md
@ -151,6 +151,19 @@ For each epic, decompose the GDD's acceptance criteria into stories:
 3. Each group = one story
 4. Order stories within the epic: foundation behaviour first, edge cases last

+**Story Type Classification** — assign each story a type based on its acceptance criteria:
+
+| Story Type | Assign when criteria reference... |
+|---|---|
+| **Logic** | Formulas, numerical thresholds, state transitions, AI decisions, calculations |
+| **Integration** | Two or more systems interacting, signals crossing boundaries, save/load round-trips |
+| **Visual/Feel** | Animation behaviour, VFX, "feels responsive", timing, screen shake, audio sync |
+| **UI** | Menus, HUD elements, buttons, screens, dialogue boxes, tooltips |
+| **Config/Data** | Balance tuning values, data file changes only — no new code logic |
+
+Mixed stories: assign the type that carries the highest implementation risk and note the secondary type.
+The Story Type determines what evidence is required before `/story-done` can mark the story Complete.
+
 For each story, map:
 - **GDD requirement**: Which specific acceptance criterion does this satisfy?
 - **TR-ID**: Look up the matching entry in `tr-registry.yaml` by normalizing the
@ -179,6 +192,7 @@ For each story, produce a story file embedding full context:
 > **Epic**: [epic name]
 > **Status**: Ready
 > **Layer**: [Foundation / Core / Feature / Presentation]
+> **Type**: [Logic | Integration | Visual/Feel | UI | Config/Data]
 > **Manifest Version**: [date from control-manifest.md header — or "N/A" if manifest not yet created]

 ## Context
@ -232,6 +246,19 @@ This boundary prevents scope creep and keeps stories independently reviewable.

 ---

+## Test Evidence
+
+**Required evidence** (based on Story Type):
+- Logic: `tests/unit/[system]/[story-slug]_test.[ext]` — must exist and pass
+- Integration: `tests/integration/[system]/[story-slug]_test.[ext]` OR playtest doc
+- Visual/Feel: `production/qa/evidence/[story-slug]-evidence.md` + sign-off
+- UI: `production/qa/evidence/[story-slug]-evidence.md` or interaction test
+- Config/Data: smoke check pass (`production/qa/smoke-*.md`)
+
+**Status**: [ ] Not yet created
+
+---
+
 ## Dependencies

 - Depends on: [Story NNN-1 must be DONE, or "None"]
@ -313,6 +340,8 @@ After approval, write:
 This epic is complete when:
 - All stories are implemented and reviewed
 - All acceptance criteria from [GDD filename] are passing
+- All Logic and Integration stories have passing test files in `tests/`
+- All Visual/Feel and UI stories have evidence docs with sign-off in `production/qa/evidence/`
 - No Foundation or Core layer stories have open blockers
 ```

--- a/.claude/skills/gate-check/SKILL.md
+++ b/.claude/skills/gate-check/SKILL.md
@ -82,6 +82,9 @@ The project progresses through these stages:
 - [ ] At least 3 Architecture Decision Records in `docs/architecture/` covering
      Foundation-layer systems (scene management, event architecture, save/load)
 - [ ] Engine reference docs exist in `docs/engine-reference/[engine]/`
+- [ ] Test framework initialized: `tests/unit/` and `tests/integration/` directories exist
+- [ ] CI/CD test workflow exists at `.github/workflows/tests.yml` (or equivalent)
+- [ ] At least one example test file exists to confirm the framework is functional
 - [ ] Master architecture document exists at `docs/architecture/architecture.md`
 - [ ] Architecture traceability index exists at `docs/architecture/architecture-traceability.md`
 - [ ] `/architecture-review` has been run (a review report file exists in `docs/architecture/`)
@ -161,7 +164,9 @@ The project progresses through these stages:
 - [ ] `src/` has active code organized into subsystems
 - [ ] All core mechanics from GDD are implemented (cross-reference `design/gdd/` with `src/`)
 - [ ] Main gameplay path is playable end-to-end
- [ ] Test files exist in `tests/`
+- [ ] Test files exist in `tests/unit/` and `tests/integration/` covering Logic and Integration stories
+- [ ] All Logic stories from this sprint have corresponding unit test files in `tests/unit/`
+- [ ] Smoke check has been run with a PASS or PASS WITH WARNINGS verdict — report exists in `production/qa/`
 - [ ] At least 3 distinct playtest sessions documented in `production/playtests/`
 - [ ] Playtest reports cover: new player experience, mid-game systems, and difficulty curve
 - [ ] Fun hypothesis from Game Concept has been explicitly validated or revised
@ -186,7 +191,11 @@ The project progresses through these stages:
 - [ ] All features from milestone plan are implemented
 - [ ] Content is complete (all levels, assets, dialogue referenced in design docs exist)
 - [ ] Localization strings are externalized (no hardcoded player-facing text in `src/`)
- [ ] QA test plan exists
+- [ ] QA test plan exists (`/qa-plan` output in `production/qa/`)
+- [ ] QA sign-off report exists (`/team-qa` output — APPROVED or APPROVED WITH CONDITIONS)
+- [ ] All Must Have story test evidence is present (Logic/Integration: test files pass; Visual/Feel/UI: sign-off docs in `production/qa/evidence/`)
+- [ ] Smoke check passes cleanly (PASS verdict) on the release candidate build
+- [ ] No test regressions from previous sprint (test suite passes fully)
 - [ ] Balance data has been reviewed (`/balance-check` run)
 - [ ] Release checklist completed (`/release-checklist` or `/launch-checklist` run)
 - [ ] Store metadata prepared (if applicable)
@ -304,6 +313,8 @@ Based on the verdict, suggest specific next steps:
 - **No interaction pattern library?** → `/ux-design patterns` to initialize it
 - **GDDs not cross-reviewed?** → `/review-all-gdds` (run after all MVP GDDs are individually approved)
 - **Cross-GDD consistency issues?** → fix flagged GDDs, then re-run `/review-all-gdds`
+- **No test framework?** → `/test-setup` to scaffold the framework for your engine
+- **No QA plan for current sprint?** → `/qa-plan sprint` to generate one before implementation begins
 - **Missing ADRs?** → `/architecture-decision` for individual decisions
 - **No master architecture doc?** → `/create-architecture` for the full blueprint
 - **ADRs missing engine compatibility sections?** → Re-run `/architecture-decision`
--- a/.claude/skills/qa-plan/SKILL.md
+++ b/.claude/skills/qa-plan/SKILL.md
@ -0,0 +1,260 @@
+---
+name: qa-plan
+description: "Generate a QA test plan for a sprint or feature. Reads GDDs and story files, classifies stories by test type (Logic/Integration/Visual/UI), and produces a structured test plan covering automated tests required, manual test cases, smoke test scope, and playtest sign-off requirements. Run before sprint begins or when starting a major feature."
+argument-hint: "[sprint | feature: system-name | story: path]"
+user-invocable: true
+allowed-tools: Read, Glob, Grep, Write
+context: fork
+agent: qa-lead
+---
+
+# QA Plan
+
+This skill generates a structured QA plan for a sprint, feature, or individual
+story. It reads all in-scope story files and their referenced GDDs, classifies
+each story by test type, and produces a plan that tells developers exactly what
+to automate, what to verify manually, what the smoke test scope is, and when
+to bring in a playtester.
+
+Run this before a sprint begins so the team knows upfront what testing work
+is required. A test plan written after implementation is a post-mortem, not a
+plan.
+
+**Output:** `production/qa/qa-plan-[sprint-slug]-[date].md`
+
+---
+
+## Phase 1: Parse Scope
+
+**Argument:** `$ARGUMENTS` (blank = ask user via AskUserQuestion)
+
+Determine scope from the argument:
+
+- **`sprint`** — read the most recent file in `production/sprints/`, extract
+  every story file path referenced. If `production/sprint-status.yaml` exists,
+  use it as the primary story list and fall back to the sprint plan for story
+  metadata.
+- **`feature: [system-name]`** — glob `production/epics/*/story-*.md`, filter
+  to stories whose file path or title contains the system name. Also check the
+  epic index file (`EPIC.md`) in that system's directory.
+- **`story: [path]`** — validate that the path exists and load that single file.
+- **No argument** — use `AskUserQuestion`:
+  - "What is the scope for this QA plan?"
+  - Options: "Current sprint", "Specific feature (enter system name)",
+    "Specific story (enter path)", "Full epic"
+
+After resolving scope, report: "Building QA plan for [N] stories in [scope]."
+
+If a story file path is referenced but the file does not exist, note it as
+MISSING and continue with the remaining stories. Do not fail the entire plan
+for one missing file.
+
+---
+
+## Phase 2: Load Inputs
+
+For each in-scope story file, read the full file and extract:
+
+- **Story title** and story ID (from filename or header)
+- **Story Type** field (if present in the file header — e.g., `Type: Logic`)
+- **Acceptance criteria** — the complete numbered/bulleted list
+- **Implementation files** — listed under "Files to Create / Modify" or similar
+- **Engine notes** — any engine API warnings or version-specific notes
+- **GDD reference** — the GDD path(s) cited
+- **ADR reference** — the ADR(s) cited
+- **Estimate** — hours or story points if present
+- **Dependencies** — other stories this one depends on
+
+After reading stories, load supporting context once (not per story):
+
+- `design/gdd/systems-index.md` — to understand system priorities and which
+  GDDs are approved
+- For each unique GDD referenced across all stories: read only the
+  **Acceptance Criteria** and **Formulas** sections. Do not load full GDD text —
+  these two sections contain the testable requirements and the math to verify.
+- `docs/architecture/control-manifest.md` — scan for forbidden patterns that
+  automated tests should guard against (if the file exists)
+
+If no GDD is referenced in a story, note it as a gap but do not block the plan.
+The story will be classified using acceptance criteria alone.
+
+---
+
+## Phase 3: Classify Each Story
+
+For each story, assign a Story Type. If the story already has a `Type:` field
+in its header, use that value and validate it against the criteria below. If the
+field is missing or ambiguous, infer the type from the acceptance criteria.
+
+| Story Type | Classification Indicators |
+|---|---|
+| **Logic** | Acceptance criteria reference calculations, formulas, numerical thresholds, state transitions, AI decisions, data validation, buff/debuff stacking, economy transactions, or any testable computation |
+| **Integration** | Criteria involve two or more systems interacting, signals or events propagating across system boundaries, save/load round-trips, network sync, or persistence |
+| **Visual/Feel** | Criteria reference animation behaviour, VFX, shader output, "feels responsive", perceived timing, screen shake, particle effects, audio sync, or visual feedback quality |
+| **UI** | Criteria reference menus, HUD elements, buttons, screens, dialogue boxes, inventory panels, tooltips, or any player-facing interface element |
+| **Config/Data** | Changes are limited to balance tuning values, data files, or configuration — no new code logic is involved |
+
+**Mixed stories** (e.g., a story that adds both a formula and a UI display):
+assign the primary type based on which acceptance criteria carry the highest
+implementation risk, and note the secondary type. Mixed Logic+Integration or
+Visual+UI combinations are the most common.
+
+After classifying all stories, produce a classification summary table in
+conversation before proceeding to Phase 4. This gives the user visibility into
+how tests will be allocated.
+
+---
+
+## Phase 4: Generate Test Plan
+
+Assemble the full QA plan document. Use this structure:
+
+````markdown
+# QA Plan: [Sprint/Feature Name]
+**Date**: [date]
+**Generated by**: /qa-plan
+**Scope**: [N stories across [N systems]]
+**Engine**: [engine name from .claude/docs/technical-preferences.md, or "Not configured"]
+**Sprint File**: [path to sprint plan if applicable]
+
+---
+
+## Test Summary
+
+| Story | Type | Automated Test Required | Manual Verification Required |
+|-------|------|------------------------|------------------------------|
+| [story title] | Logic | Unit test — `tests/unit/[system]/` | None |
+| [story title] | Integration | Integration test — `tests/integration/[system]/` | Smoke check |
+| [story title] | Visual/Feel | None (not automatable) | Screenshot + lead sign-off |
+| [story title] | UI | Interaction walkthrough | Manual step-through |
+| [story title] | Config/Data | Data validation test | Spot-check in-game values |
+
+---
+
+## Automated Tests Required
+
+### [Story Title] — [Type]
+**Test file path**: `tests/[unit|integration]/[system]/[story-slug]_test.[ext]`
+**What to test**:
+- [Specific formula or rule from the GDD Formulas section]
+- [Each named state transition or decision branch]
+- [Each side effect that should or should not occur]
+
+**Edge cases to cover**:
+- Zero/minimum input values (e.g., 0 damage, empty inventory)
+- Maximum/boundary input values (e.g., max level, stat cap)
+- Invalid or null input (e.g., missing target, dead entity)
+- [Any edge case explicitly called out in the GDD Edge Cases section]
+
+**Estimated test count**: ~[N] unit tests
+
+[If no GDD formula reference was found for this story, note:]
+*No formula found in referenced GDD — test cases must be derived from acceptance
+criteria directly. Review the GDD Formulas section before writing tests.*
+
+---
+
+## Manual QA Checklist
+
+### [Story Title] — [Type]
+**Verification method**: [Screenshot + designer sign-off | Playtest session |
+Manual step-through | Comparison against reference footage]
+**Who must sign off**: [designer / lead-programmer / qa-lead / art-lead]
+**Evidence to capture**: [screenshot of X | video clip of Y | written playtest
+notes | side-by-side comparison]
+
+Checklist:
+- [ ] [Specific observable condition — concrete and falsifiable]
+- [ ] [Another condition]
+- [ ] [Every acceptance criterion translated into a manual check item]
+
+*If any criterion uses subjective language ("feels", "looks", "seems"), it must
+be supplemented with a specific benchmark or a playtest protocol note.*
+
+---
+
+## Smoke Test Scope
+
+Critical paths to verify before any QA hand-off for this sprint:
+
+1. Game launches to main menu without crash
+2. New game / new session can be started
+3. [Primary mechanic introduced or changed this sprint]
+4. [Any system with a regression risk from this sprint's changes]
+5. Save / load cycle completes without data loss (if save system exists)
+6. Performance is within budget on target hardware (no new frame spikes)
+
+*Smoke tests are verified by the developer via `/smoke-check`. Reference this
+list when running that skill.*
+
+---
+
+## Playtest Requirements
+
+| Story | Playtest Goal | Min Sessions | Target Player Type |
+|-------|--------------|--------------|-------------------|
+| [story] | [What question must the session answer?] | [N] | [new player / experienced] |
+
+**Sign-off requirement**: Playtest notes must be written to
+`production/session-logs/playtest-[sprint]-[story-slug].md` and reviewed by
+the [designer / qa-lead] before the story can be marked COMPLETE.
+
+If no stories require playtest validation: *No playtest sessions required for
+this sprint.*
+
+---
+
+## Definition of Done — This Sprint
+
+A story is DONE when ALL of the following are true:
+
+- [ ] All acceptance criteria verified — via automated test result OR documented
+      manual evidence (screenshot, video, or playtest notes with sign-off)
+- [ ] Test file exists at the specified path for all Logic and Integration stories
+- [ ] Manual evidence document exists for all Visual/Feel and UI stories
+- [ ] Smoke check passes (run `/smoke-check sprint` before QA hand-off)
+- [ ] No regressions introduced
+- [ ] Code reviewed (via `/code-review` or documented peer review)
+- [ ] Story file updated to `Status: Complete` (via `/story-done`)
+````
+
+When generating content, use the actual story titles, GDD formula text, and
+acceptance criteria extracted in Phase 2. Do not use placeholder text — every
+test entry should reflect the real requirements of these specific stories.
+
+---
+
+## Phase 5: Write Output
+
+Show the complete plan in conversation (or a summary if the plan is very long),
+then ask:
+
+"May I write this QA plan to `production/qa/qa-plan-[sprint-slug]-[date].md`?"
+
+Write the plan exactly as generated — do not truncate.
+
+After writing:
+
+"QA plan written to `production/qa/qa-plan-[sprint-slug]-[date].md`.
+
+Next steps:
+- Share this plan with the team before sprint implementation begins
+- Run `/smoke-check sprint` after all stories are implemented to gate QA hand-off
+- For Logic/Integration stories, create the test files at the listed paths
+  before marking stories done — `/story-done` checks for them"
+
+---
+
+## Collaborative Protocol
+
+- **Never write the plan without asking** — Phase 5 requires explicit approval.
+- **Classify conservatively**: when a story is ambiguous between Logic and
+  Integration, classify it as Integration — it requires both unit and
+  integration tests.
+- **Do not invent test cases** beyond what acceptance criteria and GDD formulas
+  support. If a formula is absent from the GDD, flag it rather than guessing.
+- **Playtest requirements are advisory**: the user decides whether a playtest
+  is warranted for borderline Visual/Feel stories. Flag the case; do not mandate.
+- Use `AskUserQuestion` for scope selection when no argument is provided.
+  Keep all other phases non-interactive — present findings, then ask once to
+  approve the write.
--- a/.claude/skills/smoke-check/SKILL.md
+++ b/.claude/skills/smoke-check/SKILL.md
@ -0,0 +1,338 @@
+---
+name: smoke-check
+description: "Run the critical path smoke test gate before QA hand-off. Executes the automated test suite, verifies core functionality, and produces a PASS/FAIL report. Run after a sprint's stories are implemented and before manual QA begins. A failed smoke check means the build is not ready for QA."
+argument-hint: "[sprint | quick]"
+user-invocable: true
+allowed-tools: Read, Glob, Grep, Bash, Write
+---
+
+# Smoke Check
+
+This skill is the gate between "implementation done" and "ready for QA
+hand-off". It runs the automated test suite, checks for test coverage gaps,
+batch-verifies critical paths with the developer, and produces a PASS/FAIL
+report.
+
+The rule is simple: **a build that fails smoke check does not go to QA.**
+Handing a broken build to QA wastes their time and demoralises the team.
+
+**Output:** `production/qa/smoke-[date].md`
+
+---
+
+## Phase 1: Detect Test Setup
+
+Before running anything, understand the environment:
+
+1. **Test framework check**: verify `tests/` directory exists.
+   If it does not: "No test directory found at `tests/`. Run `/test-setup`
+   to scaffold the testing infrastructure, or create the directory manually
+   if tests live elsewhere." Then stop.
+
+2. **CI check**: check whether `.github/workflows/` contains a workflow file
+   referencing tests. Note in the report whether CI is configured.
+
+3. **Engine detection**: read `.claude/docs/technical-preferences.md` and
+   extract the `Engine:` value. Store this for test command selection in
+   Phase 2.
+
+4. **Smoke test list**: check whether `production/qa/smoke-tests.md` or
+   `tests/smoke/` exists. If a smoke test list is found, load it for use in
+   Phase 4. If neither exists, smoke tests will be drawn from the current QA
+   plan (Phase 4 fallback).
+
+5. **QA plan check**: glob `production/qa/qa-plan-*.md` and take the most
+   recently modified file. If found, note the path — it will be used in
+   Phase 3 and Phase 4. If not found, note: "No QA plan found. Run
+   `/qa-plan sprint` before smoke-checking for best results."
+
+Report findings before proceeding: "Environment: [engine]. Test directory:
+[found / not found]. CI configured: [yes / no]. QA plan: [path / not found]."
+
+---
+
+## Phase 2: Run Automated Tests
+
+Attempt to run the test suite via Bash. Select the command based on the engine
+detected in Phase 1:
+
+**Godot 4:**
+```bash
+godot --headless --script tests/gdunit4_runner.gd 2>&1
+```
+If the GDUnit4 runner script does not exist at that path, try:
+```bash
+godot --headless -s addons/gdunit4/GdUnitRunner.gd 2>&1
+```
+If neither path exists, note: "GDUnit4 runner not found — confirm the runner
+path for your test framework."
+
+**Unity:**
+Unity tests require the editor and cannot be run headlessly via shell in most
+environments. Check for recent test result artifacts:
+```bash
+ls -t test-results/ 2>/dev/null | head -5
+```
+If test result files exist (XML or JSON), read the most recent one and parse
+PASS/FAIL counts. If no artifacts exist: "Unity tests must be run from the
+editor or CI pipeline. Please confirm test status manually before proceeding."
+
+**Unreal Engine:**
+```bash
+ls -t Saved/Logs/ 2>/dev/null | grep -i "test\|automation" | head -5
+```
+If no matching log found: "UE automation tests must be run via the Session
+Frontend or CI pipeline. Please confirm test status manually."
+
+**Unknown engine / not configured:**
+"Engine not configured in `.claude/docs/technical-preferences.md`. Run
+`/setup-engine` to specify the engine, then re-run `/smoke-check`."
+
+**If the test runner is not available in this environment** (engine binary not
+on PATH, runner script not found, etc.), report clearly:
+
+"Automated tests could not be executed — engine binary not found on PATH.
+Status will be recorded as NOT RUN. Confirm test results from your local IDE
+or CI pipeline. Unconfirmed NOT RUN is treated as PASS WITH WARNINGS, not
+FAIL — the developer must manually confirm results."
+
+Do not treat NOT RUN as an automatic FAIL. Record it as a warning. The
+developer's manual confirmation in Phase 4 can resolve it.
+
+Parse runner output and extract:
+- Total tests run
+- Passing count
+- Failing count
+- Names of any failing tests (up to 10; if more, note the count)
+- Any crash or error output from the runner itself
+
+---
+
+## Phase 3: Check Test Coverage
+
+Draw the story list from, in priority order:
+1. The QA plan found in Phase 1 (its Test Summary table lists expected test
+   file paths per story)
+2. The current sprint plan from `production/sprints/` (most recently modified
+   file)
+3. If the `quick` argument was passed, skip this phase entirely and note:
+   "Coverage scan skipped — run `/smoke-check sprint` for full coverage
+   analysis."
+
+For each story in scope:
+
+1. Extract the system slug from the story's file path
+   (e.g., `production/epics/combat/story-001.md` → `combat`)
+2. Glob `tests/unit/[system]/` and `tests/integration/[system]/` for files
+   whose name contains the story slug or a closely related term
+3. Check the story file itself for a `Test file:` header field or a
+   "Test Evidence" section
+
+Assign a coverage status to each story:
+
+| Status | Meaning |
+|--------|---------|
+| **COVERED** | A test file was found matching this story's system and scope |
+| **MANUAL** | Story type is Visual/Feel or UI; a test evidence document was found |
+| **MISSING** | Logic or Integration story with no matching test file |
+| **EXPECTED** | Config/Data story — no test file required; spot-check is sufficient |
+| **UNKNOWN** | Story file missing or unreadable |
+
+MISSING entries are advisory gaps. They do not cause a FAIL verdict but must
+appear prominently in the report and must be resolved before `/story-done` can
+fully close those stories.
+
+---
+
+## Phase 4: Run Manual Smoke Checks
+
+Draw the smoke test checklist from, in priority order:
+1. The QA plan's "Smoke Test Scope" section (if QA plan was found in Phase 1)
+2. `production/qa/smoke-tests.md` (if it exists)
+3. `tests/smoke/` directory contents (if it exists)
+4. The standard fallback list below (used only when none of the above exist)
+
+Tailor batches 2 and 3 to the actual systems identified from the sprint or QA
+plan. Replace bracketed placeholders with real mechanic names from the current
+sprint's stories.
+
+Use `AskUserQuestion` to batch-verify. Keep to at most 3 calls.
+
+**Batch 1 — Core stability (always run):**
+```
+question: "Smoke check — Batch 1: Core stability. Please verify each:"
+options:
+  - "Game launches to main menu without crash — PASS"
+  - "Game launches to main menu without crash — FAIL"
+  - "New game / session starts successfully — PASS"
+  - "New game / session starts successfully — FAIL"
+  - "Main menu responds to all inputs — PASS"
+  - "Main menu responds to all inputs — FAIL"
+```
+
+**Batch 2 — Sprint mechanic and regression (always run):**
+```
+question: "Smoke check — Batch 2: This sprint's changes and regression check:"
+options:
+  - "[Primary mechanic this sprint] — PASS"
+  - "[Primary mechanic this sprint] — FAIL: [describe what broke]"
+  - "[Second notable change this sprint, if any] — PASS"
+  - "[Second notable change this sprint] — FAIL"
+  - "Previous sprint's features still work (no regressions) — PASS"
+  - "Previous sprint's features — regression found: [brief description]"
+```
+
+**Batch 3 — Data integrity and performance (run unless `quick` argument):**
+```
+question: "Smoke check — Batch 3: Data integrity and performance:"
+options:
+  - "Save / load completes without data loss — PASS"
+  - "Save / load — FAIL: [describe what broke]"
+  - "Save / load — N/A (save system not yet implemented)"
+  - "No new frame rate drops or hitches observed — PASS"
+  - "Frame rate drops or hitches found — FAIL: [where]"
+  - "Performance — not checked in this session"
+```
+
+Record each response verbatim for the Phase 5 report.
+
+---
+
+## Phase 5: Generate Report
+
+Assemble the full smoke check report:
+
+````markdown
+## Smoke Check Report
+**Date**: [date]
+**Sprint**: [sprint name / number, or "Not identified"]
+**Engine**: [engine]
+**QA Plan**: [path, or "Not found — run /qa-plan first"]
+**Argument**: [sprint | quick | blank]
+
+---
+
+### Automated Tests
+
+**Status**: [PASS ([N] tests, [N] passing) | FAIL ([N] failures) |
+NOT RUN ([reason])]
+
+[If FAIL, list failing tests:]
+- `[test name]` — [brief failure description from runner output]
+
+[If NOT RUN:]
+"Manual confirmation required: did tests pass in your local IDE or CI? This
+will determine whether the automated test row contributes to a FAIL verdict."
+
+---
+
+### Test Coverage
+
+| Story | Type | Test File | Coverage Status |
+|-------|------|-----------|----------------|
+| [title] | Logic | `tests/unit/[system]/[slug]_test.[ext]` | COVERED |
+| [title] | Visual/Feel | `tests/evidence/[slug]-screenshots.md` | MANUAL |
+| [title] | Logic | — | MISSING ⚠ |
+| [title] | Config/Data | — | EXPECTED |
+
+**Summary**: [N] covered, [N] manual, [N] missing, [N] expected.
+
+---
+
+### Manual Smoke Checks
+
+- [x] Game launches without crash — PASS
+- [x] New game starts — PASS
+- [x] [Core mechanic] — PASS
+- [ ] [Other check] — FAIL: [user's description]
+- [x] Save / load — PASS
+- [-] Performance — not checked this session
+
+---
+
+### Missing Test Evidence
+
+Stories that must have test evidence before they can be marked COMPLETE via
+`/story-done`:
+
+- **[story title]** (`[path]`) — Logic story has no test file.
+  Expected location: `tests/unit/[system]/[story-slug]_test.[ext]`
+
+[If none:] "All Logic and Integration stories have test coverage."
+
+---
+
+### Verdict: [PASS | PASS WITH WARNINGS | FAIL]
+
+[Verdict rules — first matching rule wins:]
+
+**FAIL** if ANY of:
+- Automated test suite ran and reported one or more test failures
+- Any Batch 1 (core stability) check returned FAIL
+- Any Batch 2 (primary sprint mechanic or regression check) returned FAIL
+
+**PASS WITH WARNINGS** if ALL of:
+- Automated tests PASS or NOT RUN (developer has not yet confirmed)
+- All Batch 1 and Batch 2 smoke checks PASS
+- One or more Logic/Integration stories have MISSING test evidence
+
+**PASS** if ALL of:
+- Automated tests PASS
+- All smoke checks in all batches PASS or N/A
+- No MISSING test evidence entries
+````
+
+---
+
+## Phase 6: Write and Gate
+
+Present the full report in conversation, then ask:
+
+"May I write this smoke check report to `production/qa/smoke-[date].md`?"
+
+Write only after approval.
+
+After writing, deliver the gate verdict:
+
+**If verdict is FAIL:**
+
+"The smoke check failed. Do not hand off to QA until these failures are
+resolved:
+
+[List each failing automated test or smoke check with a one-line description]
+
+Fix the failures and run `/smoke-check` again to re-gate before QA hand-off."
+
+**If verdict is PASS WITH WARNINGS:**
+
+"Smoke check passed with warnings. The build is ready for manual QA.
+
+Advisory items to resolve before running `/story-done` on affected stories:
+[list MISSING test evidence entries]
+
+QA hand-off: share `production/qa/qa-plan-[sprint].md` with the qa-tester
+agent to begin manual verification."
+
+**If verdict is PASS:**
+
+"Smoke check passed cleanly. The build is ready for manual QA.
+
+QA hand-off: share `production/qa/qa-plan-[sprint].md` with the qa-tester
+agent to begin manual verification."
+
+---
+
+## Collaborative Protocol
+
+- **Never treat NOT RUN as automatic FAIL** — record it as NOT RUN and let
+  the developer confirm status manually. Unconfirmed NOT RUN contributes to
+  PASS WITH WARNINGS, not FAIL.
+- **Never auto-fix failures** — report them and state what must be resolved.
+  Do not attempt to edit source code or test files.
+- **PASS WITH WARNINGS does not block QA hand-off** — it records advisory
+  gaps for `/story-done` to follow up on.
+- **`quick` argument** skips Phase 3 (coverage scan) and Phase 4 Batch 3.
+  Use it for rapid re-checks after fixing a specific failure.
+- Use `AskUserQuestion` for all manual smoke check verification.
+- **Never write the report without asking** — Phase 6 requires explicit
+  approval before any file is created.
--- a/.claude/skills/story-done/SKILL.md
+++ b/.claude/skills/story-done/SKILL.md
@ -45,6 +45,7 @@ Read the full story file. Extract and hold in context:
 - **ADR reference(s)** referenced
 - **Acceptance Criteria** — the complete list (every checkbox item)
 - **Implementation files** — files listed under "files to create/modify"
+- **Story Type** — the `Type:` field from the story header (Logic / Integration / Visual/Feel / UI / Config/Data)
 - **Engine notes** — any engine-specific constraints noted
 - **Definition of Done** — if present, the story-level DoD
 - **Estimated vs actual scope** — if an estimate was noted
@ -133,6 +134,42 @@ For each acceptance criterion in the story:
 4. For any ADVISORY untested criteria, add to the Completion Notes in Phase 7:
   `"Untested criteria: [AC-N list]. Recommend adding tests in a follow-up story."`

+### Test Evidence Requirement
+
+Based on the Story Type extracted in Phase 2, check for required evidence:
+
+| Story Type | Required Evidence | Gate Level |
+|---|---|---|
+| **Logic** | Automated unit test in `tests/unit/[system]/` — must exist and pass | BLOCKING |
+| **Integration** | Integration test in `tests/integration/[system]/` OR playtest doc | BLOCKING |
+| **Visual/Feel** | Screenshot + sign-off in `production/qa/evidence/` | ADVISORY |
+| **UI** | Manual walkthrough doc OR interaction test in `production/qa/evidence/` | ADVISORY |
+| **Config/Data** | Smoke check pass report in `production/qa/smoke-*.md` | ADVISORY |
+
+**For Logic stories**: use `Glob` to check `tests/unit/[system]/` for a test
+file matching the story slug. If none found:
+- Flag as **BLOCKING**: "Logic story has no unit test file. Expected at
+  `tests/unit/[system]/[story-slug]_test.[ext]`. Create and run the test
+  before marking this story Complete."
+
+**For Integration stories**: check `tests/integration/[system]/` AND
+`production/session-logs/` for a playtest record referencing this story.
+If neither exists: flag as **BLOCKING** (same rule as Logic).
+
+**For Visual/Feel and UI stories**: glob `production/qa/evidence/` for a file
+referencing this story. If none: flag as **ADVISORY** —
+"No manual test evidence found. Create `production/qa/evidence/[story-slug]-evidence.md`
+using the test-evidence template and obtain sign-off before final closure."
+
+**For Config/Data stories**: check for any `production/qa/smoke-*.md` file.
+If none: flag as **ADVISORY** — "No smoke check report found. Run `/smoke-check`."
+
+**If no Story Type is set**: flag as **ADVISORY** —
+"Story Type not declared. Add `Type: [Logic|Integration|Visual/Feel|UI|Config/Data]`
+to the story header to enable test evidence gate enforcement in future stories."
+
+Any BLOCKING test evidence gap prevents the COMPLETE verdict in Phase 6.
+
 ---

 ## Phase 4: Check for Deviations
@ -220,6 +257,11 @@ Before updating any files, present the full report:
 | AC-2: [text] | Manual confirmation | COVERED |
 | AC-3: [text] | — | UNTESTED |

+### Test Evidence
+**Story Type**: [Logic | Integration | Visual/Feel | UI | Config/Data | Not declared]
+**Required evidence**: [unit test file | integration test or playtest | screenshot + sign-off | walkthrough doc | smoke check pass]
+**Evidence found**: [YES — `[path]` | NO — BLOCKING | NO — ADVISORY]
+
 ### Deviations
 [NONE] OR:
 - BLOCKING: [description] — [GDD/ADR reference]
@ -257,6 +299,7 @@ If yes, edit the story file:
 **Completed**: [date]
 **Criteria**: [X/Y passing] ([any deferred items listed])
 **Deviations**: [None] or [list of advisory deviations]
+**Test Evidence**: [Logic: test file at path | Visual/Feel: evidence doc at path | None required (Config/Data)]
 **Code Review**: [Pending / Complete / Skipped]
 ```

--- a/.claude/skills/story-readiness/SKILL.md
+++ b/.claude/skills/story-readiness/SKILL.md
@ -169,9 +169,14 @@ items pass or are explicitly marked N/A with a stated reason.
 - [ ] **Performance budget noted if applicable**: If this story touches any
  part of the gameplay loop, rendering, or physics, a performance budget or
  a "no performance impact expected — [reason]" note is present.
- [ ] **Test strategy noted**: The story states whether verification is by
-  unit test, manual test, or playtest. "Acceptance criteria verified by
-  [test type]" is sufficient.
+- [ ] **Story Type declared**: The story includes a `Type:` field in its header
+  identifying the test category (Logic / Integration / Visual/Feel / UI / Config/Data).
+  Without this, test evidence requirements cannot be enforced at story close.
+  Fix: Add `Type: [Logic|Integration|Visual/Feel|UI|Config/Data]` to the story header.
+- [ ] **Test evidence requirement is clear**: If the Story Type is set, the story
+  includes a `## Test Evidence` section stating where evidence will be stored
+  (test file path for Logic/Integration, or evidence doc path for Visual/Feel/UI).
+  Fix: Add `## Test Evidence` with the expected evidence location for the story's type.

 ---

--- a/.claude/skills/team-qa/SKILL.md
+++ b/.claude/skills/team-qa/SKILL.md
@ -0,0 +1,210 @@
+---
+name: team-qa
+description: "Orchestrate the QA team through a full testing cycle. Coordinates qa-lead (strategy + test plan) and qa-tester (test case writing + bug reporting) to produce a complete QA package for a sprint or feature. Covers: test plan generation, test case writing, smoke check gate, manual QA execution, and sign-off report."
+argument-hint: "[sprint | feature: system-name]"
+user-invocable: true
+allowed-tools: Read, Glob, Grep, Write, Task
+agent: qa-lead
+---
+
+When this skill is invoked, orchestrate the QA team through a structured testing cycle.
+
+**Decision Points:** At each phase transition, use `AskUserQuestion` to present
+the user with the subagent's proposals as selectable options. Write the agent's
+full analysis in conversation, then capture the decision with concise labels.
+The user must approve before moving to the next phase.
+
+## Team Composition
+
+- **qa-lead** — QA strategy, test plan generation, story classification, sign-off report
+- **qa-tester** — Test case writing, bug report writing, manual QA documentation
+
+## How to Delegate
+
+Use the Task tool to spawn each team member as a subagent:
+- `subagent_type: qa-lead` — Strategy, planning, classification, sign-off
+- `subagent_type: qa-tester` — Test case writing and bug report writing
+
+Always provide full context in each agent's prompt (story file paths, QA plan path, scope constraints). Launch independent qa-tester tasks in parallel where possible (e.g., multiple stories in Phase 5 can be scaffolded simultaneously).
+
+## Pipeline
+
+### Phase 1: Load Context
+
+Before doing anything else, gather the full scope:
+
+1. Detect the current sprint or feature scope from the argument:
+   - If argument is a sprint identifier (e.g., `sprint-03`): read all story files in `production/sprints/[sprint]/`
+   - If argument is `feature: [system-name]`: glob story files tagged for that system
+   - If no argument: read `production/session-state/active.md` and `production/sprint-status.yaml` (if present) to infer the active sprint
+
+2. Read `production/stage.txt` to confirm the current project phase.
+
+3. Count stories found and report to the user:
+   > "QA cycle starting for [sprint/feature]. Found [N] stories. Current stage: [stage]. Ready to begin QA strategy?"
+
+### Phase 2: QA Strategy (qa-lead)
+
+Spawn `qa-lead` via Task to review all in-scope stories and produce a QA strategy.
+
+Prompt the qa-lead to:
+- Read each story file
+- Classify each story by type: **Logic** / **Integration** / **Visual/Feel** / **UI** / **Config/Data**
+- Identify which stories require automated test evidence vs. manual QA
+- Flag any stories with missing acceptance criteria or missing test evidence that would block QA
+- Estimate manual QA effort (number of test sessions needed)
+- Produce a strategy summary table:
+
+  | Story | Type | Automated Required | Manual Required | Blocker? |
+  |-------|------|--------------------|-----------------|----------|
+
+Present the qa-lead's full strategy to the user, then use `AskUserQuestion`:
+
+```
+question: "QA Strategy Review"
+options:
+  - "Looks good — proceed to test plan"
+  - "Adjust story types before proceeding"
+  - "Skip blocked stories and proceed with the rest"
+  - "Cancel — resolve blockers first"
+```
+
+If blockers are present: list them explicitly. The user may choose to skip blocked stories or cancel the cycle.
+
+### Phase 3: Test Plan Generation
+
+Using the strategy from Phase 2, produce a structured test plan document.
+
+The test plan should cover:
+- **Scope**: sprint/feature name, story count, dates
+- **Story Classification Table**: from Phase 2 strategy
+- **Automated Test Requirements**: which stories need test files, expected paths in `tests/`
+- **Manual QA Scope**: which stories need manual walkthrough and what to validate
+- **Out of Scope**: what is explicitly not being tested this cycle and why
+- **Entry Criteria**: what must be true before QA can begin (smoke check pass, build stable)
+- **Exit Criteria**: what constitutes a completed QA cycle (all stories PASS or FAIL with bugs filed)
+
+Ask: "May I write the QA plan to `production/qa/qa-plan-[sprint]-[date].md`?"
+
+Write only after receiving approval.
+
+### Phase 4: Smoke Check Gate
+
+Before any manual QA begins, run the smoke check.
+
+Spawn `qa-lead` via Task with instructions to:
+- Review the `tests/smoke/` directory for the current smoke test list
+- Check whether each smoke test scenario can be verified given the current build
+- Produce a smoke check result: **PASS** / **PASS WITH WARNINGS** / **FAIL**
+
+Report the result to the user:
+
+- **PASS**: "Smoke check passed. Proceeding to test case writing."
+- **PASS WITH WARNINGS**: "Smoke check passed with warnings: [list issues]. These are non-blocking. Proceeding — note these for the sign-off report."
+- **FAIL**: "Smoke check failed. QA cannot begin until these issues are resolved:
+  [list failures]
+  Fix them and re-run `/smoke-check`, or re-run `/team-qa` once resolved."
+
+On FAIL: stop the cycle and surface the list of failures. Do not proceed.
+
+### Phase 5: Test Case Writing (qa-tester)
+
+For each story requiring manual QA (Visual/Feel, UI, Integration without automated tests):
+
+Spawn `qa-tester` via Task for each story (run in parallel where possible), providing:
+- The story file path
+- The relevant section of the QA plan for that story
+- The GDD acceptance criteria for the system being tested (if available)
+- Instructions to write detailed test cases covering all acceptance criteria
+
+Each test case set should include:
+- **Preconditions**: game state required before testing begins
+- **Steps**: numbered, unambiguous actions
+- **Expected Result**: what should happen
+- **Actual Result**: field left blank for the tester to fill in
+- **Pass/Fail**: field left blank
+
+Present the test cases to the user for review before execution. Group by story.
+
+Use `AskUserQuestion` per story group (batched 3-4 at a time):
+
+```
+question: "Test cases ready for [Story Group]. Review before manual QA begins?"
+options:
+  - "Approved — begin manual QA for these stories"
+  - "Revise test cases for [story name]"
+  - "Skip manual QA for [story name] — not ready"
+```
+
+### Phase 6: Manual QA Execution
+
+Walk through each story in the approved manual QA list.
+
+Batch stories into groups of 3-4 and use `AskUserQuestion` for each:
+
+```
+question: "Manual QA — [Story Title]\n[brief description of what to test]"
+options:
+  - "PASS — all acceptance criteria verified"
+  - "PASS WITH NOTES — minor issues found (describe after)"
+  - "FAIL — criteria not met (describe after)"
+  - "BLOCKED — cannot test yet (reason)"
+```
+
+After each FAIL result: use `AskUserQuestion` to collect the failure description, then spawn `qa-tester` via Task to write a formal bug report in `production/qa/bugs/`.
+
+Bug report naming: `BUG-[NNN]-[short-slug].md` (increment NNN from existing bugs in the directory).
+
+After collecting all results, summarize:
+- Stories PASS: [count]
+- Stories PASS WITH NOTES: [count]
+- Stories FAIL: [count] — bugs filed: [IDs]
+- Stories BLOCKED: [count]
+
+### Phase 7: QA Sign-Off Report
+
+Spawn `qa-lead` via Task to produce the sign-off report using all results from Phases 4–6.
+
+The sign-off report format:
+
+```markdown
+## QA Sign-Off Report: [Sprint/Feature]
+**Date**: [date]
+**QA Lead sign-off**: [pending]
+
+### Test Coverage Summary
+| Story | Type | Auto Test | Manual QA | Result |
+|-------|------|-----------|-----------|--------|
+| [title] | Logic | PASS | — | PASS |
+| [title] | Visual | — | PASS | PASS |
+
+### Bugs Found
+| ID | Story | Severity | Status |
+|----|-------|----------|--------|
+| BUG-001 | [story] | S2 | Open |
+
+### Verdict: APPROVED / APPROVED WITH CONDITIONS / NOT APPROVED
+
+**Conditions** (if any): [list what must be fixed before the build advances]
+
+### Next Step
+[guidance based on verdict]
+```
+
+Verdict rules:
+- **APPROVED**: All stories PASS or PASS WITH NOTES; no S1/S2 bugs open
+- **APPROVED WITH CONDITIONS**: S3/S4 bugs open, or PASS WITH NOTES issues documented; no S1/S2 bugs
+- **NOT APPROVED**: Any S1/S2 bugs open; or stories FAIL without documented workaround
+
+Next step guidance by verdict:
+- APPROVED: "Build is ready for the next phase. Run `/gate-check` to validate advancement."
+- APPROVED WITH CONDITIONS: "Resolve conditions before advancing. S3/S4 bugs may be deferred to polish."
+- NOT APPROVED: "Resolve S1/S2 bugs and re-run `/team-qa` or targeted manual QA before advancing."
+
+Ask: "May I write this QA sign-off report to `production/qa/qa-signoff-[sprint]-[date].md`?"
+
+Write only after receiving approval.
+
+## Output
+
+A summary covering: stories in scope, smoke check result, manual QA results, bugs filed (with IDs and severities), and the final APPROVED / APPROVED WITH CONDITIONS / NOT APPROVED verdict.
--- a/.claude/skills/test-setup/SKILL.md
+++ b/.claude/skills/test-setup/SKILL.md
@ -0,0 +1,423 @@
+---
+name: test-setup
+description: "Scaffold the test framework and CI/CD pipeline for the project's engine. Creates the tests/ directory structure, engine-specific test runner configuration, and GitHub Actions workflow. Run once during Technical Setup phase before the first sprint begins."
+argument-hint: "[force]"
+user-invocable: true
+allowed-tools: Read, Glob, Grep, Bash, Write
+---
+
+# Test Setup
+
+This skill scaffolds the automated testing infrastructure for the project.
+It detects the configured engine, generates the appropriate test runner
+configuration, creates the standard directory layout, and wires up CI/CD
+so tests run on every push.
+
+Run this once during the Technical Setup phase, before any implementation
+begins. A test framework installed at sprint start costs 30 minutes.
+A test framework installed at sprint four costs 3 sprints.
+
+**Output:** `tests/` directory structure + `.github/workflows/tests.yml`
+
+---
+
+## Phase 1: Detect Engine and Existing State
+
+1. **Read engine config**:
+   - Read `.claude/docs/technical-preferences.md` and extract the `Engine:` value.
+   - If engine is not configured (`[TO BE CONFIGURED]`), stop:
+     "Engine not configured. Run `/setup-engine` first, then re-run `/test-setup`."
+
+2. **Check for existing test infrastructure**:
+   - Glob `tests/` — does the directory exist?
+   - Glob `tests/unit/` and `tests/integration/` — do subdirectories exist?
+   - Glob `.github/workflows/` — does a CI workflow file exist?
+   - Glob `tests/gdunit4_runner.gd` (Godot) or `tests/EditMode/` (Unity) or
+     `Source/Tests/` (Unreal) for engine-specific artifacts.
+
+3. **Report findings**:
+   - "Engine: [engine]. Test directory: [found / not found]. CI workflow: [found / not found]."
+   - If everything already exists AND `force` argument was not passed:
+     "Test infrastructure appears to be in place. Re-run with `/test-setup force`
+     to regenerate. Proceeding will not overwrite existing test files."
+
+If the `force` argument is passed, skip the "already exists" early-exit and
+proceed — but still do not overwrite files that already exist at a given path.
+Only create files that are missing.
+
+---
+
+## Phase 2: Present Plan
+
+Based on the engine detected and the existing state, present a plan:
+
+```
+## Test Setup Plan — [Engine]
+
+I will create the following (skipping any that already exist):
+
+tests/
+  unit/           — Isolated unit tests for formulas, state, and logic
+  integration/    — Cross-system tests and save/load round-trips
+  smoke/          — Critical path test list (15-minute manual gate)
+  evidence/       — Screenshot and manual test sign-off records
+  README.md       — Test framework documentation
+
+[Engine-specific files — see per-engine details below]
+
+.github/workflows/tests.yml  — CI: run tests on every push to main
+
+Estimated time: ~5 minutes to create all files.
+```
+
+Ask: "May I create these files? I will not overwrite any test files that
+already exist at these paths."
+
+Do not proceed without approval.
+
+---
+
+## Phase 3: Create Directory Structure
+
+After approval, create the following files:
+
+### `tests/README.md`
+
+```markdown
+# Test Infrastructure
+
+**Engine**: [engine name + version]
+**Test Framework**: [GdUnit4 | Unity Test Framework | UE Automation]
+**CI**: `.github/workflows/tests.yml`
+**Setup date**: [date]
+
+## Directory Layout
+
+```
+tests/
+  unit/           # Isolated unit tests (formulas, state machines, logic)
+  integration/    # Cross-system and save/load tests
+  smoke/          # Critical path test list for /smoke-check gate
+  evidence/       # Screenshot logs and manual test sign-off records
+```
+
+## Running Tests
+
+[Engine-specific command — see below]
+
+## Test Naming
+
+- **Files**: `[system]_[feature]_test.[ext]`
+- **Functions**: `test_[scenario]_[expected]`
+- **Example**: `combat_damage_test.gd` → `test_base_attack_returns_expected_damage()`
+
+## Story Type → Test Evidence
+
+| Story Type | Required Evidence | Location |
+|---|---|---|
+| Logic | Automated unit test — must pass | `tests/unit/[system]/` |
+| Integration | Integration test OR playtest doc | `tests/integration/[system]/` |
+| Visual/Feel | Screenshot + lead sign-off | `tests/evidence/` |
+| UI | Manual walkthrough OR interaction test | `tests/evidence/` |
+| Config/Data | Smoke check pass | `production/qa/smoke-*.md` |
+
+## CI
+
+Tests run automatically on every push to `main` and on every pull request.
+A failed test suite blocks merging.
+```
+```
+
+### Engine-specific files
+
+#### Godot 4 (`Engine: Godot`)
+
+Create `tests/gdunit4_runner.gd`:
+
+```gdscript
+# GdUnit4 test runner — invoked by CI and /smoke-check
+# Usage: godot --headless --script tests/gdunit4_runner.gd
+extends SceneTree
+
+func _init() -> void:
+    var runner := load("res://addons/gdunit4/GdUnitRunner.gd")
+    if runner == null:
+        push_error("GdUnit4 not found. Install via AssetLib or addons/.")
+        quit(1)
+        return
+    var instance = runner.new()
+    instance.run_tests()
+    quit(0)
+```
+
+Create `tests/unit/.gdignore_placeholder` with content:
+`# Unit tests go here — one subdirectory per system (e.g., tests/unit/combat/)`
+
+Create `tests/integration/.gdignore_placeholder` with content:
+`# Integration tests go here — one subdirectory per system`
+
+Note in the README: **Installing GdUnit4**
+```
+1. Open Godot → AssetLib → search "GdUnit4" → Download & Install
+2. Enable the plugin: Project → Project Settings → Plugins → GdUnit4 ✓
+3. Restart the editor
+4. Verify: res://addons/gdunit4/ exists
+```
+
+#### Unity (`Engine: Unity`)
+
+Create `tests/EditMode/` placeholder file `tests/EditMode/README.md`:
+```markdown
+# Edit Mode Tests
+Unit tests that run without entering Play Mode.
+Use for pure logic: formulas, state machines, data validation.
+Assembly definition required: `tests/EditMode/EditModeTests.asmdef`
+```
+
+Create `tests/PlayMode/README.md`:
+```markdown
+# Play Mode Tests
+Integration tests that run in a real game scene.
+Use for cross-system interactions, physics, and coroutines.
+Assembly definition required: `tests/PlayMode/PlayModeTests.asmdef`
+```
+
+Note in the README: **Enabling Unity Test Framework**
+```
+Window → General → Test Runner
+(Unity Test Framework is included by default in Unity 2019+)
+```
+
+#### Unreal Engine (`Engine: Unreal` or `Engine: UE5`)
+
+Create `Source/Tests/README.md`:
+```markdown
+# Unreal Automation Tests
+Tests use the UE Automation Testing Framework.
+Run via: Session Frontend → Automation → select "MyGame." tests
+Or headlessly: UnrealEditor -nullrhi -ExecCmds="Automation RunTests MyGame.; Quit"
+
+Test class naming: F[SystemName]Test
+Test category naming: "MyGame.[System].[Feature]"
+```
+
+---
+
+## Phase 4: Create CI/CD Workflow
+
+### Godot 4
+
+Create `.github/workflows/tests.yml`:
+
+```yaml
+name: Automated Tests
+
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+
+jobs:
+  test:
+    name: Run GdUnit4 Tests
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+        with:
+          lfs: true
+
+      - name: Run GdUnit4 Tests
+        uses: MikeSchulze/gdUnit4-action@v1
+        with:
+          godot-version: '[VERSION FROM docs/engine-reference/godot/VERSION.md]'
+          paths: |
+            tests/unit
+            tests/integration
+          report-name: test-results
+
+      - name: Upload Test Results
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: test-results
+          path: reports/
+```
+
+### Unity
+
+Create `.github/workflows/tests.yml`:
+
+```yaml
+name: Automated Tests
+
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+
+jobs:
+  test:
+    name: Run Unity Tests
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+        with:
+          lfs: true
+
+      - name: Run Edit Mode Tests
+        uses: game-ci/unity-test-runner@v4
+        env:
+          UNITY_LICENSE: ${{ secrets.UNITY_LICENSE }}
+        with:
+          testMode: editmode
+          artifactsPath: test-results/editmode
+
+      - name: Run Play Mode Tests
+        uses: game-ci/unity-test-runner@v4
+        env:
+          UNITY_LICENSE: ${{ secrets.UNITY_LICENSE }}
+        with:
+          testMode: playmode
+          artifactsPath: test-results/playmode
+
+      - name: Upload Test Results
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: test-results
+          path: test-results/
+```
+
+Note: Unity CI requires a `UNITY_LICENSE` secret. Add to GitHub repository
+secrets before the first CI run.
+
+### Unreal Engine
+
+Create `.github/workflows/tests.yml`:
+
+```yaml
+name: Automated Tests
+
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+
+jobs:
+  test:
+    name: Run UE Automation Tests
+    runs-on: self-hosted  # UE requires a local runner with the editor installed
+
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+        with:
+          lfs: true
+
+      - name: Run Automation Tests
+        run: |
+          "$UE_EDITOR_PATH" "${{ github.workspace }}/[ProjectName].uproject" \
+            -nullrhi -nosound \
+            -ExecCmds="Automation RunTests MyGame.; Quit" \
+            -log -unattended
+        shell: bash
+
+      - name: Upload Logs
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: test-logs
+          path: Saved/Logs/
+```
+
+Note: UE CI requires a self-hosted runner with Unreal Editor installed.
+Set the `UE_EDITOR_PATH` environment variable on the runner.
+
+---
+
+## Phase 5: Create Smoke Test Seed
+
+Create `tests/smoke/critical-paths.md`:
+
+```markdown
+# Smoke Test: Critical Paths
+
+**Purpose**: Run these 10-15 checks in under 15 minutes before any QA hand-off.
+**Run via**: `/smoke-check` (which reads this file)
+**Update**: Add new entries when new core systems are implemented.
+
+## Core Stability (always run)
+
+1. Game launches to main menu without crash
+2. New game / session can be started from the main menu
+3. Main menu responds to all inputs without freezing
+
+## Core Mechanic (update per sprint)
+
+<!-- Add the primary mechanic for each sprint here as it is implemented -->
+<!-- Example: "Player can move, jump, and the camera follows correctly" -->
+4. [Primary mechanic — update when first core system is implemented]
+
+## Data Integrity
+
+5. Save game completes without error (once save system is implemented)
+6. Load game restores correct state (once load system is implemented)
+
+## Performance
+
+7. No visible frame rate drops on target hardware (60fps target)
+8. No memory growth over 5 minutes of play (once core loop is implemented)
+```
+
+---
+
+## Phase 6: Post-Setup Summary
+
+After writing all files, report:
+
+```
+Test infrastructure created for [engine].
+
+Files created:
+- tests/README.md
+- tests/unit/ (directory)
+- tests/integration/ (directory)
+- tests/smoke/critical-paths.md
+- tests/evidence/ (directory)
+[engine-specific files]
+- .github/workflows/tests.yml
+
+Next steps:
+1. [Engine-specific install step, e.g., "Install GdUnit4 via AssetLib"]
+2. Write your first test: create tests/unit/[first-system]/[system]_test.[ext]
+3. Run `/qa-plan sprint` before your first sprint to classify stories and set
+   test evidence requirements
+4. `/smoke-check` before every QA hand-off
+
+Gate note: /gate-check Technical Setup → Pre-Production now requires:
+- tests/ directory with unit/ and integration/ subdirectories
+- .github/workflows/tests.yml
+- At least one example test file
+Run /test-setup and write one example test before advancing.
+```
+
+---
+
+## Collaborative Protocol
+
+- **Never overwrite existing test files** — only create files that are missing.
+  If a test runner file exists, leave it as-is.
+- **Always ask before creating files** — Phase 2 requires explicit approval.
+- **Engine detection is non-negotiable** — if the engine is not configured,
+  stop and redirect to `/setup-engine`. Do not guess.
+- **`force` flag skips the "already exists" early-exit but never overwrites.**
+  It means "create any missing files even if the directory already exists."
+- For Unity CI, note that the `UNITY_LICENSE` secret must be configured
+  manually. Do not attempt to automate license management.