Appendix F · Document requirements matrix (Project → Milestone → Task)¶
← Appendix E Checklists · Contents
This appendix maps every AIDD document to a three-level project hierarchy, so that at any level a team can answer three questions: which documents must exist, who owns them, and what proves the level is complete. It is the traceability backbone of the method — read it alongside the stage-depth matrix in 10 Setup and stages, which covers step depth; this appendix covers document requirements.
The three levels¶
| Level | What it is | AIDD meaning | Spans |
|---|---|---|---|
| Project | the whole product or engagement | the living documentation — documents created once and kept for the life of the product | all milestones |
| Milestone | a stage or release | one pass of the flow at a chosen depth: Prototype, POC, MVP, or Production-Ready; groups many tasks | many tasks |
| Task | one feature through the flow | a single pass of Specify → … → Verify → Observe; the smallest unit with its own gate records | the seven steps |
A project sets up the living documentation once. A milestone is a depth-bounded goal that groups tasks and has its own entry and exit document gates. A task is one feature, and it produces the per-feature artifacts.
How the hierarchy decomposes¶
flowchart TD
P["PROJECT — the product<br/>PROJECT.md (foundation) · CONVENTIONS · GLOSSARY · MODEL_REGISTRY · allowlist · playbook"]
P --> M1["MILESTONE · Prototype"]
P --> M2["MILESTONE · POC"]
P --> M3["MILESTONE · MVP"]
P --> M4["MILESTONE · Production-Ready"]
M3 --> T1["TASK · Transfer between accounts<br/>SPEC · feature · contract · tests · code · gate records"]
M3 --> T2["TASK · View balance"]
M3 --> T3["TASK · Transaction history"]
classDef p fill:#F1EFE8,stroke:#5F5E5A,color:#2C2C2A;
classDef m fill:#FAEEDA,stroke:#BA7517,color:#633806;
classDef t fill:#E6F1FB,stroke:#185FA5,color:#042C53;
class P p; class M1,M2,M3,M4 m; class T1,T2,T3 t;
Matrix 1 — Documents by level (ownership and lifespan)¶
Which document lives at which level, who is accountable for it, and how long it lasts.
| Document | Level | Created | Lifespan | Accountable owner |
|---|---|---|---|---|
PROJECT.md (foundation: domain · spec · UI/UX) |
Project | setup, grows | whole project | Product / Architect |
CONVENTIONS.md |
Project | setup | whole project | Architect / Lead |
GLOSSARY.md |
Project | setup, grows | whole project | Product / Domain |
MODEL_REGISTRY.md |
Project | setup | whole project | Architect / Lead |
dependencies.allowlist |
Project | setup | whole project | Security |
playbook/*.md (prompts) |
Project | setup, versioned | whole project | Eng Lead |
| Stage plan / roadmap | Milestone | per milestone | the milestone | EM / Delivery |
| Milestone exit report | Milestone | milestone end | the milestone | EM / Delivery |
SLO.md (objectives) |
Milestone (MVP+) | from MVP | from MVP onward | DevOps / SRE |
SPEC.md |
Task | per feature | living | Product / Domain |
features/*.feature |
Task | per feature | living | QA / Test |
contracts/*.md |
Task → Project | per feature, then frozen | living doc (promoted to project) | Architect / Lead |
tests/* |
Task | per feature | living | QA / Engineer |
| Source code | Task | per feature | disposable | Engineer |
| Gate outcome records | Task | per step | kept for audit | the reviewer |
Note the one promotion: a contract is authored at task level but, once frozen, becomes part of the project's living documentation — other tasks depend on it. That promotion is why a contract change is a project-level change request, not a task-local edit.
Matrix 2 — Documents required by milestone¶
Which documents must exist, and at what depth, to exit each milestone. Depth: Deep · Core · Light · — (not required).
| Document | Prototype | POC | MVP | Production-Ready |
|---|---|---|---|---|
CONVENTIONS.md |
Light | Core | Required | Required |
GLOSSARY.md |
seed | Core | Required | Required |
MODEL_REGISTRY.md |
Required | Required | Required | Required |
dependencies.allowlist |
optional | Required | Required | Required |
playbook/*.md |
Required | Required | Required | Required |
SPEC.md |
Light | Deep (risky slice) | Required (full) | Required (full) |
| Design: flows + screen states | Deep | Light | Core | Deep |
features/*.feature |
— | Core | Required | Exhaustive |
contracts/*.md (frozen) |
— | Core (risky slice) | Required (frozen) | Required (versioned) |
tests/* |
— | Core | Core | Full coverage |
SLO.md |
— | — | Light | Required |
| Gate outcome records | — | Core | Required | Required (all HARD-STOP) |
| Operate / observe report | — | — | Light | Required |
| Milestone exit report | Light | Core | Required | Required |
Reading it: a Prototype exits on a deep design and little else; a POC adds a deep spec, core scenarios/contract/tests on the risky slice; an MVP requires the full per-feature document set plus light operations; Production requires everything at full depth with operations and audit-grade gate records.
Matrix 3 — Documents required per task (the seven steps)¶
Every task, regardless of milestone, produces this artifact chain. The depth varies by milestone (Matrix 2); the sequence and exit gate do not.
| Step | Required document | Exit gate (the proof) | Detail |
|---|---|---|---|
| 1 Specify | SPEC.md |
rules + named rejections, assumptions ranked lowest-confidence first (biggest risk ⚠-flagged) | 03 |
| 2 Scenarios | features/<task>.feature |
one scenario per rule | 04 |
| 3 Contract | contracts/<task>.md |
frozen + contract tests green | 05 |
| 4 Tests | tests/<task>_* |
one test per scenario, red first | 06 |
| 5 Build | source code + evidence bundle | all tests green, nothing weakened | 07 |
| 6 Verify | gate outcome record | PASS / RISK-ACCEPTED / HARD-STOP (auto-resolved on evidence under autonomy: auto; security always escalates) |
08 |
| 7 Observe | TASK.md §7 OBSERVE block |
released behind a flag; scenario-monitors live; spec delta + lessons learned captured | 09 |
A task is done when the build's documents exist and the Verify record reads PASS (or a signed RISK-ACCEPTED); the seventh step — Observe (§7) — then runs in production and feeds the next loop's Specify. See the master shippable checklist in Appendix E.
Matrix 4 — Executable proofs (the claims the engine enforces)¶
The rows above are the method's promises. A promise a tool quietly breaks is worse than none — so the add engine ships a proof-harness: each invariant below is pinned by an automated test that fails loudly if the Story (this book) and the State (the engine) drift apart. This table is the coverage so far, not a completeness claim — but the minimalism-and-coverage audit has now run once over Matrices 1–3 (see Sweep findings below); what it could cheaply prove, it added; what it deliberately left unenforced, it recorded.
| Claim (where it lives) | The engine enforces | Proof test |
|---|---|---|
No silent skips (principle 7) · "done only when Verify reads PASS" (Matrix 3) |
gate PASS is refused unless the task has reached verify |
test_gate_pass_refused_before_verify |
| A passed task is genuinely done | gate PASS at verify advances to done |
test_gate_pass_at_verify_reaches_done |
| Deliberate ≠ silent | the explicit phase command is a logged escape hatch the guardrail does not block |
test_phase_override_escape_hatch |
"A security finding is ALWAYS HARD-STOP" |
HARD-STOP is recordable from any phase and never forces done |
test_hardstop_recordable_mid_build |
"done … or a signed RISK-ACCEPTED" (Matrix 3) |
gate RISK-ACCEPTED at verify advances to done (same guard as PASS) |
test_risk_accepted_complete_reaches_done |
| A waived task can complete its milestone (the point of the waiver) | the completeness predicate counts a signed RISK-ACCEPTED as done, so milestone-done / ready / check / archive accept it — it does not silently block |
test_milestone_done_accepts_a_waived_task · test_check_tolerates_a_recorded_waiver |
| A waiver is signed (owner · ticket · expiry) | gate RISK-ACCEPTED is refused without all three; they are stored in state |
test_risk_accepted_requires_waiver · test_risk_accepted_partial_waiver_refused |
| A waiver can expire — a lapsed one is caught, not trusted forever | check FAILS a RISK-ACCEPTED task whose stored expires is before today; fail-closed on a missing/unparseable date (waiver_expired) |
test_check_flags_expired_waiver · test_check_passes_unexpired_waiver · test_check_failclosed_on_unparseable_expires |
| The Story is never auto-loaded (principle 9, the Minimal pillar) | no command reads a docs/ chapter at runtime — and the spy runs every subcommand the parser exposes, so "no command" is universal, not a subset; a project with no docs/ runs the whole lifecycle |
test_full_lifecycle_runs_with_no_story · test_no_command_reads_a_docs_chapter · test_every_subcommand_is_covered |
| The book's gate outcomes are the engine's | PASS · RISK-ACCEPTED · HARD-STOP exist in both prose and GATES |
test_book_gate_outcomes_match_engine |
The tests are the source of truth; this table is their index. If a row here is ever unproven, that is a gap in the method, not a detail — the proof-harness exists to make such gaps fail loudly. (Tests: add-method/tooling/test_proof_harness.py, test_waiver.py.)
Now closed: an earlier version of this table flagged a RISK-ACCEPTED gap — the engine advanced only PASS to done, so a waived task could not complete its milestone, and the waiver fields were uncaptured. The RISK-ACCEPTED rows above close it: a signed waiver (owner · ticket · expiry) now completes a verify-phase task and is stored in state for a later check to expire. Closing it took two edits, not one — advancing the gate to done was necessary but not sufficient, because the shared completeness predicate (milestone-done / ready / check / archive all read it) still counted only PASS; a waived task reached done yet silently blocked its milestone until that predicate was taught to count a signed RISK-ACCEPTED too. The end-to-end row above is what catches that class of half-fix — proving the task completes is not proving the milestone can. The pattern that found it — book-claim → engine-enforces → named test — is the standing way to audit the remaining rows.
Sweep findings (minimalism-and-coverage audit, v2): the audit walked Matrices 1–3 for claims the engine could enforce but did not yet prove.
- Proved and added (rows above): the Minimal pillar's headline — "the Story is never auto-loaded" (principle 9) — was written but unproven; it is now pinned behaviorally (the engine runs the whole lifecycle with no
docs/present, and a read-spy over every subcommand confirms none reads a chapter at runtime). And waiver expiry, which Matrix 4 already promised the state captured "for a latercheckto expire," is now enforced. - Recorded, deliberately not enforced: Matrix 3 says a task is done only when "all six documents exist." The engine checks that
TASK.mdexists and that its phase marker matches state — it does not parse the file to confirm each of the seven sections is filled. Teaching it to grade section completeness would push the engine toward reading and judging the Story it is supposed to keep off the runtime path — an anti-minimal move. The reviewer owns section completeness at the Verify gate (the human-led half of the method); the engine owns the cheap structural invariants. This is a chosen boundary, not an oversight. - Lean check: the audit confirmed
state.jsoncarries no redundant per-task fields —title,phase,gate,milestone,depends_on, the two timestamps, and awaiveronly once one is signed; nothing to trim. A clean bill is a finding too.
Worked example — the hierarchy filled in¶
- Project: Mobile Banking App. Living documentation:
CONVENTIONS.md,GLOSSARY.md(defines account, balance, transfer),MODEL_REGISTRY.md,dependencies.allowlist,playbook/. - Milestone: MVP — core money movement. Exit requires the full per-feature document set for each task below, plus a light
SLO.mdand a milestone exit report. - Task: Transfer between own accounts →
SPEC.md,features/transfer.feature,contracts/transfer.md(frozen at v1),tests/transfer_test.py, code, and aPASSgate record. (The full set is in Appendix D.) - Task: View balance → its own SPEC, feature, contract, tests, code, record.
- Task: Transaction history → its own set.
When all three tasks read PASS and the milestone documents exist, the MVP milestone exits — and the frozen transfer contract is now a project-level living-documentation artifact the next milestone builds on.
Traceability chain¶
The hierarchy gives a clean line of evidence from a business goal down to a passing test — which is what makes an AIDD project auditable:
PROJECT goal "let customers move their own money safely"
└─ MILESTONE (MVP) "core money movement"
└─ TASK "transfer between own accounts"
└─ SPEC rule "source must have enough balance"
└─ SCENARIO "insufficient funds -> rejected, no change"
└─ TEST test_insufficient_funds (was red, now green)
└─ VERIFY record PASS (atomicity checked)
Every level points down to the evidence beneath it and up to the goal above it. To audit any claim — "we handle insufficient funds correctly" — you follow the chain to a specific test and a specific gate record. Nothing rests on assertion.
This matrix is the requirements view of the method. The flow (Part II) tells you the order; the stages (10) tell you the depth; this appendix tells you, at each level of the project, exactly which documents must exist and who owns them.
End of book. AIDD is one repeatable loop — Specify → Scenarios → Contract → Tests → Build → Verify → observe, then repeat. People own direction and verification; the AI owns the build; the artifacts are the asset and the code is disposable.