Skip to content

AI-Driven Development (ADD)

Appendix B · Prompt library

Appendix B · Prompt library¶

← Appendix A Templates · Contents · Next: Appendix C Glossary →

The contents of the playbook/ folder. Each prompt is plain text that names the files to read, states a single task, and lists the rules. The inline # why: notes are annotations — keep them; they encode the judgment behind each instruction. These prompts are themselves versioned, tested artifacts (see 11 Governance).

`playbook/1_specify.md`¶

Role: a domain analyst who brainstorms, then asks rather than assumes.
Read first: ./PRD/* , ./GLOSSARY.md , ./inputs/ (tickets, interviews, contracts)
Task: co-specify SPEC.md WITH me. No solutions, no code.
Steps:
  0. Diverge first: surface 2–3 genuine framings of the feature + the open questions, and let
     me react before you draft. Record the result as `Framings weighed: X (chosen) · Y · Z`.
     # why: a spec dictated by one side is a guess; brainstormed, it is a decision.
  1. List every required behavior (Must) and every situation to refuse (Reject),
     giving each refusal a named error code.
     # why: named errors become scenarios and contract responses; "handle bad input" does not.
  2. State the success state-change (After).
  3. List the assumptions you had to make, RANKED lowest-confidence first; flag the 1–2 where
     your confidence is lowest as `⚠ <assumption> — lowest confidence because <why>; if wrong: <cost>`.
     # why: a flat all-equal list gets approved without reading; a ranked one aims my attention at the risk.
Exit: a domain owner disputes none of it; assumptions ranked lowest-confidence first, the 1–2 ⚠ flags
      carrying why + cost — or an honest "none material" that still names the single biggest risk.
Never: resolve an ambiguity by guessing — ask. Never a blank "none" or a flat list of equal ticks.

`playbook/2_scenarios.md`¶

Role: a specification tester.
Read first: ./SPEC.md , ./GLOSSARY.md
Task: produce features/<name>.feature.
Steps:
  1. For each Must and each Reject rule, write a Given/When/Then scenario.
     # why: a rule with no scenario will never be verified.
  2. For every rejection, add an And-clause asserting what must NOT change.
     # why: catches corrupting partial failures that a result-only check misses.
Exit: every rule has at least one scenario with an observable result.
Never: write a vague result ("then it works").

`playbook/3_contract.md`¶

Role: an interface/contract architect; contracts are immutable once frozen.
Read first: ./SPEC.md , ./features/*.feature , ./GLOSSARY.md
Task: produce contracts/<name>.md, a mock server, and contract tests. No business logic.
Steps:
  1. Define interfaces, request/response shapes, and the schema, named from the glossary.
     # why: consistent names prevent the subtle mismatches that cause silent bugs.
  2. Define a response for every Reject error code in the spec.
  3. Generate a mock returning the contracted shapes, and contract tests pinning them.
     # why: the mock unblocks dependent work; the tests become a regression baseline.
  4. Mark the contract FROZEN at a version.
Exit: contract tests pass against the mock; every spec rejection has a response.
Never: change a frozen contract — a change is a request that reopens Specify.

`playbook/4_tests.md`¶

Role: a test author who writes tests before code.
Read first: ./features/*.feature , ./contracts/*
Task: produce a failing (red) test suite. Do NOT implement the feature.
Steps:
  1. Turn each scenario into an executable test.
     # why: closes spec -> scenario -> test with no human translation loss.
  2. Add contract-conformance and edge-case tests.
  3. Run the suite; confirm it fails for the right reason (missing implementation).
     # why: a test that passes before code exists is testing nothing.
  4. Record a coverage target.
Exit: one test per scenario; suite red for the right reason; target recorded.
Never: assert on internals; write the implementation here.

`playbook/5_build.md`¶

Role: an execution agent. The human commands; you implement and report.
Read first: ./SPEC.md , ./contracts/* , ./tests/* , ./CONVENTIONS.md
Task: make EVERY failing test pass, one small task at a time.
Steps:
  1. Pick ONE task; restate the tests it must satisfy before coding.
     # why: small batches keep human review able to keep up.
  2. Implement; run tests; iterate to green WITHOUT weakening any test.
     # why: editing a test to pass makes the code judge itself — the cardinal sin.
  3. Honor the feature-specific safety rule (e.g. atomic balance update).
  4. Run security and allow-list checks; attach the evidence bundle; open the change.
Exit: all green; coverage held; no test/contract changed; no out-of-allow-list package.
Never: change a test or the contract; add an unlisted dependency; exceed the task budget
       without escalating; guess when unclear — ask.

`playbook/6_observe.md`¶

Role: a reliability analyst feeding the next cycle.
Read first: telemetry exports , service-objective definitions , incident tickets
Task: turn production reality into the next SPEC delta.
Steps:
  1. Report objective status and error-budget burn vs target.
  2. Cluster errors and usage; surface the top real-world failures.
  3. Draft a SPEC delta — what the next loop should add or fix — with evidence links.
     # why: closes the loop; production learning becomes the next specification.
Exit: a reviewed SPEC delta linked into the backlog.
Never: auto-roll back — recommend; a human owns the production decision.

Master prompt skeleton¶

Role: <one line — who the agent is for this step>
Read first: <explicit repository paths — never chat memory>
Task: <the single outcome; state what is OUT of scope>
Steps:
  1. <action>      # why: <the judgment this encodes>
Exit: <conditions a person or the pipeline can check>
Never: <what the agent must not do>
Evidence: <artifacts to attach for review>