What it does
Splits feature work into two strict phases: write tests that fail, commit them, then implement the minimum code to make them pass. The agent is not allowed to write implementation before the failing tests exist on disk and have been verified to fail.
When to use it
- Implementing a story or feature with clear acceptance criteria
- Working in a codebase where tests are the safety net for non-technical reviewers
- Onboarding a new contributor (human or agent) who needs the discipline enforced
- Anywhere you’ve watched an agent retrofit tests around the implementation it just wrote
How it works
Add the rule to your CLAUDE.md (or .cursorrules):
## Strict TDD
Implement features in two phases:
### Phase 1 — Write failing tests (red)
1. Read the story or feature spec carefully
2. Identify all behaviors to test — happy path, edge cases, error cases
3. Write tests in the appropriate test directory using {{test_runner}}
4. Run the test suite — confirm tests FAIL
5. Commit: `test: add failing tests for [feature]`
### Phase 2 — Implement (green)
6. Write the minimum code to make all tests pass — no more
7. Run tests — confirm all pass
8. Run typecheck — confirm zero errors
9. Commit: `feat: implement [feature]`
### Rules
- Never modify tests to make them pass — fix the implementation
- Never skip Phase 1 (tests must fail before any implementation code)
- If a test appears genuinely wrong, flag it and ask before changing
- Keep implementation minimal — don't build beyond what tests require
Configure for your stack
Replace {{test_runner}} with your runner: bun test, npm test, pnpm test, vitest, pytest, etc. Replace test directory references with your project’s convention.
Why it matters
The reason TDD works is that the failing test pins down the spec before the agent has a chance to drift. If you let the agent write code first and tests second, the tests will conform to whatever the code happens to do — not what you actually wanted. Strict red/green prevents that drift.
For non-technical product owners, the failing test commit is also a checkpoint: you can read the test name and assertion in plain English and confirm it captures what you asked for, before any implementation effort is spent.