What it does

Splits feature work into two strict phases: write tests that fail, commit them, then implement the minimum code to make them pass. The agent is not allowed to write implementation before the failing tests exist on disk and have been verified to fail.

When to use it

Implementing a story or feature with clear acceptance criteria
Working in a codebase where tests are the safety net for non-technical reviewers
Onboarding a new contributor (human or agent) who needs the discipline enforced
Anywhere you’ve watched an agent retrofit tests around the implementation it just wrote

How it works

Add the rule to your CLAUDE.md (or .cursorrules):

## Strict TDD

Implement features in two phases:

### Phase 1 — Write failing tests (red)

1. Read the story or feature spec carefully
2. Identify all behaviors to test — happy path, edge cases, error cases
3. Write tests in the appropriate test directory using {{test_runner}}
4. Run the test suite — confirm tests FAIL
5. Commit: `test: add failing tests for [feature]`

### Phase 2 — Implement (green)

6. Write the minimum code to make all tests pass — no more
7. Run tests — confirm all pass
8. Run typecheck — confirm zero errors
9. Commit: `feat: implement [feature]`

### Rules

- Never modify tests to make them pass — fix the implementation
- Never skip Phase 1 (tests must fail before any implementation code)
- If a test appears genuinely wrong, flag it and ask before changing
- Keep implementation minimal — don't build beyond what tests require

Configure for your stack

Replace {{test_runner}} with your runner: bun test, npm test, pnpm test, vitest, pytest, etc. Replace test directory references with your project’s convention.

Why it matters

The reason TDD works is that the failing test pins down the spec before the agent has a chance to drift. If you let the agent write code first and tests second, the tests will conform to whatever the code happens to do — not what you actually wanted. Strict red/green prevents that drift.

For non-technical product owners, the failing test commit is also a checkpoint: you can read the test name and assertion in plain English and confirm it captures what you asked for, before any implementation effort is spent.