What it does

Blocks the agent from declaring a task done until typecheck and the full test suite have actually run and passed. If anything fails, the agent fixes it without asking — no “should be good” without proof.

When to use it

Before every commit, in any project with a real test suite
When you’re a non-technical reviewer who can’t spot type errors in a diff
In CI/CD-heavy workflows where failing locally is cheaper than failing in pipeline
Anywhere an agent has previously claimed “all done” only for CI to fail

How it works

Add to your CLAUDE.md:

## Verify Before Done

Never mark a task complete without proving it works:

1. Run `{{typecheck_command}}` from the project root
2. Run `{{test_command}}` from the project root
3. For user-facing changes, run E2E tests (`{{e2e_command}}`)
4. Report results clearly:
   - List any type errors with file and line
   - List any test failures with test name and assertion
5. If all pass: confirm "Ready to commit"
6. If anything fails: fix all issues before marking the task done — do not ask, just fix

Configure for your stack

Stack	typecheck	test	e2e
Bun + TS	`bun run typecheck`	`bun test`	`bun run test:e2e`
Node + TS	`tsc --noEmit`	`npm test`	`npx playwright test`
pnpm + TS	`pnpm typecheck`	`pnpm test`	`pnpm test:e2e`
Python	`mypy .`	`pytest`	—

Example

Without this skill: Agent finishes a feature, says “Done — implementation looks correct.” You merge. CI fails on a type error in an unrelated file the agent forgot it touched.

With this skill: Agent finishes the feature, runs typecheck (catches the cross-file type error), fixes it, runs tests, reports “Typecheck clean, 142 tests passed. Ready to commit.”

Why it matters

LLM-generated code looks plausible at a glance. Type errors and test failures are how you know the code actually does what it claims. Treating these as a hard gate — not a courtesy — turns “looks done” into “is done.” This is especially important when the human reviewer doesn’t read code and is relying on the agent’s report.