Test-Driven Development (TDD) Skill
Guide users through disciplined test-first development using the red-green-refactor cycle.
Core TDD Workflow
RED → GREEN → REFACTOR → REPEAT
RED: Write a Failing Test
- Identify next small behavior to implement
- Write test that specifies that behavior
- Run test to verify it fails for the right reason
- If test passes unexpectedly, test is wrong
GREEN: Make It Pass
- Write minimal code to make test pass
- Don't worry about perfection yet
- Simplest solution that works
- Run test to verify it passes
REFACTOR: Improve the Code
- Improve code quality while keeping tests green
- Remove duplication
- Improve names and structure
- Run tests after each change to ensure still passing
REPEAT
- Commit when tests are green
- Identify next behavior
- Start cycle again with new test
TDD Discipline
Critical rules to follow:
Test First (RED Phase)
- Always write test before implementation
- Resist urge to write code first
- Test defines what "done" means
- See test fail before making it pass
Minimal Implementation (GREEN Phase)
- Write simplest code to pass
- Don't over-engineer
- Don't add features not tested
- One test at a time
Refactor Only When Green
- Never refactor with failing tests
- Keep tests passing during refactoring
- Small, incremental improvements
- Run tests after each refactoring step
Run Tests Frequently
- After writing test (should fail)
- After writing implementation (should pass)
- After each refactoring step (should stay green)
- Before committing
When to Use This Skill
Activate for requests involving:
- "Use TDD for..." / "Test-driven development..."
- "Write tests first..." / "Red-green-refactor..."
- Developing new features test-first
- Learning TDD practices
- Setting up test infrastructure
- Test design and organization
Test Structure Patterns
Arrange-Act-Assert (AAA)
Arrange - Set up test data and environment Act - Execute the code under test Assert - Verify the results
def test_add_two_numbers():
calculator = Calculator() # Arrange
result = calculator.add(2, 3) # Act
assert result == 5 # Assert
Given-When-Then (BDD Style)
Given - Initial context/preconditions When - Action/event occurs Then - Expected outcome
Test Design Principles
What to Test
- Public interface - Test behavior users depend on
- Edge cases - Boundaries, empty inputs, max values
- Error conditions - Invalid inputs, exceptions
- Business logic - Core algorithms and rules
- Integration points - Where components interact
What NOT to Test
- Private implementation details - Test behavior, not internals
- Third-party libraries - Trust they work, test your usage
- Simple getters/setters - Unless they have logic
- Framework code - Test your code, not the framework
One Behavior Per Test
- Each test should verify single behavior
- Makes failures easier to diagnose
- Keeps tests focused and readable
- Prefer multiple small tests over one large test
Make Tests Readable
- Descriptive names -
test_add_returns_sum_of_two_positive_numbers - Clear structure - AAA or Given-When-Then
- Self-documenting - Test shows how code should be used
- Minimal setup - Only what's needed for this test
Keep Tests Independent
- Tests should not depend on each other
- Tests can run in any order
- Each test starts with clean state
- No shared mutable state between tests
See references/test-design-patterns.md for comprehensive guidance.
Language-Specific Guidance
For Python
See references/python-tdd.md for:
- pytest and unittest frameworks
- Fixtures and parametrized tests
- Mocking with unittest.mock
- Testing async code
- Coverage with pytest-cov
- Running and organizing tests
For Emacs Lisp
See references/elisp-tdd.md for:
- ERT (Emacs Lisp Regression Testing)
- Testing interactive functions
- Buffer manipulation testing
- Mocking with cl-letf
- Buttercup (BDD alternative)
- Running tests in Emacs and batch mode
For Other Languages
See references/general-tdd.md for:
- Finding testing frameworks
- Universal test patterns
- Common testing concepts
- Build tool integration
- Language-agnostic principles
Test Types and When to Use
Unit Tests
What: Test individual functions/methods in isolation
When:
- Testing pure functions
- Testing business logic
- Testing algorithms
- Fast, focused tests
Example: test_calculate_discount(price, percentage)
Integration Tests
What: Test multiple components working together
When:
- Testing database interactions
- Testing API calls
- Testing service integration
- Verifying components connect correctly
Example: test_user_service_saves_to_database()
End-to-End Tests
What: Test complete user workflows
When:
- Testing critical user paths
- Verifying system as a whole
- Smoke tests for deployment
Example: test_user_can_register_and_login()
Test Pyramid:
/\ ← Few E2E tests (slow, brittle)
/ \
/ IT \ ← Some Integration tests
/______\
/ Unit \ ← Many Unit tests (fast, focused)
/__________\
TDD Red-Green-Refactor Example
Goal: Implement factorial function
Iteration 1 - Base case:
- RED:
test_factorial_of_zero_is_one()→ ❌ factorial not defined - GREEN:
def factorial(n): return 1→ ✅ Passes - REFACTOR: Nothing yet. Commit.
Iteration 2 - Positive numbers:
- RED:
test_factorial_of_five()expects 120 → ❌ Got 1 - GREEN: Implement loop to calculate factorial → ✅ Passes
- REFACTOR: Use recursion for elegance → ✅ Still passes. Commit.
Iteration 3 - Error handling:
- RED:
test_factorial_negative_raises_error()→ ❌ No error raised - GREEN: Add if n < 0: raise ValueError → ✅ All tests pass
- REFACTOR: Add docstring → ✅ Still passes. Commit.
Done! Function is complete, fully tested, documented. Three test-driven iterations.
Mocking and Test Doubles
When to Mock
- External dependencies - Databases, APIs, file systems
- Slow operations - Network calls, large computations
- Unpredictable behavior - Random, time-dependent, external state
- Hard to trigger scenarios - Error conditions, edge cases
When NOT to Mock
- Your own code - Prefer real objects for your code
- Simple objects - Data classes, value objects
- Logic being tested - Don't mock what you're testing
Types of Test Doubles
Mock - Programmed with expectations, verifies interactions Stub - Provides canned responses, doesn't verify Fake - Working implementation, simpler than real Spy - Records calls, allows verification after
See references/test-design-patterns.md for detailed mocking strategies.
Common TDD Anti-Patterns
Don't:
❌ Write implementation before test - Defeats TDD purpose
❌ Write multiple tests before making them pass - Stay in rhythm (one test, make it pass, next test)
❌ Refactor with red tests - Only refactor when green
❌ Test implementation details - Test behavior, not internals
❌ Skip refactor step - Technical debt accumulates
❌ Write tests that are hard to understand - Tests are documentation
❌ Create dependencies between tests - Tests must be independent
❌ Mock everything - Use real objects when practical
❌ Fake it with hardcoded values forever - "Fake it till you make it" is temporary
❌ Write slow tests - Slow test suite won't be run frequently
Test Naming Conventions
Good test names are descriptive and specific:
Pattern: test_<function>_<scenario>_<expected_result>
test_add_two_positive_numbers_returns_sum()
test_add_with_negative_number_returns_correct_result()
test_add_with_zero_returns_other_number()
Pattern: should_<expected_behavior>_when_<condition>
should_return_empty_list_when_no_items_match()
should_raise_error_when_input_is_null()
should_calculate_discount_when_user_is_premium()
Pattern: <behavior>_<state>_<expected>
(ert-deftest save-buffer-modified-saves-to-file ())
(ert-deftest load-file-missing-raises-error ())
Test name should:
- Describe what's being tested
- Describe the scenario/condition
- Describe expected outcome
- Be readable as documentation
Test Organization
Directory Structure
Python:
project/
├── src/
│ └── calculator.py
└── tests/
├── __init__.py
├── test_calculator.py
└── conftest.py # pytest fixtures
Elisp:
package/
├── my-package.el
└── test/
└── test-my-package.el
Naming Conventions
- Test files:
test_*.py,*_test.py,test-*.el - Test functions: Start with
test_orert-deftest - Test classes:
Test*(if using classes)
Grouping Tests
- One test file per source file (generally)
- Group related tests in same file
- Separate unit/integration/e2e tests
See language-specific references for detailed organization patterns.
Refactoring with Tests
Safe Refactoring Process
- Ensure all tests are green before starting
- Make small changes - One refactoring at a time
- Run tests after each change - Catch breaks immediately
- Commit frequently - When tests pass
- Don't add features while refactoring - Separate concerns
Common Refactorings
- Extract function (break up large functions)
- Rename for clarity
- Remove duplication (DRY)
- Simplify conditional logic
- Extract variable for readability
- Inline unnecessary abstraction
When Tests Break During Refactoring
If test is testing implementation detail:
- Update test to test behavior instead
- Make test more resilient to changes
If test is testing behavior:
- Fix the code, not the test
- Behavior shouldn't change during refactoring
If too many tests break:
- Change is too large, revert
- Make smaller incremental changes
See references/refactoring-with-tests.md for detailed guidance.
Test Coverage
Coverage measures which code is executed by tests, not whether tests are good.
Types of Coverage
- Line coverage - Which lines executed
- Branch coverage - Which paths taken
- Function coverage - Which functions called
Coverage Goals
- Aim for high coverage (80%+) but not 100%
- 100% coverage doesn't mean bug-free
- Focus on critical code paths
- Don't test just to increase coverage
Using Coverage Tools
- Python: pytest-cov, coverage.py
- JavaScript: Jest with coverage
- Java: JaCoCo
- Ruby: SimpleCov
Use scripts/coverage_analyzer.py to identify coverage gaps.
Using Supporting Resources
Additional resources in this skill:
- references/python-tdd.md: Comprehensive Python testing guide (pytest, unittest, mocking, async)
- references/elisp-tdd.md: Comprehensive Elisp testing guide (ERT, Buttercup, interactive functions)
- references/general-tdd.md: Universal TDD principles for any language
- references/test-design-patterns.md: What to test, test organization, anti-patterns
- references/refactoring-with-tests.md: Safe refactoring process and common refactorings
- scripts/test_template_generator.py: Generate test file boilerplate
- scripts/coverage_analyzer.py: Analyze coverage reports
- assets/templates/: Test file templates for multiple languages
Quick Reference
TDD Cycle:
- RED - Write failing test
- GREEN - Make it pass (minimal code)
- REFACTOR - Improve while keeping green
- REPEAT
Test Structure:
- Arrange (setup)
- Act (execute)
- Assert (verify)
Test Principles:
- Test first
- One test at a time
- One behavior per test
- Independent tests
- Fast tests
- Readable tests
Refactoring Rules:
- Only refactor when green
- Small changes
- Run tests frequently
- Don't add features
Remember: TDD is a discipline. The value comes from following the cycle strictly. Test first. See it fail. Make it pass. Refactor. Repeat. The rhythm creates quality code.
