Adversarial Testing — Think Like the Attacker
Every line of code makes assumptions. Your job is to find them and violate them — systematically, not randomly. The goal is distrust, not coverage. A passing test suite proves nothing if it only tests the happy path.
The Adversarial Mindset
- Every input is a lie. Callers will send garbage, nulls, negative numbers, empty strings, and types that satisfy the compiler but violate intent.
- Implicit contracts are targets. If the code assumes ordering, uniqueness, non-emptiness, or positive values without enforcing it — that is your entry point.
- The system is your adversary. Files disappear, connections drop, clocks jump, memory runs out, permissions change between check and use.
- Passing tests prove nothing. They prove the happy path works. Adversarial tests prove the sad paths do not silently corrupt.
Assumption Hunting (Core Technique)
For every function or module under test, ask these six questions:
- What does it assume about inputs? Violate each assumption: wrong type coercion, boundary values, null/nil/None, empty collections, maximum-size payloads.
- What does it assume about ordering? Reorder arguments, reverse sequences, interleave concurrent calls, call methods out of lifecycle order.
- What does it assume about timing? Delay responses past timeouts, deliver results before the consumer is ready, inject clock skew, expire tokens mid-operation.
- What does it assume about state? Start from half-initialized state, corrupt shared state mid-operation, test post-error recovery state, double-close resources.
- What does it assume about resources? Exhaust file descriptors, fill disk, revoke permissions, return allocation failures, saturate connection pools.
- What does it assume will NOT happen? Make it happen. Concurrent modification during iteration, recursive re-entry, self-referential data, stack overflow via deep nesting.
Attack Vectors (Thinking Prompts)
Data:
- Zero, negative, MAX_INT, NaN, Infinity, negative zero
- Empty string, null bytes in strings, multi-byte Unicode (emoji, RTL, ZWJ sequences)
- Empty collections, single-element, collections at capacity
- Encode a value, corrupt one byte, decode it
State:
- Double-close, use-after-free/dispose, read-after-error
- Concurrent mutation during iteration or serialization
- Half-written state from interrupted operation (crash mid-transaction)
- State machine receiving events for a different state
Environment:
- File not found, permission denied, disk full, read-only filesystem
- Network timeout, connection reset, DNS failure, partial write
- Clock jumps (forward 1 hour, backward 5 minutes, NTP correction)
- OOM at the worst possible moment (during cleanup/rollback)
Protocol:
- Out-of-order messages, duplicate delivery, missing acknowledgment
- Partial writes (half a JSON object, truncated protobuf)
- Version mismatch between client and server
- Request after connection close, response after timeout already fired
The No-Cheating Rule
- Test through the public API only. If you need private access to break it, the abstraction is leaking — file that as a finding.
- If a scenario is "impossible," prove it with types or contracts. If you cannot prove it, it is not impossible — test it.
- Every test scenario must be production-plausible. Cosmic rays flipping bits are not plausible; a user pasting 10MB into a text field is.
Writing Strategy
- Read the code. Understand what it does, not what the docs say it does.
- List assumptions. Write them down explicitly — one per line, no hedging.
- Write violation tests. One test per assumption. Name it after what it violates:
test_rejects_negative_quantity,test_handles_empty_result_set,test_recovers_from_mid_write_crash. - Verify error quality. When the code fails, does it produce a meaningful error? Silent corruption is worse than a crash.
- Test boundaries from both sides. If the limit is 100, test 99, 100, and 101. If the limit is 0, test -1, 0, and 1.
- Run sanitizers and race detectors. After writing tests: ASan, MSan, TSan,
-race, Miri, or your language's equivalent. Tests that pass without sanitizers may hide undefined behavior.
Validation Gates
| Gate | Condition |
|---|---|
| Assumptions documented | Every implicit assumption in the code under test is written down |
| Violations tested | Each documented assumption has at least one test that violates it |
| Errors are meaningful | Every failure path produces a descriptive error, not silence or generic message |
| Sanitizers pass | All tests pass under sanitizers / race detectors with zero warnings |
Exit Codes
| Code | Meaning |
|---|---|
| 0 | All assumptions identified, violated, and handled — error paths produce meaningful output |
| 1 | Untested assumptions remain — some assumptions lack violation tests |
| 2 | Silent failures found — code swallows errors or produces wrong output without signaling |
| 3 | Crashes or panics discovered — unhandled exceptions, segfaults, or undefined behavior found |
