Development Estimation
Purpose
Create consistent, defensible estimates by:
- Triaging and clarifying scope (especially for bug lists)
- Decomposing work into deliverable components
- Accounting for testing, QA, and release overhead
- Identifying risks and unknowns with explicit mitigation
- Producing confidence-based ranges (P50/P80/P90)
Optimized for reviewing an existing project or backlog and producing a realistic delivery forecast.
When to Use
- Estimating feature or project scope and delivery timelines
- Reviewing a backlog and producing a time estimate
- Producing risk-aware estimates with confidence intervals
- Converting a list of problems into an execution plan
Avoid When
- The task is trivial and a gut-check is sufficient
- Requirements are unknown and discovery is the real work
- You lack access to the codebase or requirements
Inputs Required
Provide as much as possible; estimate quality depends on clarity.
Minimum
- List of items (bugs/features) with short descriptions
- Target platform/environment (e.g., iOS/Android/Web, staging/prod)
- Constraints (deadline, must-fix vs can-slip)
Helpful
- Repro steps, logs, screenshots
- Architecture notes or codebase access
- Deployment pipeline and QA process
- Team capacity model (engineers, focus hours/day, meeting load)
- Definition of done (tests required, QA sign-off, release scope)
Estimation Frames
Always confirm the estimate frame up front:
- Effort: engineer-days or engineer-weeks
- Calendar: derived from capacity (effort / throughput)
- Cost: optional, requires rate assumptions
- Complexity: t-shirt sizing or points (only if requested)
Default Capacity Model
Unless provided, assume:
- 5–6 hours/day of effective build time per engineer
- Review and coordination overhead included per item
- QA/release overhead is stated explicitly as an assumption
Triage Rubric (Bug Backlog)
For each bug, capture:
Repro
- A: Always
- B: Sometimes / flaky
- C: Cannot repro / unclear
Clarity
- A: Clear cause or clear fix path
- B: Partial (needs investigation)
- C: Unknown (needs discovery)
Impact
- Blocker / High / Medium / Low
Fix Type
- UI / Backend / Infra / Data / Dependency / Config / Unknown
Risk Flags
- Concurrency / migrations / auth/security / payments / third-party API
- Performance / cross-platform behavior / state persistence
- Touches core path / unknown blast radius
Rule: If Repro=C or Clarity=C, include a discovery slice.
Workflow
1) Define Scope and Estimate Type
- Confirm what is in/out of scope
- Confirm estimate type: effort, calendar, cost, complexity
- Confirm confidence target(s): P50 / P80 (default), optionally P90
- Confirm constraints: deadlines, must-fix items, release boundary
2) Intake and Normalize the Backlog
- De-duplicate issues
- Merge near-duplicates and note uncertainty
- Group items by subsystem (auth, billing, UI, sync, infra)
- Identify unknown-unknown areas (legacy zones, flaky tests, poor observability)
3) Per-Item Breakdown
Estimate each item using this structure:
- Triage / reproduce
- Investigation / diagnosis (if needed)
- Implementation
- Tests (unit/integration/e2e as appropriate)
- Code review and iteration
- QA verification
- Release / deployment steps (if applicable)
Use hours only for work expected to be under one day; otherwise use engineer-days.
4) Dependencies and Sequencing
- Identify dependencies (internal modules, data migrations, external APIs, app-store releases)
- Mark blockers and parallelizable streams
- Flag sequencing needs (must land before X)
5) Risks, Unknowns, and Buffers
Maintain an Unknowns Register:
- Unknown → discovery task → decision/validation method → timebox
Risk buffer guidance:
- Low risk: +0–10%
- Medium risk: +10–25%
- High risk: +25–50%
- Unknown-heavy: include discovery and a re-estimate gate
6) Roll-Up Totals and Confidence
Produce:
- Per-item ranges (min/most-likely/max or P50/P80)
- Totals by bucket (must/should/nice)
- Overall confidence:
- P50: likely if things go normally
- P80: realistic commitment range
- P90: conservative range for high-stakes delivery
7) Validation and Re-Estimation Plan
Provide:
- Fastest validation steps to reduce uncertainty (logs, repro harness, spike)
- Re-estimation checkpoints (after discovery spikes, after first tranche)
Output Format (Required)
A) Executive Summary
- Scope statement (included/excluded)
- Estimate frame (effort vs calendar + capacity assumptions)
- Total estimates: P50 / P80 (and P90 if requested)
- Key risks and biggest unknowns
B) Backlog Estimate Table
For each item:
- ID / Title
- Bucket (Must/Should/Nice)
- Subsystem
- Triage scores (Repro/Clarity/Impact)
- Estimate (P50/P80)
- Risk flags
- Notes / assumptions
C) Totals by Bucket
- Must-fix total (P50/P80)
- Should-fix total (P50/P80)
- Nice-to-have total (P50/P80)
D) Assumptions and Exclusions
- Explicit assumptions (env access, test maturity, release cadence)
- Explicit exclusions (UX redesign, perf overhaul, refactors unless listed)
E) Risks and Unknowns Register
- Unknown → discovery plan → timebox → impact if true
F) Next Steps
- Top 3 actions to validate quickly
- Proposed sequencing / milestones
- Re-estimation trigger points
Common Mistakes
- Estimating a bug list as if it is already well-specified
- Skipping repro/triage and undercounting diagnosis time
- Ignoring testing, QA verification, and release overhead
- Mixing effort and calendar without a capacity model
- Overstating precision (single-number estimates with no confidence)
- Not separating must-fix from nice-to-have
Quick Reference
- Reference template: skills/development-estimation/references/estimate.md
- Preferred output: P50/P80 effort plus derived calendar via stated capacity
- Default unit: engineer-days (hours only for sub-day tasks)
Quality Bar
A good estimate is:
- Transparent about assumptions
- Broken down into components
- Risk-aware with explicit unknown handling
- Delivered as a range with confidence levels
- Paired with a plan to validate and tighten the range
