Why Estimation Is Broken (And What Actually Works)
Twenty years of software estimation taught me that hours estimates are fiction. Here's what actually works for planning.
Core: Your company probably asks engineers to estimate how long tasks will take. Your engineers probably give estimates that are wrong. This isn’t the engineers’ fault—estimation is fundamentally unreliable for novel work.
Why Hour Estimates Are Fiction
Detail: In 2010, we were asked to estimate a feature: “Add payment retry logic.” Seemed straightforward. Attempt failed payment twice more. Senior engineer estimated: 8 hours.
The estimate was wrong. Not because they were bad at estimating, but because estimation is fiction for novel work. Here’s what we encountered:
- Payment provider API documentation was inconsistent with actual behavior (lost 1 hour reading docs and testing)
- Retry logic needed to interact with billing system (unexpected dependency, 3 hours)
- Tests discovered edge case in existing payment flow (1 hour fixing)
- Deployment revealed race condition in retry logic (1 hour fixing)
- Monitoring showed retry logic was logging too verbosely (created log storage issues, 2 hours tuning)
The 8-hour estimate turned into 18 hours. Nobody was incompetent; the work had unknowns that appeared during execution.
Here’s the thing: any expert looking at the task would probably also estimate 8 hours. The unknowns weren’t obvious before diving in. They became obvious during work.
Application: Hour estimates are fiction for novel work. If the work is routine (you’ve done it 50 times), estimates are reasonable. If the work is novel, estimates are guesses. Stop pretending they’re reliable.
The Estimation Bias: Planning Fallacy
Core: Humans are bad at estimating because we’re optimistic by default. We imagine the happy path.
Detail: When asked to estimate, your brain imagines: “I’ll code feature X, tests pass, deploy succeeds, done.” Your brain doesn’t imagine: “I’ll spend 3 hours on a bug in dependency Y that appears at 2 AM.”
This is planning fallacy. We estimate the time for happy-path execution, then act surprised when things go wrong. Things always go wrong. The difference between “estimate” and “actual” is usually the cumulative time spent on surprises.
Research shows planning fallacy is pervasive across domains. Students estimate study time for exams, then take longer. Companies estimate software project timelines, then miss them. This isn’t stupidity—it’s how human brains work.
The consequence: schedules are based on optimistic fiction. Every organization then adds a “buffer” (multiply estimates by 1.5x or 2x) hoping to account for unknowns. This is mathematically incoherent but pragmatically reasonable.
| |
Application: If you must estimate, provide confidence ranges, not single numbers. “8 hours ±4 hours” is more honest than “8 hours.” Recognize that confidence shrinks with novelty.
What Actually Works: Story Points
Core: Instead of hour estimates, use story points—relative sizing without claiming accuracy.
Detail: Instead of asking “how many hours,” ask “compared to that other task we did, how much work is this?” This removes the illusion of accuracy while providing useful comparison.
If task A takes 8 hours and task B is similar but slightly more complex, B gets “5 points” to A’s “3 points.” You’re not claiming accuracy; you’re estimating relative complexity.
The magic: story points work because they’re calibrated by comparison, not by absolute hours. “5 points means 20 hours” is meaningless, but “5 points is bigger than 3 points” is useful for planning.
Teams can track velocity (points per sprint) and plan future sprints. If you consistently deliver 40 points per sprint, you can plan 6 sprints of work knowing it’s 240 points. Individual task accuracy doesn’t matter if your team’s aggregate accuracy is calibrated.
| |
Application: Use story points for planning. Track team velocity (points per sprint). Use velocity for roadmapping, not for individual task accuracy.
The Real Bottleneck: Unknowns
Core: Estimation isn’t really the problem. Unknowns are.
Detail: When the payment retry task took 18 hours instead of 8, the extra 10 hours were unknowns: “I didn’t know the API behaved this way” (5 hours), “I didn’t know this would interact with billing” (3 hours), “I didn’t know this edge case existed” (2 hours).
These unknowns are hard to estimate because they’re unknown. You can’t know what you don’t know.
The real solution: reduce unknowns before estimating. Spend time researching: “How does the payment provider API actually work?” “What systems does payment touch?” “What edge cases exist in current payment logic?”
After 2-3 hours of research, suddenly the unknowns shrink. The estimate becomes more accurate not because estimation improved, but because unknowns became known.
The implication: if management rushes you to estimate before you’ve researched, the estimate will be fiction. Push back on that.
The Broken Incentive: Estimates as Promises
Core: Estimation breaks when estimates become promises.
Detail: Early in my career, I’d estimate “3 days” and the manager would say “Great, 3 days for feature X.” Six days later, I’d explain it took longer. The manager would respond, “But you estimated 3 days.”
The estimate had become a promise. I’d missed my “commitment.” This was backwards. The estimate wasn’t a promise—it was a guess. But the incentive system treated it as a promise.
When estimates become promises, engineers start padding them. “I’ll say 10 days knowing it might take 5, so I’m covered.” Padding wastes time and hides actual efficiency.
The real question isn’t “do you promise to deliver in 3 days?” It’s “what’s your best guess with what you know now?” and “what are the unknowns that could change that guess?”
Application: Never use estimates as performance metrics. “You estimated 3 days, missed it, therefore you’re bad at estimating” is backwards. Treat estimates as forecasts, not promises. Measure performance on outcome (did we solve the problem?), not adherence to estimates (did we hit the 3-day guess?).
What I Wish I’d Done Differently
Core: I spent years trying to estimate accurately. It was wasted effort.
I should have:
- Stopped trying to estimate novel work with hours
- Invested in research to reduce unknowns before estimating
- Used story points for relative sizing
- Tracked velocity to calibrate future estimates
- Communicated that estimates are forecasts, not promises
The companies that worked best were those that accepted estimation uncertainty and planned around it. “We think this is 3 weeks. If we encounter unknowns, it could be 4-5 weeks. Let’s start and recalibrate as we learn.”
The companies that failed were those that tried to eliminate estimation uncertainty through oversight. “Estimate 3 weeks, hit 3 weeks, good.” That required either overestimating (padding) or discovering the work was already finished 2 weeks in but hiding it.
Hero Image Prompt: “Estimation accuracy visualization showing broken accuracy models. Left side: hour estimates with wide miss ranges (estimated 8 hours, actual 18 hours). Center: comparison with story points and velocity-based planning showing convergence over time. Right side: successful roadmap using velocity instead of accuracy. Include graph showing velocity stabilizing over sprints. Dark professional theme with red (missed estimates) and green (velocity-based planning working) zones.”