Lesson 05

Decomposition & Analogy

Two powerful techniques for taming uncertainty: break it apart, or compare to the past.

PERT: Three-Point Estimation

Single-point estimates are comfortable but dangerous. When someone says "this will take 5 days," they're hiding the uncertainty. The Program Evaluation and Review Technique (PERT) forces you to confront that uncertainty by asking three questions:

Best case — everything goes perfectly (optimistic)
Most likely — the realistic scenario with normal hiccups
Worst case — significant problems occur (pessimistic)

These three data points are combined using a weighted average that gives four times the weight to the most likely case:

PERT Expected Value E = (Best + 4 × MostLikely + Worst) / 6

This is a beta distribution approximation. It slightly biases the result toward the most likely case while still accounting for the extremes. The standard deviation captures the spread of uncertainty:

Standard Deviation σ = (Worst − Best) / 6 Roughly 99.7% of outcomes fall within Best to Worst
Why does the formula use 4 and 6? The math behind PERT.

The PERT formula approximates a beta distribution, which is more flexible than a normal distribution because it can be skewed. If "worst case" is much further from "most likely" than "best case" is, the distribution skews right (a long tail of risk). The weighting of 1-4-1 divided by 6 is a simplification that assumes the beta distribution's shape parameters place the mode (most likely value) at the peak, with the mean pulled slightly toward the longer tail.

The (Worst - Best) / 6 for standard deviation comes from the fact that in a normal distribution, 99.7% of values fall within 3 standard deviations of the mean. Since Best-to-Worst represents the practical full range (approximately 6 sigma), dividing by 6 gives one sigma. This is an approximation, but it's proven remarkably useful in practice.

Why PERT Works

The real power isn't the formula itself — it's the conversation it forces. When you ask for three points, you're requiring the estimator to think about risk scenarios. "What could go wrong?" and "What would have to go perfectly?" are far more useful questions than "How long will this take?"

Interactive: PERT Calculator

Enter best, most likely, and worst case estimates (in days) for each task. The calculator will compute the expected value and standard deviation for each task, then aggregate them statistically.

Task Best Most Likely Worst Expected σ
Database schema design
API endpoints
Frontend components
Authentication flow
Testing & QA
AGGREGATE TOTAL

Decomposition via Work Breakdown Structure

Humans are reliably bad at estimating large, complex things. But we're surprisingly decent at estimating small, well-understood things. Decomposition exploits this asymmetry: break a big task into pieces small enough to estimate well, then recombine.

"The objective of decomposition is to decompose the estimate into pieces that are small enough to produce low estimation error. Generally, you should try to produce pieces that represent 2 days of effort or less."
— Steve McConnell, Chapter 10

The Law of Large Numbers

Here's the mathematical magic: when you estimate many small items, some estimates will be too high, others too low. If the errors are independent and unbiased, they tend to cancel out in the aggregate. This is the Law of Large Numbers at work.

More precisely: the standard error of the mean decreases proportionally to 1 / √n where n is the number of items. Ten items gives you roughly 3x less relative error than a single item. That's a massive improvement for free — all you had to do was break the work down.

Important caveat: errors must be unbiased

The Law of Large Numbers only helps if your individual errors are unbiased — equally likely to be high or low. If every estimate is systematically optimistic (as is common with developers), then decomposition won't fix the bias. It'll just give you a very precise wrong answer.

This is why McConnell recommends combining decomposition with historical data and calibration. Decomposition reduces random error; calibration reduces systematic error.

Interactive: Build a Work Breakdown Structure

Click on each category to expand it into subtasks. Watch how the aggregate estimate changes as you decompose further. The goal: break everything down to pieces of 2 days or less.

1
Tasks
30d
Estimated Total
--
Rel. Error

Rule of Thumb

Break every task down until no piece exceeds 2 days of effort. If you can't break a task down further but it still feels large, that's a sign you don't understand it well enough yet — and that is valuable information for your estimate.

The Best-Case / Worst-Case Trap

It's tempting to think: "if every task goes as well as possible, the project's best case is the sum of all best cases." This sounds logical. It is wildly wrong.

Think about it: what's the probability that all 10 tasks will simultaneously hit their best case? If each task has a 10% chance of best case, the probability of all hitting best case is:

Probability all tasks hit best case P = 0.1010 = 0.0000000001 One in 10 billion. Not a useful planning scenario.

The correct approach is statistical aggregation. When tasks are independent, variances add (not standard deviations), and the resulting distribution is much narrower than the naive sum of extremes.

Interactive: Naive vs. Statistical Aggregation

Here are 10 independent tasks. Toggle between the naive approach (summing all best or worst cases) and the statistical approach (combining expected values and variances).

See the math step by step

This is why experienced estimators don't give "sum of best cases" as a project best case. Instead, they use Sum(Expected) − 2 × σaggregate for a meaningful best case (roughly 5th percentile), and Sum(Expected) + 2 × σaggregate for a meaningful worst case (roughly 95th percentile).

Creating meaningful overall best/worst case estimates

McConnell recommends creating overall project ranges this way:

1. PERT-estimate each task to get its expected value and standard deviation.
2. Sum the expected values to get the aggregate expected value.
3. Compute the aggregate standard deviation: σtotal = √(σ1² + σ2² + ... + σn²).
4. Best case (5th percentile) = Expected − 1.645 × σtotal.
5. Worst case (95th percentile) = Expected + 1.645 × σtotal.

This gives you a meaningful 90% confidence interval: there's a 90% chance the actual result falls between your best and worst case. Much more useful than the naive approach, which produces an astronomically improbable range.

Estimation by Analogy

Sometimes the best predictor of the future is the past. Estimation by analogy works by finding a completed project similar to the one you're estimating and adjusting for known differences.

The simplest form: "Project X is similar to Project Y, which took 8 months. Project X is about 30% bigger, so it should take about 10-11 months."

The Triad Estimate

For better accuracy, McConnell recommends the triad approach: find the past project closest in size, the next largest, and the next smallest. Use all three as reference points, weighting the closest match most heavily. This triangulation reduces the risk of basing your estimate on a single potentially unrepresentative data point.

The Schedule-Effort Relationship

There's a well-established empirical relationship between effort and schedule. If you know the effort ratio between a past and new project, you can estimate the schedule impact:

Schedule-Effort Scaling Formula Schedulenew = Schedulepast × (Effortnew / Effortpast)1/3 The cube root reflects that schedule grows much slower than effort (Brooks's Law territory)
Why the cube root? The science of schedule compression.

The exponent of 1/3 comes from extensive empirical research (Barry Boehm's COCOMO model and others). It reflects two realities:

1. Parallelism. Doubling effort doesn't mean doubling duration, because you can add more people working in parallel.

2. Communication overhead. But adding people also adds communication channels (n(n-1)/2), so the parallelism gains diminish.

The cube root balances these forces. If a past project took 1,000 person-hours over 10 months, and you estimate the new project at 8,000 person-hours (8x effort), the schedule estimate is 10 × 81/3 = 10 × 2.0 = 20 months, not 80 months.

Interactive: Estimation by Analogy

Your team needs to estimate a new project: a customer portal with user management, dashboards, reporting, and API integration. Estimated size: ~45,000 lines of code, ~18 staff-months effort.

Select the past project you think is the best analogy:

Project Type Size (KLOC) Effort (SM) Schedule (mo)

Analogy Calculation

When Analogy Works Best

Analogy estimation is most effective when:

  • You have detailed records of past projects (not just memories)
  • The analogy project used similar technology and team structure
  • You adjust for known differences (new technology, different team size, different complexity)
  • You use multiple analogies (the triad approach) rather than just one

Key Takeaways

What We Learned

  • PERT (three-point estimation) forces you to quantify uncertainty with best/most likely/worst cases, producing a weighted expected value and standard deviation
  • Decomposition via Work Breakdown Structure reduces estimation error through the Law of Large Numbers — break tasks to 2 days or less
  • You cannot sum best cases to get a project best case — the probability of all tasks hitting their optimistic estimate simultaneously is vanishingly small
  • Meaningful project ranges use statistical aggregation: sum the expected values, then compute aggregate σ from the root-sum-of-squares of individual variances
  • Estimation by analogy leverages past project data, with the schedule-effort formula Schedule = Past × (Effort Ratio)1/3 capturing how schedule grows slower than effort
  • The triad estimate (smallest, closest, largest analogy) provides more robust results than relying on a single past project

Check Your Understanding

Question 1: You have 10 independent tasks, each estimated at 3 days (best), 5 days (most likely), 8 days (worst). What is the PERT expected value for the entire project?

30 days (sum of best cases)
50 days (sum of most likely)
51.7 days (sum of PERT expected values)
80 days (sum of worst cases)

Question 2: Why does decomposition improve estimate accuracy?

Smaller tasks take less time, so there's less to estimate
The Law of Large Numbers means independent over/under-estimates tend to cancel out
Managers can track progress more easily with smaller tasks
It eliminates all estimation uncertainty

Question 3: A past project took 12 months with 20 staff-months of effort. Your new project is estimated at 60 staff-months (3x effort). Using the cube-root formula, what's the estimated schedule?

36 months (linear scaling)
~17.3 months (cube root scaling)
12 months (same schedule, just add people)
24 months (square root scaling)

Next: Lesson 06 — Count, Compute, Judge

We'll explore McConnell's hierarchy of estimation approaches: why counting beats computing, computing beats judging, and how to use proxy-based estimation and parametric models to ground your estimates in data rather than gut feel.

Think about this: if you could count the number of features, database tables, and integrations — and you knew the historical cost per unit — how much better would your estimates be?