Lesson 05 — Decomposition & Analogy

PERT: Three-Point Estimation

Single-point estimates are comfortable but dangerous. When someone says "this will take 5 days," they're hiding the uncertainty. The Program Evaluation and Review Technique (PERT) forces you to confront that uncertainty by asking three questions:

Best case — everything goes perfectly (optimistic)
Most likely — the realistic scenario with normal hiccups
Worst case — significant problems occur (pessimistic)

These three data points are combined using a weighted average that gives four times the weight to the most likely case:

PERT Expected Value E = (Best + 4 × MostLikely + Worst) / 6

This is a beta distribution approximation. It slightly biases the result toward the most likely case while still accounting for the extremes. The standard deviation captures the spread of uncertainty:

Standard Deviation σ = (Worst − Best) / 6 Roughly 99.7% of outcomes fall within Best to Worst

Why does the formula use 4 and 6? The math behind PERT.

The PERT formula approximates a beta distribution, which is more flexible than a normal distribution because it can be skewed. If "worst case" is much further from "most likely" than "best case" is, the distribution skews right (a long tail of risk). The weighting of 1-4-1 divided by 6 is a simplification that assumes the beta distribution's shape parameters place the mode (most likely value) at the peak, with the mean pulled slightly toward the longer tail.

The (Worst - Best) / 6 for standard deviation comes from the fact that in a normal distribution, 99.7% of values fall within 3 standard deviations of the mean. Since Best-to-Worst represents the practical full range (approximately 6 sigma), dividing by 6 gives one sigma. This is an approximation, but it's proven remarkably useful in practice.

Why PERT Works

The real power isn't the formula itself — it's the conversation it forces. When you ask for three points, you're requiring the estimator to think about risk scenarios. "What could go wrong?" and "What would have to go perfectly?" are far more useful questions than "How long will this take?"

Interactive: PERT Calculator

Enter best, most likely, and worst case estimates (in days) for each task. The calculator will compute the expected value and standard deviation for each task, then aggregate them statistically.

Task	Best	Most Likely	Worst	Expected	σ
Database schema design				—	—
API endpoints				—	—
Frontend components				—	—
Authentication flow				—	—
Testing & QA				—	—
AGGREGATE TOTAL	—	—	—	—	—

Decomposition via Work Breakdown Structure

Humans are reliably bad at estimating large, complex things. But we're surprisingly decent at estimating small, well-understood things. Decomposition exploits this asymmetry: break a big task into pieces small enough to estimate well, then recombine.

"The objective of decomposition is to decompose the estimate into pieces that are small enough to produce low estimation error. Generally, you should try to produce pieces that represent 2 days of effort or less."
— Steve McConnell, Chapter 10

The Law of Large Numbers

Here's the mathematical magic: when you estimate many small items, some estimates will be too high, others too low. If the errors are independent and unbiased, they tend to cancel out in the aggregate. This is the Law of Large Numbers at work.

More precisely: the standard error of the mean decreases proportionally to 1 / √n where n is the number of items. Ten items gives you roughly 3x less relative error than a single item. That's a massive improvement for free — all you had to do was break the work down.

Important caveat: errors must be unbiased

The Law of Large Numbers only helps if your individual errors are unbiased — equally likely to be high or low. If every estimate is systematically optimistic (as is common with developers), then decomposition won't fix the bias. It'll just give you a very precise wrong answer.

This is why McConnell recommends combining decomposition with historical data and calibration. Decomposition reduces random error; calibration reduces systematic error.

Interactive: Build a Work Breakdown Structure

Click on each category to expand it into subtasks. Watch how the aggregate estimate changes as you decompose further. The goal: break everything down to pieces of 2 days or less.

1

Tasks

30d

Estimated Total

--

Rel. Error

Rule of Thumb

Break every task down until no piece exceeds 2 days of effort. If you can't break a task down further but it still feels large, that's a sign you don't understand it well enough yet — and that is valuable information for your estimate.

The Best-Case / Worst-Case Trap

It's tempting to think: "if every task goes as well as possible, the project's best case is the sum of all best cases." This sounds logical. It is wildly wrong.

Think about it: what's the probability that all 10 tasks will simultaneously hit their best case? If each task has a 10% chance of best case, the probability of all hitting best case is:

Probability all tasks hit best case P = 0.10¹⁰ = 0.0000000001 One in 10 billion. Not a useful planning scenario.

The correct approach is statistical aggregation. When tasks are independent, variances add (not standard deviations), and the resulting distribution is much narrower than the naive sum of extremes.

Interactive: Naive vs. Statistical Aggregation

Here are 10 independent tasks. Toggle between the naive approach (summing all best or worst cases) and the statistical approach (combining expected values and variances).

See the math step by step

This is why experienced estimators don't give "sum of best cases" as a project best case. Instead, they use Sum(Expected) − 2 × σ_aggregate for a meaningful best case (roughly 5th percentile), and Sum(Expected) + 2 × σ_aggregate for a meaningful worst case (roughly 95th percentile).

Creating meaningful overall best/worst case estimates

McConnell recommends creating overall project ranges this way:

1. PERT-estimate each task to get its expected value and standard deviation.
2. Sum the expected values to get the aggregate expected value.
3. Compute the aggregate standard deviation: σ_total = √(σ₁² + σ₂² + ... + σ_n²).
4. Best case (5th percentile) = Expected − 1.645 × σ_total.
5. Worst case (95th percentile) = Expected + 1.645 × σ_total.

This gives you a meaningful 90% confidence interval: there's a 90% chance the actual result falls between your best and worst case. Much more useful than the naive approach, which produces an astronomically improbable range.

Estimation by Analogy

Sometimes the best predictor of the future is the past. Estimation by analogy works by finding a completed project similar to the one you're estimating and adjusting for known differences.

The simplest form: "Project X is similar to Project Y, which took 8 months. Project X is about 30% bigger, so it should take about 10-11 months."

The Triad Estimate

For better accuracy, McConnell recommends the triad approach: find the past project closest in size, the next largest, and the next smallest. Use all three as reference points, weighting the closest match most heavily. This triangulation reduces the risk of basing your estimate on a single potentially unrepresentative data point.

The Schedule-Effort Relationship

There's a well-established empirical relationship between effort and schedule. If you know the effort ratio between a past and new project, you can estimate the schedule impact:

Schedule-Effort Scaling Formula Schedule_new = Schedule_past × (Effort_new / Effort_past)^1/3 The cube root reflects that schedule grows much slower than effort (Brooks's Law territory)

Why the cube root? The science of schedule compression.

The exponent of 1/3 comes from extensive empirical research (Barry Boehm's COCOMO model and others). It reflects two realities:

1. Parallelism. Doubling effort doesn't mean doubling duration, because you can add more people working in parallel.

2. Communication overhead. But adding people also adds communication channels (n(n-1)/2), so the parallelism gains diminish.

The cube root balances these forces. If a past project took 1,000 person-hours over 10 months, and you estimate the new project at 8,000 person-hours (8x effort), the schedule estimate is 10 × 8^1/3 = 10 × 2.0 = 20 months, not 80 months.

Interactive: Estimation by Analogy

Your team needs to estimate a new project: a customer portal with user management, dashboards, reporting, and API integration. Estimated size: ~45,000 lines of code, ~18 staff-months effort.

Select the past project you think is the best analogy:

Project	Type	Size (KLOC)	Effort (SM)	Schedule (mo)

Analogy Calculation

When Analogy Works Best

Analogy estimation is most effective when:

You have detailed records of past projects (not just memories)
The analogy project used similar technology and team structure
You adjust for known differences (new technology, different team size, different complexity)
You use multiple analogies (the triad approach) rather than just one

Key Takeaways

What We Learned

PERT (three-point estimation) forces you to quantify uncertainty with best/most likely/worst cases, producing a weighted expected value and standard deviation
Decomposition via Work Breakdown Structure reduces estimation error through the Law of Large Numbers — break tasks to 2 days or less
You cannot sum best cases to get a project best case — the probability of all tasks hitting their optimistic estimate simultaneously is vanishingly small
Meaningful project ranges use statistical aggregation: sum the expected values, then compute aggregate σ from the root-sum-of-squares of individual variances
Estimation by analogy leverages past project data, with the schedule-effort formula Schedule = Past × (Effort Ratio)^1/3 capturing how schedule grows slower than effort
The triad estimate (smallest, closest, largest analogy) provides more robust results than relying on a single past project

Check Your Understanding

Question 1: You have 10 independent tasks, each estimated at 3 days (best), 5 days (most likely), 8 days (worst). What is the PERT expected value for the entire project?

30 days (sum of best cases)

50 days (sum of most likely)

51.7 days (sum of PERT expected values)

80 days (sum of worst cases)

Question 2: Why does decomposition improve estimate accuracy?

Smaller tasks take less time, so there's less to estimate

The Law of Large Numbers means independent over/under-estimates tend to cancel out

Managers can track progress more easily with smaller tasks

It eliminates all estimation uncertainty

Question 3: A past project took 12 months with 20 staff-months of effort. Your new project is estimated at 60 staff-months (3x effort). Using the cube-root formula, what's the estimated schedule?

36 months (linear scaling)

~17.3 months (cube root scaling)

12 months (same schedule, just add people)

24 months (square root scaling)

↓

Next: Lesson 06 — Proxy-Based & Group Estimation Methods

We'll explore proxy-based estimation — sizing work by story points and t-shirt sizes — and group techniques like Wideband Delphi and Planning Poker, where simultaneous reveals defeat anchoring and sharpen the estimate.

Think about this: if a single expert's estimate can be anchored and biased, how much better might an estimate be when a whole team reveals their numbers simultaneously and reconciles the differences?

Decomposition & Analogy