Lesson 03 — Count, Compute, Judge

The Estimation Hierarchy

If you take away only one idea from McConnell's entire book, make it this:

"Count if at all possible. Compute when you cannot count. Use judgment only as a last resort."

— Steve McConnell, Software Estimation, Chapter 6

This deceptively simple hierarchy is the organizing principle of all good estimation. Most teams get it exactly backwards — they start with judgment (gut feel, past experience, rough guesses) when they should be looking for things to count.

Count Most accurate. Count real artifacts: pages, tables, interfaces.

↓

Compute Use formulas and calibration data to derive estimates from counts.

↓

Judge Last resort. Use structured expert judgment when data isn't available.

Why this order? Because human judgment is the least accurate estimation method we have. Study after study shows that expert judgment, even from experienced developers, is systematically biased. Counting removes the human from the equation as much as possible.

Why Judgment Fails

As we saw in Lesson 02, our estimates are distorted by anchoring, optimism bias, and the planning fallacy. Even when experts are well-calibrated on average, individual judgment calls have enormous variance. Counting doesn't have those biases — 47 web pages is 47 web pages.

But don't expert developers have great intuition?

Sometimes. But even good intuition is inconsistent. Daniel Kahneman's research shows that experts often perform worse than simple statistical models because they weigh irrelevant factors and are influenced by how a question is framed. McConnell doesn't say judgment is worthless — he says it should be your last resort, not your first instinct.

Choosing the Right Approach

For each of the following estimation scenarios, decide: should you Count, Compute, or use Judgment? Think about what information is available and which approach would give the most reliable result.

Interactive Exercise — Pick the Best Approach

What Can You Count?

The key insight is that there are far more countable things in a software project than most people realize. You're not limited to lines of code. McConnell identifies dozens of countable artifacts:

Countable Software Artifacts

Any of these can serve as the basis for an estimate:

Web pages / screens

Database tables

Reports

Dialog boxes

Business rules

Interfaces (APIs, imports/exports)

Function points

User stories

Use cases

Configuration files

Lines of code (for reestimation)

Test cases

Defect reports

Requirements

The trick is to count things that are close to the actual work and that are available early enough to be useful. Web pages and database tables are often defined early in design. Lines of code come too late.

Interactive Exercise — Find the Countable Artifacts

Below is a mock project description. Click on every countable artifact you can find — items that could serve as a basis for estimation by counting. How many can you spot?

The customer portal will consist of 12 web pages including a dashboard, account settings screen, and 3 report views. The system needs 8 database tables to store customer, order, and product data. We'll integrate with 2 external APIs (payment processor and shipping provider) and build 1 internal REST API with 15 endpoints. There are 23 business rules governing pricing tiers, discounts, and eligibility. The application requires 4 dialog boxes for confirmations and error handling. Users will be able to generate 5 types of reports (sales summary, inventory, customer activity, returns, and forecasting). The team needs to handle 3 data import formats (CSV, XML, JSON) and 2 data export formats. The admin section has 6 configuration screens.

Found: 0 / 14

How do you choose what to count?

McConnell offers these guidelines:

Count things available early — web pages and DB tables are typically known after initial design
Count things with low variability — reports tend to be similar in effort; modules can vary wildly
Count things that correlate with effort — the count should predict how much work there is
Use multiple counts — cross-check by counting different artifact types

Converting Counts to Estimates

A count alone doesn't tell you how long the project will take. You need calibration data — historical data or industry averages that tell you how much effort each counted item typically requires.

The formula is straightforward:

Effort = Count × Average effort per item

For example, if you've counted 12 web pages and your historical data shows each page takes an average of 2.5 staff-days, then your computed estimate is 30 staff-days. The quality of the estimate depends entirely on the quality of the calibration data.

Interactive Calculator — Count to Estimate

Enter counts for a hypothetical project. The multipliers below are based on industry-average data for a mid-complexity web application. Watch the estimate build up.

Web pages / screens × 2.5 days 30.0

Database tables × 3.0 days 24.0

Reports × 4.0 days 20.0

External API integrations × 5.0 days 10.0

Business rules × 0.5 days 11.5

Dialog boxes × 1.0 days 4.0

Total Estimated Effort 99.5 staff-days

With a 3-person team: ~33 days (~6.6 weeks)

Remember: These are simplified averages. Real calibration data should come from your own project history. The multipliers vary widely by complexity, team experience, and technology.

The Power of Calibration Data

The difference between a good computed estimate and a wild guess is calibration data. If you know from past projects that your team averages 3.2 days per database table (not the industry average of 3.0), your estimate will be significantly better.

This is why McConnell emphasizes collecting historical data. Every completed project is an opportunity to improve your calibration for the next one. Teams that track their actuals against estimates improve their accuracy by 20-40% within a few projects.

What about function points?

Function points are a formalized counting method developed by Allan Albrecht at IBM. They count five types of elements: external inputs, external outputs, external inquiries, internal logical files, and external interface files. Each gets weighted by complexity (low/average/high).

Function points are technology-independent, which makes them useful for cross-project comparisons. But they require training to count accurately, and many organizations find simpler counts (web pages, stories) are "good enough" in practice.

Multiple counts for cross-checking

One of McConnell's most valuable recommendations: use more than one counting approach and compare the results. If counting web pages gives you 100 staff-days and counting database tables gives you 95, you can be fairly confident. If one gives 100 and the other gives 250, you've uncovered a discrepancy that needs investigation.

This technique of triangulation is far more powerful than relying on a single number, no matter how carefully that number was derived.

When Judgment Is Unavoidable

Despite the clear hierarchy, sometimes you genuinely cannot count or compute. Maybe you're estimating a novel research project. Maybe you're in the earliest concept stage where requirements don't yet exist. In those cases, judgment is all you have — but there's a right way and a wrong way to use it.

Structured vs. Unstructured Judgment

McConnell makes a critical distinction:

Unstructured Judgment

"How long will this take?" → "Hmm... about 3 months."

Quick. Intuitive. Usually wrong. Subject to all the biases we covered: anchoring, optimism, the planning fallacy.

Structured Judgment

Decompose the work. Consider each piece. Use checklists. Get multiple opinions. Apply explicit adjustments.

Slower. Deliberate. Significantly more accurate.

Interactive Exercise — Structured vs. Unstructured

Let's estimate the same task two different ways and compare the results.

The Task: Build a user authentication system for a web application. It needs login/logout, password reset via email, OAuth integration with Google and GitHub, session management, rate limiting, and an admin panel to manage users.

Step 1: Gut Feel Estimate

Without overthinking it, how many staff-days would this take a mid-level developer?

What makes structured judgment more accurate?

Structured judgment works better for several reasons, all grounded in cognitive science:

Decomposition reduces bias — estimating small pieces is easier than estimating one big unknown
Checklists catch omissions — the most common estimation error is forgetting a task entirely
Explicit adjustments counter optimism — forcing yourself to consider risk factors adds realistic buffers
Multiple perspectives help — Wideband Delphi and other group techniques average out individual biases

Choosing the Right Technique

Different estimation techniques are appropriate at different project stages and for different types of projects. McConnell provides guidance on when each approach shines:

Project Phase	Best Approach	Why
Initial concept	Judge	Nothing concrete to count yet. Use analogy with past projects.
Requirements defined	Count	Requirements list features, screens, rules — all countable.
Design complete	Count Compute	Detailed design reveals precise counts. Apply calibration data.
Partway through coding	Count Compute	Count completed vs. remaining work. Compute using actual velocity.
Novel technology	Judge Compute	No historical data. Prototype first, then compute from results.
Similar past project	Compute	Scale from known actuals. Adjust for size and complexity differences.

The key principle: as the project progresses, more things become countable. Early on, you have no choice but to judge. But the moment you have a requirements document or a design spec, you should switch from judgment to counting.

The Most Common Mistake

Teams continue to use judgment-based estimation (developer gut feel) long after they have enough information to count and compute. By the time detailed requirements exist, an experienced estimator should be counting artifacts — not asking developers "how long do you think this will take?" If you're still using unstructured judgment at requirements-complete, you're leaving significant accuracy on the table.

Key Takeaways

What We Learned

Count first. Always look for countable artifacts before falling back to computation or judgment.
Countable things are everywhere: web pages, database tables, reports, business rules, interfaces, dialog boxes, user stories — far more than most teams realize
Calibration data converts counts into estimates. Your own historical data is the best calibration; industry averages are a fallback.
When you must use judgment, structure it: decompose, use checklists, get multiple opinions, apply explicit adjustments
As projects progress, more things become countable — update your estimation approach accordingly, don't keep guessing when you could be counting
Use multiple counting approaches and cross-check. Convergence builds confidence; divergence signals hidden risk.

Check Your Understanding

↓

Next: Lesson 04 — Calibration & Historical Data

We'll see why your own historical data beats industry averages, how to build a calibration factor from past projects, and how feeding actuals back into your estimates can improve accuracy by 20–40% within a few projects.

Think about this: if decomposition alone improves estimates by 20-40%, what happens when you combine it with calibration data and multiple expert opinions?

Count, Compute, Judge