march madness w/ the boys
All Articles

Experts vs. Simple Heuristics

Can simple rules beat the NCAA selection committee?

193 computer ranking systems • 40 seasons

71.4%
Committee's accuracy (higher seed wins)
64.7%
Simple win-loss record accuracy
70.9%
Best computer algorithm (KenPom)
87%
of computer systems scored WORSE than the committee

The Contestants

We pitted the expert committee against five alternative approaches, from sophisticated algorithms to a coin flip

Here's what we tested, ranging from "no information at all" to "team of experts with months of analysis":

ModelWhat It UsesExpert Knowledge?Accuracy
Coin FlipNothing — pure randomNone50.0%
Win-Loss RecordJust regular season winsNone — a child could do this64.7%
Point DifferentialAverage scoring margin per gameMinimal — basic arithmetic65.2%
Wins vs Tourney TeamsWins against teams that made the tournamentSome — requires knowing the field69.1%
KenPom (best algorithm)Adjusted efficiency ratings, tempo, SOSSophisticated math, no human bias70.9%
Selection CommitteeEverything — stats, film, eye test, debateMaximum — panel of experts71.4%

Finding 1: The Committee Beats Every Simple Rule — But Not by Much

The experts' edge over "just pick the team with more wins" is 6.7 percentage points
Coin Flip
50.0%
Win-Loss Record
64.7%
Point Differential
65.2%
Wins vs Tourney Teams
69.1%
Best Computer (KenPom)
70.9%
Selection Committee
71.4%

The committee is the best predictor — but the margins are thin. A simple win-loss count, something literally anyone can look up in 30 seconds, gets you to 64.7%. The committee's months of film study, debate, data analysis, and deliberation add just 6.7 points on top of that.

Or to put it differently: about 90% of the committee's predictive power can be captured by counting wins. The remaining 10% is where their expertise lives.

The economic framing: The committee is like an actively managed fund. It does outperform the "index" (simple rules), but the outperformance is modest relative to the cost and complexity. Most of the value comes from the "market" (basic win-loss data), not from expert stock-picking.

Finding 2: The Best Algorithm Essentially Ties the Committee

KenPom (70.9%) comes within 0.5% of the committee (71.4%) — with zero human judgment

We tested all 193 computer ranking systems in the Massey Ordinals database against the committee's seedings. The results:

CategoryCount% of Systems
Systems that beat the committee713%
Systems within 1% of the committee917%
Systems that trail by more than 1%3670%

Only 7 out of 52 systems with sufficient data beat the committee — which means the committee does add value beyond any single algorithm. But the best algorithms come remarkably close:

SystemWhat It IsAccuracyvs Committee
KenPom (POM)Adjusted efficiency — the gold standard of analytics70.9%+0.1%
Massey (MOR)Massey Ratings — pure mathematical ranking70.6%-0.2%
Sagarin (SAG)Jeff Sagarin's computer rankings70.5%-0.4%
RPIRating Percentage Index — simple formula69.0%-1.8%
Colley (COL)Colley Matrix — linear algebra approach69.0%-1.8%
What KenPom tells us: KenPom uses only two inputs — adjusted offensive efficiency and adjusted defensive efficiency, both corrected for strength of schedule and tempo. That's it. No film. No "eye test." No committee debates. Just two numbers per team. And it matches a room full of experts within half a percentage point.

Finding 3: Expertise Has Diminishing Returns

Going from zero information to basic stats gets you 65% of the way. Everything after that is marginal.

Think of prediction accuracy as a staircase. Each step represents adding more information or sophistication:

Step 0: Coin flip
50%
Step 1: Count wins (+14.7 pts)
64.7%
Step 2: Look at margins (+0.5 pts)
65.2%
Step 3: Consider opponents (+3.9 pts)
69.1%
Step 4: Sophisticated algo (+1.8 pts)
70.9%
Step 5: Expert committee (+0.5 pts)
71.4%

The biggest jump is from knowing nothing to counting wins: +14.7 points. After that, each additional layer of sophistication adds less. The jump from the best algorithm to the expert committee is just 0.5 points — the smallest increment on the entire staircase.

The uncomfortable implication: If you're filling out a bracket and all you know is each team's win-loss record, you're already capturing about 69% of what the expert committee knows (14.7 of their 21.4-point edge over a coin flip). The last 31% of their edge requires exponentially more knowledge, data, and analysis to achieve.

Finding 4: The Committee's Edge Is Biggest in the First Round and Elite 8

Experts add the most value in rounds with the widest talent gaps
RoundCommitteeWin-LossPoint DiffCommittee Edge over Pt Diff
Round of 6472.9%64.7%64.9%+8.0%
Round of 3269.3%65.5%65.2%+4.1%
Sweet 1668.9%61.5%68.0%+0.9%
Elite 871.1%61.7%59.2%+11.9%
Final Four71.4%66.7%90.0%-18.6%

In the first round, where 1-seeds face 16-seeds and the talent disparity is obvious, the committee's seedings are 8 points better than point differential alone. The committee is good at identifying mismatches.

But in the Final Four — where we'd expect expert judgment to matter most — point differential actually flips and beats the committee. Small sample size caveat applies (only 10 games in the dataset with clear seed differences), but the direction is interesting: when the remaining teams are all elite, the experts' subjective rankings may be less reliable than raw performance data.

Finding 5: When Simple Disagrees with Expert, the Expert Usually Wins

But the simple rule is right a third of the time — far more than zero

There are 495 tournament games where the win-loss record would have picked a different team than the committee's seeding. In those contested cases:

65.7%
Committee was right
vs
34.3%
Win-loss record was right

The committee wins the tiebreaker about two-thirds of the time. This is where their expertise genuinely earns its keep — they're seeing something (strength of schedule, injuries, conference quality, late-season trends) that a raw win count misses. But a third of the time, the simple metric had it right and the experts were wrong. That's a meaningful error rate for "the best judgment available."

Finding 6: Among 193 Computer Systems, Only 7 Beat the Committee

But the algorithms are pure math with no biases, no debates, and no cost

The committee outperforms 87% of the computer ranking systems. That's a legitimate feather in their cap. But consider the economics:

The committee: 10 members, months of work, hundreds of hours of film, extensive travel, heated debates, occasional controversies, and exposure to every cognitive bias known to psychology (recency bias, conference brand bias, anchoring to preseason expectations).
KenPom: Two numbers per team. Runs in seconds. No bias, no fatigue, no politics. Gets within 0.5% of the committee's accuracy. Year after year.

The 7 systems that beat the committee are interesting precisely because they suggest the committee's errors aren't random — they're systematic. A purely mathematical model that ignores brand names, conference prestige, and "eye test" narratives can match or exceed the experts. The committee's biases (which we documented in our seeding analysis — conference favoritism, the 11-seed anomaly, mid-seed noise) are exactly the kind of errors an algorithm wouldn't make.

What Does This Mean?

The NCAA Selection Committee is a real-world laboratory for studying expert judgment. After 40 years and 2,518 games, the data tells a clear story:

1. Expertise is real but its marginal value is small. The committee beats every simple rule and most computer systems. But the gap between "count the wins" (64.7%) and "expert committee" (71.4%) is just 6.7 points. Most of the predictive signal is in the basic data, not the expert analysis.

2. Algorithms match experts at a fraction of the cost. KenPom hits 70.9% with two numbers and no human input. The committee's remaining 0.5% edge comes with enormous cost, complexity, and the introduction of human biases.

3. The "casual fan" is closer to the expert than you'd think. If your friend picks brackets by "going with the team that won more games," they'll get about 65% right. The expert gets 71%. The gap is real but narrow — which is why office bracket pools are competitive and why your coworker who "doesn't even watch basketball" occasionally wins.

4. This mirrors findings across expert domains. From financial analysts to political forecasters to medical diagnosticians, the pattern repeats: experts outperform simple rules by small margins, algorithms match experts closely, and more information doesn't proportionally improve accuracy. The NCAA tournament is just a particularly clean dataset to prove it.