LeadershipAdvanced7 min read

Talent Reviews

A talent review is a structured forum where managers calibrate ratings of their team members across two dimensions — performance (current results) and potential (future capacity) — typically using a 9-box grid. The output: a shared, defensible view of who's a top performer, who's a high-potential leader, who's a solid cornerstone, and who's a flight risk or low-fit. Done well, talent reviews surface succession candidates, reduce manager bias by forcing peer comparison, and inform promotions, pay actions, and stretch assignments. Done poorly, they become a ratings horse-trade where the loudest manager protects their team and political capital determines who's labeled 'high potential.' GE's Session C under Jack Welch was the canonical example — every manager defended their roster annually in front of peers and the CEO.

Also known asTalent Calibration9-Box ReviewPerformance & Potential ReviewTalent MappingSession C

Challenge a friend Browse library

The Trap

The trap is the 9-box becoming a static label. Once an employee is in the 'solid performer / low potential' box, managers stop investing in them, the employee senses it, performance degrades, and the label becomes self-fulfilling. The other trap: confusing 'potential' with 'looks like the current leadership.' Most 'high potential' lists end up being the people who present well in front of executives — typically extroverted, polished, and demographically similar to the senior leaders. Without explicit guards, talent reviews calcify the existing power structure rather than surfacing genuine future leaders.

What to Do

Run a structured calibration: (1) Each manager pre-rates their team on a defined rubric. (2) Cross-functional peer manager group meets — every rating must be defended with specific evidence. (3) Force normal distribution loosely (don't let everyone be 'top quartile'). (4) Identify 3 buckets: top 20% (invest aggressively), critical middle 70% (specific development plans), bottom 10% (action within 90 days). (5) Each high-potential gets a specific 6-12 month stretch assignment, not just a label. (6) Review every 6 months, not annually — talent shifts faster than that. (7) Audit the demographics of the 'high potential' list — if it doesn't roughly match your overall workforce, you have a bias problem to fix.

Formula

Talent Review Health = Calibration Discipline × Evidence-Based Defense × Demographic Audit × Action Follow-Through

In Practice

Jack Welch's GE Session C was the gold-standard talent review for 25 years. Every business unit leader presented their full talent roster annually to Welch and HR head Bill Conaty, defending every rating with specifics. Welch enforced the famous 20/70/10 rule: 20% top performers got the bulk of equity and stretch assignments, 70% solid performers got steady development, and bottom 10% were performance-managed out within a year. The discipline produced an extraordinary leadership pipeline — at one point, current and former GE executives ran more than 30 Fortune 500 companies including Home Depot (Bob Nardelli), Boeing (Jim McNerney), and 3M (Jim McNerney). The model also drew sustained criticism for fostering internal political competition over collaboration.

Pro Tips

01
The 'potential' axis should be operationalized: 'could take on a role 1-2 levels up within 24 months.' Without that grounding, 'high potential' devolves into 'manager likes them.' Microsoft uses 'demonstrated growth mindset and learning velocity' as a more behavior-anchored signal.
02
Track the predictive validity of your reviews. If you label someone 'high potential' in 2024, do they get promoted by 2026? If your hit rate is below 50%, your calibration is broken — you're labeling people based on charisma, not capacity.
03
Run a 'shadow review' with the people NOT in the room: ask high-performing ICs and L1 managers who they think the next-level leaders are. Compare to your formal output. Massive divergence usually means the formal review is missing real talent that doesn't network upward well.

Myth vs Reality

Myth

“Forced rankings are inherently unfair”

Reality

Forced rankings are unfair when applied to small teams (n<20) where statistical distribution doesn't hold. Across 100+ employees, forced ranking surfaces real differences that consensus-driven reviews bury under inflation. The problem isn't the math — it's applying it at the wrong granularity.

Myth

“High potentials should be told they're high potentials”

Reality

Mostly they shouldn't, in those words. Telling someone they're a 'high potential' creates entitlement and sets them up to feel betrayed if the next promotion isn't immediate. Better: tell them specifically what they're being trusted to take on next ('I want you leading the platform migration in Q3'). Action-based recognition outperforms label-based recognition.

Try it

Run the numbers.

Pressure-test the concept against your own knowledge — answer the challenge or try the live scenario.

🧪

Knowledge Check

After a talent review, you realize 9 of your 10 'high potential' employees are direct reports of one VP. The other 4 VPs nominated zero high-pots. What's the most likely diagnosis?

Industry benchmarks

Is your number good?

Calibrate against real-world tiers. Use these ranges as targets — not absolutes.

Talent Review Maturity

Promotion rate of 'high potential' employees within 24 months of label

Predictive (40%+ promotion hit rate)

Top 10% of orgs

Useful (25-40% hit rate)

Healthy

Decorative (10-25% hit rate)

Most orgs

Theater (<10% hit rate)

Common

Source: Hypothetical: based on practitioner-published norms

Real-world cases

Companies that lived this.

Verified narratives with the numbers that prove (or break) the concept.

🏭

GE (under Jack Welch)

1981-2001

mixed

Jack Welch institutionalized Session C — an annual talent review where every business unit leader defended their full talent roster to Welch personally. Each employee was rated on performance and potential, with the famous 20/70/10 distribution: top 20% received outsized investment, middle 70% got steady development, bottom 10% were managed out. Welch spent roughly 50% of his time on people decisions. The result was the most extraordinary executive pipeline in modern business: dozens of GE alumni went on to run Fortune 500 companies. The model also created intense internal political competition and was eventually softened post-Welch.

Years Running

20+ at GE under Welch

Forced Distribution

20/70/10

Welch's Time on People

~50%

F500 CEOs from GE Pipeline

30+

Talent reviews work when leadership genuinely invests time and is willing to make hard calls. The downside: forced ranking can corrode collaboration if not balanced with team-based incentives.

Source ↗

🪟

Microsoft (under Satya Nadella)

2014-present

success

Satya Nadella inherited from Steve Ballmer a notoriously toxic stack-ranking system where teams were forced to rate a percentage of employees as below-average regardless of actual performance — fostering internal competition that crippled collaboration. Nadella scrapped stack ranking and replaced it with a 'growth mindset' review framework based on individual impact, contribution to others' success, and personal growth. The shift coincided with Microsoft's market cap rising from ~$300B in 2014 to over $3T by 2024. The new model demonstrated that talent reviews can be high-rigor without being adversarial — though the trade-off is harder calibration discipline.

Old System

Forced stack ranking

New System

Growth mindset, contribution to others

Market Cap (2014)

~$300B

Market Cap (2024)

$3T+

Talent reviews need rigor but not adversarial mechanics. Microsoft's pivot showed that you can keep calibration discipline while scrapping the parts that erode collaboration.

Source ↗

Decision scenario

Calibrating a Lopsided Roster

You're running annual calibration across 5 VPs. VP-A nominates 8 of his 12 reports as 'high potential.' VPs B, C, D, and E each nominate 0-2. Total: 12 high-pots, 8 from one org. The room is uncomfortable. VP-A is your most political peer and also has the loudest voice.

Total Reports Across All VPs

Expected High-Pots (~15%)

Actual Nominated

From VP-A's Org Alone

VP-A's Org Size

12 (20% of total)

Decision 1

VP-A's org is producing 8 of 12 high-pots — 4x his fair share by headcount. Either his org genuinely has 4x the talent density, or he's anchoring everyone to his rubric, or his rubric is more lenient than his peers'.

Accept the nominations — VP-A defended each one with examples and the others were quietReveal

Six months later, only 1 of VP-A's 8 high-pots gets promoted (versus 2 of 4 from the other VPs combined — a 12.5% vs 50% hit rate). Two of the other VPs' un-labeled high performers leave for promotions elsewhere. Your CEO asks why the talent pipeline looks so thin from organic non-VP-A sources. The calibration didn't calibrate — it ratified VP-A's ratings inflation.

Hit Rate (VP-A's High-Pots): 12.5%Hit Rate (Other VPs): 50%Lost Talent (other VPs' un-labeled): 2 departures

Pause the meeting. Force each VP to bring 3 specific examples per nomination, then re-rate using a shared rubric. Specifically push the quiet VPs: 'Who's the strongest person on your team you DIDN'T nominate, and why not?'Reveal

The re-do surfaces 4 strong candidates from the quiet VPs' orgs (they had been over-modest or assumed nominations were 'pre-allocated'). VP-A's count drops from 8 to 4 once held to the same evidentiary standard. Final list: 12 → 11 high-pots, distributed proportionally. Six-month hit rate is 45% (5 of 11 promoted), close to industry benchmark. The quiet VPs become more vocal in subsequent rounds. VP-A is initially annoyed but adjusts — and stops being able to dominate the room.

Distribution: 8/12 from one org → ~3/11Hit Rate: 45%Quiet VPs' Engagement: Higher in future rounds

Related concepts