Statistical Process Control
Statistical Process Control (SPC), invented by Walter Shewhart at Bell Labs in 1924 and operationalized by W. Edwards Deming, distinguishes COMMON-CAUSE variation (random noise inherent to a stable process) from SPECIAL-CAUSE variation (signals that something has changed). The tool is the control chart: plot a metric over time with statistical control limits at ยฑ3 standard deviations from the mean. Points inside the limits = stable process, leave it alone. Points outside the limits OR forming non-random patterns (runs, trends, shifts) = something has changed, investigate. The brutal insight Deming hammered into managers: reacting to common-cause variation as if it were a signal (called 'tampering') makes the process WORSE. KnowMBA take: most engineering metric reviews are tampering โ overreacting to a noisy week of customer churn, deploys, or NPS as if every wiggle is a signal. SPC tells you when to act and, more importantly, when to leave it alone.
The Trap
The dominant trap is tampering: a metric is at 3.2% when last week it was 2.8%, the leadership team panics, launches a 'task force,' and changes the process. Both readings were actually common-cause variation in a stable process โ the 'task force' added new variation that made things worse. Deming's funnel experiment showed that adjusting a stable process based on each output INCREASES variance. The opposite trap is using SPC as a way to DEFEND bad performance: 'it's just within control limits, we don't need to improve.' Control limits describe what the process IS doing, not what it SHOULD do. A stable process producing 5% defects is stable AND unacceptable.
What to Do
Pick one critical metric (cycle time, defect rate, response time). Plot the last 25-30 data points as a run chart. Calculate the mean and standard deviation; draw control limits at ยฑ3ฯ. Now check: (1) Are any points outside the limits? Investigate THOSE specific incidents. (2) Are there runs of 8+ points on one side of the mean? That's a shift โ the process changed. (3) Are there 6+ consecutive increasing or decreasing points? That's a trend. If none of those: the process is stable. To IMPROVE the level of the metric, change the system itself (different equipment, different method) โ don't react to individual points.
Formula
In Practice
Jack Welch's GE Six Sigma program (1995-2001) embedded SPC at industrial scale. GE's appliance plants tracked dozens of process metrics with control charts; operators were trained to distinguish common from special cause and only escalate the latter. When a sensor reading drifted outside control limits on a refrigerator compressor line in Louisville, the line was halted within minutes โ root cause: a worn bearing on a CNC machine, fixed before any defective compressors shipped. SPC moved GE from end-of-line inspection (catch defects after the fact) to in-process control (prevent them in real-time). Welch credited Six Sigma โ built on SPC โ with $12B in cumulative savings.
Pro Tips
- 01
Deming's Red Bead Experiment is the fastest way to internalize SPC: workers draw beads from a box; the proportion of red (defective) beads varies from 4 to 16 per draw โ but the system (the bead box) is constant. Punishing/rewarding workers for above/below average is meaningless theater. The only way to reduce red beads is to change the box (the system).
- 02
Western Electric Rules โ eight tests for non-random patterns on a control chart: any 1 point beyond ยฑ3ฯ; any 2 of 3 consecutive points beyond ยฑ2ฯ; any 4 of 5 beyond ยฑ1ฯ; 8+ in a row on one side of the mean; etc. Modern SPC software automates these. Most teams only check rule 1 and miss the early-warning signals from the other rules.
- 03
Capability (Cpk) vs. Stability: a process can be STABLE (in control) and INCAPABLE (consistently producing defects within its natural variation). Stability is the prerequisite โ you can't improve an unstable process predictably. Capability is the goal โ your natural variation must fit inside customer requirements.
Myth vs Reality
Myth
โSPC is only useful for high-volume manufacturingโ
Reality
SPC works on any time-ordered metric: ER wait times, software incident counts, ad CTR, weekly sales, app crash rates. The math doesn't care if you're making widgets or measuring user behavior. Etsy uses SPC-style anomaly detection on every key metric to avoid tampering on noisy data.
Myth
โIf a point is within control limits, the process is fineโ
Reality
Control limits show what the process IS doing โ they say nothing about what's acceptable to customers. A process can be stable inside control limits at a 5% defect rate that customers will not tolerate. Stability and capability are separate concepts; you need both. SPC tells you when to act, not whether the level is good enough.
Try it
Run the numbers.
Pressure-test the concept against your own knowledge โ answer the challenge or try the live scenario.
Knowledge Check
Your weekly customer churn rate has been: 2.1%, 2.4%, 2.0%, 2.3%, 2.2%, 2.5%, 1.9%, 2.4% (mean ~2.2%, ฯ ~0.2%). This week it's 2.6%. Three months ago it ran 1.8-2.1%. What does SPC suggest?
Industry benchmarks
Is your number good?
Calibrate against real-world tiers. Use these ranges as targets โ not absolutes.
Process Capability Index (Cpk)
Manufacturing and service processes meeting customer specificationsWorld-Class (Six Sigma)
โฅ 2.0
Strong
1.67-2.0
Capable
1.33-1.67
Marginal
1.0-1.33
Incapable
< 1.0
Source: Walter Shewhart / W. Edwards Deming / GE Six Sigma standards
Real-world cases
Companies that lived this.
Verified narratives with the numbers that prove (or break) the concept.
General Electric (Six Sigma Era)
1995-2001
Under Jack Welch, GE deployed Six Sigma โ built on Shewhart/Deming SPC โ across every business unit. Every process was charted; every defect tracked. GE Plastics in Mt. Vernon, IN reduced color variation in resin production by tightening SPC limits and re-centering processes; Cpk went from 0.9 (incapable) to 2.1 (world-class) over 18 months without major capex. Operators were trained to distinguish common vs. special cause and stop tampering. Welch publicly credited Six Sigma for $12B in cumulative savings by 2002, with SPC as the core technical method.
Cpk Improvement
0.9 โ 2.1 (typical)
Cumulative Savings
$12B by 2002
Trained Black Belts
~5,000
Tampering Reduction
~90% (informal estimate)
SPC is the technical engine of Six Sigma. Without it, 'continuous improvement' devolves into reacting to noise. With it, every team has a shared language for what's signal vs. what's chatter.
Hypothetical: B2B SaaS Customer Success
Recent
A 400-employee SaaS company's CS team was 'in crisis mode' weekly because NPS bounced between 32 and 48 โ leadership demanded action every time it dipped. The CS lead applied SPC: NPS mean 40, ฯ ~5, control limits 25-55. Almost every weekly reading was within control limits โ the panic-and-respond cycle was tampering. Once leadership stopped reacting to noise and only investigated readings outside ยฑ2ฯ or runs of 8+ on one side, NPS actually rose to a stable 47 (the constant interventions had been adding variance). Real special-cause investigations led to two structural improvements that lifted the mean.
NPS Mean Before
40 (panicky volatility)
NPS Mean After
47 (stable)
Weekly Crisis Meetings
1+ โ 1/quarter
CS Team Burnout
Down significantly
Most leadership 'crisis response' on noisy weekly metrics is tampering. SPC gives leaders permission to ignore noise and focus only on real signals โ the team works on actual problems instead of imagined ones.
Related concepts
Keep connecting.
The concepts that orbit this one โ each one sharpens the others.
Beyond the concept
Turn Statistical Process Control into a live operating decision.
Use this concept as the framing layer, then move into a diagnostic if it maps directly to a current bottleneck.
Typical response time: 24h ยท No retainer required
Turn Statistical Process Control into a live operating decision.
Use Statistical Process Control as the framing layer, then move into diagnostics or advisory if this maps directly to a current business bottleneck.