Why Metrics Often Mislead

Your dashboard shows success: conversion rate up 15%, user engagement climbing, revenue per customer increasing. Every metric green. The board presentation looks excellent. Yet three months later, the company is struggling—churn accelerating, support costs exploding, product quality complaints surging. How did all the metrics look great while the business deteriorated?

Metrics mislead not because they lie (though manipulation happens), but because they tell partial truths easily mistaken for complete pictures. A metric shows one number from one angle at one point in time, and organizations treat it as comprehensive reality. The map becomes the territory. The proxy becomes the goal. The measurement becomes divorced from what it was meant to measure.

Understanding how and why metrics mislead—through gaming, misinterpretation, misalignment, and the systemic pressures that corrupt good metrics once they become targets—is essential for using measurement effectively without being deceived.

The Core Mechanisms of Misleading

Mechanism 1: Goodhart's Law

Statement: "When a measure becomes a target, it ceases to be a good measure."

Why it happens:

Metric chosen because it correlates with goal
Metric becomes target
People optimize for metric
Correlation between metric and goal breaks down
Metric improves while goal performance deteriorates

Example: Soviet Nail Factory

Goal: Produce useful nails

Metric: Weight of nails produced (tons)

Result:

Factory produces huge, heavy, useless nails
Metric (tonnage) maximized
Goal (useful nails) sacrificed

Alternative metric: Number of nails produced

Result:

Factory produces tiny, useless nails
Metric (count) maximized
Goal still sacrificed

Lesson: Optimizing a proxy metric destroys its relationship to the underlying goal.

Example: Wells Fargo Account Openings

Goal: Grow customer relationships

Metric: New accounts opened per employee

Intent: More accounts = deeper customer relationships

Result:

Employees opened millions of unauthorized accounts
Customers didn't want or use accounts
Metric (accounts) soared
Goal (real customer relationships) harmed
Reputation destroyed, billions in fines

The metric-as-target corrupted the system.

Mechanism 2: Gaming and Manipulation

Gaming: Achieving metric targets without improving (or while degrading) actual performance.

Common gaming tactics:

Tactic	Description	Example
Cherry-picking	Report only favorable data	Select best time period, exclude bad segments
Reclassification	Change definitions to improve numbers	Reclassify customers to hide churn
Threshold gaming	Bunch activity to just meet targets	Close deals at month-end to hit quota
Sandbagging	Delay good results to next period	Hold deals if quota already met
Output shifting	Hit metric by sacrificing unmeasured quality	Fast support resolution, problems unresolved
Measurement manipulation	Change how you measure	Adjust survey timing, question wording

Example: British Ambulance Response Times

Target: Respond to emergencies within 8 minutes

Gaming tactics:

Stop clock when ambulance dispatched, not when it arrives
Send fast motorcycle paramedic first (hits 8-minute target), ambulance later
Reclassify emergencies to less urgent categories (looser targets)

Result: Response time metrics improved, but actual emergency care quality questionable.

Example: Teacher Test Score Gaming

Metric: Student test scores

Gaming:

Narrow curriculum to tested subjects
Teach test-taking strategies, not deep learning
Exclude low-performers from test day
In extreme cases: Change student answers, give answers during test

Result: Scores rise, actual educational outcomes unclear or worse.

Mechanism 3: Partial Visibility

Metrics show what they measure and hide everything else.

The illumination problem:

Metric illuminates measured aspect
Makes unmeasured aspects darker by comparison
"Drunk searching for keys under streetlight" problem

Example: Page Views

What it shows: Traffic volume

What it hides:

Traffic quality (bots vs. humans? Engaged vs. bounce?)
Traffic intent (ready to buy vs. random visitor?)
Traffic outcome (converted? Got value?)

Risk: Optimize for traffic volume, get useless traffic.

Example: Employee Productivity Metrics

What it shows: Hours worked, tasks completed, output produced

What it hides:

Quality of work
Collaboration and helping others
Innovation and creative thinking
Institutional knowledge building

Risk: Optimize for measured output, destroy unmeasured but critical factors.

Mechanism 4: Misinterpretation

Metrics are misinterpreted when:

Correlation confused with causation
Context ignored
Statistical significance misunderstood
Metric definition unclear

Example: Ice Cream and Drowning

Observation: Ice cream sales correlate with drowning deaths

Misinterpretation: Ice cream causes drowning

Reality: Both caused by warm weather (confounding variable)

Lesson: Correlation ≠ causation

Example: Simpson's Paradox

Phenomenon: Trend appears in subgroups but reverses when combined.

UC Berkeley admissions (1973):

Overall: Men admitted at higher rate than women (appears discriminatory)
By department: Women admitted at higher rates in most departments
Explanation: Women applied to more competitive departments

Lesson: Aggregated metrics can mislead. Segmentation reveals reality.

Example: Survivorship Bias

Phenomenon: Analyzing survivors without considering those who didn't survive.

WWII aircraft armor:

Observe bullet holes on returning planes
Temptation: Reinforce areas with holes
Reality: Reinforce areas without holes (planes hit there didn't return)

Lesson: Metrics based on survivors miss critical information from non-survivors.

Mechanism 5: Metric Decay

Even good metrics degrade over time.

Decay process:

Metric initially correlates with goal
Metric becomes target
People learn to game it
Metric-goal correlation weakens
Eventually: Metric decoupled from goal

Example: Citation Counts in Academia

Original use: Citations as proxy for research impact

Early days: Reasonable correlation

As target:

Strategic citation networks
Self-citation
Citation rings (we cite each other)
Incremental publishing (more papers = more citations)

Result: Citation counts inflated, relationship to actual impact weakened.

Types of Misleading Metrics

Type 1: Vanity Metrics

Definition: Metrics that look impressive but don't correlate with meaningful outcomes.

Characteristics:

Easy to increase artificially
Make you feel good
Don't inform decisions
Don't predict business success

Examples:

Vanity Metric	Why It Misleads	Meaningful Alternative
Total page views	Doesn't mean engagement or value	Conversion rate, engagement rate
Social media followers	Many inactive, don't convert	Engagement rate, conversion from social
Registered users	Most never activate	Activated users, retained users
App downloads	Most never opened	Day-7 retention, activated users
Email list size	Many unengaged	Open rate, click rate, engaged subscribers

Danger: Celebrate vanity metrics, miss real performance.

Type 2: Proxy Metrics

Definition: Metrics that represent something else, assumed to correlate with goals.

Problem: Proxies degrade when they become targets.

Example: Hospital Readmission Rates

Proxy for: Quality of care

Logic: Better care → fewer readmissions

Gaming:

Extend initial hospital stays (no "readmission" if never discharged)
Discourage readmissions (treat in ER, don't formally admit)
Select healthier patients

Result: Readmission rates improve, actual care quality unclear.

Example: Employee Satisfaction Surveys

Proxy for: Workplace health, retention risk

Logic: Satisfied employees stay, perform better

Gaming:

Survey timing (avoid stressful periods)
Implicit pressure to rate highly
Survey fatigue (only most engaged respond)

Result: Scores rise, underlying issues persist.

Type 3: Ratio Distortion

Problem: Ratios can be improved by manipulating numerator or denominator, sometimes perversely.

Example: Acceptance Rate (College Rankings)

Metric: % of applicants accepted

Desired interpretation: Selectivity indicates quality

Gaming:

Encourage unqualified students to apply (increases applications, lowers acceptance rate)
Reject more students
Accept students "off waitlist" (not counted in initial acceptance rate)

Result: Acceptance rate drops, doesn't mean quality increased.

Example: Conversion Rate

Metric: Conversions / Visitors

Gaming options:

Increase numerator: Lower prices, worse targeting (more low-value conversions)
Decrease denominator: Reduce traffic quality filters (fewer visitors, but worse overall business)

Result: Conversion rate improves, revenue may decline.

Type 4: Threshold Effects

Problem: Behavior clusters around metric thresholds, creating distortions.

Example: Standardized Test Cutoffs

Metric: % of students scoring above threshold

Gaming:

Focus resources on "bubble students" (just below threshold)
Neglect high-performers (already above threshold)
Neglect low-performers (unlikely to reach threshold)

Result: More students hit threshold, but resource allocation becomes perverse.

Example: Sales Quotas

Threshold: Monthly revenue target

Distortion:

End-of-month scramble
Discounts to close marginal deals
Sandbagging (delay deals if quota met)
Revenue pulled forward (future months suffer)

Result: Monthly target hit, but annual performance and customer relationships suffer.

Domain-Specific Misleading Examples

Software Development

Misleading metric: Lines of code written

Problem:

Incentivizes verbosity
Discourages refactoring (reduces lines)
Conflates activity with value

Alternative: Features delivered and adopted, bug rates, code maintainability.

Misleading metric: Story points completed

Problem:

Story point inflation
Focus on volume, not value
Gaming estimation process

Alternative: User value delivered, cycle time, customer satisfaction.

Sales

Misleading metric: Pipeline value

Problem:

Easy to inflate by adding low-quality leads
Doesn't account for close probability
Creates false confidence

Alternative: Weighted pipeline (probability-adjusted), win rate, actual closed revenue.

Misleading metric: Number of calls/meetings

Problem:

Activity, not outcome
Incentivizes quantity over quality
Doesn't predict revenue

Alternative: Conversion rates at each stage, deal velocity, revenue per rep.

Customer Support

Misleading metric: Tickets closed per hour

Problem:

Incentivizes quick closure, not resolution
Encourages closing without solving problem
Degrades customer experience

Alternative: First-contact resolution, customer satisfaction, issue recurrence rate.

Misleading metric: Average handle time

Problem:

Rushes complex issues
Discourages thoroughness
Reduces help quality

Alternative: Resolution rate, customer satisfaction, issue escalation rate.

Healthcare

Misleading metric: Patient satisfaction scores

Problem:

Can be gamed (avoid difficult conversations, over-prescribe pain meds)
Doesn't correlate strongly with health outcomes
May incentivize patient appeasement over best medical practice

Alternative: Health outcomes, evidence-based care adherence, patient safety indicators.

Misleading metric: Length of stay

Problem:

Pressure to discharge quickly
May compromise recovery
Readmission risk increases

Alternative: Readmission rates, recovery outcomes, patient safety, patient readiness for discharge.

Education

Misleading metric: Graduation rates

Problem:

Pressure to pass unprepared students
Grade inflation
Reduced academic standards

Alternative: Actual learning assessments, post-graduation outcomes, job placement rates.

Misleading metric: Test score averages

Problem:

Teaching to test
Narrow curriculum
Doesn't capture deep learning

Alternative: Critical thinking assessments, project quality, long-term learning retention.

The Measurement-Target Problem

Campbell's Law

Statement: "The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor."

Translation: Using a metric as a target corrupts it.

The Lifecycle of Metric Corruption

Stage 1: Valid Proxy

Metric correlates with goal
Useful for monitoring

Stage 2: Increased Attention

Metric reported prominently
Discussed in meetings
Used for evaluation

Stage 3: Becoming Target

Consequences attached to metric
Bonuses, promotions, reputation depend on it
Metric now high-stakes

Stage 4: Gaming Emerges

People discover how to improve metric without improving goal
Early gaming subtle
Metric-goal correlation weakens

Stage 5: Institutionalized Gaming

Gaming becomes normal practice
"Everyone does it"
Metric fully decoupled from goal

Stage 6: Metric Crisis

Obvious that metric no longer represents reality
Metric changed or abandoned
Cycle begins again with new metric

Example: British Healthcare Waiting Times

Goal: Reduce patient wait times for treatment

Metric: % of patients treated within target time (4 hours in emergency, 18 weeks for elective surgery)

Stage-by-stage corruption:

Stage 1-2: Valid proxy

Tracks real wait times
Identifies problem areas

Stage 3: High stakes

Hospital funding tied to hitting targets
Managers' careers depend on metrics

Stage 4-5: Gaming emerges and spreads

Ambulances wait outside ER until 4-hour window achievable
Patients reclassified to categories with longer targets
Elective surgeries scheduled just under 18-week deadline
Patients "pause" on waiting list (clock stops, not counted)

Stage 6: Crisis

Obvious gaming, public outcry
Metric no longer trusted
Actual care quality questionable despite hitting targets

Why Organizations Keep Using Misleading Metrics

Reason 1: Metrics Look Objective

Appeal: Numbers feel scientific, unbiased, fair

Reality: Metric choice is subjective, measurement contains biases, interpretation requires judgment

Result: False confidence in flawed metrics

Reason 2: Alternatives Are Harder

Qualitative assessment:

Requires judgment
Time-intensive
Harder to scale
Less "defensible" (no single number)

Metrics:

Quick, scalable
Easy to compare
Simple to report

Result: Organizations default to metrics even when misleading, because alternatives require more effort.

Reason 3: Accountability Pressure

Managers need to demonstrate results.

Metrics provide:

"Proof" of performance
Comparability (vs. goals, peers, past)
Defensibility in evaluations

Without metrics: "We improved" sounds vague

With metrics: "We improved X by 23%" sounds concrete

Problem: Even misleading metrics provide cover.

Reason 4: Gaming Is Incremental

Gaming doesn't announce itself.

Evolution:

Start: Slight optimization (reasonable)
Middle: Aggressive optimization (questionable)
End: Full gaming (clear corruption)

At each step, individuals rationalize:

"I'm just being efficient"
"Everyone does this"
"The metric is the goal"

Result: Gaming normalized before anyone notices.

Reason 5: Inertia and Path Dependence

Once established:

Historical data accumulated
Comparisons over time matter
Changing metric feels like admitting past measurement was wrong
Political cost to change

Result: Broken metrics persist long after problems obvious.

Detecting Misleading Metrics

Red Flag 1: Metric Improves, Reality Doesn't

Test: Does improving the metric correspond to actual goal achievement?

Example:

Customer satisfaction scores rising
Yet churn increasing, complaints up
Red flag: Metric decoupled from reality

Red Flag 2: Everyone Hits Targets Easily

If targets consistently achieved:

Targets too easy, OR
Widespread gaming

Healthy: Some hit targets, some miss (indicates stretch goals and honesty)

Suspicious: Everyone always hits targets (indicates gaming or sandbaggin)

Red Flag 3: Unmeasured Aspects Deteriorating

If measured areas improve while unmeasured areas degrade:

Tunnel vision
Resources shifted from unmeasured to measured

Example:

Metric: Feature velocity (features shipped per sprint)
Reality: Code quality declining, technical debt rising, bugs increasing

Red Flag 4: Metric Behavior Clusters Around Thresholds

If results bunch just above threshold:

Indicates gaming to hit target
Natural distributions don't cluster at arbitrary thresholds

Example: Test scores clustering just above passing threshold suggests teaching narrowly to threshold.

Red Flag 5: People Can't Explain How Metric Connects to Goal

Ask: "How does improving this metric advance our actual goals?"

If answers are:

Vague
Circular ("We measure it because it's important")
Inconsistent across people

Red flag: Metric has become ritualized without clear purpose.

Preventing Metric Misleading

Strategy 1: Measure Outcomes, Not Just Proxies

Closer to actual goal = harder to game.

Hierarchy:

Worst: Activity metrics (calls made, features shipped)
Better: Output metrics (deals closed, features adopted)
Best: Outcome metrics (revenue, customer retention, mission impact)

Strategy 2: Use Multiple Complementary Metrics

Single metrics get gamed. Balanced scorecards resist gaming.

Example: Balanced customer support metrics

Speed (response time)
Quality (customer satisfaction)
Effectiveness (first-contact resolution)
Efficiency (cost per ticket)

Can't optimize all simultaneously without real improvement.

Strategy 3: Include Qualitative Assessment

Don't rely on metrics alone.

Balanced approach:

Metrics for scale, trends, patterns
Qual (conversations, observations, stories) for context, gaming detection, meaning

Strategy 4: Separate Measurement from Evaluation

**When metrics used for:

Learning: Honest reporting, problem-solving
Punishment: Gaming, hiding problems

Approach:

Measure for learning and improvement (formative)
Supplement with periodic evaluation (summative) that's harder to game

Strategy 5: Rotate Metrics

If metric becomes corrupted:

Change or retire it
Introduce new metric
Forces people to refocus on goal, not metric

Strategy 6: Audit for Gaming

Regularly check:

Are there suspicious patterns? (clustering at thresholds, sudden changes)
Do metric improvements correspond to real outcomes?
What are people doing to hit metrics?

If gaming detected, address root causes (incentives, consequences), not just symptoms.

Conclusion: Metrics as Tools, Not Truth

Metrics are not reality. They are models of reality—simplified, partial, distorted.

The map is not the territory.

Metrics mislead when:

They become targets (Goodhart's Law)
People game them
They're misinterpreted
They show partial picture (hide important factors)
They decay over time as gaming evolves

Despite risks, metrics are useful:

Enable scale (can't qualitatively assess millions)
Identify patterns
Track trends
Focus attention

The path forward:

Use metrics (don't abandon measurement)
Don't trust metrics blindly (supplement with qualitative understanding)
Measure outcomes (not just proxies)
Use multiple metrics (resist gaming)
Monitor for corruption (metrics degrade over time)
Remember the goal (metric is tool, not objective)

Good measurement requires:

Humility (metrics are flawed tools)
Vigilance (watch for gaming and distortion)
Balance (metrics + qualitative understanding)
Purpose (remember why you're measuring)

"What gets measured gets managed"—sometimes in ways that help, often in ways that hurt.

Measure thoughtfully. Interpret carefully. Act wisely.

References

Goodhart, C. (1975). "Problems of Monetary Management: The U.K. Experience." Papers in Monetary Economics (Reserve Bank of Australia).
Campbell, D. T. (1979). "Assessing the Impact of Planned Social Change." Evaluation and Program Planning, 2(1), 67–90.
Muller, J. Z. (2018). The Tyranny of Metrics. Princeton University Press.
Kerr, S. (1975). "On the Folly of Rewarding A, While Hoping for B." Academy of Management Journal, 18(4), 769–783.
Ridgway, V. F. (1956). "Dysfunctional Consequences of Performance Measurements." Administrative Science Quarterly, 1(2), 240–247.
Austin, R. D. (1996). Measuring and Managing Performance in Organizations. Dorset House.
Strathern, M. (1997). "'Improving Ratings': Audit in the British University System." European Review, 5(3), 305–321.
Power, M. (1997). The Audit Society: Rituals of Verification. Oxford University Press.
Levitt, S. D., & Dubner, S. J. (2005). Freakonomics: A Rogue Economist Explores the Hidden Side of Everything. William Morrow.
Bevan, G., & Hood, C. (2006). "What's Measured Is What Matters: Targets and Gaming in the English Public Health Care System." Public Administration, 84(3), 517–538.
de Bruijn, H. (2007). Managing Performance in the Public Sector (2nd ed.). Routledge.
Hood, C. (2006). "Gaming in Targetworld: The Targets Approach to Managing British Public Services." Public Administration Review, 66(4), 515–521.
Kahneman, D., & Tversky, A. (1973). "On the Psychology of Prediction." Psychological Review, 80(4), 237–251.
Croll, A., & Yoskovitz, B. (2013). Lean Analytics: Use Data to Build a Better Startup Faster. O'Reilly Media.
Seddon, J. (2008). Systems Thinking in the Public Sector: The Failure of the Reform Regime...and a Manifesto for a Better Way. Triarchy Press.

About This Series: This article is part of a larger exploration of measurement, metrics, and evaluation. For related concepts, see [Goodhart's Law Breaks Metrics], [Why Measurement Changes Behavior], [Vanity Metrics vs Meaningful Metrics], and [Designing Useful Measurement Systems].

Share this article

Twitter Facebook LinkedIn Reddit Email WhatsApp Pocket Copy Link

When Notes Fly

Search

Popular Searches

Why Metrics Often Mislead

The Core Mechanisms of Misleading

Mechanism 1: Goodhart's Law

Mechanism 2: Gaming and Manipulation

Mechanism 3: Partial Visibility

Mechanism 4: Misinterpretation

Mechanism 5: Metric Decay

Types of Misleading Metrics

Type 1: Vanity Metrics

Type 2: Proxy Metrics

Type 3: Ratio Distortion

Type 4: Threshold Effects

Domain-Specific Misleading Examples

Software Development

Sales

Customer Support

Healthcare

Education

The Measurement-Target Problem

Campbell's Law

The Lifecycle of Metric Corruption

Example: British Healthcare Waiting Times

Why Organizations Keep Using Misleading Metrics

Reason 1: Metrics Look Objective

Reason 2: Alternatives Are Harder

Reason 3: Accountability Pressure

Reason 4: Gaming Is Incremental

Reason 5: Inertia and Path Dependence

Detecting Misleading Metrics

Red Flag 1: Metric Improves, Reality Doesn't

Red Flag 2: Everyone Hits Targets Easily

Red Flag 3: Unmeasured Aspects Deteriorating

Red Flag 4: Metric Behavior Clusters Around Thresholds

Red Flag 5: People Can't Explain How Metric Connects to Goal

Preventing Metric Misleading

Strategy 1: Measure Outcomes, Not Just Proxies

Strategy 2: Use Multiple Complementary Metrics

Strategy 3: Include Qualitative Assessment

Strategy 4: Separate Measurement from Evaluation

Strategy 5: Rotate Metrics

Strategy 6: Audit for Gaming

Conclusion: Metrics as Tools, Not Truth

References

Tags

Share this article

We Value Your Privacy

Cookie Preferences

Essential Cookies

Analytics & Performance Cookies

Advertising & Marketing Cookies