Why Measurement Changes Behavior
You announce a new metric: customer response time will now be tracked for every support interaction. Within days, support tickets get closed faster. Success? Not quite. Customers complain about receiving "Is this resolved?" messages before their issue is actually fixed. Agents discovered they could close tickets quickly by
asking customers to confirm resolution, then reopening if needed—technically improving the metric while degrading actual support quality.
This is measurement's paradox: the act of measuring changes what you're measuring. Announce you'll track something, and behavior immediately shifts—sometimes in desirable directions, often in perverse ones. The metric itself becomes the objective, not the underlying goal it was meant to represent.
This phenomenon appears everywhere: students study to the test instead of learning, employees optimize measured behaviors while neglecting unmeasured but critical work, organizations hit metric targets while real performance deteriorates. Understanding why and how measurement changes behavior is essential for using metrics effectively without being deceived by them.
The Core Mechanisms
Mechanism 1: Attention and Focus
When something is measured, it captures attention.
Before measurement:
- Support team handles tickets based on judgment
- Some agents prioritize complex issues
- Others focus on quick wins
- Natural variation, no clear priority signal
After announcing response time metric:
- Entire team focuses on response time
- Complex issues get delayed (hurt response time average)
- Quick issues get prioritized
- Unmeasured aspects (solution quality, customer satisfaction) get less attention
Why it happens:
- Limited attention and working memory
- Measurement creates salience
- Tracked items feel more important
- Untracked items fade to background
The principle: What gets measured gets attention. What doesn't get measured gets ignored.
Mechanism 2: Accountability and Evaluation
Measurement creates accountability.
When performance is measured:
- Results become visible
- Comparisons become possible (across people, teams, time periods)
- Evaluation feels imminent
- Stakes feel higher
Behavioral response:
- Increased effort (positive)
- Strategic behavior to look good (neutral to negative)
- Gaming and manipulation (negative)
Example: Sales Quotas
Before quotas measured:
- Salespeople balance short-term deals with relationship building
- Mix of large and small deals
- Long-term customer value prioritized
After monthly quotas measured:
- End-of-month scramble to hit numbers
- Discount offers to close marginal deals
- Pressure customers for early commitments
- Delay deals to next month if quota already hit (sandbagging)
Metric changed behavior, not always positively.
Mechanism 3: Feedback Loops
Measurement provides feedback that enables learning and adjustment.
Positive feedback loops:
- See metric → understand performance → adjust behavior → see result
- Enables improvement when metric aligns with goals
Example: Personal fitness tracker
- See daily step count
- Notice low days
- Adjust: take stairs, walk during lunch
- See improvement
- Reinforced behavior
But feedback cuts both ways—it also enables gaming.
Negative feedback loops:
- See metric → realize it's tracked → optimize metric appearance
- Behavior shifts to metric, not underlying goal
Example: Teacher evaluation by test scores
- See that raises depend on scores
- Realize teaching to test boosts scores more than deep learning
- Shift instruction: test-taking strategies, narrow curriculum
- Scores improve, actual learning may decline
Mechanism 4: Signaling and Interpretation
Choosing to measure something sends a message.
Implicit message: "This is important."
Even without explicit consequences, measurement signals priorities.
Example: Company adds "innovation" metric
Announced: "We'll track number of ideas submitted per employee"
Interpretation (even if unstated):
- Innovation matters to leadership
- Quantity of ideas is valued
- Submitting ideas is career-positive
Behavioral response:
- More ideas submitted (desired)
- Many low-quality ideas to hit numbers (undesired)
- Less time refining good ideas (undesired)
The metric signaled "submit lots of ideas," not "innovate thoughtfully."
The Hawthorne Effect
The Original Studies
Background: Western Electric Hawthorne Works (1924-1932)
Initial question: Does lighting affect productivity?
Study design:
- Increase lighting → productivity improves
- Decrease lighting → productivity still improves
- Change nothing (control) → productivity still improves
Interpretation: Workers improved simply because they were being studied and observed, independent of actual changes.
Conclusion: Observation itself changes behavior.
Modern Understanding
Contemporary research refined the original interpretation:
Key factors:
- Attention and novelty: Being studied made workers feel special, valued
- Feedback: Workers got more feedback during study periods
- Autonomy: Research gave workers some control over conditions
- Demand characteristics: Workers inferred expectations and complied
Critically: The specific intervention (lighting) mattered less than the fact of being observed and studied.
Implications for Measurement
The Hawthorne effect means:
1. Measuring changes what you're measuring
- Before measurement: natural behavior
- During measurement: people aware they're observed alter behavior
2. Short-term improvements may not persist
- Initial novelty creates bump
- Effect fades as measurement becomes routine
- Need to distinguish Hawthorne bump from real improvement
3. Blinding and hidden measurement have ethical issues
- Can reduce observer effect
- But raise consent and privacy concerns
- Often not feasible in organizational settings
Examples in Practice
Example 1: Monitoring employee computer usage
Announced monitoring:
- Productivity metrics improve immediately
- People appear more focused
- Result: Combination of real focus + gaming (looks busy, minimizes non-work windows)
Effect fades:
- After weeks, people adapt
- Find ways to appear productive while doing other things
- Or focus less as monitoring becomes routine
Example 2: Customer satisfaction surveys
When customers know they'll be surveyed:
- Employees become extra attentive during survey periods
- Experience improves
- Scores go up
- After survey period, attention drops, scores decline
The measurement itself temporarily improved experience—not sustainable changes.
The Observer Effect
Beyond Hawthorne: Measurement as Intervention
Observer effect: The act of measurement changes the system being measured.
Distinction from Hawthorne:
- Hawthorne: People change behavior when observed
- Observer effect: Measurement itself alters what's measured (even independent of awareness)
Examples
Example 1: Asking about voting intentions
Phenomenon: Surveying people about voting makes them more likely to vote.
Mechanism:
- Being asked activates identity ("Am I a person who votes?")
- Creates commitment (stated intention)
- Increases salienc (voting now on mind)
Result: Polls don't just measure voting intention—they increase it.
Example 2: Weighing yourself daily
Phenomenon: Daily weigh-ins change weight beyond just awareness.
Mechanism:
- Weight becomes salient daily
- Each weigh-in is decision point
- Creates short feedback loops
- Motivates micro-adjustments
Result: Daily weighing doesn't just track weight—it influences it.
Example 3: Tracking work hours
Phenomenon: Time tracking changes how people work.
Mechanism:
- Awareness of time passing alters pace
- Tasks get broken into trackable units
- Non-trackable work (thinking, collaboration) may decrease
- Billing by hour creates incentive to work slowly
Result: Time tracking changes both productivity and work quality.
"What Gets Measured Gets Managed"
The Principle
Common saying: "What gets measured gets managed."
Meaning:
- Attention: Measured things receive focus
- Accountability: Metrics create responsibility
- Improvement: Measurement enables optimization
- Prioritization: Unmeasured things deprioritized
When It's Good
Beneficial cases:
| Situation | Metric | Positive Behavior Change |
|---|---|---|
| Vague goals | Define clear metric | Creates focus, enables coordination |
| Hidden performance | Make visible | Identifies problems, highlights successes |
| No feedback | Provide measurement | Enables learning, adjustment |
| Ambiguous priorities | Measure what matters | Aligns team efforts |
Example: Safety in manufacturing
Before measurement:
- Accidents happen but not systematically tracked
- No visibility into causes
- Varies by manager
After measurement (days since accident, incident reports):
- Accidents visible
- Patterns identified
- Improvement possible
- Safety becomes priority
Measurement improved outcomes.
When It's Bad: Goodhart's Law
Goodhart's Law: "When a measure becomes a target, it ceases to be a good measure."
Why:
- People optimize for metric, not underlying goal
- Gaming, distortion, tunnel vision
- Metric decouples from what it was meant to represent
Example: Teaching to standardized tests
Metric: Student test scores
Intended goal: Improve student learning
What happened:
- Teachers optimize for test performance specifically
- Narrow curriculum to tested topics
- Teach test-taking strategies
- Actual learning breadth decreases
- Scores rise, learning quality questionable
Metric (test scores) became target, ceased to represent learning.
The Dual Nature
"What gets measured gets managed" is both:
| Aspect | Positive | Negative |
|---|---|---|
| Focus | Directs attention to important areas | Ignores unmeasured but important factors |
| Feedback | Enables learning and improvement | Enables gaming and optimization for metric |
| Accountability | Creates responsibility | Creates pressure to hit numbers regardless of method |
| Alignment | Coordinates efforts toward goals | Coordinates efforts toward metrics (which may diverge from goals) |
Key: Whether measurement improves or degrades performance depends on metric design and organizational response.
Positive Uses of Measurement's Influence
Strategy 1: Measure What Truly Matters
If measurement changes behavior, measure the behavior you want.
Wrong: Measure output (features shipped, calls made)
- Incentivizes quantity over quality
- Ignore unmeasured outcomes (user value, deal quality)
Right: Measure outcome (user retention, revenue)
- Incentivizes actual goal achievement
- Can't easily game without real improvement
Strategy 2: Use Multiple Complementary Metrics
Single metrics get gamed. Balanced metrics resist gaming.
Example: Customer support
Single metric: Response time
- Gaming: Close tickets fast, reopen later
- Degraded: Quality, actual resolution
Balanced metrics:
- Response time (speed)
- Customer satisfaction score (quality)
- First-contact resolution rate (effectiveness)
- Reopened ticket rate (gaming detection)
Harder to game all simultaneously.
Strategy 3: Communicate the "Why"
Explain the goal behind the metric.
Without "why":
- Metric feels like arbitrary target
- Invites gaming
- Loses meaning
With "why":
- Metric connected to purpose
- Gaming feels like betraying mission
- People focus on goal, not just metric
Example:
Announcement: "We'll track average delivery time."
- Response: Focus on fast delivery, possibly sacrificing accuracy
Better announcement: "We'll track delivery time because customers need products when promised. Let's aim for fast, reliable delivery."
- Response: Balance speed with reliability
Strategy 4: Review and Rotate Metrics
Metrics degrade over time as people learn to game them.
Solution:
- Periodically review whether metrics still predict goals
- Rotate metrics when gaming becomes problematic
- Keep people focused on goals, not gaming specific metrics
Example: Rotating quality metrics in manufacturing
- Year 1: Track defect rate
- People optimize for defect metric (may hide edge cases)
- Year 2: Switch to customer return rate
- Harder to game (real customer impact)
- Forces focus back on actual quality
Strategy 5: Combine Quantitative and Qualitative
Numbers can be gamed. Stories reveal gaming.
Quantitative: Response time improved 30% Qualitative: Customer feedback reveals issues weren't actually resolved
Together: Reveals that metric improved through gaming, not real improvement.
Negative Consequences of Measurement
Problem 1: Tunnel Vision
Focusing intensely on measured aspects blinds you to unmeasured but critical factors.
Example: Hospital emergency department
Metric: Average wait time
Optimization:
- Triage faster
- Start treatment quickly (even if not complete)
- Move patients through system rapidly
Unmeasured but important:
- Thoroughness of diagnosis
- Patient understanding of treatment
- Post-discharge outcomes
Result: Wait times down, but diagnostic errors and readmissions may increase.
Problem 2: Crowding Out Intrinsic Motivation
External measurement can undermine internal drive.
Before measurement:
- Employees motivated by craft, purpose, autonomy
- Work quality driven by pride
- Discretionary effort common
After heavy measurement:
- Motivation shifts to hitting numbers
- "Why bother if it's not measured?"
- Discretionary effort declines
Research finding (Deci & Ryan): Extrinsic rewards and measurement can reduce intrinsic motivation for tasks people previously enjoyed.
Problem 3: Metric Fixation
Mistaking the metric for the goal.
The map becomes the territory.
Example: Academic citations
Original purpose: Citations as proxy for research impact
Metric fixation:
- Researchers optimize for citation count
- Strategic citation rings
- Self-citation
- Publish incrementally (more papers = more citations)
Result: Citation counts rise, but don't reliably indicate true research impact anymore.
Problem 4: Creating Perverse Incentives
Metrics can incentivize opposite of intended behavior.
Example: Surgeon mortality rates
Intent: Measure surgeon skill, improve patient outcomes
Perverse incentive:
- High-risk patients increase mortality rates
- Surgeons avoid high-risk patients to protect stats
- Sickest patients can't find surgeons
Result: Metric meant to improve care creates access barriers for those who most need it.
Managing Measurement's Influence
Principle 1: Accept That Measurement Changes Behavior
Don't pretend measurement is neutral observation.
Instead:
- Design metrics assuming they'll shape behavior
- Ask: "If people optimize for this metric, what behavior results?"
- Choose metrics that incentivize desired behavior
Principle 2: Measure Outcomes, Not Just Outputs
Outputs: Activities (features shipped, calls made) Outcomes: Results (user value, deals closed)
Outputs are easier to game and less aligned with goals.
Principle 3: Use Metrics to Guide, Not Punish
Measurement for learning vs. measurement for judgment:
| Purpose | Effect on Behavior |
|---|---|
| Learning and improvement | Honest reporting, problem-solving focus |
| Punishment and consequences | Gaming, hiding problems, risk aversion |
When metrics tied to punishment:
- Underreporting of issues
- Gaming to avoid consequences
- Optimization for metric, not goal
When metrics used for learning:
- Transparent sharing
- Focus on improvement
- Less gaming
Principle 4: Maintain Qualitative Understanding
Don't let metrics replace actual understanding.
Balanced approach:
- Use metrics for scale, trends, patterns
- Use qual (conversations, observations, stories) for context, mechanisms, edge cases
Metrics without qual: Easy to miss gaming, lose context Qual without metrics: Hard to scale, spot trends, prioritize
Principle 5: Monitor for Unintended Consequences
Regularly ask:
- Is the metric still predicting the goal?
- Are people gaming it?
- What unmeasured factors are suffering?
- What perverse incentives have emerged?
If measurement creates more problems than it solves, change or eliminate it.
Conclusion: Measurement Is Intervention
Key insight: Measurement is never neutral observation. It's an intervention that changes the system.
Why measurement changes behavior:
- Focus: Measured things get attention
- Accountability: Metrics create responsibility and evaluation
- Feedback: Metrics enable optimization (for better or worse)
- Signaling: Measurement communicates priorities
Implications:
Positive potential:
- Focus attention on what matters
- Enable learning and improvement
- Align efforts toward goals
- Provide feedback for adjustment
Negative risks:
- Tunnel vision (unmeasured factors neglected)
- Gaming (optimizing metric appearance, not underlying goal)
- Crowding out intrinsic motivation
- Perverse incentives
The path forward:
- Measure what truly matters (not proxies)
- Use multiple complementary metrics (resist gaming)
- Communicate why metrics matter (connect to purpose)
- Balance metrics with qualitative understanding
- Monitor for gaming and unintended consequences
- Accept that measurement shapes behavior—design accordingly
"What gets measured gets managed"—for better or worse.
Design measurement systems assuming they'll change behavior. Because they will.
References
Mayo, E. (1933). The Human Problems of an Industrial Civilization. Macmillan.
Roethlisberger, F. J., & Dickson, W. J. (1939). Management and the Worker: An Account of a Research Program Conducted by the Western Electric Company, Hawthorne Works, Chicago. Harvard University Press.
Goodhart, C. (1975). "Problems of Monetary Management: The U.K. Experience." Papers in Monetary Economics (Reserve Bank of Australia).
Campbell, D. T. (1979). "Assessing the Impact of Planned Social Change." Evaluation and Program Planning, 2(1), 67–90.
Deci, E. L., & Ryan, R. M. (2000). "The 'What' and 'Why' of Goal Pursuits: Human Needs and the Self-Determination of Behavior." Psychological Inquiry, 11(4), 227–268.
Muller, J. Z. (2018). The Tyranny of Metrics. Princeton University Press.
Strathern, M. (1997). "'Improving Ratings': Audit in the British University System." European Review, 5(3), 305–321.
Austin, R. D. (1996). "Measuring and Managing Performance in Organizations." Dorset House.
Kerr, S. (1975). "On the Folly of Rewarding A, While Hoping for B." Academy of Management Journal, 18(4), 769–783.
Ridgway, V. F. (1956). "Dysfunctional Consequences of Performance Measurements." Administrative Science Quarterly, 1(2), 240–247.
Pink, D. H. (2009). Drive: The Surprising Truth About What Motivates Us. Riverhead Books.
Kohn, A. (1999). Punished by Rewards: The Trouble with Gold Stars, Incentive Plans, A's, Praise, and Other Bribes. Houghton Mifflin.
Seddon, J. (2008). Systems Thinking in the Public Sector: The Failure of the Reform Regime...and a Manifesto for a Better Way. Triarchy Press.
Levitt, S. D., & Dubner, S. J. (2005). Freakonomics: A Rogue Economist Explores the Hidden Side of Everything. William Morrow.
Power, M. (1997). The Audit Society: Rituals of Verification. Oxford University Press.
About This Series: This article is part of a larger exploration of measurement, metrics, and evaluation. For related concepts, see [Why Metrics Often Mislead], [Goodhart's Law Breaks Metrics], [Designing Useful Measurement Systems], and [Vanity Metrics vs Meaningful Metrics].