Why Measurement Changes Behavior

You announce a new metric: customer response time will now be tracked for every support interaction. Within days, support tickets get closed faster. Success? Not quite. Customers complain about receiving "Is this resolved?" messages before their issue is actually fixed. Agents discovered they could close tickets quickly by

asking customers to confirm resolution, then reopening if needed—technically improving the metric while degrading actual support quality.

This is measurement's paradox: the act of measuring changes what you're measuring. Announce you'll track something, and behavior immediately shifts—sometimes in desirable directions, often in perverse ones. The metric itself becomes the objective, not the underlying goal it was meant to represent.

This phenomenon appears everywhere: students study to the test instead of learning, employees optimize measured behaviors while neglecting unmeasured but critical work, organizations hit metric targets while real performance deteriorates. Understanding why and how measurement changes behavior is essential for using metrics effectively without being deceived by them.


The Core Mechanisms

Mechanism 1: Attention and Focus

When something is measured, it captures attention.

Before measurement:

  • Support team handles tickets based on judgment
  • Some agents prioritize complex issues
  • Others focus on quick wins
  • Natural variation, no clear priority signal

After announcing response time metric:

  • Entire team focuses on response time
  • Complex issues get delayed (hurt response time average)
  • Quick issues get prioritized
  • Unmeasured aspects (solution quality, customer satisfaction) get less attention

Why it happens:

  • Limited attention and working memory
  • Measurement creates salience
  • Tracked items feel more important
  • Untracked items fade to background

The principle: What gets measured gets attention. What doesn't get measured gets ignored.


Mechanism 2: Accountability and Evaluation

Measurement creates accountability.

When performance is measured:

  • Results become visible
  • Comparisons become possible (across people, teams, time periods)
  • Evaluation feels imminent
  • Stakes feel higher

Behavioral response:

  • Increased effort (positive)
  • Strategic behavior to look good (neutral to negative)
  • Gaming and manipulation (negative)

Example: Sales Quotas

Before quotas measured:

  • Salespeople balance short-term deals with relationship building
  • Mix of large and small deals
  • Long-term customer value prioritized

After monthly quotas measured:

  • End-of-month scramble to hit numbers
  • Discount offers to close marginal deals
  • Pressure customers for early commitments
  • Delay deals to next month if quota already hit (sandbagging)

Metric changed behavior, not always positively.


Mechanism 3: Feedback Loops

Measurement provides feedback that enables learning and adjustment.

Positive feedback loops:

  • See metric → understand performance → adjust behavior → see result
  • Enables improvement when metric aligns with goals

Example: Personal fitness tracker

  • See daily step count
  • Notice low days
  • Adjust: take stairs, walk during lunch
  • See improvement
  • Reinforced behavior

But feedback cuts both ways—it also enables gaming.

Negative feedback loops:

  • See metric → realize it's tracked → optimize metric appearance
  • Behavior shifts to metric, not underlying goal

Example: Teacher evaluation by test scores

  • See that raises depend on scores
  • Realize teaching to test boosts scores more than deep learning
  • Shift instruction: test-taking strategies, narrow curriculum
  • Scores improve, actual learning may decline

Mechanism 4: Signaling and Interpretation

Choosing to measure something sends a message.

Implicit message: "This is important."

Even without explicit consequences, measurement signals priorities.

Example: Company adds "innovation" metric

Announced: "We'll track number of ideas submitted per employee"

Interpretation (even if unstated):

  • Innovation matters to leadership
  • Quantity of ideas is valued
  • Submitting ideas is career-positive

Behavioral response:

  • More ideas submitted (desired)
  • Many low-quality ideas to hit numbers (undesired)
  • Less time refining good ideas (undesired)

The metric signaled "submit lots of ideas," not "innovate thoughtfully."


The Hawthorne Effect

The Original Studies

Background: Western Electric Hawthorne Works (1924-1932)

Initial question: Does lighting affect productivity?

Study design:

  • Increase lighting → productivity improves
  • Decrease lighting → productivity still improves
  • Change nothing (control) → productivity still improves

Interpretation: Workers improved simply because they were being studied and observed, independent of actual changes.

Conclusion: Observation itself changes behavior.


Modern Understanding

Contemporary research refined the original interpretation:

Key factors:

  1. Attention and novelty: Being studied made workers feel special, valued
  2. Feedback: Workers got more feedback during study periods
  3. Autonomy: Research gave workers some control over conditions
  4. Demand characteristics: Workers inferred expectations and complied

Critically: The specific intervention (lighting) mattered less than the fact of being observed and studied.


Implications for Measurement

The Hawthorne effect means:

1. Measuring changes what you're measuring

  • Before measurement: natural behavior
  • During measurement: people aware they're observed alter behavior

2. Short-term improvements may not persist

  • Initial novelty creates bump
  • Effect fades as measurement becomes routine
  • Need to distinguish Hawthorne bump from real improvement

3. Blinding and hidden measurement have ethical issues

  • Can reduce observer effect
  • But raise consent and privacy concerns
  • Often not feasible in organizational settings

Examples in Practice

Example 1: Monitoring employee computer usage

Announced monitoring:

  • Productivity metrics improve immediately
  • People appear more focused
  • Result: Combination of real focus + gaming (looks busy, minimizes non-work windows)

Effect fades:

  • After weeks, people adapt
  • Find ways to appear productive while doing other things
  • Or focus less as monitoring becomes routine

Example 2: Customer satisfaction surveys

When customers know they'll be surveyed:

  • Employees become extra attentive during survey periods
  • Experience improves
  • Scores go up
  • After survey period, attention drops, scores decline

The measurement itself temporarily improved experience—not sustainable changes.


The Observer Effect

Beyond Hawthorne: Measurement as Intervention

Observer effect: The act of measurement changes the system being measured.

Distinction from Hawthorne:

  • Hawthorne: People change behavior when observed
  • Observer effect: Measurement itself alters what's measured (even independent of awareness)

Examples

Example 1: Asking about voting intentions

Phenomenon: Surveying people about voting makes them more likely to vote.

Mechanism:

  • Being asked activates identity ("Am I a person who votes?")
  • Creates commitment (stated intention)
  • Increases salienc (voting now on mind)

Result: Polls don't just measure voting intention—they increase it.


Example 2: Weighing yourself daily

Phenomenon: Daily weigh-ins change weight beyond just awareness.

Mechanism:

  • Weight becomes salient daily
  • Each weigh-in is decision point
  • Creates short feedback loops
  • Motivates micro-adjustments

Result: Daily weighing doesn't just track weight—it influences it.


Example 3: Tracking work hours

Phenomenon: Time tracking changes how people work.

Mechanism:

  • Awareness of time passing alters pace
  • Tasks get broken into trackable units
  • Non-trackable work (thinking, collaboration) may decrease
  • Billing by hour creates incentive to work slowly

Result: Time tracking changes both productivity and work quality.


"What Gets Measured Gets Managed"

The Principle

Common saying: "What gets measured gets managed."

Meaning:

  1. Attention: Measured things receive focus
  2. Accountability: Metrics create responsibility
  3. Improvement: Measurement enables optimization
  4. Prioritization: Unmeasured things deprioritized

When It's Good

Beneficial cases:

Situation Metric Positive Behavior Change
Vague goals Define clear metric Creates focus, enables coordination
Hidden performance Make visible Identifies problems, highlights successes
No feedback Provide measurement Enables learning, adjustment
Ambiguous priorities Measure what matters Aligns team efforts

Example: Safety in manufacturing

Before measurement:

  • Accidents happen but not systematically tracked
  • No visibility into causes
  • Varies by manager

After measurement (days since accident, incident reports):

  • Accidents visible
  • Patterns identified
  • Improvement possible
  • Safety becomes priority

Measurement improved outcomes.


When It's Bad: Goodhart's Law

Goodhart's Law: "When a measure becomes a target, it ceases to be a good measure."

Why:

  • People optimize for metric, not underlying goal
  • Gaming, distortion, tunnel vision
  • Metric decouples from what it was meant to represent

Example: Teaching to standardized tests

Metric: Student test scores

Intended goal: Improve student learning

What happened:

  • Teachers optimize for test performance specifically
  • Narrow curriculum to tested topics
  • Teach test-taking strategies
  • Actual learning breadth decreases
  • Scores rise, learning quality questionable

Metric (test scores) became target, ceased to represent learning.


The Dual Nature

"What gets measured gets managed" is both:

Aspect Positive Negative
Focus Directs attention to important areas Ignores unmeasured but important factors
Feedback Enables learning and improvement Enables gaming and optimization for metric
Accountability Creates responsibility Creates pressure to hit numbers regardless of method
Alignment Coordinates efforts toward goals Coordinates efforts toward metrics (which may diverge from goals)

Key: Whether measurement improves or degrades performance depends on metric design and organizational response.


Positive Uses of Measurement's Influence

Strategy 1: Measure What Truly Matters

If measurement changes behavior, measure the behavior you want.

Wrong: Measure output (features shipped, calls made)

  • Incentivizes quantity over quality
  • Ignore unmeasured outcomes (user value, deal quality)

Right: Measure outcome (user retention, revenue)

  • Incentivizes actual goal achievement
  • Can't easily game without real improvement

Strategy 2: Use Multiple Complementary Metrics

Single metrics get gamed. Balanced metrics resist gaming.

Example: Customer support

Single metric: Response time

  • Gaming: Close tickets fast, reopen later
  • Degraded: Quality, actual resolution

Balanced metrics:

  • Response time (speed)
  • Customer satisfaction score (quality)
  • First-contact resolution rate (effectiveness)
  • Reopened ticket rate (gaming detection)

Harder to game all simultaneously.


Strategy 3: Communicate the "Why"

Explain the goal behind the metric.

Without "why":

  • Metric feels like arbitrary target
  • Invites gaming
  • Loses meaning

With "why":

  • Metric connected to purpose
  • Gaming feels like betraying mission
  • People focus on goal, not just metric

Example:

Announcement: "We'll track average delivery time."

  • Response: Focus on fast delivery, possibly sacrificing accuracy

Better announcement: "We'll track delivery time because customers need products when promised. Let's aim for fast, reliable delivery."

  • Response: Balance speed with reliability

Strategy 4: Review and Rotate Metrics

Metrics degrade over time as people learn to game them.

Solution:

  • Periodically review whether metrics still predict goals
  • Rotate metrics when gaming becomes problematic
  • Keep people focused on goals, not gaming specific metrics

Example: Rotating quality metrics in manufacturing

  • Year 1: Track defect rate
  • People optimize for defect metric (may hide edge cases)
  • Year 2: Switch to customer return rate
  • Harder to game (real customer impact)
  • Forces focus back on actual quality

Strategy 5: Combine Quantitative and Qualitative

Numbers can be gamed. Stories reveal gaming.

Quantitative: Response time improved 30% Qualitative: Customer feedback reveals issues weren't actually resolved

Together: Reveals that metric improved through gaming, not real improvement.


Negative Consequences of Measurement

Problem 1: Tunnel Vision

Focusing intensely on measured aspects blinds you to unmeasured but critical factors.

Example: Hospital emergency department

Metric: Average wait time

Optimization:

  • Triage faster
  • Start treatment quickly (even if not complete)
  • Move patients through system rapidly

Unmeasured but important:

  • Thoroughness of diagnosis
  • Patient understanding of treatment
  • Post-discharge outcomes

Result: Wait times down, but diagnostic errors and readmissions may increase.


Problem 2: Crowding Out Intrinsic Motivation

External measurement can undermine internal drive.

Before measurement:

  • Employees motivated by craft, purpose, autonomy
  • Work quality driven by pride
  • Discretionary effort common

After heavy measurement:

  • Motivation shifts to hitting numbers
  • "Why bother if it's not measured?"
  • Discretionary effort declines

Research finding (Deci & Ryan): Extrinsic rewards and measurement can reduce intrinsic motivation for tasks people previously enjoyed.


Problem 3: Metric Fixation

Mistaking the metric for the goal.

The map becomes the territory.

Example: Academic citations

Original purpose: Citations as proxy for research impact

Metric fixation:

  • Researchers optimize for citation count
  • Strategic citation rings
  • Self-citation
  • Publish incrementally (more papers = more citations)

Result: Citation counts rise, but don't reliably indicate true research impact anymore.


Problem 4: Creating Perverse Incentives

Metrics can incentivize opposite of intended behavior.

Example: Surgeon mortality rates

Intent: Measure surgeon skill, improve patient outcomes

Perverse incentive:

  • High-risk patients increase mortality rates
  • Surgeons avoid high-risk patients to protect stats
  • Sickest patients can't find surgeons

Result: Metric meant to improve care creates access barriers for those who most need it.


Managing Measurement's Influence

Principle 1: Accept That Measurement Changes Behavior

Don't pretend measurement is neutral observation.

Instead:

  • Design metrics assuming they'll shape behavior
  • Ask: "If people optimize for this metric, what behavior results?"
  • Choose metrics that incentivize desired behavior

Principle 2: Measure Outcomes, Not Just Outputs

Outputs: Activities (features shipped, calls made) Outcomes: Results (user value, deals closed)

Outputs are easier to game and less aligned with goals.


Principle 3: Use Metrics to Guide, Not Punish

Measurement for learning vs. measurement for judgment:

Purpose Effect on Behavior
Learning and improvement Honest reporting, problem-solving focus
Punishment and consequences Gaming, hiding problems, risk aversion

When metrics tied to punishment:

  • Underreporting of issues
  • Gaming to avoid consequences
  • Optimization for metric, not goal

When metrics used for learning:

  • Transparent sharing
  • Focus on improvement
  • Less gaming

Principle 4: Maintain Qualitative Understanding

Don't let metrics replace actual understanding.

Balanced approach:

  • Use metrics for scale, trends, patterns
  • Use qual (conversations, observations, stories) for context, mechanisms, edge cases

Metrics without qual: Easy to miss gaming, lose context Qual without metrics: Hard to scale, spot trends, prioritize


Principle 5: Monitor for Unintended Consequences

Regularly ask:

  • Is the metric still predicting the goal?
  • Are people gaming it?
  • What unmeasured factors are suffering?
  • What perverse incentives have emerged?

If measurement creates more problems than it solves, change or eliminate it.


Conclusion: Measurement Is Intervention

Key insight: Measurement is never neutral observation. It's an intervention that changes the system.

Why measurement changes behavior:

  1. Focus: Measured things get attention
  2. Accountability: Metrics create responsibility and evaluation
  3. Feedback: Metrics enable optimization (for better or worse)
  4. Signaling: Measurement communicates priorities

Implications:

Positive potential:

  • Focus attention on what matters
  • Enable learning and improvement
  • Align efforts toward goals
  • Provide feedback for adjustment

Negative risks:

  • Tunnel vision (unmeasured factors neglected)
  • Gaming (optimizing metric appearance, not underlying goal)
  • Crowding out intrinsic motivation
  • Perverse incentives

The path forward:

  • Measure what truly matters (not proxies)
  • Use multiple complementary metrics (resist gaming)
  • Communicate why metrics matter (connect to purpose)
  • Balance metrics with qualitative understanding
  • Monitor for gaming and unintended consequences
  • Accept that measurement shapes behavior—design accordingly

"What gets measured gets managed"—for better or worse.

Design measurement systems assuming they'll change behavior. Because they will.


References

  1. Mayo, E. (1933). The Human Problems of an Industrial Civilization. Macmillan.

  2. Roethlisberger, F. J., & Dickson, W. J. (1939). Management and the Worker: An Account of a Research Program Conducted by the Western Electric Company, Hawthorne Works, Chicago. Harvard University Press.

  3. Goodhart, C. (1975). "Problems of Monetary Management: The U.K. Experience." Papers in Monetary Economics (Reserve Bank of Australia).

  4. Campbell, D. T. (1979). "Assessing the Impact of Planned Social Change." Evaluation and Program Planning, 2(1), 67–90.

  5. Deci, E. L., & Ryan, R. M. (2000). "The 'What' and 'Why' of Goal Pursuits: Human Needs and the Self-Determination of Behavior." Psychological Inquiry, 11(4), 227–268.

  6. Muller, J. Z. (2018). The Tyranny of Metrics. Princeton University Press.

  7. Strathern, M. (1997). "'Improving Ratings': Audit in the British University System." European Review, 5(3), 305–321.

  8. Austin, R. D. (1996). "Measuring and Managing Performance in Organizations." Dorset House.

  9. Kerr, S. (1975). "On the Folly of Rewarding A, While Hoping for B." Academy of Management Journal, 18(4), 769–783.

  10. Ridgway, V. F. (1956). "Dysfunctional Consequences of Performance Measurements." Administrative Science Quarterly, 1(2), 240–247.

  11. Pink, D. H. (2009). Drive: The Surprising Truth About What Motivates Us. Riverhead Books.

  12. Kohn, A. (1999). Punished by Rewards: The Trouble with Gold Stars, Incentive Plans, A's, Praise, and Other Bribes. Houghton Mifflin.

  13. Seddon, J. (2008). Systems Thinking in the Public Sector: The Failure of the Reform Regime...and a Manifesto for a Better Way. Triarchy Press.

  14. Levitt, S. D., & Dubner, S. J. (2005). Freakonomics: A Rogue Economist Explores the Hidden Side of Everything. William Morrow.

  15. Power, M. (1997). The Audit Society: Rituals of Verification. Oxford University Press.


About This Series: This article is part of a larger exploration of measurement, metrics, and evaluation. For related concepts, see [Why Metrics Often Mislead], [Goodhart's Law Breaks Metrics], [Designing Useful Measurement Systems], and [Vanity Metrics vs Meaningful Metrics].