Measurement and Metrics Terms You Should Know
Why Metrics Vocabulary Matters
A startup CEO proudly reports: "We have 100,000 users!" (Vanity metric—says nothing about revenue, engagement, or retention)
A product manager tracks: "Page views are up 40%!" (Misleading—could be from bots, confusion, or users repeatedly failing to find what they need)
A marketer celebrates: "We're measuring everything!" (No—you're collecting everything but measuring what matters)
Imprecise metrics language leads to measuring the wrong things, optimizing for the wrong goals, and making decisions based on meaningless numbers.
Measurement and metrics terminology comes from statistics, business analytics, operations research, and management science. Each term has specific meaning that distinguishes useful metrics from useless ones, predictive indicators from historical records, and actionable data from feel-good numbers.
Peter Drucker (never actually said): "What gets measured gets managed." Real quote (Muller, 2018): "What gets measured gets gamed"—which is precisely why precise metrics terminology matters.
Understanding these distinctions helps you:
- Design metrics that actually drive behavior
- Distinguish signal from noise
- Avoid Goodhart's Law (optimizing metrics instead of goals)
- Communicate clearly about performance
This is the vocabulary that separates data-driven decisions from data-decorated guesses.
Core Metrics Concepts
Metric vs. KPI
Metric:
- Definition: Any quantifiable measure of performance, behavior, or outcomes
- Scope: Broad—anything you can count, rate, or measure
- Quantity: Organizations track hundreds or thousands
- Purpose: Monitor, understand, diagnose
Examples: Page views, support tickets, response time, lines of code, coffee consumption
Key Performance Indicator (KPI):
- Definition: Specific metrics most critical to achieving strategic goals
- Scope: Narrow—the vital few that matter most
- Quantity: Organizations focus on 3-7 primary KPIs per team/level
- Purpose: Drive strategy, make high-stakes decisions, track progress toward goals
Examples: Monthly Recurring Revenue (MRR), Net Promoter Score (NPS), Customer Acquisition Cost (CAC), Gross Margin
The Relationship
All KPIs are metrics, but not all metrics are KPIs.
Visual hierarchy:
All Measurements
└─ Metrics (quantifiable, tracked regularly)
└─ KPIs (strategic, tied to goals)
Selection criteria for KPIs:
- Strategic alignment: Directly reflects progress toward goal
- Actionability: You can influence it through decisions
- Measurability: Can be accurately quantified
- Clarity: Everyone understands what it means
- Timeliness: Updates frequently enough to inform decisions
Common mistake: Calling everything a "KPI" dilutes focus. If you have 50 KPIs, you have 0 KPIs—you have metrics.
Example - SaaS company:
Metrics tracked (dozens):
- Website visitors, trial signups, activation rate, feature usage, support tickets, page load time, server uptime, team velocity, bug count, NPS, referrals...
KPIs (3-5 primary):
- Monthly Recurring Revenue (MRR)
- Customer Churn Rate
- Customer Acquisition Cost (CAC)
- Lifetime Value (LTV)
Application: Track many metrics, but focus leadership attention on the few KPIs that determine success or failure.
Leading vs. Lagging Indicators
Lagging Indicators
Definition: Metrics that measure past results—outcomes that have already occurred.
Characteristics:
- Historical: Tell you what happened
- Easy to measure: Usually clear, objective
- Hard to influence: Past can't be changed
- Definitive: Actual outcomes, not predictions
Examples:
- Revenue (result of past sales)
- Profit (result of past operations)
- Customer churn (already left)
- Graduation rates (already completed)
- Accidents (already occurred)
Value: Definitive assessment of whether you succeeded.
Limitation: By the time you know, it's too late to change.
Leading Indicators
Definition: Metrics that predict future performance—early signals of likely outcomes.
Characteristics:
- Predictive: Indicate what will happen
- Harder to measure: Often require inference
- Actionable: You can still influence outcome
- Imperfect: Probabilistic, not certain
Examples:
- Sales pipeline (predicts future revenue)
- Employee engagement (predicts retention)
- Product trial rate (predicts conversions)
- Student attendance (predicts graduation)
- Near-miss incidents (predict future accidents)
Value: Early warning system—alerts you to problems before they materialize.
Limitation: Correlation isn't perfect; leading indicators can be wrong.
Why Both Matter
Lagging indicators tell you if you achieved goals (accountability, scorekeeping).
Leading indicators tell you if you're on track to achieve goals (management, course correction).
| Aspect | Lagging Indicator | Leading Indicator |
|---|---|---|
| Timing | Past results | Future predictions |
| Certainty | Definitive | Probabilistic |
| Actionability | Low (too late) | High (can intervene) |
| Ease | Easy to measure | Harder to measure |
| Example | Revenue earned | Sales calls made |
Best practice: Pair leading and lagging indicators.
Example - Weight loss:
- Lagging: Weight on scale (definitive but delayed)
- Leading: Daily calorie intake, exercise minutes (predict future weight)
Application: Don't just track outcomes (lagging). Identify and track the activities/behaviors (leading) that drive those outcomes.
Vanity vs. Actionable Metrics
Vanity Metrics
Definition (Eric Ries, Lean Startup): Metrics that look impressive but don't correlate with business success or inform decisions.
Characteristics:
- Make you feel good (big numbers)
- Don't predict revenue or retention
- Don't suggest specific actions
- Easy to manipulate or game
- Lack context (absolute numbers without rates)
Common vanity metrics:
| Metric | Why It's Vanity | Better Alternative |
|---|---|---|
| Total registered users | Includes inactive, churned users | Monthly Active Users (MAU) |
| Total page views | Could be confusion, bots, same user | Unique engaged users |
| Total downloads | Says nothing about usage | Daily Active Users (DAU) |
| Social media followers | Many are bots, inactive | Engagement rate |
| Total revenue | Ignores costs, growth rate | Net profit, MRR growth rate |
Why they're dangerous:
- Create false sense of progress
- Distract from metrics that matter
- Enable self-deception
- Waste resources optimizing wrong things
Example:
- Vanity: "We have 1 million app downloads!"
- Reality: 95% used it once and never returned. Company is dying.
Actionable Metrics
Definition: Metrics that inform specific decisions and suggest clear actions.
Characteristics:
- Tie to business outcomes
- Segment and context-rich
- Lead to specific interventions
- Hard to game without real improvement
Transformation - Vanity to Actionable:
| Vanity Metric | Actionable Version | What It Tells You |
|---|---|---|
| Total users | Weekly Active Users / Total Signups | Activation rate—how many become engaged |
| Page views | Time to task completion | Whether users find what they need efficiently |
| Revenue | Revenue per user segment | Which segments are profitable |
| Followers | Engagement rate by content type | What content resonates |
Test for actionability: Ask "If this metric changes, what do I do differently?"
- If answer is clear → Actionable
- If answer is "feel good" or "nothing" → Vanity
Application: Audit your dashboard. For each metric, ask: "Does this inform a decision or just make me feel good?" Remove vanity metrics ruthlessly.
Proxy Metrics and Goodhart's Law
Proxy Metrics
Definition: Measurable substitutes that approximate something hard or impossible to measure directly.
Why needed: Some important outcomes are:
- Too delayed (long-term health)
- Too expensive (full user satisfaction survey)
- Too abstract (happiness, understanding)
- Too rare (prevent catastrophic failures)
Common proxies:
| True Goal (Hard to Measure) | Proxy Metric (Easier to Measure) |
|---|---|
| Long-term health | Blood pressure, cholesterol, BMI |
| Customer satisfaction | Net Promoter Score (NPS) |
| Learning | Test scores |
| Software quality | Bug count, test coverage |
| Economic health | GDP, unemployment rate |
| Employee happiness | Retention rate, engagement surveys |
The problem with proxies: They're imperfect. Optimizing the proxy doesn't guarantee optimizing the goal.
Example - Education:
- Goal: Deep understanding, critical thinking, creativity
- Proxy: Standardized test scores
- Result: Schools teach to the test (proxy improves, but goal may not)
Goodhart's Law
Definition (Charles Goodhart, 1975): "When a measure becomes a target, it ceases to be a good measure."
Expanded (Marilyn Strathern): "When a measure becomes a target, people optimize for the measure rather than the underlying goal."
Why it happens:
- Metric is imperfect proxy for goal
- Metric becomes target (measured, rewarded, tracked)
- People game the metric (consciously or unconsciously)
- Metric-goal correlation breaks (metric improves without real progress)
Classic examples:
| Domain | Target Metric | Gaming Behavior | Actual Outcome |
|---|---|---|---|
| Soviet factories | Nail production (by weight) | Made fewer, heavier nails | Unusable products |
| Cobra bounty (India) | Dead cobras turned in | Breed cobras for bounty | More cobras after program ended |
| Hospital wait times | % seen within 4 hours | Ambulances circle outside until patient can be seen quickly | Gaming the metric, not reducing waits |
| Software engineering | Lines of code written | Write verbose, redundant code | More code, worse quality |
| Academia | Publication count | Publish minimal publishable units, quantity over quality | Citation inflation, replication crisis |
Modern examples:
Social media metrics:
- Target: Engagement (likes, shares, comments)
- Gaming: Outrage content, clickbait, sensationalism
- Result: Engagement up, discourse quality down
Wells Fargo (2016):
- Target: Accounts opened per employee
- Gaming: Opened fake accounts without customer knowledge
- Result: Scandal, fines, reputation damage
COVID-19 testing:
- Target: Positivity rate
- Gaming: Some locations limited testing to obvious cases
- Result: Lower positivity rate, worse outbreak detection
Defending Against Goodhart's Law
Strategies:
1. Use multiple metrics (no single metric captures everything)
- Don't just measure accounts opened; measure legitimate accounts used
- Don't just measure engagement; measure user satisfaction
2. Monitor for gaming (check if metric-goal correlation holds)
- Rising test scores + declining real performance? Gaming likely
3. Rotate metrics (prevents long-term optimization)
- Change what you measure periodically
4. Balance competing metrics (creates trade-offs)
- Revenue AND customer satisfaction
- Speed AND quality
- Quantity AND accuracy
5. Focus on outcomes, not outputs
- Not "lines of code" but "working features deployed"
- Not "sales calls made" but "revenue generated"
6. Tie metrics to actual goals
- Regularly ask: "Is optimizing this metric actually achieving our goal?"
Application: When designing metrics, assume people will game them. How could they optimize the metric without achieving the goal? Design defenses accordingly.
Validity and Reliability
Validity
Definition: Does the metric actually measure what you think it's measuring?
Types:
Face validity: Appears to measure the construct
- Example: Asking "Are you satisfied?" seems to measure satisfaction
Construct validity: Actually captures the theoretical concept
- Example: Does IQ test actually measure intelligence, or just test-taking ability?
Predictive validity: Predicts outcomes it should predict
- Example: Do SAT scores predict college success?
Content validity: Covers all aspects of what you're measuring
- Example: Does customer satisfaction survey cover all satisfaction dimensions?
Threats to validity:
- Measuring wrong thing (test scores ≠ learning)
- Missing important dimensions (revenue growth without profitability)
- Confounding variables (correlation without causation)
Example - Validity problem:
- Goal: Measure employee productivity
- Metric: Hours worked
- Problem: Invalid—hours ≠ output. Measures time, not productivity.
Reliability
Definition: Does the metric produce consistent results under consistent conditions?
Characteristics:
- Repeatability: Same measurement process yields same result
- Precision: Low random error
- Consistency: Different measurers get same result
Types:
Test-retest reliability: Measure same thing twice, get same result
Inter-rater reliability: Different people measuring get same result
Internal consistency: Multiple items measuring same construct correlate
Threats to reliability:
- Measurement error (inconsistent instruments)
- Subjective judgment (different raters, different results)
- Environmental variation (conditions change between measurements)
Example - Reliability problem:
- Metric: "Employee engagement" rated by managers
- Problem: Unreliable—different managers rate differently, same manager rates differently at different times
The Relationship
Validity: Are you measuring the right thing?
Reliability: Are you measuring consistently?
Ideal: High validity AND high reliability (measuring right thing consistently)
Possible problems:
- High reliability, low validity: Consistently measuring the wrong thing
- High validity, low reliability: Measuring right thing inconsistently (random noise)
- Low both: Useless metric
| Validity | Reliability | Example |
|---|---|---|
| ✅ High | ✅ High | Blood pressure reading with calibrated instrument (measures cardiovascular health consistently) |
| ✅ High | ❌ Low | Customer satisfaction via unstructured interviews (relevant but inconsistent) |
| ❌ Low | ✅ High | Hours worked as productivity measure (consistent but doesn't measure actual output) |
| ❌ Low | ❌ Low | Random number generator (neither measures anything nor consistent) |
Application: When designing metrics, ask:
- Validity: "Does this actually measure what matters?"
- Reliability: "Will I get consistent results?"
Both are necessary. Neither alone is sufficient.
Advanced Metrics Concepts
Composite Metrics
Definition: Single metric combining multiple sub-metrics, weighted to reflect priorities.
Examples:
- Credit score: Payment history + debt-to-income + credit age + types of credit...
- Happiness index: GDP per capita + social support + life expectancy + freedom + generosity + corruption perception
- Developer productivity: Code quality + velocity + bug rate + collaboration
Advantages:
- Captures multi-dimensional concepts
- Provides single scoreboard number
- Allows weighting of priorities
Risks:
- Obscures underlying components (score drops—but why?)
- Weighting is subjective (what's "right" weight?)
- More complex to understand and trust
- Easier to game (optimize easier components, ignore hard ones)
Best practice: Show composite score AND underlying components.
Ratio Metrics
Definition: Metrics expressed as ratios rather than absolute numbers, providing context.
Why ratios matter: Absolute numbers lack context.
Examples:
| Absolute (Context-free) | Ratio (With Context) | Why Ratio Is Better |
|---|---|---|
| 100 sales | 100 sales / 1000 leads = 10% conversion | Shows efficiency, not just volume |
| $1M revenue | $1M revenue / $500K costs = 2x ROI | Shows profitability, not just size |
| 50 bugs | 50 bugs / 10,000 lines of code = 0.5% bug rate | Normalizes by complexity |
| 1,000 complaints | 1,000 complaints / 100,000 customers = 1% | Shows proportion affected |
Key ratios:
- Rates: Events per time period (churn rate, growth rate)
- Proportions: Part-to-whole (market share, conversion rate)
- Efficiency: Output per input (revenue per employee, profit margin)
Application: Convert absolute metrics to ratios for meaningful comparison across time, teams, or companies.
Cohort Metrics
Definition: Metrics segmented by groups that share common characteristics or experience at same time.
Why cohorts matter: Aggregated metrics hide important patterns.
Common cohort types:
- Time-based: Users acquired in January vs. February
- Channel-based: Users from Google vs. Facebook
- Feature-based: Users who tried Feature X vs. didn't
- Demographic: Age groups, locations, segments
Example - User retention:
Aggregated: "80% retention rate"
Problem: Mixes old users (high retention) with new users (low retention)
Cohort analysis:
- January cohort: 90% retention
- February cohort: 85% retention
- March cohort: 70% retention
Insight: Retention is declining for new users—something changed. Aggregated metric hid this.
Application: When metrics seem stable but you suspect problems, segment by cohort to reveal hidden patterns.
Practical Application
Designing a Metrics System
Framework:
1. Define goals clearly
- What are you trying to achieve? (Strategy)
- What does success look like? (Outcomes)
2. Identify KPIs (3-7 per level/team)
- What metrics best reflect progress toward goals?
- Leading indicators (predict future)
- Lagging indicators (confirm results)
3. Add supporting metrics (dashboard context)
- Metrics that explain KPI movements
- Diagnostic metrics for troubleshooting
4. Test validity and reliability
- Does each metric measure what you think?
- Are measurements consistent?
5. Check for gaming potential
- How could people optimize metrics without achieving goals?
- Add balancing metrics
6. Review and iterate
- Do metrics still align with goals?
- Are you measuring what matters?
Metric Red Flags
Warning signs of bad metrics:
1. Vanity symptoms:
- You can't explain what action to take if it changes
- It always goes up (no failure mode)
- You celebrate the number but business isn't improving
2. Goodhart's Law symptoms:
- Metric improves but underlying goal doesn't
- Obvious gaming behavior emerges
- People optimize metric instead of customer value
3. Validity problems:
- Metric doesn't correlate with business outcomes
- Proxy has drifted from goal
- Measuring wrong thing entirely
4. Reliability problems:
- Results vary wildly without real change
- Different teams report different numbers for same thing
- Can't reproduce measurements
5. Complexity problems:
- No one understands how it's calculated
- Requires PhD to interpret
- Changes for unknown reasons
Application: Audit existing metrics against these red flags. If metric fails multiple tests, replace it.
The Meta-Principle
Metrics are tools, not goals. They help you understand reality, not replace it.
Peter Drucker (actual quote): "There is surely nothing quite so useless as doing with great efficiency what should not be done at all."
Translation: Measuring the wrong things precisely is worse than not measuring at all—it directs effort toward worthless goals.
The vocabulary of metrics exists to help you:
- Distinguish signal from noise (actionable vs. vanity)
- Predict future from past (leading vs. lagging)
- Measure what matters (validity)
- Measure consistently (reliability)
- Avoid gaming (Goodhart's Law awareness)
Use metrics vocabulary precisely because imprecise language leads to imprecise measurement, which leads to imprecise decisions.
Measure what matters. Ignore what doesn't. Know the difference.
Essential Readings
Metrics Fundamentals:
- Ries, E. (2011). The Lean Startup. New York: Crown Business. [Vanity vs. actionable metrics]
- Croll, A., & Yoskovitz, B. (2013). Lean Analytics. Sebastopol, CA: O'Reilly. [Metrics for startups]
- Kaplan, R. S., & Norton, D. P. (1996). The Balanced Scorecard. Boston: Harvard Business School Press. [Strategic measurement framework]
Goodhart's Law and Gaming:
- Goodhart, C. A. E. (1984). "Problems of Monetary Management: The UK Experience." In Monetary Theory and Practice (pp. 91-121). London: Macmillan.
- Muller, J. Z. (2018). The Tyranny of Metrics. Princeton: Princeton University Press. [Comprehensive critique of metric fixation]
- Campbell, D. T. (1979). "Assessing the Impact of Planned Social Change." Evaluation and Program Planning, 2(1), 67-90. [Campbell's Law]
Measurement Theory:
- Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Boston: Houghton Mifflin. [Validity types]
- Stevens, S. S. (1946). "On the Theory of Scales of Measurement." Science, 103(2684), 677-680. [Measurement scales]
- Carmines, E. G., & Zeller, R. A. (1979). Reliability and Validity Assessment. Beverly Hills: Sage. [Accessible treatment]
Leading and Lagging Indicators:
- Parmenter, D. (2015). Key Performance Indicators (3rd ed.). Hoboken, NJ: Wiley. [KPI design and implementation]
- Marr, B. (2012). Key Performance Indicators: The 75 Measures Every Manager Needs to Know. Harlow: Pearson.
Business Metrics:
- Farris, P. W., Bendle, N. T., Pfeifer, P. E., & Reibstein, D. J. (2010). Marketing Metrics (2nd ed.). Upper Saddle River, NJ: Wharton School Publishing.
- Davenport, T. H., & Harris, J. (2017). Competing on Analytics (Updated ed.). Boston: Harvard Business Review Press.
Practical Application:
- Redman, T. C. (2013). Data Driven: Profiting from Your Most Important Business Asset. Boston: Harvard Business Review Press.
- Provost, F., & Fawcett, T. (2013). Data Science for Business. Sebastopol, CA: O'Reilly.