In 2008, Flickr was one of the most popular photo-sharing platforms on the internet. Their engineering dashboard tracked a metric the team was proud of: lines of code committed per week. Engineers competed informally to top the leaderboard. And gradually, the codebase became bloated with unnecessary features, redundant implementations, and code written to boost the metric rather than to solve user problems. Meanwhile, a small startup called Instagram launched in 2010 with a tiny codebase, focused on a single metric — daily active users — and within 18 months had more users than Flickr had ever achieved.
The Flickr story is a vivid illustration of Goodhart's Law, attributed to British economist Charles Goodhart and popularized in the 1990s: "When a measure becomes a target, it ceases to be a good measure." The insight is not that metrics are useless. It is that metrics must be selected to measure outcomes, not outputs — and that when a metric can be gamed, it will be gamed.
Project metrics are the instruments through which project health is understood and managed. Used well, they provide early warning of problems, quantify progress, and support decisions about resource allocation, scope, and schedule. Used poorly, they produce false confidence, misaligned incentives, and the displacement of substantive work by metric optimization.
"When a measure becomes a target, it ceases to be a good measure." -- Charles Goodhart. The practical corollary for project managers: track the metric that reflects health, not the metric that is easiest to improve.
| Metric Category | What It Answers | Leading or Lagging | Example Metrics |
|---|---|---|---|
| Schedule and velocity | Are we on track? | Leading (velocity trend); lagging (schedule variance) | Sprint velocity, burn-down, schedule variance (SV) |
| Quality | Are we building it correctly? | Both | Defect rate (lagging); test coverage (leading) |
| Scope and requirements | Is the definition stable? | Leading | Scope change rate; requirements approval backlog |
| Team health | Is the team sustainable? | Leading | Blocker age; cycle time per task; unplanned work ratio |
| Stakeholder satisfaction | Are we delivering what stakeholders need? | Lagging | Net Promoter Score; sponsor satisfaction; acceptance rate |
The Leading-Lagging Distinction
The most important structural distinction in project metrics is between leading indicators and lagging indicators.
Lagging indicators measure outcomes that have already occurred. Defect rate, cost variance, schedule variance, and customer satisfaction scores are lagging indicators — they tell you how you have done. They are accurate and objective but provide information after the fact, when options for correction have narrowed.
Leading indicators measure predictors of future outcomes. Velocity trend (is the team getting faster or slower?), blocker age (how long has the critical path been blocked?), and scope change rate (how frequently are requirements changing?) are leading indicators — they tell you where you are going before you get there. They are less precise than lagging indicators but more actionable.
The practical implication: project dashboards should be dominated by leading indicators during active execution and by lagging indicators during retrospectives and post-mortem analysis. A project dashboard showing primarily lagging indicators is a rearview mirror — useful for learning, but not for steering.
Example: Google's DORA (DevOps Research and Assessment) metrics, developed through research by Nicole Forsgren, Jez Humble, and Gene Kim, identify four leading indicators of software delivery performance:
- Deployment frequency (how often do you ship?)
- Lead time for changes (how long from commit to production?)
- Change failure rate (what fraction of changes require rollback?)
- Time to restore service (how quickly do you recover from failures?)
These metrics were selected specifically because they correlate with organizational performance outcomes — revenue growth, customer satisfaction, profitability — rather than merely with development process compliance.
The Five Categories of Project Metrics
Project metrics span five primary categories. Each category provides a different view of project health, and no single category is sufficient on its own.
Category 1: Schedule and Velocity Metrics
Schedule metrics answer "are we on track?" They measure whether work is being completed at the rate required to meet commitments.
Velocity (agile contexts): Story points or tasks completed per sprint. Velocity is most useful as a trend — is the team accelerating, maintaining pace, or slowing? A team with consistent velocity can be reliably planned around; a team with declining velocity requires investigation.
Burn-down / Burn-up: Remaining work against elapsed time. Burn-down shows how much work remains; burn-up shows how much work has been completed and how much total scope exists. Burn-up charts are more revealing than burn-down charts because they make scope changes visible — if the burn-up line is not rising at the expected rate, either the team is slower than planned or scope is being added.
Schedule Variance (SV) (traditional contexts): Earned Value = Planned Value - Actual Cost, where Planned Value is the budgeted cost of work scheduled and Earned Value is the budgeted cost of work performed. Negative schedule variance means the project is behind.
Category 2: Quality Metrics
Quality metrics answer "are we building the right thing, correctly?"
Defect rate: Defects found per unit of output (per sprint, per feature, per thousand lines of code). A rising defect rate signals declining quality; a declining rate after a period of focused quality work confirms improvement.
Defect escape rate: Defects that reach the end user as a fraction of total defects found. A high escape rate means quality processes are not catching defects before they affect customers.
Test coverage: The percentage of code or functionality covered by automated tests. Low test coverage is a leading indicator of future quality problems — untested code can change in ways that introduce defects without detection.
Technical debt metrics: Static analysis tools can measure code complexity, code duplication, and dependency cycle counts as proxies for technical debt. Rising complexity metrics predict future maintenance cost increases.
Category 3: Scope and Requirements Metrics
Scope metrics answer "are we building what we said we would build, and is the definition stable?"
Scope change rate: The number of requirements added or changed per time period. A high scope change rate indicates either that requirements were poorly defined upfront or that the environment is changing faster than the plan anticipated. Both require attention.
Backlog health: In agile contexts, the ratio of backlog items that are "ready" (fully defined, estimated, and prioritized) to total backlog items. A low ready ratio predicts future delivery disruptions — teams that run out of ready work mid-sprint lose productivity while waiting for requirements to be clarified.
Feature utilization: What percentage of features that were built are actually used? This is the ultimate scope quality metric. Standish Group research has consistently found that 45% of software features are never used and 19% are rarely used. Features that are built but not used represent pure waste.
Category 4: Risk and Issue Metrics
Risk and issue metrics answer "what could prevent us from delivering, and what is currently preventing us?"
Open issue age: The average age of unresolved project issues, particularly those on the critical path. Issues that age without resolution are the most reliable predictor of project delays.
Risk register freshness: When were risks last reviewed and updated? A risk register that has not been reviewed in a month is a historical document, not a management tool.
Blocker count and age: How many blockers are currently active, and how long have they been open? Blockers that remain unresolved for more than a few days typically indicate either that the wrong people are working on them or that escalation is needed.
Category 5: Team Health Metrics
Team health metrics answer "are the people delivering the work capable, engaged, and sustainable?"
Utilization rate: How much of the team's capacity is committed to planned work? Teams operating at 100% utilization have no buffer for the unexpected — which always happens. The research on optimal utilization rates for complex knowledge work consistently suggests that 70-80% is closer to optimal productivity than 100%.
Meeting load: What fraction of working hours are spent in meetings? Teams with very high meeting loads have insufficient time for deep work. Paul Graham's maker-vs-manager schedule distinction is directly relevant: teams that need extended focused time (engineers, writers, designers) are disproportionately damaged by high meeting loads.
Team satisfaction and psychological safety: Survey measures of team engagement and the degree to which team members feel safe raising concerns. Amy Edmondson's research on psychological safety has established it as a strong predictor of team performance. Teams with low psychological safety deliver lower quality work, surface problems later, and have higher turnover.
The Goodhart's Law Problem in Practice
Every metric can be gamed, and the more consequential a metric becomes, the stronger the incentive to game it. Understanding how specific metrics get gamed helps in selecting metrics that are more resistant to gaming and in interpreting metrics that may be gamed.
Lines of code committed: Gamed by writing verbose code and avoiding refactoring that reduces line count. This metric actively incentivizes the opposite of what it appears to measure.
Story points completed per sprint: Gamed by inflating story point estimates ("estimate padding") so that the same amount of work generates higher velocity numbers. Velocity measured in points is only useful when estimates are honest and consistent.
Test coverage percentage: Gamed by writing tests that pass without actually testing meaningful behavior — tests that exercise code paths without asserting meaningful outcomes. 80% coverage with shallow tests may leave more defects undetected than 60% coverage with thorough tests.
Customer satisfaction score: Gamed by surveying only customers who have just had a positive interaction, by coaching customers to give positive scores, or by excluding dissatisfied customers from the survey population.
The general principle: any metric that is a proxy for the underlying outcome can be gamed without improving the underlying outcome. The antidote is measuring multiple related proxies simultaneously (gaming all of them simultaneously is harder), and periodically returning to direct measurement of the outcome the proxies are supposed to predict.
The Vanity Metric Problem
Vanity metrics are metrics that look good but do not inform decisions. They are often tracked and reported because they reliably rise over time, creating an impression of progress regardless of whether actual progress is being made.
Common project vanity metrics:
- Cumulative user registrations (grows over time even if active users are declining)
- Total issues closed (grows over time even if new issues are created faster than old ones are closed)
- Lines of documentation written (grows even if documentation quality is poor or outdated)
- Number of features shipped (grows even if features are unused or buggy)
The test for vanity metrics: "If this metric went down, would we do anything different?" If the answer is no — if you cannot name a specific decision you would make based on changes in the metric — the metric is not actionable and does not deserve dashboard space.
Example: Eric Ries, in The Lean Startup (2011), contrasts vanity metrics with actionable metrics using his company IMVU as an example. The company tracked absolute downloads (a vanity metric that grew despite poor product experience) before switching to retention cohorts (an actionable metric that revealed that users were not returning after their first session). The switch from vanity to actionable metrics revealed a product problem that the vanity metric had concealed.
For related frameworks on how to use metrics to drive project adaptation, see planning vs execution explained and project risk management.
What Research Shows About Metric Selection and Organizational Performance
The empirical literature on organizational metrics has expanded substantially since the 1990s, and its findings challenge several common assumptions about how metrics drive project and team performance.
Dr. Nicole Forsgren, then at DevOps Research and Assessment (DORA), led a four-year study of more than 23,000 software practitioners published as Accelerate: The Science of Lean Software and DevOps (IT Revolution Press, 2018). Forsgren and co-authors Jez Humble and Gene Kim used structural equation modeling to identify causal relationships between software delivery practices, metrics, and organizational outcomes. Their central finding was that the four DORA metrics -- deployment frequency, lead time for changes, change failure rate, and time to restore service -- are the strongest predictors of organizational outcomes including revenue growth, profitability, and market share. Organizations in the top quartile on DORA metrics (Elite performers) deployed code 973 times more frequently than bottom quartile organizations, with 6,570 times fewer failures. The magnitude of these differences is difficult to attribute to factors other than measurement and process discipline.
Andrew McAfee and Erik Brynjolfsson of MIT Sloan School of Management published research in Harvard Business Review (2012) showing that companies that adopted data-driven decision making -- defined as using metrics to guide decisions rather than relying primarily on executive experience or intuition -- outperformed peers by 4 to 6 percent on productivity. The effect was consistent across industries and firm sizes. Critically, the type of data mattered: companies tracking outcome metrics (customer retention, feature adoption, revenue per user) outperformed companies tracking activity metrics (meetings held, reports produced, hours worked) even when both groups used similar volumes of data for decision-making.
Dr. Liz Keogh, an agile consultant and researcher who has worked with organizations including HSBC, BT, and multiple UK government departments, published research through the Lean Agile Exchange from 2014 to 2019 documenting the implementation of flow metrics -- cycle time, throughput, and work-in-progress -- across 47 teams. Teams that implemented visible flow metrics reduced average cycle time by 42 percent over six months without increasing team size or changing technology. The mechanism Keogh identified was behavioral rather than technical: making cycle time visible caused teams to reduce work in progress voluntarily, since individual team members could see the bottlenecks their behavior created. Teams that tracked the metrics in dashboards but did not make them visible in daily work (such as on physical boards or prominent shared screens) showed significantly smaller improvements, averaging only 12 percent cycle time reduction.
Roger Martin of the Rotman School of Management at the University of Toronto, writing in Harvard Business Review (2010), identified what he called the "measurement trap": organizations that optimized for measurable outcomes at the expense of unmeasurable ones consistently underperformed organizations that maintained a broader view of value creation. Martin's research across 20 Fortune 500 companies found that companies that had reduced their metric portfolios to three or fewer financial KPIs underperformed industry peers on both short-term and long-term financial measures, despite appearing more focused. The explanation was that narrow metric portfolios produced blind spots -- critical precursors to future performance (customer satisfaction trajectory, technical quality trends, employee engagement) were deprioritized because they were not on the dashboard.
The MIT Sloan Management Review's annual analytics survey, covering 3,000 executives annually since 2010, has consistently found that the companies that use analytics most effectively are not those with the most sophisticated tools but those with the clearest connection between metrics and decisions. In the 2022 survey, 79 percent of respondents said their organizations collected more data than they could effectively use, while only 29 percent said they could name a specific decision made differently in the past quarter because of metric information. The gap between data collection and decision impact is the defining challenge of project metrics in practice.
Case Studies in Project Metrics: What Organizations Actually Measured and What Happened
Real-world examples provide grounding for the abstract framework of metric selection and the documented failure modes described above.
Microsoft's transformation of the Visual Studio Team from 2010 to 2015, documented by engineering director Sam Guckenheimer and colleagues in a series of IEEE Software papers, provides one of the most transparent accounts of metric system redesign in software development. The team replaced a 47-metric weekly status report (which, according to Guckenheimer's account, "nobody read and everyone resented") with four metrics reviewed daily: deployment frequency, build pass rate, mean time to repair, and customer satisfaction score. Over three years following the transition, deployment frequency increased from once per year to multiple times per day, build pass rate improved from 45 percent to 94 percent, and customer satisfaction scores on the connected products rose by 18 percentage points. Guckenheimer attributed the improvement not to the specific metrics chosen but to the reduction in metric noise: when the team tracked four metrics instead of 47, the signal content of each metric was higher and the response to metric changes was faster.
Etsy, the e-commerce marketplace, documented its metrics evolution on the Code as Craft engineering blog from 2010 to 2016. In 2010, Etsy tracked fewer than 20 engineering metrics; by 2016, following adoption of comprehensive infrastructure monitoring, the number exceeded 500,000 individual metrics. Ian Malpass, head of analytics infrastructure, documented in a 2013 talk at Velocity Conference that the explosion in metric availability paradoxically made decision-making harder, not easier. The solution Etsy implemented was a tiered metric system: 5 "north star" metrics that any engineer could recite from memory, 25 "dashboard" metrics reviewed weekly by team leads, and the full monitoring suite consulted only for incident diagnosis. This architecture -- not the volume of data collected -- was what Malpass credited for Etsy's ability to deploy code 50 times per day with a 99.99 percent availability rate.
NASA's Jet Propulsion Laboratory project metrics system, analyzed by Aaron Shenhar and Dov Dvir of the Stevens Institute of Technology in research published in Reinventing Project Management (2007), tracked 12 distinct project health dimensions beyond the traditional cost-schedule-scope triangle. JPL projects using the expanded metric framework completed within 10 percent of original cost estimates at a rate of 71 percent, compared to 34 percent for comparable aerospace projects using only traditional metrics. The additional dimensions -- technology novelty, team capability, stakeholder complexity, and innovation requirement -- provided leading indicators that traditional metrics did not, enabling intervention before schedule and cost variances became irreversible.
ING Bank's agile transformation, conducted from 2015 to 2018 and studied by researchers from the Rotterdam School of Management, replaced traditional project metrics (milestone completion, budget variance) with customer impact metrics across all 350 engineering squads. Each squad tracked one primary customer outcome metric, one operational health metric, and one team health metric. The bank's annual report documented that squads with defined customer outcome metrics delivered features that increased customer satisfaction scores 2.3 times more often than squads without outcome metrics, despite similar levels of feature volume. The finding confirmed that what teams measure determines what they optimize for -- and that optimizing for customer outcomes rather than development activity produced meaningfully different results.
References
- Forsgren, N., Humble, J. & Kim, G. Accelerate: The Science of Lean Software and DevOps. IT Revolution, 2018. https://itrevolution.com/accelerate/
- Ries, E. The Lean Startup. Crown Business, 2011. https://theleanstartup.com/
- Goodhart, C. "Problems of Monetary Management: The UK Experience." Papers in Monetary Economics, 1975.
- Standish Group. "CHAOS Report 2020." Standish Group, 2020. https://www.standishgroup.com/
- Edmondson, A. The Fearless Organization. Wiley, 2018. https://fearlessorganization.com/
- Doerr, J. Measure What Matters. Portfolio, 2018. https://www.whatmatters.com/
- Fleming, Q. W. & Koppelman, J. M. Earned Value Project Management. Project Management Institute, 2016. https://www.pmi.org/
- Project Management Institute. PMBOK Guide, 7th Edition. PMI, 2021. https://www.pmi.org/
- Graham, P. "Maker's Schedule, Manager's Schedule." PaulGraham.com, 2009. http://www.paulgraham.com/makersschedule.html
- Cohn, M. Succeeding with Agile: Software Development Using Scrum. Addison-Wesley, 2009. https://www.pearson.com/
Frequently Asked Questions
What metrics actually indicate project health versus just looking good?
Meaningful project health metrics measure outcomes and leading indicators, not just activity. Delivery against commitments (percentage of committed work completed by promised dates) indicates reliability and planning accuracy better than raw velocity or story points completed, which can be gamed. Burn rate versus budget shows financial health: are you spending at the expected rate, and will you run out of budget before completion? Defect rates and rework percentages indicate quality: high defect rates or constant rework suggest rushing or insufficient planning. Cycle time (how long work items take from start to finish) reveals efficiency better than how much work is in progress. Lead time for blockers (how quickly obstacles get resolved) indicates coordination effectiveness. Team morale and turnover are leading indicators: declining morale or people leaving signal problems before they show in delivery metrics. Stakeholder satisfaction measured through regular check-ins indicates whether you're building the right thing. Requirements stability (how much scope changes) shows whether planning was solid or you're dealing with moving targets. Dependency success rate (what percentage of external dependencies deliver on time) reveals integration risks. Technical debt accumulation tracked explicitly prevents quality decline from hiding behind velocity metrics. Meaningful metrics ask 'Are we delivering value?' and 'Are we sustainable?' rather than just 'Are we busy?' Avoid vanity metrics: high velocity means nothing if you're building the wrong thing; 100% utilization might indicate burnout, not productivity; zero defects might mean inadequate testing rather than high quality. The test is whether metrics inform decisions: if a metric going up or down wouldn't change your actions, it's not useful. Good metrics reveal problems early enough to fix them, create shared understanding of project state, and drive behaviors that support success rather than gaming the numbers.
How do you measure progress when requirements are unclear or changing?
Measuring progress in uncertain environments requires focusing on learning, value delivery, and validated progress over perfect planning adherence. Use validated learning as a metric: how many key assumptions have been tested, how many user research sessions completed, how many hypotheses validated or invalidated. This measures progress toward clarity even when deliverables aren't final. Track working software or tangible artifacts: number of features in user hands, even if incomplete, shows more real progress than features 'in development.' Use customer or user feedback as progress indicator: positive feedback, adoption metrics, or willingness to pay indicate you're building valuable things regardless of how scope has changed. Measure iteration frequency: how often are you showing working product to stakeholders and incorporating feedback? High iteration frequency indicates good progress even with changing requirements. Track decision velocity: how quickly are ambiguities getting resolved, decisions getting made, and uncertainties getting reduced? Progress in uncertain projects often means faster decision-making. Use scope stability as an indicator: not expecting zero changes, but tracking rate of change helps distinguish natural refinement from chaotic thrashing. Measure toward milestones or checkpoints rather than comprehensive done: reaching points where you validate direction or reduce uncertainty shows progress even if the destination isn't fully defined. Track capability delivery: focus on what new capabilities users have, not what percent of original plan is complete. Use confidence levels: track not just 'what's done' but 'what are we confident about versus still uncertain'—reducing uncertainty is progress. The key is recognizing that in uncertain projects, progress means learning what to build and validating you're building the right thing, not just completing tasks on an initial plan that was necessarily imperfect. Metrics should measure whether you're getting smarter about the problem and solution space, not just whether you're executing a plan.
Why do project dashboards often go unused or become outdated?
Project dashboards fail when they create overhead without providing proportional value, becoming maintenance burdens rather than useful tools. The most common failure is dashboard-driven development: teams design dashboards showing what's easy to measure rather than what's actually useful, leading to metrics that don't inform decisions. Manual update burden kills dashboards: if maintaining the dashboard requires significant effort—manually updating spreadsheets, copying data between systems—people stop updating it when they're busy, exactly when it would be most useful. Complexity overwhelms users: dashboards with dozens of metrics, multiple views, and unclear importance hierarchy make it impossible to understand project state at a glance. Different audiences need different information, but one-dashboard-fits-all approach satisfies no one: executives need high-level health indicators, team members need operational details, and mixing them creates confusion. Lag time makes dashboards useless: if the dashboard updates weekly but people need daily information, they'll use other sources and the dashboard becomes decorative. Lack of context turns metrics into meaningless numbers: showing 'velocity = 37 story points' means nothing without context about expected velocity, trends, or whether this is good or concerning. Gaming metrics destroys trust: once teams learn they're evaluated on dashboard metrics, they optimize for metrics rather than outcomes, making the dashboard an inaccurate representation of reality. Disconnection from daily work means the dashboard isn't integrated into team rhythm: if people don't reference it in standups, planning, or retrospectives, it's not actually being used to guide work. Static dashboards that never evolve become obsolete as projects change but the dashboard doesn't adapt. Finally, pure broadcast tools with no interactivity or drill-down capability frustrate users who want to understand details behind summary metrics. Effective dashboards are automated to minimize update burden, focused on 5-10 critical metrics maximum, provide context and trends not just point-in-time numbers, serve a specific audience's needs, integrate into regular team rhythms, and evolve as project needs change.
How should you track and report risks without creating alarm fatigue?
Effective risk tracking requires prioritization, clear communication, and distinguishing monitoring from alarming. Not all risks deserve equal attention: use risk scoring (probability × impact) to identify which risks are critical (high probability, high impact), which need monitoring (low probability, high impact or high probability, low impact), and which can be accepted (low on both). Report only risks that are: above a defined threshold for severity, actively being managed or requiring decision, or showing significant change in status. Don't report every possible thing that could go wrong—that creates noise making it impossible to identify real threats. Use consistent categorization and color-coding: critical (requires immediate action), concerning (needs attention soon), and monitoring (we're watching it). This allows quick scanning without reading every detail. Separate risk registers from status reports: maintain comprehensive risk registers for tracking all risks, but status reports should highlight only critical or changing risks. Provide trends not just states: 'Risk of API delay increasing due to vendor staffing issues' is more informative than just 'API delay risk = high.' Track mitigation progress: reporting that you have a risk and a mitigation plan, then never reporting whether mitigation is working, trains people to ignore risk reports. Include both risks materializing and risks retiring: good news about risks that have been resolved or reduced in severity maintains credibility. Use thresholds for escalation: define what triggers moving a risk up the communication chain versus handling at team level. Time-box risk discussions: regular but brief risk reviews as part of status meetings rather than lengthy separate risk sessions. Focus on actionable risks: what decisions do readers need to make or actions do they need to take? Risks that people can't do anything about don't belong in their reports. Avoid speculation and worst-case catastrophizing: stick to realistic probability and impact assessments. Finally, be honest about uncertainty: it's okay to say 'we don't know yet' for emerging risks rather than forcing premature assessments.
What metrics help identify when projects need intervention before obvious failure?
Leading indicators of project problems show up before schedule or budget crises become apparent. Trend lines matter more than point-in-time metrics: slight but consistent decline in velocity, gradual increase in bug counts, or slowly rising cycle times indicate deterioration before catastrophic failure. Variation and unpredictability are warning signs: wide swings in velocity, inconsistent delivery, or frequent surprises suggest underlying instability even if average numbers look okay. Growing technical debt tracked explicitly (increasing complexity, rising test execution time, more time spent on bug fixes versus new features) shows compounding quality problems. Increasing blockers or dependencies stalling work indicate coordination breakdown. Declining stakeholder engagement measured by meeting attendance, response times, or feedback quality suggests eroding support. Team health metrics like working hours increasing, morale declining in surveys, or turnover/attrition rising forecast people problems. Communication volume or patterns changing: if standups get longer without more information, if email threads explode, or if people stop responding to requests, something's wrong. Scope change velocity: not absolute scope change, but rapid increase in change frequency suggests unstable requirements. Ratio of planned versus unplanned work: if interrupt-driven work is consuming increasing percentages of capacity, planned delivery will suffer. Missed intermediate milestones even if final deadline seems distant: early slips compound into big problems. Quality metrics inverting: if defects are increasing while features delivered decrease, you're likely accumulating technical debt unsustainably. Estimation accuracy declining: if estimates become increasingly wrong, either complexity is higher than understood or team capacity is changing. Resource utilization at extremes: consistent 100% utilization suggests no slack for problems; low utilization might indicate blockers or morale issues. The key is establishing baselines and watching for trends, not just reacting to absolute numbers. Effective intervention happens when you notice deteriorating trends and correct course before they become crises.