In January 1986, seven astronauts died when the Space Shuttle Challenger broke apart 73 seconds after launch. The cause was a failed O-ring seal — a known problem. Engineers at Morton Thiokol had explicitly warned NASA management the night before that low temperatures would compromise the seals. They were overruled. The decision to launch was made not because the engineering data supported it, but because of schedule pressure, organizational hierarchy, and the deeply human tendency to filter out information that contradicts what you want to be true.

The Challenger disaster is an extreme example of a universal problem: the gap between how people think they make decisions and how they actually do. Most people believe they weigh evidence rationally and choose the best available option. The research on human judgment — from Kahneman and Tversky's foundational work in the 1970s to the forecasting studies of Philip Tetlock and the organizational decision research of Gary Klein — paints a more complicated picture.

Good decisions do not require certainty, which rarely exists for decisions that matter. They require a process that is systematically better than intuition alone. Several specific tools have strong evidence behind them.


Why Decisions Go Wrong

Before discussing what to do, it is worth understanding why smart people make bad decisions. The causes are largely predictable.

Overconfidence: People consistently overestimate the accuracy of their own judgments. In calibration studies, events assigned a 90 percent probability of occurring happen only about 70 to 75 percent of the time. Experts are not reliably better calibrated than informed non-experts; in some domains they are worse, because domain expertise can increase confidence faster than it increases accuracy.

The inside view: When planning or evaluating a decision, people naturally focus on the details of the specific case — the particular team, the particular market, the particular circumstances. This inside view produces optimistic estimates because the vivid details of "why this case is different" crowd out the base rate evidence on how similar cases have actually turned out.

Confirmation bias: The systematic tendency to seek, interpret, and remember information in ways that confirm existing beliefs. Once a conclusion has formed, the mind preferentially processes evidence that supports it and discounts evidence that challenges it. This operates largely below conscious awareness.

The planning fallacy: Kahneman and Tversky's term for the systematic tendency to underestimate the time, costs, and risks of future actions while overestimating benefits. Projects almost always take longer and cost more than planned. This is not because people are dishonest; it is because the inside view drives planning.

"People make predictions by constructing scenarios of how the future will unfold, not by estimating base rates of similar previous cases. These scenarios are typically optimistic." -- Daniel Kahneman, Thinking Fast and Slow


Expected Value Thinking

The most fundamental tool in decision-making under uncertainty is expected value — the probability-weighted average of possible outcomes.

The expected value of a decision is calculated by: identifying all plausible outcomes, estimating the probability of each, estimating the value of each (positive or negative), multiplying probability by value for each outcome, and summing the results.

Example: You are considering a business investment that will return $200,000 with a 25 percent probability, $50,000 with a 50 percent probability, and lose $30,000 with a 25 percent probability.

Expected value = (0.25 x $200,000) + (0.50 x $50,000) + (0.25 x -$30,000) = $50,000 + $25,000 - $7,500 = $67,500

The expected value is positive, which is relevant information — but it is not the only relevant information. The distribution matters too. The same expected value can arise from a low-variance distribution (highly predictable) or a high-variance distribution (lots of uncertainty about which outcome occurs). Risk tolerance and the ability to absorb the downside scenario should factor into the decision alongside expected value.

Where Expected Value Thinking Is Most Useful

Expected value thinking is most valuable when:

  • Decisions are repeated (decisions made many times allow probabilities to average out over the long run)
  • Outcomes are measurable enough to assign rough values
  • You are choosing between multiple options, not just accepting or rejecting one

It is least reliable when probabilities are deeply uncertain (not just unknown but unknowable), when outcomes are non-linear (e.g., catastrophic downside scenarios), or when values are genuinely incommensurable (comparing financial outcomes with moral obligations, for example).


Base Rates and the Outside View

The most consistent finding in forecasting research is that base rates — the historical frequency of an outcome across a reference class of similar cases — are systematically underweighted in favor of the specific details of the case at hand.

This pattern was documented extensively by Philip Tetlock in his long-running research on expert political forecasting, later described in "Superforecasting" (2015). The forecasters who performed best — those Tetlock called "superforecasters" — shared a specific habit: they began with the base rate and then adjusted for specific factors, rather than beginning with the specific case and ignoring the base rate.

Reference class forecasting is the formalization of this approach, developed by Kahneman and Amos Tversky and later systematized by Bent Flyvbjerg for large infrastructure projects. The procedure:

  1. Identify the reference class — the population of similar past cases (e.g., "comparable IT system implementations," "similarly sized business expansions")
  2. Identify the base rate for the outcome in question in that reference class (e.g., "what fraction of comparable IT projects were delivered on time and budget?")
  3. Use this base rate as the starting estimate
  4. Adjust based on specific features of the current case that are genuinely different from the reference class average

The adjustment in step 4 should be modest. Most people believe their case is more exceptional than it actually is. The base rate is usually a better estimate than the inside view, even after adjustment.

Flyvbjerg's research on large capital projects found that base rates for cost overrun in transportation infrastructure are approximately 45 percent on average, with some categories running higher. Projects estimated by optimistic inside-view analysis at budget compliance routinely overran by 50 percent or more. Had planners started with the base rate, their estimates would have been more accurate and their contingency planning more appropriate.


The Pre-Mortem

The pre-mortem is a technique developed by psychologist Gary Klein and popularized by Kahneman. It is one of the most reliably useful interventions available for improving decision quality in groups.

The standard forward-looking risk analysis asks: "What could go wrong?" This question is addressed in an atmosphere where the plan has already achieved some social momentum — where optimism is the default and raising risks can feel like opposing the team. The result is that known risks are often discussed superficially and novel risks are rarely surfaced.

The pre-mortem reframes the question. At the outset of implementation — before commitment is locked in — the group is told:

"Imagine it is one year from today. We implemented this plan, and it failed. The failure was significant. I want each of you to write down, independently, the most plausible reasons the failure occurred."

Several features make this more effective than standard risk review:

  • The failure is assumed, not speculated: This bypasses the tendency to defend the plan. You are not asked "could this fail?" but "why did it fail?"
  • Independent generation before sharing: Prevents anchoring on the first risk raised and suppresses social conformity effects
  • Prospective hindsight: Research by Deborah Mitchell and colleagues found that people identify more reasons for an event when asked to explain it after imagining it has already occurred than when asked to explain it prospectively

The pre-mortem is particularly effective for surfacing implementation risks — things that seem unlikely given optimistic assumptions but become plausible when execution complexity is considered honestly.

What to Do With Pre-Mortem Output

The goal is not to abandon the plan; it is to identify the top two or three risks and ask: what would we do differently given these? Can any of these risks be mitigated at acceptable cost? If one of the pre-mortem scenarios is highly plausible and the mitigation is prohibitively expensive, that is important information about whether to proceed at all.


Type 1 vs Type 2 Decisions: The Bezos Framework

Jeff Bezos's framework for categorizing decisions, first described in his 2015 letter to Amazon shareholders and elaborated in subsequent communications, has become one of the most widely cited frameworks in management because it addresses a specific failure mode of mature organizations.

Bezos distinguishes between:

Type 1 decisions (one-way doors): Decisions that are consequential, difficult to reverse, and require careful analysis before commitment. Once you walk through the door, returning to the prior state is expensive or impossible. Examples: shutting down a product line, making a large acquisition, entering a new country market, changing a fundamental organizational structure.

Type 2 decisions (two-way doors): Decisions that are reversible, with low irreversibility cost. If you walk through the door and it is the wrong call, you can walk back. Examples: trying a new product feature for a quarter, changing an internal process, hiring for a new role on a provisional basis, running a new marketing channel.

The failure mode Bezos identified is that large organizations apply Type 1 deliberation processes to Type 2 decisions. Multiple approval layers, extensive documentation requirements, cross-functional sign-off, long timelines — these make sense for decisions that cannot be undone but are pure overhead for decisions that can be reversed quickly if they do not work.

The result is that organizations slow down their learning cycles, because the fastest way to learn in complex environments is often to try something, observe the result, and adjust. Type 1 processes applied to Type 2 decisions remove the ability to run fast experiments.

Decision Type Reversibility Recommended Process Examples
Type 1 (one-way door) Low to none High deliberation: extensive analysis, diverse perspectives, pre-mortem, scenario planning Acquisitions, major restructuring, platform architecture choices
Type 2 (two-way door) High Low deliberation: decide with available information, act quickly, monitor and adjust Feature tests, process experiments, tactical resource allocation

The challenge in applying this framework is that the line between Type 1 and Type 2 is not always obvious in advance. Decisions that seem reversible sometimes are not; decisions that seem irreversible sometimes have more flexibility than initially apparent. And there is organizational incentive to misclassify decisions as Type 1 when the actual motivation is risk aversion or bureaucratic caution. The framework requires honest assessment of actual reversibility, not desired reversibility.


Separating Decision Quality from Outcome Quality

One of the most important — and least practiced — habits in decision improvement is evaluating decisions on the quality of their process rather than the quality of their outcomes.

A decision can be excellent and still produce a bad outcome, because outcomes under uncertainty are partly determined by factors outside the decision-maker's control. A decision can be poor — based on inadequate analysis, biased information, and wishful thinking — and still produce a good outcome by chance. Evaluating only outcomes provides distorted feedback.

Annie Duke, former professional poker player and author of "Thinking in Bets," uses the term resulting for the error of judging the quality of a decision by its outcome. Resulting produces systematic mislearning: you reinforce the habits that happened to work in specific cases rather than the habits that have positive expected value over many cases.

The antidote is prospective documentation: writing down the reasoning behind decisions before the outcome is known, including:

  • The options considered and rejected, and why
  • The key uncertainties and how they were estimated
  • The expected distribution of outcomes
  • The confidence level in the assessment

Reviewing these notes after outcomes are known allows honest assessment of where the reasoning was sound and where it failed — independently of whether the outcome was good or bad. This feedback loop, maintained consistently over time, is the most reliable mechanism for genuine improvement in decision quality.


Calibration: Knowing What You Know

Calibration is the alignment between confidence and accuracy: a well-calibrated person who says they are 80 percent confident in a prediction is right approximately 80 percent of the time. A poorly calibrated person who says they are 90 percent confident may be right only 60 percent of the time.

Calibration is trainable. The research from Tetlock's Good Judgment Project found that individuals who practiced probabilistic forecasting — expressing beliefs as specific probabilities, tracking outcomes, and reviewing results — improved their calibration substantially over time. The practice required:

  • Expressing beliefs as numbers ("70 percent likely" rather than "probably")
  • Tracking predictions against outcomes in a structured log
  • Decomposing predictions into component estimates where possible
  • Updating estimates as new information became available

The benefits extended beyond the specific domains being forecasted. Calibration training appears to generalize, making reasoners better at knowing the limits of their knowledge across domains.

A practical starting point: for decisions with uncertain outcomes, practice stating your confidence as a probability. If you think a project will finish on time, how confident are you — 60 percent? 80 percent? Track whether your 80 percent predictions actually occur 80 percent of the time. Most people will initially find they are overconfident. The feedback is uncomfortable and productive.


A Framework for High-Stakes Decisions

Combining these tools, a practical decision process for decisions that genuinely matter looks like this:

  1. Identify the reference class and find the base rate for the outcome you care about in that class. Start with the outside view.

  2. Apply expected value thinking: map out the plausible scenarios, estimate probabilities and values, compute expected value, and note the uncertainty around the estimates.

  3. Run a pre-mortem: assume failure and generate reasons why. Address the top risks before commitment.

  4. Classify the decision: is it Type 1 (low reversibility, high deliberation warranted) or Type 2 (high reversibility, faster process appropriate)?

  5. Document the reasoning: write it down before the outcome is known.

  6. Review after outcomes are known: evaluate the reasoning quality independently of the outcome.

This process will not eliminate bad outcomes. Uncertainty is real, and some fraction of good-process decisions will still produce bad outcomes. But over time, decisions made through this process will outperform decisions made by intuition alone, because the process is designed to counteract the specific biases — optimism, base rate neglect, confirmation bias, outcome dependency — that most predictably lead smart people to wrong conclusions.

Frequently Asked Questions

What is expected value thinking in decision making?

Expected value thinking involves estimating the probability of different outcomes and multiplying each by its value, then summing across scenarios to produce an overall expected value for each option. A decision with a 30 percent chance of gaining \(100 and a 70 percent chance of losing \)20 has an expected value of (0.3 x \(100) + (0.7 x -\)20) = \(30 - \)14 = $16. This framework shifts focus from 'what will happen' to 'what is the probability-weighted outcome,' which is more useful under genuine uncertainty.

What is a pre-mortem?

A pre-mortem, introduced by psychologist Gary Klein, is a technique in which a team imagines that a project or decision has already failed and works backward to identify why. Unlike traditional risk analysis, which asks 'what could go wrong?', the pre-mortem assumes failure has occurred and asks 'what happened?' This shifts the cognitive framing and reliably surfaces failure modes that forward-looking risk analysis misses, because it bypasses the planning fallacy and overcomes the social pressure to be optimistic.

What are base rates and why do they matter?

A base rate is the historical frequency of an outcome across a reference class of similar cases. If 60 percent of small businesses fail within five years, that is the base rate for any specific small business, regardless of how exceptional its founder believes it to be. People systematically underweight base rates in favor of specific case details (the planning fallacy). Deliberately starting with the base rate and then adjusting for specific factors — rather than ignoring the base rate — consistently improves prediction accuracy.

What is the difference between Type 1 and Type 2 decisions?

Jeff Bezos's Type 1/Type 2 framework distinguishes between one-way doors (Type 1 decisions that are difficult or impossible to reverse and require extensive deliberation) and two-way doors (Type 2 decisions that are easily reversible and should be made quickly with less process). Most large organizations make the mistake of treating Type 2 decisions like Type 1 decisions, creating bottlenecks and slowing down learning cycles. The framework prescribes matching the level of deliberation to the reversibility of the decision.

How can you improve decision quality without more information?

Several techniques improve decision quality without requiring more data: explicitly considering base rates before diving into case details, using the outside view (asking how similar situations turned out) before the inside view, conducting a pre-mortem to identify blind spots, separating the decision from its outcome to evaluate decision quality on process rather than results, and writing down the reasoning before the outcome is known to enable honest retrospective analysis.