Philip Tetlock spent two decades studying the accuracy of expert predictions. He recruited nearly 300 professional forecasters — political scientists, economists, intelligence analysts, policy experts — and asked them to make specific, testable predictions about political and economic events. He collected over 27,000 predictions and tracked their accuracy against outcomes.
The headline finding was devastating for expert confidence: most experts performed barely better than chance. But buried in the data was a more interesting finding. Some forecasters were substantially better than others — not because of their credentials, their field, or their ideology, but because of how they thought about their predictions.
The better forecasters treated their predictions as hypotheses to be tested and refined, not positions to be defended. They kept records. They tracked accuracy. They updated their views when evidence came in. They were genuinely interested in where they had been wrong and why. They treated every resolved prediction as a learning event.
The worse forecasters did none of this. They rarely reviewed their old predictions. When outcomes were unfavorable, they explained the failure as an aberration or attributed it to factors outside their model. They moved on without updating. They accumulated decades of experience without accumulating accuracy.
The difference was not intelligence, credential, or access to information. It was epistemic hygiene: the discipline of maintaining an honest record of your thinking and subjecting it to scrutiny when outcomes are known.
This is the core of decision journaling: creating a written record of your reasoning at the time of a decision, then reviewing that record against outcomes to identify systematic errors you would otherwise never discover. It is among the most direct, evidence-backed tools available for improving the quality of human judgment over time — not the only tool, but one of the few that directly addresses the root problem of how we learn (or fail to learn) from experience.
"The biggest source of poor decisions is not lack of information. It is the stories we tell ourselves afterward about why things happened the way they did. A journal is the only antidote to that story-telling." — Annie Duke, Thinking in Bets (2018)
Key Definitions
Decision journal — A structured written record of important decisions, capturing the decision-maker's reasoning, the options considered, predictions, confidence levels, and emotional state at the time of the decision, to enable accurate retrospective evaluation when outcomes become known.
Hindsight bias — The tendency, after outcomes are known, to believe that you always knew they would occur. Also called the "knew-it-all-along" effect. Hindsight bias makes it nearly impossible to learn from experience without explicit records: you rewrite your memory of your prior reasoning to match what actually happened, making it look like your judgment was better than it was.
Outcome bias — The tendency to evaluate the quality of a decision by the quality of its outcome rather than by the quality of the reasoning at the time of the decision. A good decision can produce a bad outcome due to bad luck. A bad decision can produce a good outcome due to good luck. Decision quality and outcome quality are not the same thing.
Resulting — Annie Duke's term for the specific form of outcome bias in which a good outcome leads to reinforcing the decision process that produced it, regardless of whether that process was sound. A poker player who wins with a bad hand call may result by continuing to make bad calls — because the winning outcome told them the call was good when the probabilities said it was bad.
Calibration — The match between confidence levels and accuracy rates. A well-calibrated forecaster who says they are 70% confident in predictions is right about 70% of the time. A poorly calibrated forecaster who says they are 70% confident is right significantly more or less often. Decision journaling enables calibration tracking: by recording confidence at decision time and tracking outcomes, you can identify whether your confidence levels are systematically inflated or deflated.
Metacognition — Thinking about thinking: awareness and analysis of your own cognitive processes, including how you make decisions, what biases affect you, and how your reasoning could be improved. Decision journaling is applied metacognition: using records to make the reasoning process itself visible and evaluable.
Post-decision rationalization — The cognitive process by which you construct justifications for decisions after they are made, drawing on information available after the decision, as if you had considered it before. Post-decision rationalization is mostly unconscious. It produces memories of decision processes that are more deliberate, more informed, and more rational than the actual processes were. Written records defeat it by capturing the actual reasoning.
Reference forecasting — Recording specific, numerical predictions alongside decisions to enable later accuracy assessment. Instead of recording "I think this will probably work," recording "I think there is a 75% chance this succeeds within 12 months." Numerical predictions can be evaluated against outcomes and tracked over time; verbal impressions cannot.
The Psychological Case for Keeping Records
The Hindsight Problem
Baruch Fischhoff demonstrated hindsight bias in a landmark 1975 study. He asked participants to read about historical events whose outcomes were unknown to them. After revealing the actual outcomes, he asked participants to recall what probability they had originally assigned to each outcome.
The results were unambiguous: participants consistently misremembered their prior probability estimates as higher than they had been for the outcome that actually occurred. The knowledge of the outcome systematically biased the recollection of prior belief. Participants believed, falsely, that they had "always known" the outcome was likely.
This finding has been replicated hundreds of times across different contexts, populations, and domains. Hindsight bias is one of the most robust effects in cognitive psychology.
The implication for decision-making is severe: if you try to learn from experience by reflecting on past decisions without written records, you are not actually reflecting on your past decisions. You are reflecting on a reconstructed version of your past decisions that incorporates the knowledge of what happened. The reconstruction almost always makes you look better than you were.
This is not a failure of honesty or character. It is an automatic feature of how human memory works. Memory is reconstructive, not archival — every time you recall an event, you rebuild it partly from scratch using currently available knowledge. Hindsight contamination of memory is not something you can simply decide to avoid; you need structural defenses against it.
"Hindsight bias makes surprises vanish. After an event occurs, people consistently report that they knew the event would happen, that they saw it coming, even when they clearly did not." — Baruch Fischhoff, Journal of Experimental Psychology (1975)
The Outcome Bias Problem
Jonathan Baron and John Hershey documented outcome bias in 1988. They showed that people systematically evaluate the quality of decisions based on outcomes, even when they know the decision-maker could not have anticipated the outcome at the time.
In one experiment, a medical procedure that produced a good outcome was rated as a better decision than the identical procedure that produced a bad outcome — even when participants were told that the procedure was identical in both cases and that outcomes were determined by chance. The good outcome retrospectively made the decision look better.
This creates a specific problem for learning: if you evaluate your past decisions primarily by whether they worked out, you will reinforce processes that got lucky and criticize processes that were sound but produced bad outcomes. Over time, this produces worse judgment, not better. You are learning from noise rather than signal.
Decision journaling separates process evaluation from outcome evaluation by capturing the reasoning at the time of decision — before outcomes are known — and then evaluating that reasoning on its own terms when outcomes are later revealed.
The Experience Trap
There is a widely believed idea that experience automatically produces wisdom. Spend enough years making decisions in a domain, and your judgment in that domain will improve. Tetlock's research is one of several lines of evidence suggesting this is wrong in the absence of structured feedback.
Robin Hogarth, in 'Educating Intuition' (2001), distinguished between "kind" and "wicked" learning environments. A kind learning environment is one where feedback is rapid, unambiguous, and accurately related to the quality of the decision — a surgeon knows almost immediately whether a procedure worked; a chess player knows immediately whether a move was sound. In kind environments, experience does build expertise.
A wicked learning environment is one where feedback is delayed, ambiguous, or disconnected from the quality of the decision — a hiring manager may not know for years whether a hire was excellent or poor; a business leader may receive good outcomes from bad strategies due to favorable market conditions. In wicked environments, experience does not automatically build expertise. It may build false confidence instead.
Most high-stakes decisions in life — career choices, relationships, business strategy, investment decisions — occur in wicked learning environments. Decision journaling is a tool for converting wicked learning environments into kinder ones: by creating explicit records of reasoning and tracking outcomes against predictions, you create the feedback loop that would otherwise be absent.
The Decision Journal Format
There is no single required format. The goal is to capture enough to reconstruct your actual reasoning state at the time of the decision. A working format used by many practitioners:
At Decision Time (Required)
The decision: What exactly are you deciding? Write it precisely. Vague decisions produce vague records that cannot be meaningfully evaluated later.
The options you considered: List the alternatives you actually evaluated, not just the one you chose. This captures whether your deliberation was broad or narrow. Reviewing this field often reveals that important options were not considered at all.
The key factors: What information or reasoning drove your choice? What were the most important considerations? Write what you actually found most persuasive — not the most rational-sounding justification. The goal is to record actual reasoning, not ideal reasoning.
Your prediction: What do you expect to happen as a result of this decision? Write it specifically and, where possible, numerically. "I think there is a 70% chance the product reaches 1,000 users within 6 months." Vague predictions — "I think it will probably work out" — cannot be evaluated.
Your confidence level: How confident are you in this decision? A percentage or simple scale (low/medium/high). This enables calibration tracking over time.
Your emotional state: Are you excited, anxious, reluctant, pressured, calm? Emotional state at decision time is important retrospective context. Many systematic errors are driven by emotional state — knowing you were anxious when you made a conservative decision, or excited when you made an optimistic one, helps explain the error later.
What you are missing: What information would you want that you do not have? What is your biggest uncertainty?
What would change your mind: What evidence or event would cause you to conclude this was the wrong decision? This forces you to think about falsifiability and prevents indefinite post-hoc rationalization.
At Review Time (After Outcome Is Known)
What actually happened: Describe the outcome specifically and accurately.
Predicted vs. actual: How did your prediction compare to the actual outcome? Were you right? How far off?
Quality of your reasoning: Looking at your written reasoning at the time, was the process sound? Did you consider the right factors? Were you missing something important? Were your assumptions reasonable given what you knew then?
What you would do differently: Not "what I would have done with hindsight" — but "given only what I knew then, what would have been a better process?"
Pattern identification: Does this entry fit a pattern you have seen in your other entries? Are you consistently overconfident in similar domains? Do you consistently underweigh certain types of risk?
Decision Journal Template: What to Record
| Field | At Decision Time | At Review Time |
|---|---|---|
| The decision | Write precisely | Confirm what was decided |
| Options considered | List all alternatives evaluated | Note any you missed |
| Key reasoning | Record actual drivers | Assess whether they were sound |
| Prediction | Specific and numerical | Compare to actual outcome |
| Confidence level | Rate 1-10 or % | Note calibration error |
| Emotional state | Record honestly | Identify if it biased reasoning |
| Information gaps | List uncertainties | Note which gaps mattered |
| Change conditions | What would reverse you | Assess whether you held firm correctly |
| Outcome notes | (empty at entry time) | Record what happened |
| Pattern tag | Optional category | Link to similar entries |
What to Review For
The value of a decision journal is not in any single entry. It is in the patterns that emerge across entries over time. Common patterns that decision journals reliably reveal:
Confidence Calibration
Are you consistently overconfident? Many people are: they say 80% confident and are right 55% of the time. The journal allows you to track this explicitly. If you discover that your 80% confidence predictions are right 60% of the time, you have learned something immediately actionable: reduce your stated confidence in similar situations by roughly 20 percentage points.
Superforecasters in Tetlock's Good Judgment Project systematically tracked calibration as a metric and updated their confidence habits accordingly. This is the single most directly applicable lesson from the superforecasting research: calibration is improvable, but only if you measure it.
Domain-Specific Accuracy
You may find that your judgment is well-calibrated in some domains and systematically off in others. Technical judgments about your area of expertise may be accurate; predictions about other people's behavior may be systematically optimistic; market timing predictions may be near-random. This domain-specific knowledge is extremely valuable — it tells you where to trust your instincts and where to seek outside input or apply extra skepticism.
Emotional State Effects
Reviewing entries with their emotional state context often reveals that certain emotional states reliably predict poor decisions. Excitement leads to underestimating risks. Pressure leads to premature convergence on available options. Anxiety leads to excessive caution. Knowing your own emotional signatures for poor judgment is actionable knowledge — you can build in pauses or external review when you notice those states.
Information Gaps
What information were you consistently missing when your predictions failed? Are there categories of information you systematically fail to seek? Do you consistently rely on one type of information and neglect others? A pattern of missing a certain type of information is directly actionable: you can build a checklist that forces you to seek that information before finalizing similar decisions.
Option Set Limitations
Many poor decisions are poor primarily because the best option was never considered. Reviewing the "options considered" field across entries may reveal that you consistently consider only two or three options when more exist, or that you tend to frame decisions as binary when they admit of more creative solutions.
Common Errors in Decision Journaling
Journaling After the Fact
Recording a decision after you already know the outcome defeats the primary purpose. The journal is designed to capture reasoning before outcome knowledge contaminates it. If you cannot record at decision time, record as soon as possible afterward — but note that even hours of outcome information can distort the record.
Optimizing for Appearance
Some practitioners unconsciously write journal entries that cast their reasoning in the best possible light — entries designed to look good on review rather than accurately record the actual decision process. The journal is private and the only person deceived by this is yourself. Accurate records of flawed reasoning are more valuable than polished records of fictional reasoning.
Reviewing Too Soon
Reviewing an entry the same week you wrote it does not give you temporal distance or additional outcome information. The most useful reviews happen weeks or months later, when outcomes have developed and when you have enough distance to read your past reasoning without immediately identifying with it.
Ignoring Good Decisions
The instinct is to review entries where things went wrong. But reviewing entries where predictions were accurate is equally important — it tells you what your reliable strengths are, which domains to trust your judgment in, and what patterns of reasoning produce good outcomes.
Evaluating by Outcome Rather Than Process
The most common error in reviewing a decision journal is still evaluating the decision by whether it worked out, rather than by whether the reasoning was sound. If you notice yourself thinking "that was a good decision because it worked," reframe: was the reasoning sound given what you knew then? Outcome information is useful for calibration tracking, but the primary evaluation target should always be the quality of the process.
Who Uses Decision Journals and Why
The practice of maintaining explicit decision records is most developed in domains where the stakes of poor judgment are highest and where feedback loops are long.
Professional investors are among the most consistent users. Howard Marks, founder of Oaktree Capital, is known for extensive internal memos that document his reasoning at key investment decisions, many of which he has made public. Ray Dalio's principle-based decision-making at Bridgewater is built on similar foundations: explicit rules, tested against outcomes, refined over time. Michael Mauboussin, at Morgan Stanley, has written extensively on why investment firms should maintain decision journals as a standard practice.
Intelligence analysts at the US Central Intelligence Agency have been required to maintain structured analytical records since reforms following intelligence failures in the early 2000s. The explicitly stated purpose is to enable retrospective evaluation of analytical reasoning — to identify what information was available, what assumptions were made, and where reasoning failed.
Medical professionals use morbidity and mortality conferences — structured retrospective reviews of cases where outcomes were poor — as an institutionalized form of the same principle. The goal is to identify reasoning failures while protecting against hindsight bias by examining what information was available at the time of the clinical decision.
In each of these domains, the underlying problem is identical: decisions are made under uncertainty with delayed and ambiguous feedback, and without explicit records, experience does not reliably produce learning.
Building the Habit
Decide What Qualifies
Not every decision warrants a journal entry. A useful heuristic: journal decisions that are consequential enough that you would genuinely want to understand them in retrospect — decisions where being wrong is costly or where understanding your reasoning would be valuable. Career decisions, major financial decisions, significant relationship decisions, important business choices.
For most people, this is one to three entries per week. More frequent recording often produces lower-quality entries because the decisions are too routine to generate genuine insight.
Keep It Fast
Long, elaborate journal entries are hard to maintain. A complete minimum entry should take five to ten minutes. If the format feels burdensome, simplify it: a minimum viable entry is (1) the decision, (2) why you made it, (3) what you predict will happen, (4) your confidence.
Separate Recording From Review
Do not review entries at the time of recording. The value comes from reading entries weeks or months later, when you have temporal distance from the decision and when outcomes may already be known. Reviewing immediately after writing defeats the purpose.
Build in Scheduled Reviews
Most practitioners recommend a quarterly review in addition to reviewing entries as outcomes resolve. The quarterly review looks for patterns across entries — the kinds of systematic errors that do not show up in any single entry but emerge across many. An annual review identifies longer-term patterns and tracks whether judgment is actually improving.
"The goal is not to evaluate decisions by outcomes. The goal is to get better at the process of deciding — and the only way to improve a process is to have an honest record of what that process actually was." — Shane Parrish, Farnam Street (2017)
Digital vs. Paper Journals
There is no strong evidence that digital or paper formats are superior. The choice should be made based on what you will actually maintain consistently.
Paper journals have the advantage of being linear and bounded: you are less tempted to edit past entries, and the physical record is more difficult to retroactively alter. The limitation is searchability: identifying patterns across entries requires manual review.
Digital journals — whether a dedicated app, a spreadsheet, or a notes application — enable tagging, searching, and filtering that makes pattern identification much easier. A spreadsheet-based journal can track confidence levels and outcomes quantitatively, enabling systematic calibration analysis. The limitation is that digital records are easy to edit, making it technically possible to alter past entries, and the temptation to polish past reasoning is greater.
Hybrid approaches — initial entry on paper, periodic transfer to a searchable digital format — capture advantages of both.
The Relationship to Other Decision Tools
Decision journaling pairs with, but is distinct from, several related practices.
Pre-mortem analysis (imagining the decision has failed and working backward to identify why) is a tool used before a decision is made, to identify risks and failure modes. Decision journaling documents the process at the time of the decision and enables retrospective review. The two are complementary: pre-mortem findings should be recorded in the journal entry as part of the reasoning process.
Probabilistic thinking — the practice of assigning explicit probability estimates rather than binary predictions — is most useful in combination with journaling. Probabilities that are not recorded cannot be tracked for calibration. Journaling provides the record; probabilistic thinking provides the numerical content that makes systematic calibration assessment possible.
Base rate research — looking up reference class data on outcomes of similar decisions before making a final judgment — is relevant at the point of constructing predictions in the journal entry. Recording what base rates you consulted and how you weighted them against inside-view reasoning is useful retrospective information.
References
- Fischhoff, B. (1975). Hindsight is not equal to foresight: The effect of outcome knowledge on judgment under uncertainty. Journal of Experimental Psychology: Human Perception and Performance, 1(3), 288-299.
- Baron, J., & Hershey, J. C. (1988). Outcome bias in decision evaluation. Journal of Personality and Social Psychology, 54(4), 569-579.
- Duke, A. (2018). Thinking in Bets: Making Smarter Decisions When You Don't Have All the Facts. Portfolio/Penguin.
- Tetlock, P. E. (2005). Expert Political Judgment: How Good Is It? How Can We Know? Princeton University Press.
- Tetlock, P. E., & Gardner, D. (2015). Superforecasting: The Art and Science of Prediction. Crown Publishers.
- Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus and Giroux.
- Hogarth, R. M. (2001). Educating Intuition. University of Chicago Press.
- Parrish, S. (2017). The decision journal. Farnam Street Blog. fs.blog/decision-journal/
- Russo, J. E., & Schoemaker, P. J. H. (2002). Winning Decisions: Getting It Right the First Time. Currency/Doubleday.
- Marks, H. (2011). The Most Important Thing: Uncommon Sense for the Thoughtful Investor. Columbia University Press.
- Mauboussin, M. J. (2012). The Success Equation: Untangling Skill and Luck in Business, Sports, and Investing. Harvard Business Review Press.
Frequently Asked Questions
What is a decision journal?
A decision journal is a written record of your reasoning at the time you make an important decision, including the options you considered, your prediction, and your confidence level. You review it later against actual outcomes to identify systematic errors in your thinking.
Why keep a decision journal instead of just reflecting on outcomes?
Without a written record, hindsight bias causes you to rewrite your memory of your reasoning to match what actually happened. A journal captures your actual thinking before outcomes were known, making honest evaluation possible.
What should you include in a decision journal entry?
At minimum: the decision, the options you considered, the key reasoning, a specific prediction, your confidence level, and your emotional state. The goal is to capture enough to reconstruct your actual reasoning state later.
How often should you review a decision journal?
Review entries when outcomes become clear enough to evaluate, plus a periodic full review (quarterly or annually) to identify patterns across multiple decisions. Never review immediately after writing — you need temporal distance.
Who developed decision journaling?
Annie Duke and Farnam Street's Shane Parrish are the most prominent advocates. The underlying research on hindsight bias and outcome bias comes from Baruch Fischhoff, Jonathan Baron, and Daniel Kahneman.
What is the difference between outcome quality and decision quality?
A decision is high-quality if the reasoning was sound given information available at the time, regardless of whether the outcome was good. Outcome bias — judging decisions by results rather than process — is the mistake decision journaling is designed to correct.
Is a decision journal the same as a regular diary?
No. A regular diary records what happened and how you felt. A decision journal records reasoning and predictions specifically to enable later evaluation of whether your thinking process was sound.