Definition
Chain-of-Thought (CoT) is a prompting technique that enhances the reasoning capabilities of LLMs by requiring the model to explicitly articulate intermediate reasoning steps before providing the final answer.
Instead of requesting a direct answer, you ask the model to “think step by step,” producing verifiable reasoning that leads to a conclusion.
Variants
Few-shot CoT: provide examples of problems solved with explicit reasoning. The model learns the pattern and applies it to new problems.
Zero-shot CoT: simply add “Let’s think step by step” (or equivalent) to the prompt, without examples. Surprisingly effective on modern models.
Self-Consistency: generate multiple independent chain-of-thought responses and select the most frequent answer. Improves accuracy at the cost of additional compute.
Tree-of-Thought: explore multiple reasoning paths in parallel, with backtracking. More costly but more powerful for complex problems.
When to Use It
Effective for:
- Mathematical and arithmetic problems
- Multi-step logical reasoning
- Tasks requiring decomposition
- Questions that benefit from explicit process articulation
Less useful for:
- Simple tasks (factual lookups, direct classification)
- Creative generation
- Tasks where “reasoning” is not the bottleneck
Practical Considerations
Costs: CoT produces longer output (the reasoning itself). For high-volume applications, this increases costs. Evaluate whether accuracy improvements justify the expense.
Latency: more tokens generated = more time. For real-time applications, CoT may be too slow.
Interpretability: explicit reasoning makes output more verifiable. Useful for debugging and understanding where the model fails.
Common Misconceptions
”CoT makes the model reason like a human”
No. It produces output that resembles human reasoning, but the underlying process remains token prediction. The model can generate convincing but incorrect reasoning.
”More steps = better answer”
Not necessarily. Unnecessary steps can introduce errors or confusion. The reasoning must be pertinent to the problem.
”CoT always outperforms direct prompting”
On simple tasks, CoT can degrade performance by adding unnecessary complexity. It’s a technique, not a universal solution.
Related Terms
- Prompt Engineering: broader discipline of which CoT is a technique
- LLM: models to which CoT applies
- Prompt Engineering: foundation technique that enables CoT prompting
Sources
- Wei, J. et al. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. NeurIPS
- Kojima, T. et al. (2022). Large Language Models are Zero-Shot Reasoners. NeurIPS
- Wang, X. et al. (2022). Self-Consistency Improves Chain of Thought Reasoning in Language Models. ICLR