Chain-of-Thought (CoT)

Definition

Chain-of-Thought (CoT) is a prompting technique that enhances the reasoning capabilities of LLMs by requiring the model to explicitly articulate intermediate reasoning steps before providing the final answer.

Instead of requesting a direct answer, you ask the model to “think step by step,” producing verifiable reasoning that leads to a conclusion.

Variants

Few-shot CoT: provide examples of problems solved with explicit reasoning. The model learns the pattern and applies it to new problems.

Zero-shot CoT: simply add “Let’s think step by step” (or equivalent) to the prompt, without examples. Surprisingly effective on modern models.

Self-Consistency: generate multiple independent chain-of-thought responses and select the most frequent answer. Improves accuracy at the cost of additional compute.

Tree-of-Thought: explore multiple reasoning paths in parallel, with backtracking. More costly but more powerful for complex problems.

When to Use It

Effective for:

Mathematical and arithmetic problems
Multi-step logical reasoning
Tasks requiring decomposition
Questions that benefit from explicit process articulation

Less useful for:

Simple tasks (factual lookups, direct classification)
Creative generation
Tasks where “reasoning” is not the bottleneck

Practical Considerations

Costs: CoT produces longer output (the reasoning itself). For high-volume applications, this increases costs. Evaluate whether accuracy improvements justify the expense.

Latency: more tokens generated = more time. For real-time applications, CoT may be too slow.

Interpretability: explicit reasoning makes output more verifiable. Useful for debugging and understanding where the model fails.

Common Misconceptions

”CoT makes the model reason like a human”

No. It produces output that resembles human reasoning, but the underlying process remains token prediction. The model can generate convincing but incorrect reasoning.

”More steps = better answer”

Not necessarily. Unnecessary steps can introduce errors or confusion. The reasoning must be pertinent to the problem.

”CoT always outperforms direct prompting”

On simple tasks, CoT can degrade performance by adding unnecessary complexity. It’s a technique, not a universal solution.

Prompt Engineering: broader discipline of which CoT is a technique
LLM: models to which CoT applies
Prompt Engineering: foundation technique that enables CoT prompting

Sources

Wei, J. et al. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. NeurIPS
Kojima, T. et al. (2022). Large Language Models are Zero-Shot Reasoners. NeurIPS
Wang, X. et al. (2022). Self-Consistency Improves Chain of Thought Reasoning in Language Models. ICLR