How much does chain-of-thought improve LLM accuracy?
Prompt Engineering Patterns: Chain-of-Thought (CoT), Zero-Shot CoT, Reasoning for Complex Tasks
Audio flashcard · 0:21Nortren·
How much does chain-of-thought improve LLM accuracy?
0:21
On the GSM8K mathematical reasoning benchmark, the original CoT paper showed PaLM 540B going from 17.7 percent accuracy with standard few-shot to 58.1 percent with chain-of-thought. More recent models combined with CoT achieve over 90 percent on the same benchmark. The improvement varies by task and model but is consistently large for problems requiring multiple reasoning steps.
arxiv.org