Chain-of-Thought Prompting Explained (With 10 Real Examples)

11 minute read · Updated April 2026

Chain-of-thought (CoT) prompting is one of the simplest and most impactful techniques in modern AI. It's the difference between an AI that gives you an answer and an AI that gives you the CORRECT answer — often 30-50% more accurate on complex tasks.

If you've ever been frustrated by an AI confidently producing a wrong answer, CoT is usually the fix.

What chain-of-thought actually is

Instead of asking the AI for a direct answer, you ask it to show its reasoning step-by-step before answering. This forces the model to work through intermediate steps rather than pattern-matching to a final response.

The original academic paper (Wei et al., 2022) showed that adding "Let's think step by step" to prompts improved accuracy on math problems from 18% to 79% — a transformative gain from 5 words.

Modern implementations have gone further with structured reasoning spaces, verification steps, and explicit "thinking" tags (especially in Claude).

When to use it

CoT helps on tasks with these features:

Multi-step reasoning (math, logic, strategy)
Constraint satisfaction (scheduling, allocation, optimization)
Analysis + synthesis (evaluating arguments, diagnosing issues)
Complex classification (legal categorization, medical triage)

CoT doesn't help (or hurts slightly) on:

Simple retrieval ("what's the capital of France?")
Creative tasks with no correct answer
Tasks where the model's first instinct is already good (summarization, translation)

The 4 forms of CoT

Form 1: Basic (the classic)

```

Question: A store sells pens for $3 and pencils for $1. I bought 12 items for $20. How many of each?

Let's think step by step.

```

Why it works: "step by step" triggers the model to show intermediate work. Simple but effective.

Form 2: Explicit thinking space

```

[complex question]

Work through this step by step in this section.

Give final answer in this section.

```

Why it works better: Gives the model a dedicated space for reasoning + a separate space for the final answer. Cleaner output, better reasoning. This is the Claude-preferred style.

Form 3: Structured reasoning template

```

Problem: [statement]

Step 1 — Understand the problem:

Step 2 — Identify known constraints:

Step 3 — Identify unknowns:

Step 4 — Apply relevant principles:

Step 5 — Work through the calculation:

Step 6 — Sanity-check the answer:

Final answer:

```

Why it works: The template forces specific reasoning moves. Prevents skipping steps. Especially useful for math, logic, and engineering problems.

Form 4: Multi-pass reasoning (advanced)

```

Solve this problem. Then:

Re-read the problem
Check if your solution addresses the actual question
Identify one weakness in your reasoning
Produce a refined final answer

Problem: [statement]

```

Why it works: Adds self-critique to the reasoning. Catches the common failure mode where the model answers a related-but-different question.

10 real examples

Example 1: Math word problem

```

A train leaves Station A at 9:00 AM traveling at 60 mph.

Another train leaves Station B (300 miles away) at 10:00 AM traveling at 75 mph toward Station A.

When do they meet?

Work through step-by-step:

What's the head start distance?
What's the closing speed?
How long until they meet after 10:00 AM?
What's the clock time?

```

Without CoT: models often skip the 1-hour head-start and produce wrong answer.

With CoT: near-100% accuracy.

Example 2: Strategic analysis

```

My company has 18 months of runway. Revenue is flat. We can either:

A) Raise money (probably at down-round valuation)

B) Cut 30% of staff + reduce burn

C) Pivot the product (3-6 month detour)

For each option, work through:

Financial math: what does 24 months look like?
Team impact: what do we lose?
Opportunity cost: what's foreclosed?
Reversibility: if this fails, what's plan B?

Then rank the options.

```

CoT forces multi-dimensional analysis. Without CoT, the model often recommends the option that sounds "right" without fully exploring consequences.

Example 3: Debugging

```


[paste broken code]

[paste error message]

Work through:

What does the error literally say?
What line is it on?
What's the state of the relevant variables at that point?
What are the 3 most likely root causes, ranked?
What's the minimum-risk first thing to try?

```

CoT in debugging catches bugs that "just try X" doesn't. Forces structured diagnostic thinking.

Example 4: Legal analysis

```

[fact pattern]

Using IRAC method:

Issue: What's the legal question?
Rule: What's the applicable rule?
Application: Apply the rule to the facts.
Conclusion: What's the result?

Show each section explicitly.

```

CoT maps directly to the IRAC framework law students learn. Dramatically better legal reasoning.

Example 5: Medical triage (example only — not medical advice)

```

[patient description]

Using differential diagnosis:

List 5 possible conditions
For each, list which symptoms support or refute
Rank by probability given the full picture
Identify most urgent action step

```

Forces systematic differential thinking. Better than "what's wrong with me?"

Example 6: Code review

```

[paste code]

Review systematically:

Layer 1 — Correctness: bugs, edge cases

Layer 2 — Security: injection, auth, data

Layer 3 — Performance: N+1, memory, CPU

Layer 4 — Maintainability: naming, complexity

For each layer: what did you find?

```

CoT + structured layers = more thorough review. (See our Code Review Architect Original.)

Example 7: Complex research question

```

Why has urban crime decreased in the US since the 1990s despite rising inequality?

State the consensus answer (most commonly cited causes)
State the strongest counter-argument
Evaluate each proposed cause: evidence for, evidence against
Synthesize: what's most likely true?
What's uncertain?

```

Produces balanced, epistemically-honest research vs. one-sided assertions.

Example 8: Product decision

```

Should we build feature X?

Estimated cost: 6 weeks of engineering.

Work through:

What problem does feature X solve?
For whom? (what % of users?)
What's the expected outcome (retention, revenue, etc.)?
What's the opportunity cost? (what else could 6 weeks do?)
What's the minimum version?
Should we build that instead?

Final recommendation:

```

CoT forces opportunity-cost thinking. Kills the "let's build it because we want to" trap.

Example 9: Negotiation prep

```

I'm asking my boss for a 15% raise.

Current salary: $100k.

Tenure: 3 years.

Work through every objection they'll raise:

"That's above our salary band" — my response:
"Review cycle is in 6 months, let's wait" — my response:
"The budget is tight this year" — my response:
"What specifically have you done to earn it?" — my response:
"What if we offered a bonus instead?" — my response:

For each, what's the strongest counter?

```

This is CoT-as-war-gaming. Our Salary Negotiation Pre-Mortem Original is built on this pattern.

Example 10: Moral dilemma

```

[complex ethical situation]

Consider from 3 ethical frameworks:

Consequentialist: what produces best outcomes?
Deontological: what does duty/principle require?
Virtue ethics: what would a virtuous person do?

For each, reason through the answer. Then synthesize.

```

Forces consideration of multiple frameworks rather than defaulting to one. Produces more nuanced moral analysis.

When CoT backfires

1. Simple tasks get over-engineered. "What's 2+2?" with CoT wastes tokens. Use only for genuinely complex problems.

2. Narrative creep in long reasoning. Very long CoT can drift from the original question. Cap thinking at reasonable length.

3. Reasoning that looks right but isn't. CoT can produce plausible-sounding wrong reasoning. Always verify the final answer, not just the process.

4. Confirmation bias. If the reasoning supports an obvious answer, it might just be justifying a pre-baked conclusion. Ask for counter-arguments explicitly.

The 3 advanced techniques

1. Self-consistency: Run the same CoT prompt 3-5 times, take the majority answer. Dramatically improves accuracy on ambiguous problems.

2. Least-to-most prompting: Break complex questions into simpler sub-questions. Solve each. Synthesize.

3. Tree of thoughts: Explore multiple reasoning paths in parallel, evaluate each, select best. Research-intensive technique for critical decisions.

The meta-takeaway

Chain-of-thought works because LLMs are better at pattern-matching than at reasoning, but pattern-matching their OWN reasoning traces is a form of approximate reasoning. When you ask them to show their work, they're leveraging training data of people showing their work — which is typically better thinking than inferring a conclusion directly.

If you adopt one prompt engineering technique from 2026, make it CoT. It's free (adds minimal tokens), easy (a few words of instruction), and substantially improves output on any task that involves multi-step thinking.

More: XML Prompt Method · Promptolis Originals · Decisions & Reasoning