Chain-of-Thought Prompting

Platforms: claude openai gemini m365-copilot

What It Is

Chain-of-Thought (CoT) prompting asks the model to work through a problem step by step before giving a final answer. Instead of jumping straight to a conclusion, the model shows its reasoning — making it more likely to arrive at a correct answer and allowing you to verify the logic along the way.

Why It Works

When an LLM (large language model) generates intermediate reasoning steps, each step provides context for the next. This breaks complex problems into simpler sub-problems the model can handle more reliably. Without CoT, the model attempts to compute the answer in a single forward pass — essentially trying to “think” the entire solution at once — which fails for multi-step reasoning tasks like math, logic, and comparative analysis.

When to Use It

Math and numerical reasoning
Multi-step logic problems
Comparing options with tradeoffs
Debugging and root cause analysis
Any task where the reasoning matters as much as the answer

The Pattern

{Problem statement}

Think through this step by step:
1. First, {identify/analyze the key elements}
2. Then, {work through the logic}
3. Finally, {arrive at your conclusion}

Show your reasoning before giving your final answer.

Zero-Shot CoT variation — just append this to any prompt:

Let's think step by step.

This simple addition can dramatically improve reasoning performance without any examples (Kojima et al. 2022).

Filled-in example:

A company has 150 employees. 40% are remote, 30% are hybrid, and the rest
are in-office. Hybrid workers share desks at a 3:1 ratio (3 hybrid workers
per desk). How many desks does the office need?

Think through this step by step. Show your reasoning before giving your
final answer.

Examples in Practice

Business decision

Context: Your team is evaluating CRM vendors and needs a structured comparison that accounts for multiple constraints.

We're choosing between three vendors for our CRM migration:
- Vendor A: $50K cost, 6-month timeline
- Vendor B: $35K cost, 12-month timeline
- Vendor C: $45K cost, 4-month timeline

Our current CRM contract expires in 8 months. We have a $48K budget.

Think through the tradeoffs step by step, considering cost, timeline risk,
and transition complexity. Then recommend the best option with your reasoning.

Why this works: It forces the model to weigh multiple factors systematically rather than jumping to the cheapest or fastest option.

Debugging

Context: Your engineering team noticed a performance regression after a deployment and needs to narrow down the cause.

Our API response times increased from 200ms to 800ms after last week's
deployment. The deployment included: a database schema migration, an upgrade
to the authentication library, and a new logging middleware.

Walk through potential causes step by step, starting with the most likely.
For each cause, explain how we could verify or rule it out.

Why this works: Debugging requires sequential elimination of hypotheses. CoT prevents the model from fixating on one cause and ignoring others.

Math with multiple calculations

Context: You need to calculate office space requirements based on employee work patterns.

A company has 150 employees. 40% are remote, 30% are hybrid, and the rest
are in-office. Hybrid workers share desks at a 3:1 ratio (3 hybrid workers
per 1 desk). In-office workers each need their own desk.

How many desks does the office need? Show your work.

Why this works: The model needs to track multiple calculations — percentages, different desk ratios for different groups, then a final total. Showing work prevents arithmetic errors that commonly occur when the model tries to compute everything at once.

Common Pitfalls

Zero-Shot Prompting — Zero-Shot CoT (“Let’s think step by step”) bridges these two techniques
Self-Consistency and Reflection — sample multiple reasoning paths for higher accuracy
Direct Instruction — combine with CoT by explicitly specifying the reasoning steps
Prompt Engineering Overview
Research use case