Thinking Mode and Context Management

Thinking Mode

When you ask Claude Code to do something complex, you can request deeper reasoning by adding specific words to your prompt. These aren't magic incantations — they're signals to allocate a larger "thinking budget" before responding.

The Three Levels

Plain text

"think about this"          → light extended thinking
"think hard about this"     → medium extended thinking
"ultrathink"                → maximum thinking budget

What actually happens: Claude enters an extended reasoning phase before producing output. It works through the problem more carefully — considering alternatives, checking assumptions, thinking through failure modes. The response takes longer but tends to be more accurate on complex problems.

What this is NOT: switching to a different model. Same model, more thinking tokens.

When to Use Each Level

No modifier (most of the time): routine coding tasks, simple refactors, generating boilerplate, answering questions about your codebase. Most of what you'll do day-to-day doesn't need extended thinking.

"think about this": design questions, debugging non-obvious bugs, tasks where you want to make sure it considered the options.

Plain text

"Think about this before implementing:
we need to add rate limiting to our API routes.
What's the right approach given we're on Next.js with Vercel?"

"think hard": genuinely complex multi-file changes, subtle performance issues, anything with security implications.

Plain text

"Think hard about the implications of changing the session
token format before you implement anything. We have active
users with existing tokens."

"ultrathink": the hardest problems. Architectural decisions that are expensive to reverse. Complex debugging where simpler approaches have failed. Significant refactors with many moving pieces.

Plain text

"ultrathink: I need to migrate our authentication from
custom JWT handling to Auth.js v5. We have 50+ API routes
with manual auth checks, three different token types, and
active users. What's the migration plan that minimizes
downtime and keeps the app functional throughout?"

Use ultrathink sparingly. It's slow, it consumes tokens, and it's wasted on tasks that don't need deep reasoning. "ultrathink about adding a console.log" is a waste.

Practical Guideline

Start without a modifier. If the first response misses something important or makes a wrong assumption, that's a signal to retry with "think hard." If it keeps getting it wrong, that's a signal for ultrathink — or that the problem needs to be broken down more clearly before asking.

Context Management

Context accumulates as you work. Your entire conversation history is re-sent with every message — every exchange, every file the AI read, every output it produced. Eventually this becomes a problem:

Early instructions drop out of the effective context
The AI gets confused by contradictory intermediate states
Context fills up and responses get slower
The model "loses the thread"

You have two options when this happens: compact or start fresh.

/compact — Summarize and Continue

Plain text

> /compact

Claude compresses the conversation history into a summary. You lose the verbatim exchange but keep the key decisions, file states, and current understanding. The conversation continues with more room.

Use /compact when:

You're mid-task and don't want to lose the current understanding
The task has been going well and you just need to continue with a lighter context
Early messages are clearly no longer relevant

The risk: compacting loses detail. If the summary omits something important, the AI might revert a decision you made 20 messages ago. After compacting, re-state any critical constraints:

Plain text

> /compact
> [after compacting] Just to confirm the constraints that still apply:
  we're using Sonnet, we're not touching the database schema, and
  we're keeping the existing API contract unchanged.

Start a New Chat

The other option: start fresh. If the task is done, or if the context has become genuinely confused, a clean slate is often better than trying to compress a broken one.

Bash

# Just start a new session
claude

Start fresh when:

You're switching to a genuinely different task
The previous conversation went sideways
The AI is clearly confused and fixing it in the same session is making it worse

The Signal: Commit Often

The reason people hesitate to start new sessions is fear of losing work. The solution is committing frequently.

Plain text

Good work done → git commit → start new session for the next thing

Each commit is a checkpoint. Once committed, the work is safe. Starting a new chat costs nothing — the code is in git.

If you're in a session and not committing, you're accumulating both risk and context debt. Small loops, frequent commits, fresh contexts — that's the rhythm.

/status — Checking Your Position

Plain text

> /status
Model: claude-sonnet-4-6
Context: 18,432 / 200,000 tokens (9.2%)

Check /status before starting a significant task. If you're at 70% context, compact or start fresh first. Starting a large task at 70% context means you'll be dealing with context pressure before you finish.

A useful habit: check /status at the start of a session, and again when you feel responses getting less precise.

Putting It Together

Plain text

Start session: /status to check where you are
Describe task — add "think about this" if complex
Let it work
Review output: is it right? commit if yes
Continue — add "think hard" if you hit something tricky
When context fills: /compact to continue, new session to switch
Repeat

The tools are simple. The discipline is in applying them consistently: reserve extended thinking for genuinely hard problems, don't let context accumulate past the point of usefulness, and commit often so fresh starts are low-cost.