Thinking Mode and Context Management
Claude Code has three thinking modes that trade speed for reasoning quality. Knowing when to trigger "ultrathink" vs. just typing is the difference between thoughtful solutions and fast ones.
Thinking Mode
When you ask Claude Code to do something complex, you can request deeper reasoning by adding specific words to your prompt. These aren't magic incantations — they're signals to allocate a larger "thinking budget" before responding.
The Three Levels
"think about this" → light extended thinking
"think hard about this" → medium extended thinking
"ultrathink" → maximum thinking budgetWhat actually happens: Claude enters an extended reasoning phase before producing output. It works through the problem more carefully — considering alternatives, checking assumptions, thinking through failure modes. The response takes longer but tends to be more accurate on complex problems.
What this is NOT: switching to a different model. Same model, more thinking tokens.
When to Use Each Level
No modifier (most of the time): routine coding tasks, simple refactors, generating boilerplate, answering questions about your codebase. Most of what you'll do day-to-day doesn't need extended thinking.
"think about this": design questions, debugging non-obvious bugs, tasks where you want to make sure it considered the options.
"Think about this before implementing:
we need to add rate limiting to our API routes.
What's the right approach given we're on Next.js with Vercel?""think hard": genuinely complex multi-file changes, subtle performance issues, anything with security implications.
"Think hard about the implications of changing the session
token format before you implement anything. We have active
users with existing tokens.""ultrathink": the hardest problems. Architectural decisions that are expensive to reverse. Complex debugging where simpler approaches have failed. Significant refactors with many moving pieces.
"ultrathink: I need to migrate our authentication from
custom JWT handling to Auth.js v5. We have 50+ API routes
with manual auth checks, three different token types, and
active users. What's the migration plan that minimizes
downtime and keeps the app functional throughout?"Use ultrathink sparingly. It's slow, it consumes tokens, and it's wasted on tasks that don't need deep reasoning. "ultrathink about adding a console.log" is a waste.
Practical Guideline
Start without a modifier. If the first response misses something important or makes a wrong assumption, that's a signal to retry with "think hard." If it keeps getting it wrong, that's a signal for ultrathink — or that the problem needs to be broken down more clearly before asking.
Context Management
Context accumulates as you work. Your entire conversation history is re-sent with every message — every exchange, every file the AI read, every output it produced. Eventually this becomes a problem:
- Early instructions drop out of the effective context
- The AI gets confused by contradictory intermediate states
- Context fills up and responses get slower
- The model "loses the thread"
You have two options when this happens: compact or start fresh.
/compact — Summarize and Continue
> /compactClaude compresses the conversation history into a summary. You lose the verbatim exchange but keep the key decisions, file states, and current understanding. The conversation continues with more room.
Use /compact when:
- You're mid-task and don't want to lose the current understanding
- The task has been going well and you just need to continue with a lighter context
- Early messages are clearly no longer relevant
The risk: compacting loses detail. If the summary omits something important, the AI might revert a decision you made 20 messages ago. After compacting, re-state any critical constraints:
> /compact
> [after compacting] Just to confirm the constraints that still apply:
we're using Sonnet, we're not touching the database schema, and
we're keeping the existing API contract unchanged.Start a New Chat
The other option: start fresh. If the task is done, or if the context has become genuinely confused, a clean slate is often better than trying to compress a broken one.
# Just start a new session
claudeStart fresh when:
- You're switching to a genuinely different task
- The previous conversation went sideways
- The AI is clearly confused and fixing it in the same session is making it worse
The Signal: Commit Often
The reason people hesitate to start new sessions is fear of losing work. The solution is committing frequently.
Good work done → git commit → start new session for the next thingEach commit is a checkpoint. Once committed, the work is safe. Starting a new chat costs nothing — the code is in git.
If you're in a session and not committing, you're accumulating both risk and context debt. Small loops, frequent commits, fresh contexts — that's the rhythm.
/status — Checking Your Position
> /status
Model: claude-sonnet-4-6
Context: 18,432 / 200,000 tokens (9.2%)Check /status before starting a significant task. If you're at 70% context, compact or start fresh first. Starting a large task at 70% context means you'll be dealing with context pressure before you finish.
A useful habit: check /status at the start of a session, and again when you feel responses getting less precise.
Putting It Together
1. Start session: /status to check where you are
2. Describe task — add "think about this" if complex
3. Let it work
4. Review output: is it right? commit if yes
5. Continue — add "think hard" if you hit something tricky
6. When context fills: /compact to continue, new session to switch
7. RepeatThe tools are simple. The discipline is in applying them consistently: reserve extended thinking for genuinely hard problems, don't let context accumulate past the point of usefulness, and commit often so fresh starts are low-cost.
Keep reading
Enjoyed this? Get more like it.
Deep dives on system design, React, web development, and personal finance — straight to your inbox. Free, always.