The AI Dev Tools Landscape

Cursor, Claude Code, Codex, Gemini CLI — what each tool is for, how they differ, and why "which one should I use?" has a different answer depending on what you're doing.

March 30, 20265 min read1 / 2

The instinct is to pick one tool and stick with it. The reality is that Cursor and Claude Code solve different problems — and knowing which to reach for, and when, is the real skill.

The Vibe Coding Problem

There's a real phenomenon that happens with AI coding tools:

"The code grows beyond my usual comprehension."

That line captures the failure mode perfectly. You're in a flow state. The agent is generating code. Things feel productive. Then you surface and realize you don't fully understand what was built — and neither does the codebase now.

AI tools are very good at generating a lot of code quickly. But a lot of code isn't always a good thing. Anyone who has parachuted into a legacy codebase knows: more code is usually more problems. The goal of today's tools should be generating the right code, not the most code.

The good news: decades-old best practices — git discipline, test-driven development, architectural decision records, small focused commits — turn out to work extremely well as guardrails for AI tools. The discipline you've always aspired to follow now pays compounding dividends.

The Two Main Tools

Cursor

Cursor is a VS Code fork. If it looks a lot like VS Code, that's because it is — effectively a superset of VS Code with AI features built in.

The experience: you're in your editor. You can see your code and the AI's changes side by side. You accept or reject inline. The agent mode works from inside the same environment you're already in.

Best for:

  • Staring at a function you need to refactor
  • Inline surgical edits to specific files
  • Tasks where you want to watch what's happening and stop it if needed
  • Working with the code directly visible

Pricing: Free 14-day pro trial, then a hobby plan with limits. Pro plan covers everything in this series. API keys can be provided directly if you want to bypass Cursor's billing.

Claude Code

Claude Code is a terminal app. It runs in your terminal — you're not necessarily looking at the same code the agent is touching.

The experience: more autonomous. You give it a task, it works, you review the result. You're not tied to what file you're looking at while it runs.

Best for:

  • Changing function signatures across an entire codebase
  • Larger refactors that span many files
  • Tasks where you want to delegate and review the outcome
  • Running in the background while you do other work

Pricing: No free plan. Pay via API usage or monthly subscription. Rate-limited even on max plans.

Alternative if cost is a concern: Google's Gemini CLI is free within limits if you have a Google account, and covers the terminal-based paradigm well for learning.

The "Which One?" Answer

Plain text
Staring at a specific function → Cursor inline edit Refactoring one component → Cursor agent Changing a function signature across the entire codebase → Claude Code Large background task while you do something else → Claude Code Brainstorming architecture → Gemini (large context window) or GPT O3 Code review on a PR → GitHub Copilot / Codex

The real answer is: use both, for different things. They're not competitors — they're tools with different interaction paradigms.

OpenAI Codex: The Third Paradigm

Codex takes a completely different approach. You point it at a git repo, tell it what you want, review its plan, and it opens a PR when done.

This is the traditional code reviewer paradigm — you're reviewing a pull request, not watching code get written. Useful for:

  • Kicking off work remotely (away from your computer)
  • Tasks you want treated like a PR you'll review in the morning
  • Understanding a large codebase without running it locally

Model Selection

Every tool gives you a model picker. The choice matters more than people think.

Model FamilyBest For
Claude Sonnet (Anthropic)Daily coding driver — best balance of quality and speed
Claude OpusComplex tasks, costs 4–6x more
Claude HaikuFast, cheap, great for simple/repetitive tasks
Gemini Flash (Google)Speed, large context (2M tokens), brainstorming
Gemini ProThinking mode, understanding whole repos
GPT O3 (OpenAI)Brainstorming, "what am I missing?", factual reasoning

Key insight: a small model with a well-detailed plan outperforms a large model with a vague request. This is the central theme of working effectively with these tools. Opus let loose on an underspecified task will produce worse results than Haiku given a precise, scoped specification.

For budget-conscious use: Sonnet hits the sweet spot of affordable and capable. Save the big models for tasks that genuinely need them.

The Systems Thinking Gap

AI tools generate code quickly. But generating code has always been the less valuable part of engineering. As you move up the career ladder, the code-writing decreases and the systems thinking increases — and the pay goes up.

The interesting opportunity: you can now generate a large, messy codebase on purpose and practice navigating it. You can experience the architectural challenges that normally only come with years in production systems. That's genuinely useful for building the judgment that AI tools can't replace.

What AI tools won't do for you:

  • Decide what to build
  • Catch their own architectural mistakes
  • Know your team's conventions unless you tell them
  • Review their own output critically

What they do extremely well:

  • Execute a well-specified plan
  • Handle the mechanical parts of refactoring
  • Generate boilerplate
  • Surface possibilities you hadn't considered

The engineers getting the most leverage are the ones who bring the systems thinking and delegate the execution — not the ones who let the agent decide both.

Enjoyed this? Get more like it.

Deep dives on system design, React, web development, and personal finance — straight to your inbox. Free, always.