Context Placement

Where you place information in a prompt matters as much as what you say. Critical info buried in the middle gets ignored — sometimes worse than saying nothing at all.

March 30, 20264 min read3 / 6

It's not just what you say in a prompt — it's where you say it.

Research shows that critical information placed in the middle of a long context gets worse model attention than information at the beginning or end. In some cases, middle-buried information performs worse than providing no information at all.

The "Lost in the Middle" Study

The paper "Lost in the Middle: How Language Models Use Long Contexts" tested model performance across many documents, with the key information placed at different positions: early documents, middle documents, late documents.

Findings:

  • Beginning of context → highest accuracy
  • End of context → high accuracy
  • Middle of context → lower accuracy than the baseline (no documents provided)

This is striking. The model literally did better with no information than with the relevant information buried in position 7–16 of a 20-document context.

Why This Happens: Primacy and Recency Bias

Human psychology has names for this:

Primacy bias — we remember the beginning of a list better than the middle. Recency bias — we remember the end of a list better than the middle.

LLMs, trained on human-generated data and built with attention mechanisms that mirror neural networks, exhibit the same pattern. The model attends to the beginning and end of a context more strongly than its middle.

This means long conversations and big context windows don't give you unlimited reliable attention. They give you a buffer — but not an even one.

Practical Implications

1. Critical instructions belong at the start

Plain text
❌ "...and by the way, don't add any features I didn't ask for." [buried at line 12 of a 20-line prompt] ✅ "IMPORTANT: Do not add any features beyond what is specified below. [then the full requirements]"

2. Supporting details go at the end

Put context that provides depth or background after your core request. The model reads the main task first (primacy), sees the supporting details (recency if it's the last thing), and forms its plan.

3. Long conversations degrade silently

What you put at the end of message 5 becomes the "middle" of your conversation by message 20. If something is important, re-state it — don't assume it's still in view.

Plain text
// After 15+ messages of code generation: "Reminder: all functions should be pure — no side effects. Continue building the auth system with this constraint in mind."

4. The "start a new chat" signal

If you're getting worse outputs as a conversation grows — code that's drifting from your stated conventions, features being added that you explicitly said not to — context has likely been lost or buried.

Signal: start a new chat. Ask the model to summarize first:

Plain text
"Before we continue, summarize what we've built so far, the key constraints we've agreed on, and what the next feature is. I'll paste this into a new chat."

Context Window vs. Useful Attention Window

A model might have a 200,000-token context window. That doesn't mean it reliably attends to all 200,000 tokens. The useful attention window — where the model actually processes information well — is smaller, and it's weighted toward the beginning and end.

The "Lost in the Middle" study found this degradation starting around 2,000–4,000 tokens in. In a tool like Copilot or Cursor with a system message, attached files, and conversation history, you can hit that zone faster than you'd expect.

Structure That Fights "Lost in the Middle"

When writing long prompts, repeat critical constraints in two places:

Plain text
[TOP] You are building a Prompt Library app. CONSTRAINTS: - Plain HTML, CSS, JavaScript only - No external dependencies - No features beyond what is listed [... middle: detailed specifications ...] [BOTTOM] Remember: use only plain HTML, CSS, and JavaScript. Implement only the features listed above — nothing else.

This exploits both primacy and recency bias to keep the most important constraints visible.

When Context is Too Much

Given the "Lost in the Middle" finding, sometimes less context is better.

If you're working in a large codebase:

  • Don't paste the entire repo into context
  • Don't @codebase if you only need 2 files
  • Provide the minimal files needed for the specific task
Plain text
❌ Attaching 15 files "for context" → Key constraints in file #8 effectively don't exist ✅ Attaching only the 2 files directly relevant to the current task → Everything the model sees is high-signal

Summary

PositionAttention levelUse for
BeginningHighestCritical constraints, core task description
EndHighSupporting details, reminders, output format
MiddleLowestAvoid putting anything critical here

When a conversation gets long, re-state critical information. When quality degrades, start fresh. The model's context window is large — but its useful attention window is not.

Enjoyed this? Get more like it.

Deep dives on system design, React, web development, and personal finance — straight to your inbox. Free, always.