Context Placement
Where you place information in a prompt matters as much as what you say. Critical info buried in the middle gets ignored — sometimes worse than saying nothing at all.
It's not just what you say in a prompt — it's where you say it.
Research shows that critical information placed in the middle of a long context gets worse model attention than information at the beginning or end. In some cases, middle-buried information performs worse than providing no information at all.
The "Lost in the Middle" Study
The paper "Lost in the Middle: How Language Models Use Long Contexts" tested model performance across many documents, with the key information placed at different positions: early documents, middle documents, late documents.
Findings:
- Beginning of context → highest accuracy
- End of context → high accuracy
- Middle of context → lower accuracy than the baseline (no documents provided)
This is striking. The model literally did better with no information than with the relevant information buried in position 7–16 of a 20-document context.
Why This Happens: Primacy and Recency Bias
Human psychology has names for this:
Primacy bias — we remember the beginning of a list better than the middle. Recency bias — we remember the end of a list better than the middle.
LLMs, trained on human-generated data and built with attention mechanisms that mirror neural networks, exhibit the same pattern. The model attends to the beginning and end of a context more strongly than its middle.
This means long conversations and big context windows don't give you unlimited reliable attention. They give you a buffer — but not an even one.
Practical Implications
1. Critical instructions belong at the start
❌ "...and by the way, don't add any features I didn't ask for."
[buried at line 12 of a 20-line prompt]
✅ "IMPORTANT: Do not add any features beyond what is specified below.
[then the full requirements]"2. Supporting details go at the end
Put context that provides depth or background after your core request. The model reads the main task first (primacy), sees the supporting details (recency if it's the last thing), and forms its plan.
3. Long conversations degrade silently
What you put at the end of message 5 becomes the "middle" of your conversation by message 20. If something is important, re-state it — don't assume it's still in view.
// After 15+ messages of code generation:
"Reminder: all functions should be pure — no side effects.
Continue building the auth system with this constraint in mind."4. The "start a new chat" signal
If you're getting worse outputs as a conversation grows — code that's drifting from your stated conventions, features being added that you explicitly said not to — context has likely been lost or buried.
Signal: start a new chat. Ask the model to summarize first:
"Before we continue, summarize what we've built so far, the key constraints
we've agreed on, and what the next feature is. I'll paste this into a new chat."Context Window vs. Useful Attention Window
A model might have a 200,000-token context window. That doesn't mean it reliably attends to all 200,000 tokens. The useful attention window — where the model actually processes information well — is smaller, and it's weighted toward the beginning and end.
The "Lost in the Middle" study found this degradation starting around 2,000–4,000 tokens in. In a tool like Copilot or Cursor with a system message, attached files, and conversation history, you can hit that zone faster than you'd expect.
Structure That Fights "Lost in the Middle"
When writing long prompts, repeat critical constraints in two places:
[TOP]
You are building a Prompt Library app.
CONSTRAINTS:
- Plain HTML, CSS, JavaScript only
- No external dependencies
- No features beyond what is listed
[... middle: detailed specifications ...]
[BOTTOM]
Remember: use only plain HTML, CSS, and JavaScript.
Implement only the features listed above — nothing else.This exploits both primacy and recency bias to keep the most important constraints visible.
When Context is Too Much
Given the "Lost in the Middle" finding, sometimes less context is better.
If you're working in a large codebase:
- Don't paste the entire repo into context
- Don't
@codebaseif you only need 2 files - Provide the minimal files needed for the specific task
❌ Attaching 15 files "for context"
→ Key constraints in file #8 effectively don't exist
✅ Attaching only the 2 files directly relevant to the current task
→ Everything the model sees is high-signalSummary
| Position | Attention level | Use for |
|---|---|---|
| Beginning | Highest | Critical constraints, core task description |
| End | High | Supporting details, reminders, output format |
| Middle | Lowest | Avoid putting anything critical here |
When a conversation gets long, re-state critical information. When quality degrades, start fresh. The model's context window is large — but its useful attention window is not.
Enjoyed this? Get more like it.
Deep dives on system design, React, web development, and personal finance — straight to your inbox. Free, always.