Few-Shot Prompts

Few-shot prompts use 2+ examples to teach the model nuances, edge cases, and format variations. More powerful than one-shot — but requires more upfront effort.

March 30, 20265 min read4 / 5

Few-shot prompting is providing two or more examples alongside your request. The model learns nuances, variations, and edge cases from the diversity of those examples.

It's more powerful than one-shot — and research shows it gets more effective as models get larger. Since models are trending larger, this is one of the most future-proof techniques you can learn.

Why Two or More Examples?

With one-shot, you show the general case. With few-shot, you show:

  • Multiple input variations
  • Multiple output formats
  • Edge cases and error cases
  • The range of acceptable inputs and outputs

The model generalizes a richer pattern from diverse examples than it can from a single one.

The Research

The paper "Language Models are Few-Shot Learners" found that few-shot prompting improved accuracy more rapidly than zero-shot or one-shot as model size increased.

On a benchmark of complex reasoning tasks, few-shot examples produced exponential accuracy gains on larger models — while zero-shot and one-shot showed only linear improvement. In a world where models keep getting bigger, few-shot prompting will keep getting more effective.

TechniqueAccuracy gain as model size increases
Zero-shotLinear improvement
One-shotLinear improvement
Few-shotExponential improvement

How Many Examples?

Research suggests 4–8 examples is the sweet spot. After 10 examples, there are diminishing returns — and some models actually degrade with too many.

More important than the number is the quality and diversity of examples. Four well-chosen examples outperform eight mediocre ones. Low-quality human-engineered examples actively hurt model performance.

Example: Business Decision Analysis

This few-shot prompt asks a model to classify decisions at different analysis depths:

Plain text
Analyze the following business decision with the requested level of detail. --- Decision: Opening a new store location Analysis level: Quick take Response: New location appears viable given foot traffic data and competitor absence, though initial investment is substantial. --- --- Decision: Launching a mobile app Analysis level: Standard review Response: Mobile app launch presents a calculated opportunity with mixed timing signals. App store competition is intense, with over 2.8 million apps competing for discoverability. However, your existing customer base of 45,000 represents a captive audience... [2-3 sentences] --- --- Decision: Acquiring a competitor Analysis level: Deep dive Response: Acquisition strategy requires careful multi-factor analysis... [detailed multi-paragraph breakdown with data points] --- Now analyze: Decision: Implementing remote work policy Analysis level: Standard review

The model learns from the examples what "quick take," "standard review," and "deep dive" mean in terms of response length and depth — without you having to define them explicitly. It then applies that pattern to a new decision at the requested depth.

Example: Generating a Feature Implementation Prompt

One powerful use of few-shot prompting is asking the model to generate a prompt for you. Here's a real example:

Plain text
I need you to create a prompt for implementing a new feature. Here are examples of effective feature implementation prompts: Example 1: Save and delete functionality --- Create a save and delete system for a Prompt Library with these requirements: Technical specifications: - Save button that persists prompts to localStorage - Delete button with confirmation dialog before removal - Visual feedback on save (green checkmark animation) - Trash icon with hover effect for delete Provide complete HTML, CSS, and JavaScript with: - Semantic HTML structure - CSS animations for feedback - localStorage integration with JSON serialization Data structure: { id: string, title: string, content: string, createdAt: ISO string } --- Example 2: Star rating component --- Build a 5-star rating system with: Core requirements: - Interactive 5-star display - Shows average rating and total count - Updates without page refresh - Allows users to change their rating Implementation details: - SVG stars for crisp rendering - Gold fill for rated, gray outline for unrated - Smooth hover animations Deliver production-ready HTML, CSS, and JavaScript with comments. Data model: { rating: number (1-5) | null, ratingCount: number, ratingSum: number } --- Your task: Create a detailed implementation prompt for a **notes section** where users can add, edit, save, and delete notes for each prompt in the library. Keep it as simple as possible. Only include the features mentioned.

The model generates a full specification — technical requirements, implementation details, data structure — before any code is written. Review the plan, refine it, then implement in a follow-up prompt.

When to Use Few-Shot (and When Not To)

Use few-shot when:

  • The task has complex logic or multiple variations
  • You're building a classification system with many categories
  • You're standardizing output format across diverse inputs
  • The task is domain-specific and requires context

Skip few-shot when:

  • A zero-shot prompt already gets you 80–90% of the way there
  • The time to craft good examples exceeds the time to just build the thing manually
  • You can't come up with truly diverse, high-quality examples

Few-shot prompting is used in roughly 15–20% of professional prompts. It's not for everything — but when you need it, nothing else is as effective.

Building Good Few-Shot Examples

The examples are the most important part. Guidelines:

Plain text
✅ Include variety — different inputs, different output structures ✅ Cover edge cases — what happens with unusual or boundary inputs ✅ Include failure cases — show how errors should be handled ✅ Keep examples concise but complete — no unnecessary padding ✅ Use realistic examples — not toy data that doesn't reflect actual use ❌ Don't repeat the same input shape with minor variations ❌ Don't use edge cases as your *only* examples ❌ Don't sacrifice quality for quantity (4 great > 8 mediocre)

Asking the Model to Generate the Examples

If you're struggling to come up with good few-shot examples, ask the model:

Plain text
I want to use few-shot prompting to build a feature that does X. Can you help me create 3 diverse, high-quality examples I can use as shots? Include edge cases and ensure the examples cover different variations of the input.

Then review what it generates carefully. The examples need to be correct — a wrong example teaches the model a wrong pattern.

Practice: Try It Yourself

Exercise 1: Classification with depth

Create 3 examples of classifying a product review as "rant," "balanced," or "glowing" — each with a different review style and length. Then test with 5 new reviews and see how consistently the model applies the pattern.

Exercise 2: Code generation with format

Show the model 2–3 of your existing unit tests as examples, then ask it to write tests for a new function. Compare format consistency vs. using a zero-shot approach.

Exercise 3: Generate your own shots

Describe a task you do regularly. Ask the model to generate few-shot examples for it, review and fix them, then use the prompt on a real task.

Enjoyed this? Get more like it.

Deep dives on system design, React, web development, and personal finance — straight to your inbox. Free, always.