Cache Eviction Policies: LRU, FIFO, LFU, and the Principle of Locality

The Eviction Problem 🗑️

Scenario: Your cache is full, but you need to add new data.

The fridge joke:

How do you put an elephant in the fridge? Open it, put the elephant in, close it.
How do you put a giraffe in the fridge? Open it, take the elephant out, put the giraffe in, close it.

Eviction = Forceful removal of existing data to make space for new data (like a landlord evicting a tenant who doesn't pay rent).

Common Eviction Policies 📋

Standard eviction policies (from DSA):

LRU (Least Recently Used): Priority queue implementation

FIFO (First In, First Out): Queue implementation

LIFO (Last In, First Out): Stack implementation

LFU (Least Frequently Used): Frequency-based priority queue

MRU (Most Recently Used): Reverse of LRU

Data Locality Principles 📍

Temporal Locality (Time-Based)

Principle: If data is accessed now, it will likely be accessed again soon.

Example: You open a file today → High probability you'll open it again in 5-15 minutes.

LRU is based on temporal locality: Keep recently-accessed items, evict old items.

Spatial Locality (Physical Proximity)

Principle: If you access element D, you'll likely access nearby elements (C, E, F) next.

Example: Array [A, B, C, D, E, F, G, H]

Just accessed D
High probability next access: C, E, or F
Low probability next access: A or H

Application: Memory paging uses spatial locality (load adjacent memory pages).

Limitation: Spatial locality isn't always definable (e.g., non-sequential data structures). Temporal locality is universally applicable.

Choosing an Eviction Policy: The Simple Answer 🎯

For 99% of use cases: Use LRU. Don't overthink it.

The Decision Framework

Step 1: Use LRU by default (no thinking required)

Step 2: Optimize everything else in your system first

Database queries
Network latency
Code performance
Infrastructure scaling

Step 3: Only after exhausting all other optimizations, consider alternative eviction policies

Step 4: Don't guess which policy is best. Benchmark all of them with production data.

Why Not Guess?

Bad approach: "My use case is X, so policy Y should work best" (assumptions lead to poor decisions)

Good approach:

Implement LRU, FIFO, LFU
Benchmark with real traffic
Choose based on data, not assumptions

Analogy: Touching a hot pan → You react immediately (use LRU). You don't stop to think about it.

Eviction vs. Invalidation: Key Differences 🔄

Many developers confuse these two concepts:

Eviction (Space Problem)

Trigger: Cache is full

Solution: Remove least important data to free up space

Criteria: Usage patterns (LRU, LFU, etc.)

Example: Refrigerator is full → Remove eggs to add apples

Invalidation (Freshness Problem)

Trigger: Data updated in database

Solution: Mark cached data as invalid/stale

Criteria: Data consistency requirements

Example: Rotten eggs in fridge → Throw them out immediately

Both problems exist simultaneously. Any production cache system must implement both eviction and invalidation strategies.

The House Analogy 🏠

Eviction (Not Important):

Tenant stops paying rent → Landlord evicts them (space needed for paying tenant)

Invalidation (No Longer Valid):

Tenant dies or moves away → Remove from rent agreement (data is invalid)

When to Use Alternatives to LRU 🤔

Rarely. If you've optimized everything else and still need performance gains:

Implement all policies (LRU, FIFO, LFU, MRU)
Benchmark with production traffic
Measure cache hit rates for each policy
Choose based on empirical data

Never assume which policy will work best based on theory alone. Real-world access patterns often defy expectations.

Why LRU Wins 🏆

Temporal locality is universal:

Recently accessed data is likely to be accessed again
Works well for most application patterns
Simple to implement and understand
Predictable performance

Edge cases where LRU might not be optimal:

Circular access patterns (A, B, C, A, B, C...)
Sequential scans of large datasets
Purely random access patterns

But even in these cases: LRU is "good enough" for 99% of applications.

Next: Cache invalidation policies — TTL, lazy deletion, and consistency models.