Write-Back Cache Strategy: High-Throughput Systems with Acceptable Data Loss
What is Write-Back Cache? 🔄
Definition: Writes go to the cache first, then asynchronously flush to the database later.
Flow:
- Client sends write request
- App server writes to cache (fast)
- App server returns success to client immediately
- Cache periodically dumps data to database (background process)
Key characteristic: Database writes are deferred (not immediate).
When to Use Write-Back Cache ✅
Ideal for: High-throughput systems where occasional data loss is acceptable.
Two Key Requirements
- Extremely high write volume — Millions of writes per second
- 1-2% data loss tolerance — Losing some data is not catastrophic
Use Case 1: View Counting on YouTube/Twitter 👁️
The Problem
Scenario: Billions of users watching videos on YouTube.
Challenge:
- Millions of view requests per second
- Writing each view to disk (database) is too slow
- Need extremely high throughput
The Write-Back Solution
Implementation:
- User watches a video → Increment view counter in cache (RAM)
- Cache accumulates views in memory
- Every 30 seconds, dump aggregated count to database
- If cache crashes before dump, some views are lost
Example:
Cache: Video X views = 1,523,891 (in memory)
Database: Video X views = 1,520,000 (last dump 30 seconds ago)
[Cache crashes before next dump]
Result: ~3,891 views lostWhy This Is Acceptable
Question: Is losing 1-2% of views catastrophic?
Answer: No, because we care about trends, not individual data points.
Rationale:
- Overall trend remains accurate (millions of views)
- Losing 3,891 out of 1,520,000 views = 0.25% error
- Nobody notices or cares about this margin
- Benefit: Massively higher throughput (can handle millions of requests/second)
Server Crash Frequency
How often do servers crash? Approximately once per year (for well-maintained systems).
Impact: If cache dumps to database every 30 seconds, and the server crashes once per year, you lose at most 30 seconds of view data — negligible in the grand scheme.
Use Case 2: Multiplayer Gaming (PUBG, CS:GO, Dota) 🎮
The Problem
Scenario: 10-100 players in a live game match.
Challenge:
- Real-time game state (player positions, health, ammo, kills)
- Extremely high write frequency (every player action)
- Writing every action to database is too slow
The Write-Back Solution
Architecture:
10 Players → App Server (local cache stores game state) → DatabaseImplementation:
- All players in a match connect to one app server
- App server stores entire game state in local cache (RAM)
- During the match: All updates stay in cache (no database writes)
- Match ends successfully: Dump final statistics to database
- Winner
- Kill count per player
- Death count
- MVP
- Cheater detection results
- Server crashes mid-match: Game is lost (cannot resume)
Why This Is Acceptable
Question: If the server crashes and all game data is lost, isn't that bad?
Answer: Yes, but it's acceptable for this use case.
Rationale:
- Server crashes are rare (once per year)
- If a crash happens, players simply start a new game
- Users accept this risk in exchange for smooth, lag-free gameplay
- Alternative (writing every action to database) would make the game unplayable due to latency
Not stored in database during match:
- Every bullet fired
- Every player movement
- Every health/ammo change
Stored in database after match:
- Final game metadata (winner, kills, deaths, MVP)
- Game recording (for replay/review)
Write-Back Cache: Key Takeaways 🎯
When to Use
- High-throughput systems (millions of writes/second)
- Data loss of 1-2% is acceptable
- Trend accuracy matters more than individual precision
Examples:
- View counts (YouTube, Twitter, Instagram)
- Like counts (social media)
- Analytics dashboards
- Gaming sessions
- Leaderboards (with periodic updates)
When NOT to Use
- Financial transactions (bank transfers, payments)
- Critical user data (account passwords, personal information)
- Legal records (contracts, compliance data)
- Medical records (patient data)
- Any system where data loss is unacceptable
Trade-off Summary
Gain:
- Massive throughput increase (10-100x faster writes)
- Near-zero write latency for users
Cost:
- Risk of data loss on cache failure
- Eventual consistency (data may be stale until next flush)
Advanced Topic: Bloom Filters for Non-Existent Keys 🔍
Problem: What if users frequently request keys that don't exist in the database?
Scenario:
- User requests non-existent key
- Cache miss → Check database
- Database doesn't have the key either
- Return empty result
- This happens repeatedly → Database gets overwhelmed with useless queries
Solution: Bloom filters provide a fast containment check before querying the database.
How Bloom filters work:
- Probabilistic data structure
- Quickly answers: "Is this key definitely NOT in the database?"
- If Bloom filter says "not present" → Skip database query entirely
- If Bloom filter says "maybe present" → Check database
Note: Bloom filters will be covered in detail during the NoSQL internals lecture. For now, understand they optimize cache misses for non-existent keys.
Next: CDN infrastructure deep dive — ISP partnerships and redundancy architecture.