Consistency Models: Strong vs. Eventual Consistency in Caching

March 5, 20264 min read
system designhigh level designHLDdistributed systemsscalabilitymicroservicesload balancingcachingdatabase designAPI designsoftware architecture

Understanding Consistency ๐Ÿ”„

Core question: When you read data from a cache, can it be stale?

Three Types of Consistency

1. Strong Consistency (Immediate Consistency)

Definition: Every read is guaranteed to return the latest data. No stale reads are possible.

Characteristics:

  • All replicas/caches sync immediately
  • Zero delay between write and consistency
  • No waiting required

Example: Bank Transactions

  • You spend โ‚น10
  • Query your balance immediately
  • Bank returns updated balance (not the old one)
  • Stale balance is never served

2. Eventual Consistency

Definition: Data becomes consistent after some delay. Temporary inconsistencies are possible.

Characteristics:

  • Writes propagate to all replicas/caches asynchronously
  • Short window where different users see different data
  • Eventually (after sync completes), all users see the same data

3. Inconsistent (No Guarantee)

Definition: It's possible to read stale values indefinitely. No consistency guarantees.

Live Example: LinkedIn Post Inconsistency ๐Ÿงช

The Experiment

Setup:

  1. Create a LinkedIn post: "Do you see a post?"
  2. Edit the post to: "This post has now been edited"
  3. Share the URL with users after editing
  4. Ask: Do you see the original or edited version?

Results

Some users: See edited post (latest version)

Other users: See original post (stale data)

Critical observation: The URL was shared after the edit was saved. Yet some users still saw the old version.

Why This Happens: Distributed Local Cache

LinkedIn's architecture:

User 1 โ†’ Load Balancer โ†’ App Server A (updated cache) โ†’ Sees edited post User 2 โ†’ Load Balancer โ†’ App Server B (stale cache) โ†’ Sees original post

Explanation:

  • LinkedIn uses distributed local caches (each app server has its own cache)
  • User 1's request routes to a server with the updated cache
  • User 2's request routes to a different server with stale cache
  • After refresh, all servers sync and everyone sees the edited post

Eventual Consistency in Action

Initial state: Some users see old data, some see new data

After 5-10 seconds: All users see the new data

This is NOT permanent inconsistency โ€” just temporary lag while caches sync.

Typical Cache Read-Write Flow ๐Ÿ“

Architecture:

Client โ†” Application Server โ†” Cache Server โ†” Database Server

Write Operation (Bypasses Cache)

Standard behavior: Writes go directly to the database, not to the cache.

1. Client sends write request (update value A) 2. Application server writes to database (bypasses cache) 3. Cache is NOT updated

Why bypass the cache? Multiple reasons:

  • Simpler logic (single source of truth)
  • Avoids cache-database sync complexity
  • We'll cover alternative strategies (write-through, write-back) in advanced topics

Read Operation: Cache Miss

Scenario: Data was just written to the database, but not in the cache.

1. Client sends read request (get value A) 2. Application server checks cache 3. Cache returns 404 (key not found) โ†’ CACHE MISS 4. Application server fetches from database 5. Application server returns data to client 6. Application server updates cache asynchronously (background task)

Result: Next read will be a cache hit.

Read Operation: Cache Hit

Scenario: Data exists in the cache.

1. Client sends read request (get value A) 2. Application server checks cache 3. Cache returns value โ†’ CACHE HIT 4. Application server returns data to client (does NOT check database)

Critical point: We trust the cache blindly. Even if the cache has stale data, we serve it. The database is not consulted during a cache hit.

Why Caches Should Be Dumb ๐Ÿšซ

Question: When the cache gets a request and realizes it doesn't have the data, why can't it fetch from the database itself?

Answer: This approach is used by CDNs, but not recommended for backend caches.

The Problem: Caches Become Too Smart

If the cache automatically fetches from the database, the cache needs:

  1. SQL query logic โ€” How to construct database queries
  2. Database knowledge โ€” Which tables exist, what schema they use
  3. Post-processing logic โ€” How to transform raw data
  4. Authentication/authorization โ€” Which users can access what data
  5. Validation logic โ€” Data integrity checks
  6. Complex joins โ€” Multi-table aggregations

Result: You now have business logic in two places (app server AND cache server).

Why Dual Logic Is Dangerous

Scenario:

  • Developer updates data-fetching logic in app server
  • Developer forgets to update cache server logic
  • System breaks โ€” App server and cache server behave differently

Problems:

  • Code duplication โ€” Maintain two codebases
  • Synchronization bugs โ€” Logic divergence between systems
  • Violates Single Responsibility Principle โ€” Cache should cache, not process business logic

The Correct Approach: Dumb Cache

Principle: Cache should be dumb (simple key-value storage). App server should be smart (business logic).

Division of responsibility:

  • Cache: Store and retrieve data (no logic)
  • App server: Handle all business logic, data processing, validation, authorization

Benefits:

  • Single source of truth for business logic
  • Easier to maintain and debug
  • Clear separation of concerns

Next: Write-back cache strategy for high-throughput systems like YouTube and gaming platforms.