Backend Caching: Global vs. Local Cache Architectures

Where Backend Caches Fit ⚙️

Architecture layers:

Client → DNS → CDN (for static content)
  ↓
Load Balancer
  ↓
Application Servers (stateless)
  ↓
[CACHE LAYER] ← Backend cache sits here
  ↓
Database Servers

Why backend caching matters:

Application servers are stateless (they don't store data). Every data request goes to the database, which is slow. Adding a cache layer between application servers and databases dramatically improves performance.

Type 1: Global Cache 🌐

Architecture:

Cache exists as a separate microservice
Sits between application servers and database
Shared across all application servers (hence "global")

Example: Redis cluster

All app servers can read/write to the same cache instance. This provides consistency — if Server A caches a value, Server B can retrieve it immediately.

Code Example: Global Cache (Redis)

def get_user_preferences(request):
    user_id = request.user_id
    
    # Check global cache (Redis)
    preferences = RedisClient.get_key(f"preferences:{user_id}")
    
    if preferences is None:
        # Cache miss - fetch from database
        preferences = SQLClient.get_preferences(user_id)
        
        # Update global cache
        RedisClient.set_key(f"preferences:{user_id}", preferences)
    
    return preferences

How it works: All app servers access the same Redis cluster. If Server A caches data, Server B can retrieve it immediately.

Type 2: Local Cache 💾

Architecture:

Cache exists inside each application server
Each server has its own private cache (RAM or disk)
Not shared between servers

Storage location:

In-memory (RAM) — fastest
On-disk (hard drive) — slower but persistent

Key characteristic: Each app server has a private cache. Server A cannot access Server B's cache. This can make servers somewhat stateful (storing temporary data).

Code Example: Local Cache (In-Memory Dictionary)

# Local cache stored in app server RAM
user_preferences_cache = {}

def get_user_preferences(request):
    user_id = request.user_id
    
    # Check local cache (dictionary in RAM)
    if user_id in user_preferences_cache:
        return user_preferences_cache[user_id]
    
    # Cache miss - fetch from database
    preferences = SQLClient.get_preferences(user_id)
    
    # Update local cache
    user_preferences_cache[user_id] = preferences
    
    return preferences

How it works: Each app server maintains its own dictionary in RAM. Data cached by Server A is not accessible to Server B.

What Data Gets Cached? 📊

Two criteria for caching:

Frequently accessed data — Data requested often (e.g., user profiles, product catalogs)
Expensive-to-compute data — Data requiring complex joins or heavy computation

Example: LinkedIn Profile Page

A single profile page aggregates data from multiple database tables:

user_education — University, degrees
user_location — Geographic data
user_connections — Network size
user_profile_views — View count
user_posts — Recent activity
user_skills — Endorsements

Loading this page requires joining 50+ tables, which is computationally expensive.

Solution: Cache the entire aggregated profile data to avoid repeated expensive queries.

Performance Comparison: Local vs. Global Cache ⚡

Which is faster?

Local cache wins because:

RAM access: ~10 nanoseconds
Hard disk access: ~100 microseconds
Network access (to Redis): ~1 millisecond

Accessing local RAM/disk is orders of magnitude faster than fetching from a network-based cache like Redis.

Single vs. Distributed Caches 🏗️

Single Global Cache

Use case: Small datasets or low request volume

Architecture:

App Servers → Single Redis Server → Database

One Redis instance handles all cache requests. Sufficient when:

Cached data fits in one server's memory
Request volume doesn't overwhelm a single server

Distributed Global Cache

Use case: Large datasets or high request volume

Architecture:

App Servers → Load Balancer → Redis Cluster (multiple servers) → Database

Multiple Redis servers behind a load balancer. Required when:

Cached data exceeds single server capacity
Request volume requires horizontal scaling

Local Cache Distribution

Question: Can local caches be single or distributed?

Answer: Local caches are always distributed by default.

Why?

Production systems always run multiple app servers (for redundancy)
If you only have one app server and it crashes, the entire service goes down
Each app server has its own local cache
Therefore, local caches are inherently distributed (though each cache is private)

Next: Load balancing algorithms for distributed caches — consistent hashing vs. round-robin.