Load Balancing Architecture and Service Discovery
Load Balancing Architecture ⚖️
The IP Address Problem
Single server scenario:
- DNS maps domain → single IP
- User connects directly to server
500 server scenario:
- Each server has unique IP address
- Which IP does DNS return?
- If DNS returns one IP, that server gets all traffic (defeats purpose of horizontal scaling)
Load Balancer Solution 🎯
Architecture:
[Client] → [Load Balancer] → [Backend Server Pool (499 servers)]Load balancer implementation:
- Designate one machine from pool as load balancer
- Install load balancing software (Nginx, Kong, HAProxy)
- All client traffic routes through load balancer
- Load balancer distributes requests to backend servers
DNS configuration:
- DNS maps domain → load balancer IP only
- Backend server IPs remain internal/hidden
- Clients only know load balancer's address
Load Balancer Responsibilities 📜
1. Abstraction (Unified View)
- Present single system interface to clients
- Hide distributed architecture complexity
- Clients unaware of multiple backend servers
2. Load Distribution
- Distribute requests evenly across servers
- Prevent individual server overload
- Maintain balanced utilization across pool
Key distinction: Router forwards traffic blindly. Load balancer intelligently distributes based on server capacity and load.
Service Discovery: Tracking Available Servers 🕵️
The Discovery Problem
Load balancer must know:
- ✅ Which servers exist
- ✅ Which servers are healthy (operational)
- ✅ Which servers are failed/crashed
Challenge: Servers can fail at any time. Load balancer requires real-time awareness of backend pool health.
Solution #1: Heartbeat Mechanism (Push) 💓
Concept: Servers actively report their status to load balancer.
Implementation:
- Load balancer exposes endpoint:
/heartbeat - Each backend server periodically sends status (e.g., every 5 seconds):
POST /heartbeat
{
"server_ip": "192.168.1.5",
"status": "alive"
}- Load balancer maintains list of active servers
- If server misses multiple consecutive heartbeats → marked as failed
- Failed servers removed from routing pool
Advantages:
- ✅ Real-time health awareness
- ✅ Failed servers automatically removed
- ✅ No traffic routed to dead servers
Pattern: Push mechanism (servers push status to load balancer)
Solution #2: Health Check Mechanism (Pull) 🩺
Concept: Load balancer actively queries server health (pull approach).
Implementation:
- Load balancer periodically polls each server
- Servers expose health check endpoint:
/health
GET http://192.168.1.5/health- Server responds:
200 OK(healthy) - Timeout or error after multiple attempts → server marked as failed
Pattern: Pull mechanism (load balancer pulls status from servers)
Push vs Pull: Service Discovery Approaches 🤔
Heartbeat (Push):
- Servers actively report status to load balancer
- Real-time notification when server starts/fails
- Distributed responsibility across all servers
Health Check (Pull):
- Load balancer queries server status
- Centralized responsibility in load balancer
- Slightly delayed failure detection (polling interval)
Industry preference: Health checks more common due to centralized maintenance and simpler server implementation.
New Server Registration 🆕
Problem: How does load balancer discover newly added servers?
Solution: New server self-registers with load balancer:
POST http://load-balancer/register
{
"server_ip": "192.168.1.250",
"status": "ready"
}Principle: Registration is server's responsibility. Load balancer cannot autonomously detect new infrastructure servers must announce their presence.
Load Balancer Performance: The 100× Advantage 💪
Application Server Workload
Responsibilities:
- Receive request
- Process through OSI layers (7 layers)
- Deserialize request
- Decrypt (if encrypted)
- Authorization checks
- Database queries
- Business logic execution
- Generate response
- Serialize response
- Encrypt response
- Send through OSI layers
Throughput: 100-1,000 requests/second
Load Balancer Workload
Responsibilities:
- Examine incoming request
- Select backend server
- Forward request
What load balancers DON'T do:
- ❌ Deserialization
- ❌ Decryption
- ❌ Authentication
- ❌ Database access
- ❌ Business logic
- ❌ Response generation
Throughput: 100,000+ requests/second
Performance ratio: Load balancers handle 100× more traffic than application servers due to minimal processing overhead.
OSI Model Context 🌐
7 Layers (bottom to top):
- Physical: Electrical signals, hardware
- Data Link: MAC addresses, frames
- Network: IP addresses, routing
- Transport: TCP/UDP, ports
- Session: Connection management
- Presentation: Encryption, data formatting
- Application: HTTP, web services
Load balancers operate primarily at Network/Transport layers (lower overhead than Application layer processing).
Load Balancer OSI Layers
Layer 4 (Transport Layer):
- TCP/UDP routing
- Works with IP addresses and ports
- No application protocol awareness
Layer 7 (Application Layer):
- HTTP/HTTPS routing
- Application protocol awareness
- Can route based on URL paths, headers, cookies
Use case determines layer: Different operational layers for different requirements.
Key Takeaways 💡
- Load balancers are essential for distributed architectures.
- Service discovery is critical.
- Push vs pull are equally efficient.
- New servers must self-register.
- Load balancers handle 100× more traffic than application servers.
- OSI layer choice matters.