Load Balancing Architecture and Service Discovery

Load Balancing Architecture ⚖️

The IP Address Problem

Single server scenario:

DNS maps domain → single IP
User connects directly to server

500 server scenario:

Each server has unique IP address
Which IP does DNS return?
If DNS returns one IP, that server gets all traffic (defeats purpose of horizontal scaling)

Load Balancer Solution 🎯

Architecture:

[Client] → [Load Balancer] → [Backend Server Pool (499 servers)]

Load balancer implementation:

Designate one machine from pool as load balancer
Install load balancing software (Nginx, Kong, HAProxy)
All client traffic routes through load balancer
Load balancer distributes requests to backend servers

DNS configuration:

DNS maps domain → load balancer IP only
Backend server IPs remain internal/hidden
Clients only know load balancer's address

Load Balancer Responsibilities 📜

1. Abstraction (Unified View)

Present single system interface to clients
Hide distributed architecture complexity
Clients unaware of multiple backend servers

2. Load Distribution

Distribute requests evenly across servers
Prevent individual server overload
Maintain balanced utilization across pool

Key distinction: Router forwards traffic blindly. Load balancer intelligently distributes based on server capacity and load.

Service Discovery: Tracking Available Servers 🕵️

The Discovery Problem

Load balancer must know:

✅ Which servers exist
✅ Which servers are healthy (operational)
✅ Which servers are failed/crashed

Challenge: Servers can fail at any time. Load balancer requires real-time awareness of backend pool health.

Solution #1: Heartbeat Mechanism (Push) 💓

Concept: Servers actively report their status to load balancer.

Implementation:

Load balancer exposes endpoint: /heartbeat
Each backend server periodically sends status (e.g., every 5 seconds):

POST /heartbeat
{
  "server_ip": "192.168.1.5",
  "status": "alive"
}

Load balancer maintains list of active servers
If server misses multiple consecutive heartbeats → marked as failed
Failed servers removed from routing pool

Advantages:

✅ Real-time health awareness
✅ Failed servers automatically removed
✅ No traffic routed to dead servers

Pattern: Push mechanism (servers push status to load balancer)

Solution #2: Health Check Mechanism (Pull) 🩺

Concept: Load balancer actively queries server health (pull approach).

Implementation:

Load balancer periodically polls each server
Servers expose health check endpoint: /health

GET http://192.168.1.5/health

Server responds: 200 OK (healthy)
Timeout or error after multiple attempts → server marked as failed

Pattern: Pull mechanism (load balancer pulls status from servers)

Push vs Pull: Service Discovery Approaches 🤔

Heartbeat (Push):

Servers actively report status to load balancer
Real-time notification when server starts/fails
Distributed responsibility across all servers

Health Check (Pull):

Load balancer queries server status
Centralized responsibility in load balancer
Slightly delayed failure detection (polling interval)

Industry preference: Health checks more common due to centralized maintenance and simpler server implementation.

✅ Conclusion: Both equally efficient choice depends on where you want monitoring responsibility to reside.

New Server Registration 🆕

Problem: How does load balancer discover newly added servers?

Solution: New server self-registers with load balancer:

POST http://load-balancer/register
{
  "server_ip": "192.168.1.250",
  "status": "ready"
}

Principle: Registration is server's responsibility. Load balancer cannot autonomously detect new infrastructure servers must announce their presence.

Load Balancer Performance: The 100× Advantage 💪

Application Server Workload

Responsibilities:

Receive request
Process through OSI layers (7 layers)
Deserialize request
Decrypt (if encrypted)
Authorization checks
Database queries
Business logic execution
Generate response
Serialize response
Encrypt response
Send through OSI layers

Throughput: 100-1,000 requests/second

Load Balancer Workload

Responsibilities:

Examine incoming request
Select backend server
Forward request

What load balancers DON'T do:

❌ Deserialization
❌ Decryption
❌ Authentication
❌ Database access
❌ Business logic
❌ Response generation

Throughput: 100,000+ requests/second

Performance ratio: Load balancers handle 100× more traffic than application servers due to minimal processing overhead.

OSI Model Context 🌐

7 Layers (bottom to top):

Physical: Electrical signals, hardware
Data Link: MAC addresses, frames
Network: IP addresses, routing
Transport: TCP/UDP, ports
Session: Connection management
Presentation: Encryption, data formatting
Application: HTTP, web services

Load balancers operate primarily at Network/Transport layers (lower overhead than Application layer processing).

Load Balancer OSI Layers

Layer 4 (Transport Layer):

TCP/UDP routing
Works with IP addresses and ports
No application protocol awareness

Layer 7 (Application Layer):

HTTP/HTTPS routing
Application protocol awareness
Can route based on URL paths, headers, cookies

Use case determines layer: Different operational layers for different requirements.

Key Takeaways 💡

Load balancers are essential for distributed architectures.
Service discovery is critical.
Push vs pull are equally efficient.
New servers must self-register.
Load balancers handle 100× more traffic than application servers.
OSI layer choice matters.