# Load Balancing

### ⚖️ Load Balancing Strategies

#### Load Balancing Algorithms

**Load balancing algorithms** menentukan bagaimana traffic didistribusikan ke server instances.

````yaml
# Load Balancing Algorithms

## Round Robin
description: "Distribute requests sequentially to servers"
implementation:
  - Maintain circular list of servers
  - Move to next server for each request
  - Wrap around to first server when reaching end

advantages:
  - Simple implementation
  - Equal distribution
  - No server state required

disadvantages:
  - Ignores server capacity differences
  - Doesn't consider server load
  - May send requests to overloaded servers

use_case: "Servers with similar capacity and request processing time"

## Weighted Round Robin
description: "Distribute requests based on server capacity weights"
implementation:
  - Assign weight to each server based on capacity
  - Distribute requests proportionally to weights
  - Higher capacity servers get more requests

advantages:
  - Considers server capacity differences
  - Predictable distribution
  - Resource optimization

disadvantages:
  - Manual weight configuration
  - Static weight assignment
  - Doesn't adapt to changing loads

use_case: "Heterogeneous server infrastructure"

## Least Connections
description: "Route requests to server with fewest active connections"
implementation:
  - Track active connections per server
  - Select server with minimum connections
  - Update connection count dynamically

advantages:
  - Considers current server load
  - Better for long-running requests
  - Dynamic load balancing

disadvantages:
  - Requires connection tracking
  - More complex implementation
  - May not work well with short connections

use_case: "Applications with variable request duration"

## IP Hash
description: "Use client IP hash for consistent server routing"
implementation:
  - Calculate hash of client IP address
  - Map hash to server index
  - Consistent routing for same client

advantages:
  - Session persistence
  - Cache-friendly
  - Consistent user experience

disadvantages:
  - Uneven distribution
  - Server failures affect sessions
  - Not suitable for dynamic scaling

use_case: "Stateful applications requiring session persistence"

## Least Response Time
description: "Route to server with lowest response time"
implementation:
  - Monitor response time for each server
  - Select fastest responding server
  - Update response time metrics continuously

advantages:
  - Optimal performance
  - Adaptive to changing conditions
  - User experience focused

disadvantages:
  - Complex monitoring required
  - May cause oscillation
  - Overhead of response time tracking

use_case: "Performance-critical applications"

## Implementation Example
```javascript
// Load Balancer Implementation
class LoadBalancer {
  constructor(algorithm = 'round_robin') {
    this.servers = [];
    this.algorithm = algorithm;
    this.currentIndex = 0;
    this.connections = new Map(); // Track connections per server
    this.responseTimes = new Map(); // Track response times
  }

  addServer(server, weight = 1) {
    this.servers.push({
      ...server,
      weight,
      healthy: true,
      lastHealthCheck: null
    });
    this.connections.set(server.id, 0);
    this.responseTimes.set(server.id, 0);
  }

  removeServer(serverId) {
    this.servers = this.servers.filter(s => s.id !== serverId);
    this.connections.delete(serverId);
    this.responseTimes.delete(serverId);
  }

  selectServer(clientIp = null) {
    const healthyServers = this.servers.filter(s => s.healthy);

    if (healthyServers.length === 0) {
      throw new Error('No healthy servers available');
    }

    switch (this.algorithm) {
      case 'round_robin':
        return this.roundRobin(healthyServers);
      case 'weighted_round_robin':
        return this.weightedRoundRobin(healthyServers);
      case 'least_connections':
        return this.leastConnections(healthyServers);
      case 'ip_hash':
        return this.ipHash(healthyServers, clientIp);
      case 'least_response_time':
        return this.leastResponseTime(healthyServers);
      default:
        return this.roundRobin(healthyServers);
    }
  }

  roundRobin(servers) {
    const server = servers[this.currentIndex % servers.length];
    this.currentIndex++;
    return server;
  }

  weightedRoundRobin(servers) {
    const totalWeight = servers.reduce((sum, s) => sum + s.weight, 0);
    let random = Math.random() * totalWeight;

    for (const server of servers) {
      random -= server.weight;
      if (random <= 0) {
        return server;
      }
    }

    return servers[servers.length - 1];
  }

  leastConnections(servers) {
    return servers.reduce((min, server) => {
      const connections = this.connections.get(server.id) || 0;
      const minConnections = this.connections.get(min.id) || 0;
      return connections < minConnections ? server : min;
    });
  }

  ipHash(servers, clientIp) {
    if (!clientIp) {
      return this.roundRobin(servers);
    }

    const hash = this.hashCode(clientIp);
    const index = Math.abs(hash) % servers.length;
    return servers[index];
  }

  leastResponseTime(servers) {
    return servers.reduce((best, server) => {
      const responseTime = this.responseTimes.get(server.id) || Infinity;
      const bestResponseTime = this.responseTimes.get(best.id) || Infinity;
      return responseTime < bestResponseTime ? server : best;
    });
  }

  // Simulate request handling
  async handleRequest(request, handler) {
    const server = this.selectServer(request.clientIp);

    try {
      // Increment connection count
      this.connections.set(server.id, (this.connections.get(server.id) || 0) + 1);

      const startTime = Date.now();
      const response = await handler(server, request);
      const responseTime = Date.now() - startTime;

      // Update response time metrics
      this.responseTimes.set(server.id, responseTime);

      return response;
    } finally {
      // Decrement connection count
      this.connections.set(server.id, Math.max(0, this.connections.get(server.id) - 1));
    }
  }

  // Health checking
  async healthCheck() {
    for (const server of this.servers) {
      try {
        const response = await fetch(`${server.url}/health`, {
          timeout: 5000
        });
        server.healthy = response.ok;
        server.lastHealthCheck = new Date();
      } catch (error) {
        server.healthy = false;
        server.lastHealthCheck = new Date();
      }
    }
  }

  hashCode(str) {
    let hash = 0;
    for (let i = 0; i < str.length; i++) {
      const char = str.charCodeAt(i);
      hash = ((hash << 5) - hash) + char;
      hash = hash & hash; // Convert to 32-bit integer
    }
    return hash;
  }
}
````

#### Load Balancer Types

```yaml
# Load Balancer Types and Use Cases

## Layer 4 Load Balancer (Transport Layer)
protocols:
  - TCP
  - UDP
  - SCTP

characteristics:
  - Forward packets based on IP addresses and ports
  - No packet inspection at application level
  - High performance and low latency
  - Limited routing capabilities

use_cases:
  - General TCP/UDP traffic
  - High-performance requirements
  - Simple routing needs
  - Database load balancing

examples:
  - HAProxy
  - NGINX (stream module)
  - AWS Network Load Balancer
  - Google Cloud Network Load Balancing

## Layer 7 Load Balancer (Application Layer)
protocols:
  - HTTP
  - HTTPS
  - WebSocket
  - gRPC

characteristics:
  - Inspect application-level data
  - Advanced routing capabilities
  - SSL/TLS termination
  - Content-based routing

use_cases:
  - Web applications
  - API gateways
  - Microservices
  - Content-based routing

examples:
  - NGINX
  - Apache HTTP Server
  - AWS Application Load Balancer
  - Google Cloud HTTP(S) Load Balancing

## Global Load Balancer
characteristics:
  - Geographic traffic distribution
  - DNS-based routing
  - Health monitoring across regions
  - Latency-based routing

use_cases:
  - Global applications
  - Multi-region deployments
  - Disaster recovery
  - Content delivery networks

examples:
  - Cloudflare Load Balancing
  - AWS Route 53
  - Google Cloud Load Balancing
  - Azure Front Door

## Implementation Comparison
performance:
  - L4: Highest performance (millions of RPS)
  - L7: Lower performance (hundreds of thousands RPS)
  - Global: Variable based on DNS propagation

features:
  - L4: Basic protocol support
  - L7: Advanced routing and security
  - Global: Geographic optimization

complexity:
  - L4: Simple configuration
  - L7: Complex rules and policies
  - Global: Multi-region coordination

cost:
  - L4: Lower cost
  - L7: Medium cost
  - Global: Higher cost
```

### 🗄️ Database Scaling

#### Horizontal Database Scaling

**Horizontal scaling** membagi database menjadi multiple partitions untuk handle growth.

```yaml
# Database Scaling Strategies

## Sharding (Horizontal Partitioning)
definition: "Split database into smaller partitions called shards"
strategies:
  range_based:
    - Partition data by value ranges
    - Example: Users 1-1000 in shard 1, 1001-2000 in shard 2
    - Easy to implement
    - Uneven data distribution possible

  hash_based:
    - Partition data using hash function
    - Example: Hash(user_id) % number_of_shards
    - Even distribution
    - Complex re-sharding

  directory_based:
    - Central lookup service for shard location
    - Flexible shard assignment
    - Additional lookup overhead
    - Single point of failure concern

advantages:
  - Linear scalability
  - Better performance for large datasets
  - Parallel query processing
  - Geographic distribution

challenges:
  - Cross-shard queries
  - Data consistency
  - Rebalancing complexity
  - Increased operational overhead

## Replication
definition: "Create copies of database for read scaling and availability"
types:
  master_slave:
    - One master for writes
    - Multiple slaves for reads
    - Asynchronous replication
    - eventual consistency

  master_master:
    - Multiple masters for writes
    - Conflict resolution required
    - Higher availability
    - Complex consistency management

  read_replicas:
    - Dedicated read-only replicas
    - Reporting and analytics
    - Reduced load on primary
    - Slight replication lag

```