Scalability
Purpose: Design systems that handle growth in users, data, and traffic.
Approaches: Horizontal scaling, load balancing, caching, async processing, stateless services.
Horizontal vs Vertical Scaling
| Approach | Description | When to Use |
|---|---|---|
| Vertical | Bigger server (more CPU/RAM) | Quick fix, limited by hardware |
| Horizontal | More servers | Long-term, unlimited growth |
Prefer horizontal scaling - Add more instances rather than bigger servers.
Stateless Services
// ❌ Stateful (doesn't scale)
public class OrderController : ControllerBase
{
private static Dictionary<int, Order> _orders = new(); // Shared state!
[HttpPost]
public IActionResult CreateOrder(Order order)
{
_orders[order.Id] = order; // Lost on restart or different instance
return Ok();
}
}
// ✅ Stateless (scales horizontally)
public class OrderController : ControllerBase
{
private readonly IOrderRepository _repository;
[HttpPost]
public async Task<IActionResult> CreateOrder(Order order)
{
await _repository.SaveAsync(order); // Persisted to database
return Ok();
}
}
Load Balancing
# NGINX load balancer config
upstream api_servers {
least_conn; # Route to server with fewest connections
server api1:5000;
server api2:5000;
server api3:5000;
}
server {
listen 80;
location / {
proxy_pass http://api_servers;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
Strategies:
- Round Robin - Distribute evenly
- Least Connections - Route to least busy
- IP Hash - Same client → same server
Caching Strategy
// Cache frequently accessed data
public class ProductService
{
private readonly IDistributedCache _cache;
private readonly IProductRepository _repo;
public async Task<Product> GetProductAsync(int id)
{
var cacheKey = $"product:{id}";
var cached = await _cache.GetStringAsync(cacheKey);
if (cached != null)
return JsonSerializer.Deserialize<Product>(cached);
var product = await _repo.GetByIdAsync(id);
await _cache.SetStringAsync(cacheKey,
JsonSerializer.Serialize(product),
new DistributedCacheEntryOptions
{
AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(10)
});
return product;
}
}
Message Queues (Async Processing)
using RabbitMQ.Client;
// Producer - Queue heavy operations
public class OrderService
{
private readonly IConnection _connection;
public async Task<Order> CreateOrderAsync(OrderDto orderDto)
{
var order = await _repository.CreateAsync(orderDto);
// Queue email and inventory update (don't block)
await _queue.PublishAsync("order.created", new
{
OrderId = order.Id,
CustomerEmail = order.CustomerEmail
});
return order;
}
}
// Consumer - Process in background
public class OrderProcessor : BackgroundService
{
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
await _queue.SubscribeAsync("order.created", async message =>
{
await _emailService.SendOrderConfirmationAsync(message.OrderId);
await _inventoryService.UpdateStockAsync(message.OrderId);
});
}
}
Database Scaling
Read Replicas
// Write to primary
builder.Services.AddDbContext<AppDbContext>(options =>
options.UseNpgsql(builder.Configuration.GetConnectionString("Primary")));
// Read from replicas
builder.Services.AddDbContext<ReadDbContext>(options =>
options.UseNpgsql(builder.Configuration.GetConnectionString("ReadReplica")));
public class UserService
{
private readonly AppDbContext _writeDb;
private readonly ReadDbContext _readDb;
public async Task<User> GetUserAsync(int id) =>
await _readDb.Users.FindAsync(id); // Read from replica
public async Task CreateUserAsync(User user)
{
_writeDb.Users.Add(user);
await _writeDb.SaveChangesAsync(); // Write to primary
}
}
Database Sharding
// Shard by user ID
public class ShardedUserRepository
{
private readonly List<AppDbContext> _shards;
private AppDbContext GetShard(int userId)
{
var shardIndex = userId % _shards.Count;
return _shards[shardIndex];
}
public async Task<User> GetUserAsync(int userId)
{
var shard = GetShard(userId);
return await shard.Users.FindAsync(userId);
}
}
CDN for Static Assets
// Use CDN for images, CSS, JS
<img src="https://cdn.myapp.com/images/logo.png" />
<link href="https://cdn.myapp.com/css/styles.css" rel="stylesheet" />
// Configure CDN
builder.Services.Configure<StaticFileOptions>(options =>
{
options.OnPrepareResponse = ctx =>
{
ctx.Context.Response.Headers.Add("Cache-Control", "public,max-age=31536000");
};
});
Rate Limiting
using AspNetCoreRateLimit;
builder.Services.Configure<IpRateLimitOptions>(options =>
{
options.GeneralRules = new List<RateLimitRule>
{
new RateLimitRule
{
Endpoint = "*",
Period = "1m",
Limit = 100
}
};
});
app.UseIpRateLimiting();
Autoscaling
# Kubernetes HPA (Horizontal Pod Autoscaler)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-autoscaler
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Best Practices
✅ DO
- Design stateless - Store state in database/cache
- Use load balancers - Distribute traffic
- Cache aggressively - Reduce database load
- Queue heavy operations - Process asynchronously
- Use read replicas - Separate read/write load
- Enable autoscaling - Handle traffic spikes
- Use CDN - Serve static assets globally
- Implement rate limiting - Prevent abuse
- Monitor metrics - CPU, memory, request rate
- Plan for failure - Circuit breakers, retries
❌ DON'T
- Store state in memory - Breaks horizontal scaling
- Single database instance - Bottleneck
- Synchronous heavy operations - Use queues
- Skip caching - Database overload
- Ignore connection pooling - Connection exhaustion
- Manual scaling only - Use autoscaling
- Serve static files from app - Use CDN
- No rate limiting - Vulnerable to abuse
Scalability Checklist
- Services are stateless
- Load balancer configured
- Caching implemented (Redis/Memory)
- Message queue for async processing
- Database read replicas configured
- CDN for static assets
- Rate limiting enabled
- Autoscaling configured
- Connection pooling enabled
- Monitoring and alerting set up
See Also: 05-performance.md • 06-database.md • 15-logging-monitoring.md
Last Updated: January 13, 2026
