askill
scalability

scalabilitySafety 85Repository

Design scalable systems with horizontal scaling, load balancing, caching, message queues, and stateless service architecture.

0 stars
1.2k downloads
Updated 2/3/2026

Package Files

Loading files...
SKILL.md

Scalability

Purpose: Design systems that handle growth in users, data, and traffic.
Approaches: Horizontal scaling, load balancing, caching, async processing, stateless services.


Horizontal vs Vertical Scaling

ApproachDescriptionWhen to Use
VerticalBigger server (more CPU/RAM)Quick fix, limited by hardware
HorizontalMore serversLong-term, unlimited growth

Prefer horizontal scaling - Add more instances rather than bigger servers.


Stateless Services

// ❌ Stateful (doesn't scale)
public class OrderController : ControllerBase
{
    private static Dictionary<int, Order> _orders = new(); // Shared state!
    
    [HttpPost]
    public IActionResult CreateOrder(Order order)
    {
        _orders[order.Id] = order; // Lost on restart or different instance
        return Ok();
    }
}

// ✅ Stateless (scales horizontally)
public class OrderController : ControllerBase
{
    private readonly IOrderRepository _repository;
    
    [HttpPost]
    public async Task<IActionResult> CreateOrder(Order order)
    {
        await _repository.SaveAsync(order); // Persisted to database
        return Ok();
    }
}

Load Balancing

# NGINX load balancer config
upstream api_servers {
    least_conn;  # Route to server with fewest connections
    server api1:5000;
    server api2:5000;
    server api3:5000;
}

server {
    listen 80;
    location / {
        proxy_pass http://api_servers;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Strategies:

  • Round Robin - Distribute evenly
  • Least Connections - Route to least busy
  • IP Hash - Same client → same server

Caching Strategy

// Cache frequently accessed data
public class ProductService
{
    private readonly IDistributedCache _cache;
    private readonly IProductRepository _repo;
    
    public async Task<Product> GetProductAsync(int id)
    {
        var cacheKey = $"product:{id}";
        var cached = await _cache.GetStringAsync(cacheKey);
        
        if (cached != null)
            return JsonSerializer.Deserialize<Product>(cached);
        
        var product = await _repo.GetByIdAsync(id);
        await _cache.SetStringAsync(cacheKey, 
            JsonSerializer.Serialize(product),
            new DistributedCacheEntryOptions
            {
                AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(10)
            });
        
        return product;
    }
}

Message Queues (Async Processing)

using RabbitMQ.Client;

// Producer - Queue heavy operations
public class OrderService
{
    private readonly IConnection _connection;
    
    public async Task<Order> CreateOrderAsync(OrderDto orderDto)
    {
        var order = await _repository.CreateAsync(orderDto);
        
        // Queue email and inventory update (don't block)
        await _queue.PublishAsync("order.created", new
        {
            OrderId = order.Id,
            CustomerEmail = order.CustomerEmail
        });
        
        return order;
    }
}

// Consumer - Process in background
public class OrderProcessor : BackgroundService
{
    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        await _queue.SubscribeAsync("order.created", async message =>
        {
            await _emailService.SendOrderConfirmationAsync(message.OrderId);
            await _inventoryService.UpdateStockAsync(message.OrderId);
        });
    }
}

Database Scaling

Read Replicas

// Write to primary
builder.Services.AddDbContext<AppDbContext>(options =>
    options.UseNpgsql(builder.Configuration.GetConnectionString("Primary")));

// Read from replicas
builder.Services.AddDbContext<ReadDbContext>(options =>
    options.UseNpgsql(builder.Configuration.GetConnectionString("ReadReplica")));

public class UserService
{
    private readonly AppDbContext _writeDb;
    private readonly ReadDbContext _readDb;
    
    public async Task<User> GetUserAsync(int id) =>
        await _readDb.Users.FindAsync(id); // Read from replica
    
    public async Task CreateUserAsync(User user)
    {
        _writeDb.Users.Add(user);
        await _writeDb.SaveChangesAsync(); // Write to primary
    }
}

Database Sharding

// Shard by user ID
public class ShardedUserRepository
{
    private readonly List<AppDbContext> _shards;
    
    private AppDbContext GetShard(int userId)
    {
        var shardIndex = userId % _shards.Count;
        return _shards[shardIndex];
    }
    
    public async Task<User> GetUserAsync(int userId)
    {
        var shard = GetShard(userId);
        return await shard.Users.FindAsync(userId);
    }
}

CDN for Static Assets

// Use CDN for images, CSS, JS
<img src="https://cdn.myapp.com/images/logo.png" />
<link href="https://cdn.myapp.com/css/styles.css" rel="stylesheet" />

// Configure CDN
builder.Services.Configure<StaticFileOptions>(options =>
{
    options.OnPrepareResponse = ctx =>
    {
        ctx.Context.Response.Headers.Add("Cache-Control", "public,max-age=31536000");
    };
});

Rate Limiting

using AspNetCoreRateLimit;

builder.Services.Configure<IpRateLimitOptions>(options =>
{
    options.GeneralRules = new List<RateLimitRule>
    {
        new RateLimitRule
        {
            Endpoint = "*",
            Period = "1m",
            Limit = 100
        }
    };
});

app.UseIpRateLimiting();

Autoscaling

# Kubernetes HPA (Horizontal Pod Autoscaler)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-autoscaler
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Best Practices

✅ DO

  • Design stateless - Store state in database/cache
  • Use load balancers - Distribute traffic
  • Cache aggressively - Reduce database load
  • Queue heavy operations - Process asynchronously
  • Use read replicas - Separate read/write load
  • Enable autoscaling - Handle traffic spikes
  • Use CDN - Serve static assets globally
  • Implement rate limiting - Prevent abuse
  • Monitor metrics - CPU, memory, request rate
  • Plan for failure - Circuit breakers, retries

❌ DON'T

  • Store state in memory - Breaks horizontal scaling
  • Single database instance - Bottleneck
  • Synchronous heavy operations - Use queues
  • Skip caching - Database overload
  • Ignore connection pooling - Connection exhaustion
  • Manual scaling only - Use autoscaling
  • Serve static files from app - Use CDN
  • No rate limiting - Vulnerable to abuse

Scalability Checklist

  • Services are stateless
  • Load balancer configured
  • Caching implemented (Redis/Memory)
  • Message queue for async processing
  • Database read replicas configured
  • CDN for static assets
  • Rate limiting enabled
  • Autoscaling configured
  • Connection pooling enabled
  • Monitoring and alerting set up

See Also: 05-performance.md06-database.md15-logging-monitoring.md

Last Updated: January 13, 2026

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

95/100Analyzed 2/10/2026

A comprehensive and highly actionable guide to system scalability, featuring clear C# and YAML examples across multiple architectural patterns including load balancing, caching, and database sharding.

85
100
90
95
95

Metadata

Licenseunknown
Version-
Updated2/3/2026
PublisherjnPiyush

Tags

apidatabaseobservabilitysecurity