Top mistakes from Neo Kim

Mike's Notes

This is a compilation of posts taken from different issues of Neo Kim's excellent System Design Newsletter. It's worth subscribing to.

Resources

References

  • Reference

Repository

  • Home > Ajabbi Research > Library > Subscriptions > System Design Newsletter
  • Home > Handbook > 

Last Updated

11/01/2026

Top mistakes from Neo Kim

By: Neo Kim
System Design: 24/12/2025


39 mistakes YOU make in scaling a system  (21/12/2025)

Here are the biggest mistakes I see 99% of YOU making in scaling a system:

  1. Scaling vertically instead of horizontally (and hitting hard limits)
  2. Adding “microservices” too early (plus unnecessary complexity)
  3. Ignoring load balancing
  4. Not using caching at all… and increasing system load linearly with traffic
  5. Caching ‘everything’ blindly (causing stale data, memory pressure, complexity)
  6. Forgetting CDNs for static assets
  7. Keeping the server STATEFUL… (and limiting horizontal scalability + recovery)
  8. Scaling compute before data (databases are usually the first bottleneck)
  9. Treating the database as “infinite” storage
  10. Not using read replicas
  11. Sharding BEFORE understanding access patterns
  12. Never indexing ‘critical’ queries
  13. Allowing SLOW queries reach production… and amplify under load
  14. Blocking requests with “synchronous” processing
  15. Not using QUEUES for background jobs
  16. Ignoring retries and back-off… transient failures are typical in distributed systems
  17. Not setting “timeouts” (and causing thread exhaustion + cascading failures)
  18. Forgetting RATE LIMITS
  19. Letting failures ‘cascade’ by not using circuit breakers
  20. Ignoring backpressure
  21. Deploying ONLY to a single zone/region (and failing during zone/region outages)
  22. No global traffic routing
  23. Manual scaling instead of auto scaling (and moving slowly on traffic spikes)
  24. Shipping without ‘load testing’
  25. Never doing “capacity planning”
  26. Storing big files in databases (instead of object storage)
  27. Sending uncompressed payloads
  28. Making “excessive“ network calls
  29. Not BATCHING for writes
  30. No ‘service discovery’
  31. No failover strategy
  32. No graceful degradation… and disrupting core functionality under load
  33. Retries without ‘idempotency’ (and causing data corruption)
  34. No observability,,, you cannot scale what you cannot measure
  35. No monitoring alerts
  36. No “tracing” across services in a distributed system
  37. Scaling features instead of fixing BOTTLENECKS
  38. ‘Blindly’ copying big tech architectures
  39. Believing scale is about tools,,, not tradeoffs

23 latency mistakes YOU make when building distributed systems (24/12/2025)

Here are the biggest latency mistakes I see 99% of YOU making:

  1. Not indexing CRITICAL database queries (and causing full table scans + slow reads at scale)
  2. Hitting the database ‘repeatedly’ instead of caching hot data
  3. Not using a CDN for static assets and cacheable responses
  4. Deploying everything in a single region,,, and ignoring geographic latency
  5. Designing architectures with too many network hops… and unnecessary microservice chains
  6. Not using load balancers (and overloading individual servers)
  7. Scaling vertically instead of horizontally under CONCURRENT load
  8. Running “inefficient” code and algorithms on the critical path
  9. Executing ‘independent’ work sequentially instead of batching + parallelizing it
  10. Blocking requests with synchronous heavy processing (instead of async queues)
  11. Sending large payloads (instead of efficient compression & serialization)
  12. Sticking to HTTP/1.1 and missing out on multiplexing benefits from HTTP/2 & HTTP/3
  13. Opening new connections per request… instead of connection pooling
  14. Blocking threads with locks or synchronous I/O (and destroying parallelism)
  15. Running systems at “full capacity” with no headroom for traffic bursts
  16. Not measuring latency percentiles or profiling bottlenecks
  17. Allowing cold start affect user-facing requests
  18. Separating data and compute unnecessarily (and forcing extra network calls)
  19. Letting SLOW external dependencies sit on the critical path
  20. Not using ‘timeouts or circuit breakers’ for downstream calls
  21. Relying on slow disks or poorly tuned infrastructure for latency-sensitive paths
  22. Choosing runtimes & languages without tuning for latency
  23. Believing low latency comes from a single optimization instead of disciplined system design & tradeoffs

Latency is rarely caused by one big mistake… It’s the accumulation of many small, avoidable decisions in architecture, code, and execution.

No comments:

Post a Comment