Mike's Notes
This is a compilation of posts taken from different issues of Neo Kim's excellent System Design Newsletter. It's worth subscribing to.
Resources
References
- Reference
Repository
- Home > Ajabbi Research > Library > Subscriptions > System Design Newsletter
- Home > Handbook >
Last Updated
11/01/2026
Top mistakes from Neo Kim
By: Neo Kim
System Design: 24/12/2025
39 mistakes YOU make in scaling a system (21/12/2025)
Here are the biggest mistakes I see 99% of YOU making in scaling a system:
- Scaling vertically instead of horizontally (and hitting hard limits)
- Adding “microservices” too early (plus unnecessary complexity)
- Ignoring load balancing
- Not using caching at all… and increasing system load linearly with traffic
- Caching ‘everything’ blindly (causing stale data, memory pressure, complexity)
- Forgetting CDNs for static assets
- Keeping the server STATEFUL… (and limiting horizontal scalability + recovery)
- Scaling compute before data (databases are usually the first bottleneck)
- Treating the database as “infinite” storage
- Not using read replicas
- Sharding BEFORE understanding access patterns
- Never indexing ‘critical’ queries
- Allowing SLOW queries reach production… and amplify under load
- Blocking requests with “synchronous” processing
- Not using QUEUES for background jobs
- Ignoring retries and back-off… transient failures are typical in distributed systems
- Not setting “timeouts” (and causing thread exhaustion + cascading failures)
- Forgetting RATE LIMITS
- Letting failures ‘cascade’ by not using circuit breakers
- Ignoring backpressure
- Deploying ONLY to a single zone/region (and failing during zone/region outages)
- No global traffic routing
- Manual scaling instead of auto scaling (and moving slowly on traffic spikes)
- Shipping without ‘load testing’
- Never doing “capacity planning”
- Storing big files in databases (instead of object storage)
- Sending uncompressed payloads
- Making “excessive“ network calls
- Not BATCHING for writes
- No ‘service discovery’
- No failover strategy
- No graceful degradation… and disrupting core functionality under load
- Retries without ‘idempotency’ (and causing data corruption)
- No observability,,, you cannot scale what you cannot measure
- No monitoring alerts
- No “tracing” across services in a distributed system
- Scaling features instead of fixing BOTTLENECKS
- ‘Blindly’ copying big tech architectures
- Believing scale is about tools,,, not tradeoffs
23 latency mistakes YOU make when building distributed systems (24/12/2025)
Here are the biggest latency mistakes I see 99% of YOU making:
- Not indexing CRITICAL database queries (and causing full table scans + slow reads at scale)
- Hitting the database ‘repeatedly’ instead of caching hot data
- Not using a CDN for static assets and cacheable responses
- Deploying everything in a single region,,, and ignoring geographic latency
- Designing architectures with too many network hops… and unnecessary microservice chains
- Not using load balancers (and overloading individual servers)
- Scaling vertically instead of horizontally under CONCURRENT load
- Running “inefficient” code and algorithms on the critical path
- Executing ‘independent’ work sequentially instead of batching + parallelizing it
- Blocking requests with synchronous heavy processing (instead of async queues)
- Sending large payloads (instead of efficient compression & serialization)
- Sticking to HTTP/1.1 and missing out on multiplexing benefits from HTTP/2 & HTTP/3
- Opening new connections per request… instead of connection pooling
- Blocking threads with locks or synchronous I/O (and destroying parallelism)
- Running systems at “full capacity” with no headroom for traffic bursts
- Not measuring latency percentiles or profiling bottlenecks
- Allowing cold start affect user-facing requests
- Separating data and compute unnecessarily (and forcing extra network calls)
- Letting SLOW external dependencies sit on the critical path
- Not using ‘timeouts or circuit breakers’ for downstream calls
- Relying on slow disks or poorly tuned infrastructure for latency-sensitive paths
- Choosing runtimes & languages without tuning for latency
- Believing low latency comes from a single optimization instead of disciplined system design & tradeoffs
Latency is rarely caused by one big mistake… It’s the accumulation of many small, avoidable decisions in architecture, code, and execution.
No comments:
Post a Comment