On a Sandy Beach: Top mistakes from Neo Kim

Mike's Notes

This is a compilation of posts taken from different issues of Neo Kim's excellent System Design Newsletter. It's worth subscribing to.

Resources

https://newsletter.systemdesign.one/

References

Reference

Repository

Home > Ajabbi Research > Library > Subscriptions > System Design Newsletter
Home > Handbook >

Last Updated

11/01/2026

Top mistakes from Neo Kim

By: Neo Kim

System Design: 24/12/2025

39 mistakes YOU make in scaling a system (21/12/2025)

Here are the biggest mistakes I see 99% of YOU making in scaling a system:

Scaling vertically instead of horizontally (and hitting hard limits)
Adding “microservices” too early (plus unnecessary complexity)
Ignoring load balancing
Not using caching at all… and increasing system load linearly with traffic
Caching ‘everything’ blindly (causing stale data, memory pressure, complexity)
Forgetting CDNs for static assets
Keeping the server STATEFUL… (and limiting horizontal scalability + recovery)
Scaling compute before data (databases are usually the first bottleneck)
Treating the database as “infinite” storage
Not using read replicas
Sharding BEFORE understanding access patterns
Never indexing ‘critical’ queries
Allowing SLOW queries reach production… and amplify under load
Blocking requests with “synchronous” processing
Not using QUEUES for background jobs
Ignoring retries and back-off… transient failures are typical in distributed systems
Not setting “timeouts” (and causing thread exhaustion + cascading failures)
Forgetting RATE LIMITS
Letting failures ‘cascade’ by not using circuit breakers
Ignoring backpressure
Deploying ONLY to a single zone/region (and failing during zone/region outages)
No global traffic routing
Manual scaling instead of auto scaling (and moving slowly on traffic spikes)
Shipping without ‘load testing’
Never doing “capacity planning”
Storing big files in databases (instead of object storage)
Sending uncompressed payloads
Making “excessive“ network calls
Not BATCHING for writes
No ‘service discovery’
No failover strategy
No graceful degradation… and disrupting core functionality under load
Retries without ‘idempotency’ (and causing data corruption)
No observability,,, you cannot scale what you cannot measure
No monitoring alerts
No “tracing” across services in a distributed system
Scaling features instead of fixing BOTTLENECKS
‘Blindly’ copying big tech architectures
Believing scale is about tools,,, not tradeoffs

23 latency mistakes YOU make when building distributed systems (24/12/2025)

Here are the biggest latency mistakes I see 99% of YOU making:

Not indexing CRITICAL database queries (and causing full table scans + slow reads at scale)
Hitting the database ‘repeatedly’ instead of caching hot data
Not using a CDN for static assets and cacheable responses
Deploying everything in a single region,,, and ignoring geographic latency
Designing architectures with too many network hops… and unnecessary microservice chains
Not using load balancers (and overloading individual servers)
Scaling vertically instead of horizontally under CONCURRENT load
Running “inefficient” code and algorithms on the critical path
Executing ‘independent’ work sequentially instead of batching + parallelizing it
Blocking requests with synchronous heavy processing (instead of async queues)
Sending large payloads (instead of efficient compression & serialization)
Sticking to HTTP/1.1 and missing out on multiplexing benefits from HTTP/2 & HTTP/3
Opening new connections per request… instead of connection pooling
Blocking threads with locks or synchronous I/O (and destroying parallelism)
Running systems at “full capacity” with no headroom for traffic bursts
Not measuring latency percentiles or profiling bottlenecks
Allowing cold start affect user-facing requests
Separating data and compute unnecessarily (and forcing extra network calls)
Letting SLOW external dependencies sit on the critical path
Not using ‘timeouts or circuit breakers’ for downstream calls
Relying on slow disks or poorly tuned infrastructure for latency-sensitive paths
Choosing runtimes & languages without tuning for latency
Believing low latency comes from a single optimization instead of disciplined system design & tradeoffs

Latency is rarely caused by one big mistake… It’s the accumulation of many small, avoidable decisions in architecture, code, and execution.

On a Sandy Beach

Pages

Top mistakes from Neo Kim

Mike's Notes

Resources

References

Repository

Last Updated

Top mistakes from Neo Kim

39 mistakes YOU make in scaling a system (21/12/2025)

23 latency mistakes YOU make when building distributed systems (24/12/2025)

No comments:

Post a Comment