These numbers are approximate and will vary by hardware generation, cloud provider, and configuration. The point is not precision — it is the ratio between things. Knowing that a network round trip is roughly 100x slower than a memory access is far more useful than knowing either number exactly. When designing, reason in orders of magnitude first.
Latency Numbers
These are the building blocks of every performance argument. When someone says "we can't afford a network call in this hot path," this table is why.
| Operation | Latency | Notes |
|---|---|---|
| CPU & Memory | ||
| L1 cache reference | ~0.5 ns | The fastest you can read anything |
| L2 cache reference | ~7 ns | 14x L1 |
| L3 cache reference | ~40 ns | Still sub-microsecond |
| Main memory (RAM) access | ~100 ns | 200x L1. The first "slow" thing. |
| Mutex lock/unlock | ~25 ns | Contention makes this much worse |
| Storage | ||
| NVMe SSD random read | ~100 µs | 1,000x RAM. Fast for storage, slow vs. memory. |
| NVMe SSD sequential read (1 MB) | ~50 µs | Sequential is 2–10x faster than random |
| SATA SSD random read | ~300 µs | 3x slower than NVMe |
| HDD seek + read | ~10 ms | 100x slower than NVMe SSD. Near-dead for hot paths. |
| HDD sequential read (1 MB) | ~1 ms | Sequential recovers a lot of HDD's deficit |
| Network | ||
| Same datacenter round trip | ~0.5 ms | The baseline for same-region service calls |
| Cross-AZ round trip (same region) | ~1–2 ms | Why cross-AZ replication has a cost |
| Cross-region round trip (US) | ~40–80 ms | Speed of light across ~3,000 miles |
| Cross-continent round trip (US → EU) | ~80–120 ms | Why CDNs exist. Irreducible by speed of light. |
| Cross-continent (US → Asia) | ~150–200 ms | Anything interactive needs edge presence |
| TCP handshake overhead | 1 RTT | Why connection pools matter at scale |
| TLS handshake overhead | 1–2 RTT | TLS 1.3 reduced to 1 RTT (0 RTT resumption possible) |
| Operations at Scale | ||
| Compress 1 KB with snappy | ~3 µs | Fast enough to almost always be worth it for network |
| Send 1 MB over 1 Gbps network | ~8 ms | Bandwidth ≠ latency. Still feels slow. |
| Read 1 MB sequentially from memory | ~250 µs | 32x faster than network |
RAM is ~1,000x faster than SSD. Caching works because this ratio is enormous.
SSD is ~100x faster than HDD. If you are still using HDD for a random-read workload, this is why your tail latencies look the way they do.
Same-DC network is ~5,000x slower than RAM. Every remote call pays this cost. Design accordingly.
Cross-region network adds 40–200 ms. No amount of optimization beats physics. Put data near users.
Throughput and Capacity Numbers
Throughput is different from latency — it tells you how many operations you can sustain over time, not how fast a single one is. Both matter, and they often pull in opposite directions.
| Resource | Typical Throughput | Notes |
|---|---|---|
| Storage I/O | ||
| NVMe SSD sequential read/write | 3–7 GB/s | Modern gen4/gen5 NVMe. Saturates PCIe lanes before most workloads. |
| NVMe SSD random IOPS (4K) | 500K–1M IOPS | Random I/O is the bottleneck for most databases, not sequential |
| SATA SSD random IOPS (4K) | ~100K IOPS | 5–10x below NVMe for random |
| HDD random IOPS | ~100–200 IOPS | The spinning platter bottleneck. 1,000x below NVMe. |
| HDD sequential throughput | ~150–200 MB/s | HDD is still fine for sequential-heavy analytics workloads |
| Network | ||
| 1 GbE NIC (typical VM) | ~125 MB/s | Often the bottleneck on older cloud instance types |
| 10 GbE NIC (modern cloud) | ~1.25 GB/s | Standard for compute-optimized cloud instances |
| 25/40 GbE (high-perf cloud) | 3–5 GB/s | Large instances, network-optimized tiers |
| Single TCP connection | ~1 Gbps | Limited by congestion window. Multiple connections needed for full bandwidth. |
| Databases (rough baselines) | ||
| PostgreSQL simple reads (single node) | ~10K–50K QPS | Highly dependent on query complexity, indexes, RAM vs. disk |
| PostgreSQL writes (with fsync) | ~1K–5K TPS | Durable writes are expensive. Batching helps significantly. |
| Redis (single node, simple ops) | ~100K–1M ops/s | In-memory; pipelining pushes toward the higher end |
| Kafka (single broker, produce) | ~100K–500K msg/s | Throughput vs. latency trade-off in producer config |
| S3 / object storage (single prefix) | ~5,500 GET/s, 3,500 PUT/s | Per prefix. Sharding by prefix breaks this limit. |
| Memory & CPU | ||
| Memory bandwidth (modern server) | ~50–200 GB/s | DDR5, multi-channel. Rarely the bottleneck unless doing heavy analytics. |
| Simple HTTP request (Go/Java, no DB) | ~50K–200K req/s | Single core. Network becomes the limit well before CPU for most apps. |
Availability and Downtime
"Nines" are how reliability is measured in practice. Knowing what they translate to in minutes per year changes every SLO conversation.
| Availability | Downtime / Year | Downtime / Month | Downtime / Week | Typical Use Case |
|---|---|---|---|---|
| 90% (1 nine) | 36.5 days | 73 hours | 16.8 hours | Internal batch jobs, dev tools |
| 95% | 18.25 days | 36.5 hours | 8.4 hours | Non-critical internal services |
| 99% (2 nines) | 3.65 days | 7.3 hours | 1.68 hours | Internal dashboards, low-stakes APIs |
| 99.5% | 1.83 days | 3.65 hours | 50 min | Consumer apps where downtime is noticeable |
| 99.9% (3 nines) | 8.77 hours | 43.8 min | 10.1 min | Standard SaaS, internal critical services |
| 99.95% | 4.38 hours | 21.9 min | 5 min | High-value consumer products |
| 99.99% (4 nines) | 52.6 min | 4.38 min | 1 min | Payments, auth, core API infrastructure |
| 99.999% (5 nines) | 5.26 min | 26.3 sec | 6 sec | Telecom, financial clearing, life-critical systems |
If service A calls service B calls service C, and each has 99.9% availability, the combined availability is 99.9% × 99.9% × 99.9% = 99.7% — nearly three times more downtime than any individual service. Long synchronous call chains destroy your effective availability. This is why timeouts, circuit breakers, and graceful degradation are not optional.
Storage Scale Reference
| Unit | Size | Relatable Example |
|---|---|---|
| 1 KB | 1,024 bytes | A short plain-text email |
| 1 MB | ~10⁶ bytes | A compressed photo; 1,000 average database rows |
| 1 GB | ~10⁹ bytes | ~200,000 average database rows; a small relational DB |
| 1 TB | ~10¹² bytes | ~200M database rows; a medium-sized production DB |
| 1 PB | ~10¹⁵ bytes | Large-scale data warehouse; ~200B rows; all photos on a mid-size social platform |
| 1 EB | ~10¹⁸ bytes | Hyperscaler territory. Total data created globally per day is ~2–3 EB (2024). |
Cloud Cost Ratios (Approximate)
Absolute prices change constantly and vary by provider. These ratios are more stable and more useful for design decisions.
| What You're Paying For | Relative Cost | Design Implication |
|---|---|---|
| Intra-AZ data transfer | Free | Keep hot paths in the same AZ when you can |
| Cross-AZ data transfer (same region) | ~$0.01/GB | Small but non-zero. Replication and read replicas have a cost. |
| Egress to internet (cloud → user) | ~$0.08–0.12/GB | 8–12x cross-AZ. CDNs reduce this significantly. |
| Cross-region data transfer | ~$0.02–0.08/GB | Multi-region architectures have a real data transfer line item |
| Object storage (S3/GCS) | ~$0.02/GB/month | Cheapest durable storage. Archive tiers go 10x lower. |
| Block storage (EBS gp3) | ~$0.08/GB/month | 4x object storage. But low latency, so worth it for databases. |
| Managed database storage | ~$0.12–0.25/GB/month | 6–12x object storage. Includes replication cost. |
| In-memory (ElastiCache/Redis) | ~$0.10–0.50/GB/month | 5–25x object storage. Only cache what needs it. |
Data transfer (especially egress) is frequently the largest and most surprising cloud bill line item for data-heavy systems. A system that moves 100 TB of data per month to end users pays ~$8,000–12,000/month in egress alone, before compute or storage. Design your data locality with this in mind: keep processing close to storage, use CDNs for user-facing content, and question any architecture that requires large cross-region data movement.