EBS performance insights for AWS users

Ott Salmar

Co-Founder | Hykell

EBS can be blisteringly fast—or surprisingly sluggish. The difference is usually configuration, not hardware. Here’s how to get consistent low latency and high throughput while cutting storage spend.

What you’ll learn:

How gp3, io2, st1, and sc1 actually perform, and where they fit
IOPS, throughput, latency, queue depth, and burst behavior—without the fluff
Practical tuning and troubleshooting that fixes “it’s slow” for good
Cost-performance trade-offs, failure-rate expectations, and sizing tips
Proven steps to improve EBS performance while reducing AWS costs

EBS in simple terms

Amazon EBS is block storage for EC2. It behaves like a fast virtual disk you can snapshot, resize, and attach/detach. You choose a volume type with specific performance traits and pay per GiB (plus provisioned IOPS where applicable). For most applications, start with gp3; for latency-critical databases, use io2; for large sequential analytics, consider st1; for cold, infrequent sequential access, sc1. See the official guide to EBS performance concepts and limits.

The performance primitives that matter

IOPS, throughput, and latency form the foundation of EBS performance. IOPS represents how many read/write operations per second your volume can sustain, with small, random I/O operations (typical in databases) being IOPS-bound. Throughput measures how many MiB/s you can push, making large, sequential workloads (ETL jobs, full table scans) throughput-bound. Latency—the time to complete a single I/O—is what database users feel first.

Queue depth, the number of outstanding I/O operations, requires careful tuning. Too low leaves performance unused, while too high increases latency. Some volume types use credits for burst behavior, and when these credits deplete, performance drops until they recharge.

AWS now exposes EBS volume health with per-second metrics and I/O latency histograms to speed diagnosis—monitor these in CloudWatch and set up alarms for early warnings (Amazon EBS performance statistics announcement).

EBS volume types and how “fast” they are

EBS volume family comparison diagram showing gp3, io2, st1, sc1 with IOPS and throughput characteristics (gp3 3,000 IOPS 125 MiB/s; io2 high IOPS low latency; st1 high throughput sequential; sc1 low cost cold data)

General Purpose SSD (gp3)

Baseline: 3,000 IOPS and 125 MiB/s regardless of size; no burst credits
Scale up independently: provision up to typical service limits for IOPS and throughput without increasing GiB
Best for: most app servers, microservices, dev/test, general RDS
Why engineers prefer it to gp2: predictable performance and ~20% lower $/GiB at common price points compared to gp2, plus no credit starvation (AWS EBS performance guide; gp3 overview)

Provisioned IOPS SSD (io2, including Block Express)

Consistent, low-latency I/O with high ceilings—up to 256,000 IOPS and 4,000 MiB/s per volume on Block Express with 99.999% durability for mission-critical workloads (AWS EBS docs; Lucidity comparison)
Best for: transaction-heavy databases (MySQL, PostgreSQL, Oracle), high-traffic OLTP, and systems sensitive to tail latency
Cost model: $/GiB plus $/provisioned IOPS; provision only what you need

Throughput-Optimized HDD (st1)

Designed for large, sequential I/O; cost-efficient at scale
Typical caps: up to 500 MiB/s and lower IOPS ceilings—great for log processing and big scans (N2WS EBS comparison)
Best for: data warehousing, streaming ingestion, and ETL that is tolerant of higher latency

Cold HDD (sc1)

Lowest cost per GiB; lowest performance
Typical caps: up to 250 MiB/s with very limited IOPS (N2WS comparison)
Best for: infrequently accessed, large sequential datasets where throughput and latency are secondary

Legacy gp2

Burstable credits model: can be fast when bursting but slows when credits drain
Migration guidance: move to gp3 for predictable performance and lower cost per GiB (AWS EBS performance guide; CloudZero EBS optimization primer)

Which is the highest performance?

io2 (Block Express) delivers the highest IOPS/throughput and the tightest latency guarantees for single volumes. Multi-volume striping can exceed single-volume limits but adds complexity.

Is EBS faster than S3?

They’re different tools for different jobs. EBS is block storage attached to EC2 with low-latency I/O; S3 is object storage with eventual consistency trade-offs and different access patterns. For databases and filesystem workloads, EBS wins on latency; for static objects and massive parallel reads, S3 scales differently.

Bursting and credits explained (why gp3 fixed the pain)

The gp2 volume type uses burst credits, where small volumes earn and consume credits. This means sustained loads can deplete credits and throttle IOPS. Similarly, st1/sc1 use throughput credits, where sustained reads/writes beyond baseline drain credits and reduce throughput.

The newer gp3 volumes eliminate burst credits entirely, offering fixed baselines and independent provisioning of IOPS and throughput. This creates predictable performance and easier capacity planning (Lucidity gp3 summary).

Cost-performance trade-offs that actually save money

Making smart choices between volume types can significantly impact both performance and cost:

gp3 vs gp2: gp3 is commonly ~20% cheaper per GiB and delivers consistent baseline performance; add IOPS/throughput only when required (Lucidity; CloudZero)
io2: Pricier due to provisioned IOPS, but required for consistent sub-10ms latency at high IOPS. Use for production OLTP; keep dev/test on gp3
st1/sc1: Excellent $/GiB for large sequential workloads; not suitable for random I/O

With disciplined right-sizing and migration from gp2 to gp3, it’s common to reduce EBS costs by 20–30% with no performance loss. Many teams see up to 40% when storage rightsizing is paired with instance/placement tuning. If you want this on autopilot, Hykell’s EBS optimization delivers savings without engineering toil.

Don’t forget the EC2 side: EBS bandwidth and latency are instance-bound

Your volume can only go as fast as the instance’s EBS-optimized bandwidth and the network path allow. Check:

Instance EBS bandwidth caps and max IOPS/throughput in the instance family docs
Enable EBS-optimized (most Nitro instances are by default)
Use larger instances or Nitro families when you need higher EBS throughput/IOPS
Co-locate volumes and instances in the same AZ; consider placement groups for ultra-low latency

Storage often bottlenecks EC2 apps—pair storage tuning with AWS EC2 performance tuning for best results.

Practical tuning checklist

1) Match the volume to the workload

Random, latency-sensitive (OLTP): io2 (right-size IOPS), or gp3 with provisioned IOPS if moderate
Mixed/general purpose: gp3 baseline first; provision IOPS/throughput only if metrics demand it
Large sequential: st1; cold sequential: sc1

2) Right-size capacity, IOPS, and throughput

Start with gp3 baseline and observe latency/queue depth; then add IOPS or throughput independently
For io2, provision just enough IOPS for P95/P99 needs; avoid overprovisioning

3) Tune I/O size and queue depth in the OS/application

Databases: increase parallelism, adjust innodb_io_capacity and flush settings
Filesystems: align block size with workload and EBS optimal I/O size (commonly 256 KiB or 1 MiB for sequential)

4) Align the instance

Ensure instance EBS bandwidth > sum of attached volumes’ needs
Prefer Nitro instances and high base-clock CPUs for latency-critical work
Separate OS and data volumes for isolation and performance (best practices overview)

5) Use multi-volume striping when needed

RAID 0 across multiple volumes to exceed single-volume ceilings; snapshot-aware configs recommended. Understand the failure domain implication

6) Monitor and auto-adjust

Use CloudWatch per-second metrics and latency histograms; alert on approaching provisioned limits and rising queue depth (AWS performance statistics)
Automate changes: schedule ModifyVolume to increase IOPS/throughput during peak windows. Some updates need I/O warm-up and time to take effect

7) Lifecycle cold data

Move inactive, large files to S3 or swap st1/sc1 where patterns fit

If your workload is a database, pair storage and engine tuning—see AWS RDS MySQL performance tuning.

Troubleshooting slow EBS like an SRE

Troubleshooting sketch illustrating queue depth, rising latency, and burst credits depletion to diagnose slow EBS performance

Symptom: High latency and stalled requests

Check CloudWatch VolumeReadOps/WriteOps vs provisioned IOPS; if maxing out, increase IOPS (gp3/io2) or distribute across more volumes
Inspect VolumeThroughput% metrics; if at ceiling, provision more throughput (gp3) or move to io2/striping
Validate EC2 EBS bandwidth; if instance-bound, upsize the instance

Symptom: Performance degrades over time

For gp2/st1/sc1, confirm credit balance; migrate to gp3 or switch workload class
Check noisy neighbor on the instance (CPU steal, IRQ saturation); pin interrupts and adjust NIC/EBS IRQ affinity

Symptom: Random I/O is slow but throughput is fine

Increase queue depth (e.g., fio iodepth 16–64 for SSD); for databases, tune parallelism and flush settings
Ensure filesystem mount options are optimized (noatime, correct scheduler)

Measure with fio

Example random read test:

fio --name=randread --filename=/dev/nvme1n1 --rw=randread --bs=4k --iodepth=32 --numjobs=4 --direct=1 --time_based --runtime=120 --group_reporting

Expect results to approach provisioned IOPS when EC2 bandwidth and queue depth are sufficient. Compare to AWS stated limits in the EBS performance guide.

Apply changes safely

Use aws ec2 modify-volume to adjust IOPS/throughput; monitor until the modification reaches the “optimizing” then “completed” state (see AWS CLI docs referenced in the EBS performance guide).

For broader telemetry, adopt cloud native application monitoring so storage, compute, and app signals correlate in one view.

Reliability and failure rates you can plan around

EBS durability is achieved by replicating data within an Availability Zone; typical annual failure rates are reported in the 0.1%–0.2% range across volume types (N2WS comparison).

io2 advertises 99.999% durability, the highest among EBS SSD classes (Lucidity volume types guide).

For availability beyond a single AZ, architect with multi-AZ replication at the application or database layer (e.g., RDS Multi-AZ, storage replication), and understand how this ties to your AWS performance SLA.

Snapshots are not a substitute for cross-AZ/region redundancy; use them for backup, DR, and volume cloning.

Speed planning: quick sizing scenarios

OLTP database needing 20k IOPS at sub-10ms latency

Choose io2, provision ~20k IOPS; validate EC2 EBS bandwidth supports the target IOPS and latency. Consider Block Express for headroom and tail-latency control (EBS performance guide).

Analytics scan sustaining 400–500 MiB/s sequential reads

Use st1 sized large enough to sustain the needed throughput, or gp3 with provisioned throughput; ensure instance can deliver >500 MiB/s EBS bandwidth (N2WS caps).

General-purpose app server with spiky load, 2–3 TB data

Migrate gp2 → gp3. Start with baseline 3,000 IOPS/125 MiB/s; add IOPS/throughput only if CloudWatch shows queuing or throttling (Lucidity: gp3 benefits).

Proven ways to improve EBS performance and reduce AWS costs

Standardize on gp3 for most volumes; migrate off gp2 to eliminate credit risk and save ~20% $/GiB
Right-size IOPS/throughput based on observed P95/P99 latency and queue depth, not on peak guesswork
Align EC2 instance EBS bandwidth with aggregate volume needs; upgrade instances when storage-bound
Use io2 only where latency SLAs demand it; keep non-critical and dev/test on gp3
Stripe when you must exceed single-volume ceilings—but account for failure domains and backups
Tier cold/sequential data to st1/sc1 or S3 to cut $/GiB dramatically
Automate monitoring and resizing with CloudWatch alarms and scheduled modifications; integrate with CI/CD

Want this handled automatically? Hykell’s platform continuously right-sizes capacity, IOPS, and throughput, and flags instance storage bottlenecks—delivering up to 40% savings with performance intact. Explore optimized EBS volume sizing and performance tuning and broader AWS EBS optimization. For end-to-end savings across compute and storage, see Hykell and pair with best practices for AWS EC2 performance management and cloud optimization trends.

Ready to balance speed and spend? Let Hykell put your EBS and EC2 optimization on autopilot. If you don’t save, you don’t pay.