EBS Performance Benchmarking and Optimization: A Practical Guide for AWS Engineers

Ott Salmar

Co-Founder | Hykell

Ever wondered why your AWS cloud workloads crawl despite paying premium rates for EBS volumes? You’re not alone. For many DevOps teams, achieving the right balance between Amazon EBS performance and cost efficiency feels like chasing a moving target. This guide cuts through the complexity with actionable benchmarking methods, troubleshooting approaches, and optimization strategies that deliver measurable results.

Understanding EBS Performance Fundamentals

Before launching any benchmark, you need to understand what metrics actually matter for EBS performance:

Blackboard diagram of EBS performance metrics: IOPS, throughput, latency, queue depth, and burst credits.

IOPS (Input/Output Operations Per Second): The number of read/write operations per second, critical for transactional workloads like databases
Throughput: Measured in MiB/s, representing data transfer volume, important for large sequential workloads
Latency: Time to complete an I/O operation, typically in milliseconds
Queue Depth: Number of pending I/O requests to a volume
Burst Credits: Applicable to some volume types like gp2, allowing temporary performance above baseline

Your actual performance is largely determined by configuration choices rather than hardware limitations. As Hykell’s research shows, different volume types serve different purposes:

gp3: Baseline 3,000 IOPS and 125 MiB/s regardless of size with no burst credits
io2/io2 Block Express: For consistent, low-latency I/O up to 256,000 IOPS
st1: Designed for sequential workloads with up to 500 MiB/s throughput
sc1: Lowest cost option for infrequently accessed data

Setting Up Your EBS Benchmark Environment

Recommended Benchmark Procedure

Blackboard flow of EBS benchmarking: EC2 instance, attach volume, run FIO, review results.

Launch an EBS-optimized instance in your target Availability Zone
Create new EBS volumes specifically for testing (never benchmark production volumes)
Attach volumes to your instance
Configure and mount the block device
Install benchmarking tools
Run your benchmarks
Delete test volumes and terminate the instance when done

Using FIO for Benchmarking

Flexible I/O Tester (FIO) is the industry standard for EBS benchmarking. Here’s a basic script to get started:

# Random read test
fio --name=random-read --directory=/path/to/test --rw=randread --bs=4k --size=4g --numjobs=1 --time_based --runtime=180 --group_reporting

# Random write test
fio --name=random-write --directory=/path/to/test --rw=randwrite --bs=4k --size=4g --numjobs=1 --time_based --runtime=180 --group_reporting

# Sequential read test
fio --name=sequential-read --directory=/path/to/test --rw=read --bs=1m --size=4g --numjobs=1 --time_based --runtime=180 --group_reporting

# Sequential write test
fio --name=sequential-write --directory=/path/to/test --rw=write --bs=1m --size=4g --numjobs=1 --time_based --runtime=180 --group_reporting

When benchmarking, follow these best practices identified in Hykell’s cloud performance benchmarking guide:

Establish clear baselines before optimization
Use consistent testing environments (same instance types, same AZ)
Control for variables like time of day and concurrent workloads
Adopt industry standards like TPC-DS for big data workloads
Schedule regular benchmarking cycles (quarterly reviews are common)

Interpreting Benchmark Results

After running benchmarks, focus on these key insights:

IOPS Achieved vs. Provisioned: Are you getting what you’re paying for?
Latency Distributions: Examine P95/P99 latencies, not just averages
Queue Depth: High queue depth indicates performance bottlenecks
Throughput Consistency: Check for variations in throughput over time

Common benchmark interpretation pitfalls include:

Ignoring the initialization effect (first-write penalty)
Not accounting for burst credits being depleted
Overlooking instance-level bandwidth limitations
Testing with inappropriate I/O patterns for your workload

Troubleshooting EBS Performance Issues

When faced with poor EBS performance, follow this systematic approach:

Check CloudWatch Metrics: Review VolumeReadOps, VolumeWriteOps, VolumeQueueLength and other AWS EBS performance metrics
Analyze Bottlenecks:
- For high latency: Check if you’re exceeding provisioned IOPS/throughput
- For time-based degradation: Verify if burst credits are depleted
- For random performance issues: Check for “noisy neighbor” effects
Verify Instance Settings:
- Confirm you’re using an EBS-optimized instance
- Check if instance bandwidth limits are restricting EBS performance
- Verify the instance type supports your EBS performance requirements
Review Volume Configuration:
- Verify initialization status (pre-warming may be needed)
- Check if you’re hitting single-volume performance limits
- Ensure proper alignment with workload patterns

Optimizing EBS Performance

Based on benchmark results, apply these optimization strategies:

Volume Type Selection

Standardize on gp3 for most volumes and migrate off gp2 to eliminate credit risk and save approximately 20% per GiB, as recommended by Hykell’s optimization team.
Use io2 only where latency SLAs demand it; keep non-critical and dev/test environments on gp3.
Tier cold/sequential data to st1/sc1 or S3 to cut costs per GiB dramatically.

Instance and EBS Bandwidth Alignment

Align EC2 instance EBS bandwidth with aggregate volume needs and upgrade instances when storage-bound.
Stripe volumes when exceeding single-volume ceilings, but account for failure domains and backups.
Enable EBS optimization on your instances to ensure dedicated bandwidth for EBS traffic. As noted in AWS EC2 performance tuning, this is critical for consistent performance.

Performance Tuning Checklist

Match volume type to workload pattern:
- OLTP → io2 or gp3 with high IOPS
- Data warehousing → st1 or gp3 with high throughput
- Mixed workloads → gp3 with balanced settings
Right-size capacity, IOPS, and throughput based on observed metrics, not guesswork.
Tune OS and application I/O patterns:
- Adjust read-ahead settings for sequential workloads
- Optimize I/O queue depths at the application level
- Use modern Linux kernels with NVMe optimizations
Monitor with CloudWatch and set alerts for queue depth and latency issues.

Cost-Effective Performance Strategies

The goal isn’t just performance—it’s optimal performance at the right cost:

Right-sizing alone can reduce costs by up to 50% in some cases, especially for organizations that initially over-provisioned.
A 1TB volume using only 500GB represents a 50% cost inefficiency that can be eliminated.
Automated platforms can deliver up to 40% savings with performance intact through continuous right-sizing of capacity, IOPS, and throughput.
Monitor key CloudWatch EBS metrics regularly:
- VolumeReadOps/WriteOps
- VolumeReadBytes/WriteBytes
- VolumeTotalReadTime/WriteTime
- BurstBalance (for gp2)
- VolumeQueueLength
- ThroughputPercentage
Use AWS Compute Optimizer to identify optimization opportunities. It examines historical usage data and often identifies 20-30% of volumes as candidates for optimization.

Monitoring and Continuous Optimization

Set up a monitoring strategy that includes:

CloudWatch Dashboards with key EBS metrics
CloudWatch Alarms for performance thresholds
Regular benchmark cycles to validate performance over time
Automated remediation for common issues

For optimal results, consider implementing cloud native application monitoring with tools that can continuously adjust resources based on actual usage patterns.

Practical Application: Benchmark-Driven Optimization

Consider this real-world scenario:

A financial services company was experiencing latency spikes during market opening hours. Benchmarks revealed their gp2 volumes were depleting burst credits, causing unpredictable performance. By:

Migrating to gp3 volumes with consistently provisioned IOPS
Adjusting instance types to ensure sufficient EBS bandwidth
Implementing volume striping for databases exceeding single-volume limits

They achieved 40% better performance at 15% lower cost, while eliminating the unpredictable performance issues.

Taking Action on Your EBS Performance

Understanding and optimizing EBS performance is a continuous process that requires regular benchmarking, monitoring, and adjustment. By following the strategies outlined in this guide, you can ensure your AWS workloads achieve optimal performance while controlling costs.

Ready to automate this process? Hykell can help you identify the perfect balance between performance and cost, delivering up to 40% savings on your AWS storage costs while maintaining or improving performance. Our automated platform continuously monitors and adjusts your EBS configuration based on actual usage patterns, ensuring you never overpay for performance you don’t need.