Analyzing I/O Performance in Linux
Introduction
Before diving into I/O performance analysis, first read our guide on analyzing CPU performance in Linux. This article builds on that foundation to help you pinpoint which disks are underperforming.
Using the iostat Command
The iostat command provides detailed statistics on CPU and I/O operations. If iostat is not present, install it:
- On RHEL/CentOS:
sudo yum install sysstat
- On Ubuntu/Debian:
sudo apt-get install sysstat
Here’s an example output of iostat -x 15
:
iostat -x 15 Linux 5.15.0-204.147.6.2.el8uek.x86_64 (techinfobest) 05/27/2024 _x86_64_ (2 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 1.03 0.20 0.33 0.01 0.16 98.27 Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util sda 30.00 80.00 15000.00 40000.00 1.00 2.00 5.00 10.00 1.50 3.00 1.00 500.00 500.00 1.50 50.00 sdb 25.00 60.00 12500.00 30000.00 0.50 1.00 2.00 5.00 1.70 2.80 0.90 500.00 500.00 1.70 45.00 sdc 500.00 1000.00 500000.00 1000000.00 5.00 10.00 5.00 10.00 15.00 25.00 15.00 500.00 500.00 5.00 100.00
Key Metrics to Monitor
- r/s and w/s: Read and write requests per second.
- rkB/s and wkB/s: Kilobytes read and written per second.
- r_await and w_await: Average time in milliseconds for read and write requests.
- %util: Percentage of time the device is active. A value close to 100% indicates the disk is at its capacity.
Interpreting the Data
High r_await
and w_await
values indicate longer wait times, suggesting potential bottlenecks. High %util
means the disk is heavily utilized. If sdc
shows 100% utilization, it’s operating at full capacity and might be a bottleneck. Compare the performance metrics of sda
, sdb
, and sdc
to identify which disk has the highest read/write times and utilization.
Example Analysis
For instance:
- sda: Low read/write operations but higher
r_await
andw_await
, indicating possible latency issues. - sdb: Moderate operations with slightly higher wait times, suggesting it handles more load but still performs adequately.
- sdc: Highest read/write operations and utilization, indicating it is the most heavily used disk and has reached its performance limit.
Scenario with I/O Issues
Here’s an additional example where sdc
shows 100% utilization, indicating an I/O bottleneck:
iostat -x 15 Linux 5.15.0-204.147.6.2.el8uek.x86_64 (techinfobest) 05/27/2024 _x86_64_ (2 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 1.03 0.20 0.33 0.01 0.16 98.27 Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util sda 30.00 80.00 15000.00 40000.00 1.00 2.00 5.00 10.00 1.50 3.00 1.00 500.00 500.00 1.50 50.00 sdb 25.00 60.00 12500.00 30000.00 0.50 1.00 2.00 5.00 1.70 2.80 0.90 500.00 500.00 1.70 45.00 sdc 500.00 1000.00 500000.00 1000000.00 5.00 10.00 5.00 10.00 15.00 25.00 15.00 500.00 500.00 5.00 100.00
In this scenario, sdc
shows 100% utilization (%util
), indicating that it has reached its maximum capacity for read/write throughput and IOPS. This disk is a clear bottleneck and is likely causing performance issues.
The Value of 15
in iostat -x 15
The 15
in the command iostat -x 15
means the command will report the average statistics every 15 seconds. This helps in monitoring performance over time and identifying spikes in I/O activity.
Checking Swap Usage and Its Impact on I/O
Sometimes system administrators notice high swap usage and assume it’s a sign of I/O performance issues. However, this isn’t always the case — especially on Linux, where the kernel might preemptively swap out idle memory pages even when there is plenty of free RAM.
You can check memory and swap usage using:
free -m
Example output:
total used free shared buff/cache available Mem: 402370 271356 111370 2882 19643 124685 Swap: 8191 8105 86
Here, even though swap usage is very high (8105 MB out of 8191 MB), over 110 GB of RAM is still free, and over 124 GB is available. This might look concerning at first, but let’s confirm if it’s actually affecting performance.
Use the vmstat
command to observe swap activity over time:
vmstat 1 5
Sample output:
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 8205960 113117808 2445388 17811172 0 1 5357 790 1 1 13 4 83 1 0 1 0 8205960 113101784 2445396 17811168 0 0 344 309 18896 13551 5 3 92 0 0 1 0 8205960 113108472 2445396 17811192 0 0 2112 2538 15889 11272 9 2 88 0 0 1 0 8205960 113093040 2445396 17811224 0 0 1440 644 20056 14599 10 3 87 0 0 11 0 8205960 113096736 2445412 17811376 0 0 184 1191 33471 22185 20 13 67 0 0
Key indicators:
si
(swap in) andso
(swap out) are all zero, meaning no active swapping is happening.- CPU wait time (
wa
) is near zero, indicating no I/O bottlenecks. - I/O columns (
bi
,bo
) are stable and not peaking.
Despite high swap usage, there is no I/O impact here. The kernel likely swapped out idle memory pages (e.g., from Oracle DB background processes) earlier, and since they’re not being accessed, they remain in swap.
When Should You Be Concerned About Swap Usage?
Swap becomes a concern only if:
- You see non-zero
si
andso
values consistently invmstat
. - There’s increased CPU wait time (
wa
). - High disk I/O due to swap (
bi
,bo
spikes). - Applications are slowing down or crashing due to memory pressure.
In those cases, you can:
- Reduce
vm.swappiness
(e.g., set it to10
) usingsysctl
. - Reclaim swap usage with
swapoff -a && swapon -a
(during low activity). - Consider upgrading RAM if usage is always near full.
This kind of analysis complements disk-level tools like iostat
, giving you a holistic view of system health without misinterpreting memory usage signals.
Conclusion
By analyzing iostat -x
output, you can identify specific disks with performance issues. High r_await
, w_await
, and %util
values are indicators of potential I/O bottlenecks. Address these by optimizing disk usage, upgrading hardware, or redistributing the load across multiple disks.
For further insights on CPU performance and related I/O issues, visit TechInfoBest.