Budget Fair Queueing (BFQ) Storage-I/O Scheduler

In this page we report a selection of our test results with BFQ-v8r5, CFQ, DEADLINE and NOOP, under Linux 4.8.0, and on the following devices:

Results with many more devices, but with previous versions of BFQ and Linux can be found here.

For each device, we report only our throughput, application-responsiveness and video-playing (frame-drop-rate) results.

To get all our results we used our ad hoc benchmark suite.

In what follows we call reader/writer a program (we used either dd or fio) that just reads/writes a large file. In addition we say that a reader/writer is sequential or random depending on whether it reads/writes the file sequentially or at random positions. For brevity we report only our results with synthetic and heavy workloads. In particular, we show application start-up times in rather extreme conditions, i.e, with very heavy background workloads. We consider lighter workloads only with the last three, slower devices.

HITACHI HDD

Next figure shows the throughput achieved by each scheduler while one of the following four heavy workloads is being executed: 10 parallel sequential or random readers (10r-seq, 10r-rand), 5 parallel sequential or random readers plus 5 parallel sequential or random writers (5r5w-seq, 5r5w-rand). The symbol X means that, for that workload and with that scheduler, the script failed to terminate within 10 seconds from due termination time (which implies that the system, and thus the results, are not reliable.).

HITACHI HDD throughput
Figure 1. Throughput on the HITACHI HDD (higher is better).

BFQ outperforms CFQ with 10r-rand, while, with this workload, the benchmark just fails with the other two schedulers. The reason is that, with DEADLINE and NOOP, the system ends up in a livelock state, and the benchmark script never terminates. The same bad outcome occurs with all schedulers but BFQ for 5r5w-seq, because writes are much harder to control than reads. Instead, with 10r-rand, all the schedulers achieve about the same performance. Finally, results are quite differentiated with 5r5w-rand, for the following reasons. First, DEADLINE and NOOP serve almost only writes during the test. This boosts the throughput, but causes critical system starvation. On the opposite end, CFQ serves almost only reads, which leads to very poor throughput. BFQ balnces reads and writes.

Next figure shows the cold-cache start-up time of xterm while one of the above heavy background workloads is being executed. The symbol X means that, for that workload and with that scheduler, the application failed to start in 60 seconds.

HITACHI HDD bash start-up time
Figure 2. gnome-terminal start-up time on the HITACHI HDD (lower is better).

As can be seen, with any workload BFQ guarantees about the same start-up time as if the disk was idle. With the other schedulers the application either takes a very long time to start or practically does not start at all. We ran tests also with lighter background workloads, and, also in those cases, the responsiveness guaranteed by these schedulers was noticeably worse than that guaranteed by BFQ (results available on demand). Results with both smaller and larger applications can be found in this extra result page.

In the video-playing tests, in parallel with the playback of the video,

Finally, video-playing results are shown in next figure. In this benchmark, the same workloads as for the responsiveness tests are executed, and, to make the background workload even more demanding for such a time-sensitive application, a bash shell is also started and terminated repeatedly. This time the symbol X means that the playback of the video did not terminate within a 60-second timeout after its actual duration, and hence the test was aborted. In most of these failure cases, the playback of the video actually did not start at all.

Video-playing frame-drop rate on the Seagate HDD
Figure 3. Video-playing frame-drop rate on the Seagate HDD (lower is better).

As can be seen, the performance of BFQ is not even comparable with that of the other schedulers.

SEAGATE HDD

Tests abandoned because the relative performance of the schedulers is the same as on the other HDD.

PLEXTOR SSD

With the SSD we consider only raw readers, i.e., processes reading directly from the device, to avoid writing large files repeatedly, and hence wearing out a costly SSD :)

SSD throughput
Figure 7. Throughput on the Plextor SSD (higher is better).

With sequential readers, BFQ loses about a 0.05 percent of throughput with respect to the other schedulers. This is the price it pays to achieve the start-up times shown in the next two figures. The throughput loss is instead at most 20% with random readers, for the following additional reason. With random readers, the number of IOPS becomes so high that the execution time of the schedulers is relevant as well. And BFQ is slightly more complex than the other scheduler.

As for responsiveness, for both applications BFQ achieves the lowest-possible start-up time with both workloads.

SSD gnome-terminal start-up time
Figure 8. gnome-terminal start-up time on the Plextor SSD (lower is better).

The high start-up times with the other schedulers in the presence of sequential readers is a consequence also of the fact that, to maximize throughput, the device prefetches requests, and, among internally-queued requests, privileges sequential ones. BFQ prevents the device from prefetching requests when that would hurt responsiveness. This behavior is paid with the above 0.05 loss of throughput with sequential readers. Results with both smaller and larger applications can be found in this extra result page.

Finally, the next figure shows our video-playing results.

Video-playing frame-drop rate on the Plextor SSD
Figure 9. Video-playing frame-drop rate on the Plextor SSD (lower is better).

Results are good with all schedulers. However, the figure does not show the fact that the player takes a lot of time to start up with all schedulers but BFQ.

 
Last updated: October 25 2016.
Paolo Valente (paolo DOT valente AT unimore DOT it)