Due to rapid improvements in processor technology, the gap between processor
speeds and I/O latency continues to widen. This will increase the number of
cycles per I/O stall, and therefore the progress that speculative execution
can make during a single stall. To predict the impact of this trend on the
effectiveness of our approach, we modified the striping pseudodevice to
delay notification of completed I/O requests. For example, to simulate the
effect of doubling the gap between processor and disk speeds, we doubled the
time before the system was notified that each I/O request had completed, then
scaled our resulting measurements by half. Since disk positioning times and data rates improve at
different rates, and data rates have been improving at 40% per year lately,
this simulates an artificially slow transfer rate. However, since the disks
perform track-buffer read-ahead while the pseudodevice is delaying completion,
accesses which are physically sequential will appear to have a faster than
modelled transfer rate.
Our simulation results are shown in Figure 6. The improvements obtained by
the manually modified applications increase steadily but insignificantly.
This is unsurprising since their performance is limited by the available I/O
bandwidth and their processing times are already only a small percentage of
their execution times. The curves for the speculating applications are
similar to those for the manually modified applications, although offset
in Gnuld's case. For Agrep and XDataSlice, speculative execution already
generates enough hints to keep the disks busy at all times. For Gnuld, data dependencies, which are independent of
processor speed, prevent speculative execution from using the additional
cycles during I/O stalls to hint more read calls.
For some applications, a more sophisticated design may be able to take
advantage of these additional cycles. For example, it may prove useful
to loosen our current definition of what it means for speculative execution
to be on track. In general, however, applications dependent on recently
read values may not be able to derive additional benefit from faster
processors (unless they are rewritten to allow newly read data to affect
future reads only after more intervening disk requests have been issued).
Figure 6: Results from simulating a widening of the gap
between processor and disk speeds. A processor/disk speed ratio of 1
indicates results in our current experimental environment.