All of these issues essentially boil down to the problem of estimating highly-variable latency and using it as an indicator of array overload. We may need to distinguish between latency changes caused by workload versus those due to the overload at the array. Some of the variation in IO latency can be absorbed by long-term averaging, and by considering latency per fixed IO size instead of per IO request. Also, a sufficiently high baseline latency (the desired operating point for the control algorithm, ) will be insensitive to workload-based variations in under-utilized cases.