The input to LocalAdjust is a candidate allotment vector and request arrival rate , together with specified constraints on each resource. The output of LocalAdjust is an adjusted vector that achieves a predicted average-case response time as close as possible to the target, while conforming to the resource constraints. Since this paper focuses primarily on memory and storage resources, we ignore CPU constraints. Specifically, we assume that the expected CPU response time RP for a given is fixed and achievable. CPU allotments are relatively straightforward because memory and storage allotments affect per-request CPU demand only minimally. For example, if the CPU is the bottleneck, then the allotments for other resources are easy to determine: set to the saturation request throughput rate, and provision other resources as before.
If the storage constraint falls below the candidate storage allotment , then LocalAdjust assigns the maximum value to , and rebalances the system by expanding the memory allotment M to meet the response time target given the lower allowable request rate for storage. Determine the allowable at the preconfigured . Determine the hit ratio H needed to achieve this using Equation (2), and the memory allotment M to achieve H using Equation (1).
Figure 5 illustrates the effect of LocalAdjust on the candidate vector under a storage constraint at IOPS. As load increases, LocalAdjust meets the response time target by holding to the maximum and growing M instead. The candidate M varies in a (slightly) nonlinear fashion because H grows as M increases, so larger shares of the increases to are absorbed by the cache. This effect is partially offset by the dynamics of Web content caching captured in Equation (1): due to the nature of Zipf distributions, H grows logarithmically with M, requiring larger marginal increases to M to effect the same improvement in H.
If memory is constrained, LocalAdjust sets M to the maximum and rebalances the system by expanding (if possible). The algorithm is as follows: determine H and at M using Equations (1) and (2), and use to determine the adjusted storage allotment as . Then compensate for the reduced H by increasing further to reduce storage utilization levels below , improving the storage response time RS.
If both and M are constrained, assign both to their maximum values and report the predicted response time using the models in the obvious fashion. LocalAdjust adjusts allotments to consume a local surplus in the same way. This may be useful if the service is the only load component assigned to some server or set of servers. Surplus assignment is more interesting when the system must distribute resources among multiple competing services, as described below.