The input to LocalAdjust is a candidate allotment
vector and request arrival rate ,
together with specified constraints on each resource.
The output of LocalAdjust is an adjusted
vector that achieves a predicted average-case
response time as close as possible to the target,
while conforming to the resource constraints.
Since this paper focuses primarily on memory and storage resources,
we ignore CPU constraints. Specifically,
we assume that the expected
CPU response time RP for a given
is fixed and achievable.
CPU allotments are
relatively straightforward because memory and
storage allotments affect per-request CPU demand only minimally.
For example, if the CPU is the bottleneck, then the allotments for other
resources are easy to determine: set
to the saturation request
throughput rate, and provision other resources as before.
If the storage constraint falls below the candidate storage allotment
,
then LocalAdjust assigns the maximum value to
,
and rebalances the system by expanding the memory allotment
M to meet the response time target given the lower allowable
request rate
for storage.
Determine the allowable
at the
preconfigured
.
Determine the hit ratio
H needed to achieve this
using Equation (2),
and the memory allotment M to achieve H using
Equation (1).
Figure 5 illustrates the effect of LocalAdjust
on the candidate vector under a storage constraint at
IOPS.
As load
increases, LocalAdjust meets the response time
target by holding
to the maximum and growing M instead. The candidate
M varies in a (slightly) nonlinear fashion because
H grows as M increases, so larger shares of the increases to
are absorbed by the cache. This effect is partially offset
by the dynamics of Web content caching captured in Equation (1):
due to the nature of Zipf distributions, H grows logarithmically
with M, requiring larger marginal increases to M to effect
the same improvement in H.
If memory is constrained,
LocalAdjust sets M to the maximum
and rebalances the system by expanding
(if possible). The algorithm
is as follows:
determine H and
at M using Equations (1) and (2),
and use
to determine the adjusted
storage allotment as
.
Then compensate for the reduced H
by increasing
further to reduce storage utilization
levels below
,
improving the storage
response time RS.
If both
and M are constrained, assign both to
their maximum values and report the predicted response
time using the models in the obvious fashion.
LocalAdjust adjusts allotments to consume a local
surplus in the same way.
This may be useful if the service
is the only load component assigned to some server or set of
servers. Surplus
assignment is more interesting when the system must distribute
resources among multiple competing services, as described below.