Emerging federated computing environments offer attractive platforms
to test and deploy global-scale distributed applications.
When nodes in these platforms are time-shared among competing
applications, available resources vary
across nodes and over time. Thus,
one open architectural question in such systems is how to map
applications to available nodes--that is, how to
discover and select resources. Using a six-month trace of
PlanetLab resource utilization data and of resource demands from
three long-running PlanetLab services, we quantitatively characterize
resource availability and application usage behavior
across nodes and over time, and investigate
the potential to mitigate the application impact of resource variability through intelligent
service placement and migration.
We find that usage of CPU and network resources is heavy and
highly variable. We argue that this variability calls for intelligently
mapping applications to available nodes. Further, we find that node placement
decisions can become ill-suited after about 30 minutes, suggesting
that some applications can benefit from migration at that timescale,
and that placement and migration decisions can be safely based on
data collected at roughly that timescale.
We find that inter-node latency is stable and is a good predictor of
available bandwidth; this
observation argues for collecting latency data at relatively coarse
timescales and bandwidth data at even coarser timescales, using the
former to predict the latter between measurements.
Finally, we find that although the utilization of a particular resource on a
particular node is a good predictor of that node's utilization of that resource
in the near future, there do not exist correlations to support
predicting one resource's availability based on availability of other
resources on the same node at the same time, on availability of the same resource on other
nodes at the same site, or on time-series forecasts that
assume a daily or weekly regression to the mean.