If one knew the probability of attack for each vulnerability, then
given a target system resilience one could enumerate cores
with that target resilience. In our case, it is not clear how one
would determine such probabilities. Instead, we define a core
for a host
to be a minimal set of hosts with the following additional
properties: 1)
; 2) for every attribute a
A,
either there is a host in
that differs from
in the value of
or there is no host in the system that differs from
in the value of
. Such a subset of hosts is a core for a host
if we assume that,
in any Internet catastrophe, an attack targets a single attribute value.
Although it is not hard to generalize this definition to allow for attacks
targeted against multiple attribute values, in the rest of this paper we
focus on attacks against a single attribute value.
Smaller cores means less replication, which is desirable for reducing
storage overhead. A core will contain between 2 and
hosts. If the hosts' attributes are well distributed, then the
cores will be small on average: for any host
, it is likely that
there is a host
that has different values of each of the
attributes, and so
and
constitute an orthogonal core. That is,
a fair number of orthogonal cores are likely to exist. If there is
less diversity, though, then the smallest cores may not be orthogonal
for many hosts, thus increasing storage overhead.
A lack of diversity, especially when trying to keep core sizes small,
can lead to a more severe problem. Suppose there are hosts
and an attribute
such that all have the
same value for
. Moreover, there is only one host
that differs in the
value of
. A core for each host
hence contains
, meaning that
will maintain copies for all of the
. Since the amount of disk space
donates for storing backup data is fixed, each
can only use
of
this space. In other words, if
donates
bytes for common storage to
the system, then each
can back up only
bytes. Note
that
can be as large as the number of hosts, and so
can
be minuscule. In Example 3.1, host
is the only
one to have a different value for attribute ``Operating System'', and
hence has to store copies for all the other hosts.
Characterizing the diversity of a set of hosts is a challenging
task. In particular, considering all possible distributions
for attribute configurations is not feasible. Instead, we define a
measure that condenses the diversity of a system into a single
number. According to our definition, a system with diversity
is one in which a share
of the servers is characterized by a
share
of the combinations of attributes. Although this metric
is coarse and does not capture all possible scenarios, it is
expressive enough to enable one to observe how the behavior of a backup
system is affected by skewed diversity. Note that
is in the interval
. The value
corresponds to a uniform distribution,
and a value of
close to 1 indicates a highly skewed diversity.
We use this metric to study how storage overhead, storage load, and resilience
vary with skew in diversity. We define the storage overhead of a host as the
size of the core that
uses to backup its data. Host
maintains copies for
other hosts. We define the storage load of
to be such a value
.
Note that storage load may vary among the hosts. Thus, we define the
storage load of the system as the maximum value of
across all the hosts.
In the remainder of this paper, we refer to the storage load of the system as
just storage load. Finally, resilience depends on the number of attributes covered
in a core, and it decreases as the number of non-covered attributes increases. We then
define the resilience of the system for a host
as the percentage
of attributes covered by the core
that
uses to backup its data.
The problem of finding a smallest core given a set of hosts and an attribute
configuration for each host, however, is NP-hard (reduction from SET-COVER). For
this reason, we used a randomized heuristic to find cores. This heuristic finds a
core for a host as follows:
This heuristic may not be the best; We have not yet done a thorough study of heuristics for finding cores. The results we present below, however, indicates that it is efficient in terms of storage overhead and resilience.