Next: Forming Groups
Up: Role Classification of Hosts
Previous: Defining Similarity
Role Classification
The role classification problem is not difficult to solve in ideal
situations, such as the network shown in Figure 1, in which two nodes
that share the same logical role communicate with the identical set of
machines. Clearly, such a situation does not reflect the connection
patterns in typical enterprise networks. Three major challenges of the
role classification problem are:
- Two hosts that share the same logical role may communicate with
drastically different sets of machines.
- A host may potentially be classified into more than one role.
- The grouping results that network administrators desire may vary
from network to network and therefore the role classification
algorithm must provide flexibility for them to control its mechanics
so that meaningful grouping results can be achieved.
In a typical network setting for a technology company, each lab or
test machine may be dedicated to a single engineer. Thus, each of
these lab machines, despite sharing the same role, can have a
connection pattern that is very different from the rest of the lab
machines. To be able to correctly group such machines together, the
grouping algorithm must take into account the potential roles of
neighboring hosts rather than comparing the neighbor sets.
Furthermore, some hosts may potentially be classified into more than
one role. For instance, there could exist a machine in the network in
Figure 1 that communicates with both sets of machines
with which many engineering machines and sales machines communicate
respectively. In such cases, the connection patterns of hosts must be
evaluated carefully to ensure that each host is grouped with other
hosts with which it has the strongest similarity in connection habits.
The role classification problem is not trivial for the aforementioned
reasons. Not only does the computation of the similarity measure matter,
but the process of how nodes are grouped based on the similarity
values among node pairs is also important.
The grouping algorithm consists of two phases: i) the group formation phase and
ii) the group merging phase. The group formation phase identifies each
group of hosts that have similar sets of neighbors using a simple
similarity measure such as the one described in
Section 3. The purpose of the group formation phase is
two-fold: i) to efficiently identify various groups of hosts, each of
which has drastically different overall connection patterns, and ii)
to prepare for the second phase of the algorithm. The formation phase
of the algorithm can efficiently find the desired partitioning for the
example network in Figure 1 but may fail for many
networks since it does not take into account the potential roles of
neighboring hosts as explained earlier. In general, the group
formation phase may generate a partitioning that contains more groups
than desired.
The group merging phase decides whether groups, produced by the
formation phase, can further be merged using a much more sophisticated
similarity measure. This phase provides network administrators with
fine-grained control over the merging process so that the grouping
results reflect their intuition of the network structure.
Subsections
Next: Forming Groups
Up: Role Classification of Hosts
Previous: Defining Similarity
Godfrey Tan
2003-04-01