Next: System Overview
Up: Role Classification of Hosts
Previous: Abstract
Introduction
Today, many enterprises have internal networks (intranets) that are as
or more complicated than the entire Internet of a few years ago.
Managing these networks is increasingly costly, and the business cost
of network problems increasingly high.
Managing an enterprise network involves a number of inter-related
activities, including:
- Establishing a topology.
- A network's topology has a
significant impact on its cost, security, and performance. An
increasingly important aspect of topology design is network
segmentation. In an effort to provide fault isolation and mitigate
the spread of worms like Nimda [3] and Code
Red [2], enterprises segment their networks using
firewalls [4], routers, VLANs [7], and
other technologies.
- Establishing policies.
- Different users of a network have
different privileges. Some users may have unlimited access to
external networks while others may have restricted access. Some users
may be limited in the amount of bandwidth they may consume, and so on.
The number of policies is open-ended.
- Monitoring network performance.
- Almost every complex network
suffers from various localized performance problems. Network managers
must detect these problems and take action to correct them.
- Detecting and responding to security violations.
- Increasingly,
networks are coming under attack. Sometimes the targets are chosen at
random, as in most virus-based attacks, and in other cases they are
picked intentionally, as with most denial-of-service attacks. These
attacks often involve compromised computers within the enterprise
network. Early detection of attacks plays a critical role in reducing
the damage.
Conducting these activities on a host-by-host basis is not feasible
for large networks. Network managers need to extract structure from
their networks so that they can think about them and make decisions at
larger levels of granularity. Today, this structuring is most often
done in an ad hoc manner that relies on administrators' best guesses
about the computers, services, and users on the network. Obviously,
this method has scaling problems.
This paper presents two algorithms that, used together, partition the
hosts on an enterprise network into groups in a way that exposes the
logical structure of a network. The grouping algorithm classifies
hosts into groups, or ``roles,'' based on their connection
habits. The correlation algorithm correlates groups produced by
different runs of the classification algorithm.
The two algorithms together provide the following properties:
- They guarantee that a host is only grouped with other hosts that
have the strongest degree of similarity in connection habits.
- They provide a mechanism to merge groups, and give network
administrators fine-grained control over the merging process, so
that meaningful results can be achieved.
- They deal with transient changes in connection patterns by
analyzing the profiled data over long periods.
- They respond to non-transient changes in connection patterns
by producing a new partitioning and describing the differences
between the new partitioning and the previous partitioning.
- Their run time grows quadratically with
the number of hosts in the enterprise network.
As we demonstrate in Section 6, the algorithms reduce the
number of logical units that a network administrator must deal with by
one or two orders of magnitude. The algorithms are implemented
as part of an enterprise monitoring and analysis system that is in
production use at several large enterprises.
Section 2 outlines the system in which the algorithms
operate, and introduces an example scenario that will be used
throughout the paper. Section 3 describes the models used
to develop practical solutions. Section 4 and
Section 5 explain the two practical algorithms for solving
the role classification problem. Section 6 presents
preliminary results, and Section 7 discusses related work.
We conclude with discussions of our current and future work in
Section 8.
Next: System Overview
Up: Role Classification of Hosts
Previous: Abstract
Godfrey Tan
2003-04-01