USENIX ;login: - summaries

Workshop on Intrusion Detection and Network Monitoring

Santa Clara, California
April 9-12, 1999

Keynote
Session: Analysis and Large Networks
Session: Software and Processes
Session: IDS Systems
Session: Network Data Processing and Storage
Session: Statistics and Anomalies
Invited Talks

KEYNOTE ADDRESS

Dr. Peter Neumann, SRI

Summary by Rik Farrow

neumann_peter_g Dr. Neumann has a long history in both computers and security — his first programming job was in 1953. He was the co-author of the paper that described the file system for MULTICS (an ancestor of UNIX), which included much stronger provisions for security than most operating systems before or after. He worked with Dorothy Denning on IDES, an intrusion-detection project that had its roots in work beginning in 1983. Neumann was also the co-author, with Phil Porras, of the paper that won the workshop's Best Paper award, about EMERALD (Event Monitoring Enabling Responses to Anomalous Live Disturbances).

Neumann spoke from notes instead of giving a prepared speech. His premise is that our problem with security today is a structural one, and we won't have secure networks or computers unless we have secure operating systems, applications, and even programming languages that have support for security. One person, an ex-CERT team member, muttered that this was "the same old stuff." Sure, but until we get this right, we really cannot move on.

"We don't start with secure systems," proclaimed Neumann. Today's intrusion-detection (ID) systems have chosen to attack the easy problem, detecting known patterns of misuse. Today's ID products ignore insider abuse, which—like penetrations—would better be prevented than detected. The need for ID would be greatly reduced if operating systems provided better authentication, stronger cryptographic algorithms and implementations, stronger differential access controls, and, in some cases, multilevel security.

With today's operating systems, there is little hope of maintaining good security. Operating systems are already too unwieldy to administer properly, and some systems can be totally impossible to manage. As an example, he mentioned Microsoft and alluded to the reported 48 million lines of code that will become Windows 2000.

Also, with today's systems it is possible for a single node to contaminate an entire network. As examples, he mentioned the 1980 ARPAnet collapse, the 1990 AT&T long-distance collapse, the AT&T Frame Relay Network outage, and the Galaxy IV satellite outage. (Remember when many pagers suddenly stopped working?)

EMERALD is designed so that it can be integrated with other tools. Correlation of data from many dispersed sensors represents a serious problem. Another paper given in Oakland in May provides more details about how EMERALD copes with this problem and also characterizes the rule-based component. (See <https://www2.csl.sri.com/emerald/pbest-sp99-cr.pdf>.) The use of further reasoning can weed out the false positives found in anomaly-based ID systems.

Neumann concluded by saying that he is trying to convince the US Department of Defense to fund the robustification of an open-source version of UNIX and other good open-source software. "Jails are not the answer to our security problem; robust systems and networks are required as a basis for real security," stated Neumann. So-called software engineers need to be taught principles of software engineering and security. Data abstraction, encapsulation, and language features can help to make it easier to write secure software.

Dr. Neumann left ample time for questions at the end of his remarks. Someone asked if authentication will solve problems, to which he answered, "Crypto-graphic authentication can help solve the problem." Fixed passwords outlived their usefulness many years ago. The challenge here is to build a much more robust structure coupling authentication with identification. He'd like to see it done sooner rather than later.

The next person asked about problems with differential ACLs (access control lists), and did MULTICS solve the problem? Neumann replied, "No, and the Orange book didn't fix it either." Roles don't do it, and today's OSes have serious vulnerabilities. We should put trustworthiness on trusted servers. We need open source because it can be analyzed.

Dan Geer tied in examples from bacteria defense systems and asked if these are truly relevant to ID. Neumann's answer was that while the biological model is not really applicable, there are some lessons from biodiversity. Cyberdiversity can help with defense; the Melissa virus shows what can happen in a monoculture environment, which is another violation of the Einsteinian dictum that everything should be as simple as possible but no simpler.

Tom Limoncelli asked about best practice. Neumann answered that you can do simple things, but that alone is not enough. The open-source paradigm has an enormous community working to find problems, but open source does not solve the problem by itself unless the systems are adequately robust. Someone else asked about electronic commerce and whether the rush to insure risk would help. Neumann replied that the risks at the moment are relatively low. He quipped that maybe we need a Chernobyl in the computer world.

You can find out more about Peter Neumann's life, interests, projects, and papers by visiting <https://www.csl.sri.com/neumann/>.

REFEREED PAPERS

Session: Analysis and Large Networks

Summary by David Klotz

Analysis Techniques for Detecting Coordinated Attacks and Probes
John Green and David Marchette, Naval Surface Warfare Center; Stephen Northcutt, Ballistic Missile Defense Organization; Bill Ralph, ATR Corporation

One of the hot topics in intrusion detection is how a coordinated attack can be recognized. David Marchette gave a statistician's viewpoint on this question. By his own admission, the talk gave some clues for how to detect coordinated attacks but offered no magic bullet. He went on to say that as attackers get more sophisticated, we must rely more and more on coincidence to detect coordinated attacks.

A coordinated attack is one attack from two or more computers. It is not a requirement, however, that you can tell for sure that multiple machines are involved. In practice this is often impossible. Essentially, coordination is a symptom or the appearance of a pattern. Marchette gave several examples of coordinated attacks he has seen:

Coordinated traceroute, which can be used as part of a mechanism for denial of service by taking down a system upstream from yours.
NetBIOS scans, in which he found that only IP addresses that didn't exist were being probed. One explanation for this was that attackers might have been interested only in new machines, which presumably haven't been secured yet.
Reset scans, which he was not even sure were attacks. They might have been a naturally occurring event, but they could also be used to do inverse mapping by recording host-unreachable messages generated by the RST packet. TCP hijacking was another possible explanation.

Detection of these types of attacks can sometimes be done by recognizing patterns in audit data, either by looking for anomalies or seeing signatures of known attacks. Unfortunately, very slow attacks that fall outside the window of detection really cannot be recognized.

Someone asked how long a window Marchette uses. His response was that though his team has over half a terabyte of packet header data, it is unlikely they would look at all of it. He estimated his window to be around a couple of days, maybe a little longer.

Someone else mentioned detection of spoofed addressed attacks, saying that if you clustered IP addresses, TTLs, or sequence numbers of scans, then the spoofed ones would stand out. Unfortunately, attack tools that spoof have already become sophisticated enough to escape most detection using this method, and the trend is toward even better tools.

Intrusion Detection and Intrusion Prevention on a Large Network: A Case Study
Tom Dunigan and Greg Hinkel, Oak Ridge National Laboratory

In many situations, intrusion detection can be done in a locked-down system where access is limited and abnormal behavior can be specified rather than inferred. In many scientific centers, however, such restriction is very unpopular and is strongly resisted. Scientific staffs, including offsite collaborators, require easy access to accounts and data, often over unencrypted channels. A further complication is the fact that many users don't see security as an issue until they themselves are victimized. Greg Hinkel presented a paper on detecting intrusions in such an open environment, at Oak Ridge National Lab.

Hinkel described a layered approach taken at ORNL:

Firewall: to place some limitations on access.
External monitoring: to monitor traffic from outside.
Internal monitoring: Honeypots are placed around the network, and some scanning of internal systems is done to look for known vulnerabilities.
System administration: System administrators are taught how to set up systems correctly and what things to watch out for.
End users: Users are educated about security issues, such as the danger of cleartext passwords and the importance of their accounts.

The hardware configuration involves extensive monitoring of traffic, both internal and external. For example, all external sessions are keystroke-logged, providing a complete record of what happened. Honeypots are used internally to allow the tracing of intrusions when they do happen. Realtime detection is also complemented by scripts that comb through logs looking for "interesting" things, such as IRC traffic on dedicated scientific computing machines.

Three specific cases from the paper were described. The first involved a Russian attacker who was able to gain root access on one of the SGIs at the lab. They were able to detect the attack by noticing a port scan and then capture the attacker's session with the keystroke logger. A similar attack from Brazil was detected by the same means. A third attack, which involved a cracked password, was detected when the TCP connection logger noticed an unusually high number of connections coming from one machine.

Most of the interest from the audience was in the keystroke logger. One question dealt with how the logs are kept safe from tampering. Hinkel responded that the logs are kept on highly protected machines that can't be reached from the outside and are well hidden. Someone else wondered how ORNL plans to deal with encrypted sessions; this issue hasn't been addressed yet.

Luc Girardin photo An Eye on Network Intruder-Administrator Shootouts
Luc Girardin, UBS, Ubilab

Luc Girardin presented a method to analyze network traffic statistics visually, allowing detection of possible attacks by inspection. Unlike most current IDSs, this system relies on human monitoring and takes advantage of human ability to understand complexity.

Girardin sees current IDSes as systems that rely on implementation of more and more complicated locks to respond to more advanced attacks. In order to try to break out of this loop, a new approach is needed. Rather than combing through network packets looking for anomalies or misuse, traffic data is mapped topographically. In this way similar events are represented as being close together, and dissimilar ones farther apart.

By mapping network traffic, this approach takes advantage of the almost universal ability of human beings to comprehend geographic maps. Relation-ships such as proximity or optimal path apply to network maps in a way similar to how they apply to geographical maps. The actual mapping is done using an unsupervised neural net, which has no a priori knowledge about network traffic patterns.

Session: Software and Processes

Summary by David Parter

On Preventing Intrusions by Process Behavior Monitoring
R. Sekar, Iowa State University; T. Bowen and M. Segal, Bellcore

This paper presented an active approach to preventing intrusions. According to the authors, damage can happen only as a result of system calls by modified programs and network packets delivered to the target host. Damage occurs when programs deviate from the intended behavior. The authors' damage-prevention technique is to model the correct behavior of a program in terms of system calls and detect any deviation from the model. They did a paper-and-pencil analysis of 96 CERT advisories and determined that most of the program deviations were detectable with their method.

There are two parts to their system: an offline preparation stage and a runtime monitor. The offline stage generates the detection engine on the basis of the behavioral specification of a program. The runtime system uses a kernel-level system call interpreter for Linux, which intercepts system calls just before and immediately after the kernel implementation of the system call. The detection engine compares the system-call sequence with the expected sequence. Any deviations are detected, and execution is prevented.

In response to a question about code availability, it was stated that their work is currently a prototype of the concept.

Intrusion Detection Through Dynamic Software Measurement
Sebastian Elbaum and John C. Munson, University of Idaho

Sebastian Elbaum, a student at the University of Idaho, presented his work on intrusion detection "from the inside out" of a program. (He received the Best Student Paper Award).

Most tools use standard audit trails, which include information only about points for which audit records are generated. By looking at the software "from the inside out," the authors expect more low-level detailed information, which will make it easier to detect abnormalities.

Their method is to model the expected behavior of the program as a series of function calls. In practice this is an extended call graph and execution profile of the program, and the probability distributions for sequences of operations.

The implementation is to add software instrumentation to the program source code; in this case, the Linux kernel. At this point, they have successfully instrumented the kernel and produced a nominal profile. Changes in user behavior (for example, students on spring break instead of in class) have led to false positives, and more work is needed on developing the nominal profile.

Learning Program Behavior Profiles for Intrusion Detection
Anup K. Ghosh, Aaron Schwartzbard, and Michael Schatz, Reliable Software Technologies Corp.

Anup Ghosh observed that looking at process behavior, instead of user behavior, is a shift in intrusion-detection methods, and that it was nice to see several papers on this at the workshop.

The advantage of anomaly detection, as opposed to attack-signature detection, is the possibility of detecting new (as yet unknown) attacks. This is an important goal of the authors' work. Their premise is that abnormal program behavior is a primary indication of program or system misuse.

Like the previous presenters in this session, they construct profiles of expected program behavior. Their work builds on previous "computational immunology" work at UNM, which looked at system-call sequences, and at Columbia, which used data-mining techniques on the UNM data.

Their implementation uses available technology such as the Sun Basic Security Module (BSM) auditing facility and the Linux strace program. With Sun's BSM, programs typically create 10-20 different BSM events (out of about 200 possible events).

He discussed three algorithms for using the profile data: equality matching, like that done by UNM; back-propagation neural networks; and recurrent networks. Each algorithm was tested using known attack traces. A "Receiver Operating Characteristic" (ROC) curve is used to plot the effectiveness of the method with different settings. The Y-axis represents the probability of detecting an attack, the X-axis the probability of reporting a false positive. At (0,0) nothing is reported. At (1,1) all attacks are reported; however, 100 percent false positives are also reported. (0,1) is considered an oracle: 100 percent correct positives reported, with no false positives.

Table Lookup

Table lookup (equality matching) is the baseline for comparing the methods, since it is the easiest to implement and is known to be fairly good at anomaly detection. The profile consists of fixed-sized "windows" of events and their frequencies. It is then analyzed at various granularities (individual windows, clusters of windows of various sizes, and over the entire session), counting the number of anomalies found. For the detection phase, tunable parameters include how many anomalies at each level will trigger a report, size of the windows, and amount of training the system undergoes.

The table-lookup method is simple to implement and good at detecting new attacks, but the false-positive rate proved to be too high. And this method does not generalize—it is based entirely on memorizing expected patterns.

Neural Networks for Intrusion Detection

Neural networks learn the normal behavior by observation, and they provide the ability to generalize from past behavior. They can detect deviations from normal behavior. The authors implemented a "back-propagation" neural network with supervision. It proved to be suitable, but it did not do as well as equality. They concluded that this is because both training and tuning the network are difficult. They also noted that overtraining the network leads to a pure memorization approach.

Ellman Networks

Their third approach used an Ellman network, which is similar to the network used in the previous approach but includes state information. The performance was remarkably better: 0 percent false positives up to nearly a level of 80 percent correct positives -- but so far, only on the one set of data that they have tested. They are continuing to investigate this approach to real time intrusion-detection systems.

Session: IDS Systems

Summary by David Klotz

Automated Intrusion Detection Methods Using NFR: Methods and Experiences
Wenke Lee, Christopher T. Park, and Salvatore J. Stolfo, Columbia University

As attackers and attacks have grown more sophisticated, intrusion-detection systems have had to be brought up to speed. Often the rules used to detect attacks are hand-encoded, and it can be a laborious process to keep these systems current. Christopher Park presented work that

Christopher Park & Anup Ghosh
represents a new approach to discovering and encoding rules into an IDS.

After giving the obligatory description of anomaly and misuse detection, Park went on to describe his group's system, which uses data mining to discover frequent patterns in connection data. These patterns are used to come up with machine-learned guidelines, using RIPPER (a rule-learning system), and then encoded into Network Flight Recorder (NFR). As with most other machine-learning systems, a training period is required.

Several questions followed the talk. One attendee wanted to know if they had looked at other sniffer or detection systems besides NFR. Park, who earlier in the talk had indicated that NFR had been chosen because of its extensibility, realtime alert capability, and noninterference with network traffic, replied that they had looked at several software packages and had all agreed that NFR best met their needs. He reiterated that its extensibility was what made it the most useful for their research.

Someone else asked what features are looked at during rule creation. Currently only features that can be gleaned from packet headers are used. Do they plan to look at data rather than just header information? Park responded no, they only look at headers. A response to another question indicated that as bandwidth increased, the rules discovered also changed, specifically those that deal with fragmented packets; however, no data had been taken that would show whether or not accuracy went up or down with increased traffic.

Experience with EMERALD to Date
Phillip A. Porras and Peter G. Neumann, SRI International

This paper received the Best Paper Award. Phil Porras presented an overview of the EMERALD (Event Monitoring Enabling Responses to Anomalous Live Disturbances) intrusion-detection system. The EMERALD approach has been to shift analysis from centralized, monolithic intrusion detection to distributed, lightweight sensors that can be deployed strategically. Though each module would gather data locally, they would feed information to one another in order to get a global view. There is also a standard API that allows integration of third-party components into the EMERALD system.

EMERALD uses both anomaly detection and an expert system to recognize known attacks. The anomaly detection builds a profile of normal activity and compares both short- and long-term patterns. It monitors current activity and sets off alarms when it departs from the profile of what is normal. The expert system fully integrates a P-BEST shell. Rules are translated and compiled directly into code modules. Hence, no interpretation is done, so the rules can be run very quickly. Porras emphasized that creating EMERALD modules was not difficult and mentioned a graduate-school class where the students built their own modules, using P-BEST, in about two weeks. He also mentioned that the EMERALD team is currently working on a Web-based development environment.

EMERALD also attempts to correlate activity across modules. Equivalence recognition is done to determine when two reports refer to the same attack. Commonalities between reports are also looked for, as well as vulnerability interdependencies across physically distributed components. Commonly seen sequences across domains are also noted. One attendee wanted to know how EMERALD handles system calls. Porras responded that it traps Solaris system calls, and that they are working on something similar for FreeBSD, but that they had nothing for Linux. In response to another question, Porras mentioned that he does not believe there is any IDS on the market that can be compared to EMERALD.

Defending Against the Wily Surfer—Web-based Attacks and Defenses
Daniel V. Klein, Cybertainment, Inc.

Dan Klein started his talk by answering the question of the hour: yes, his mom—and in fact most of his relatives—knows what he does for a living. For those neither related to him nor not present at his talk, Klein is the technical person at Cybertainment, Inc., one of the larger Internet adult-entertainment providers. From this vantage point, he has a good view of a large variety of Web-focused network attacks.

Klein started his talk with a prediction that the nature of the Web will change tremendously in the near future. Now almost everything is free, except for porn. He believes this will change when people realize that the costs associated with going on the Web are not covered by ad revenues. He also pointed out that the adult-entertainment industry is what drives the Web, since it is that rare commodity that people are willing to pay for. People are also willing to steal it.

Klein broke Web-based attacks down into three categories: simple theft, breaking and entering, and felonious assault. Simple theft refers to actions like registering common misspellings of a popular domain name (like netscpae.com), or registering variations on top-level domains (such as whitehouse.com). While these actions aren't in themselves illegal, these alter-sites could easily be used to fool users into divulging credit-card or personal information.

Breaking and entering refers to actions such as domain-name stealing or password cracking. Several domain-name controversies have been mentioned in the media recently, which only highlights how easy that particular attack is. Felonious assault covers such things as DNS-cache poisoning or JavaScript frame spoofing.

One of the most common problems that adult sites face is password sharing. One person will get a hold of a password, by legitimate means or otherwise, and then post it to a public place. There are in fact whole Web sites devoted to this. Of course this makes password sharing fairly easy to detect, since huge traffic spikes occur shortly after one has been posted. Klein uses software to check for these spikes twice a day and disables the account when it's found.

One novel approach that some sites use to turn this attack to their advantage is to post fake usernames and passwords, which redirect to a click-through ad that generates revenue. Someone raised the question of whether or not this kind of activity is itself illegal. Klein argued that it is not, but simply a legal means of taking advantage of illegal behavior.

Another tricky legal area Klein covered was bandwidth theft. If a site downloads images from another site and posts them, this is clearly prosecutable theft, but if a site simply puts links to images owned by others, surrounded by their own advertising, then they have done nothing illegal. Meanwhile the other site is the one paying the bandwidth hit for serving the images, while the one using the links is able to generate the revenue.

Clickbots represent another form of illegal activity. They are designed to generate hits on click-through ads that produce revenue on a per-click basis. These can be very hard to detect if done well. One person wanted to know when these were considered illegal rather than just immoral. Klein responded that in all cases he considered them immoral, but clickbots that generate fake hits that actually create real revenue could be legally classified as fraudulent. A trickier situation is one where the clickbot is generating hits that move the site up on a top-ten list. Here there is no direct revenue generated, and in fact it can be argued that the top-ten list benefits too.

The talk ended with Klein giving some advice on how to go about protecting yourself: try to think like a bad guy, and most of all have fun; remember that we too can be devious.

Session: Network Data Processing and Storage

Summary by David Parter

Preprocessor Algorithm for Network Management Codebook
Minaxi Gupta and Mani Subramanian, Georgia Institute of Technology

Minaxi Gupta presented her work on preprocessing a "codebook" for network management. The codebook approach is to try to identify all possible causes for every possible failure in the network. The symptoms and problems are put into a matrix, whereby one can easily determine the problem for a given symptom. A well-formed matrix eliminates redundant entries so that each symptom leads to a specific root cause of the problem.

Once a codebook has been created, an automated system can use the codebook to identify problems based on the symptoms (observed failures). Gupta's work is a technique to ensure a minimal codebook, which is crucial to an efficient runtime system. In her presentation, she detailed the mathematical aspects of the preprocessor technique.

A question from the audience challenged her assumption that an optimal codebook could be produced and was in fact useful, since most systems have multiple errors and it is impossible to build a perfect system where everything functions exactly according to plan. It was strongly recommended that she consider using redundancy in the codebook. Gupta answered that this was a good observation, and that they would be adding known probabilist symptoms, which should address this concern.

The Packet Vault: Secure Storage of Network Data
C. J. Antonelli, M. Undy, and P. Honeyman, University of Michigan

C. J. Antonelli described work on secure storage of captured network packets. The premise is that recording only the headers or some other subset of the total data will cause some information to be lost that could later prove valuable in analysis, in intrusion detection, or as evidence. If this data is to be kept, it must be secured in a manner that allows release of selected traffic while continuing to protect the other traffic.

The architectural goals of the project include: use of commodity hardware and software, completeness of the packet capture, permanency of the record, and security of the record.

The packet vault utilizes two (commodity) Pentium workstations, one running OpenBSD, the other running Linux. The collector workstation runs OpenBSD and accumulates packets from the network, using a modified Berkeley Packet Filter (BPF) to write directly to a memory-based file system. Each packet is encrypted and then transferred via RPC to the archive station (running Linux). The archive station creates a filesystem image of the encrypted packets and metadata, then records a CD of the image.

The archived data is organized as follows: The source and destination of each packet is obscured by means of a translation table, which is encrypted using a "translation table key" that is changed for each CD. The payload of each packet is encrypted using a "conversation key," which is unique for each (source, destination) pair on the CD. The "conversation key" is derived from the volume (CD) master key and the packet headers. All the keys for a given volume are stored on the CD, using PGP encryption, where the private master key is held in escrow (and is not on the archiver system). In this case, the master key would be held by the University of Michigan Regents.

The encryptions are done using DESX, a modified DES cipher that is believed to be equivalent to 95-bit keys for this application.

While the system worked as a prototype, they have only limited experience with it. They were able to use it for two weeks on a 10Mbit research Ethernet, storing 8GB on 15 CDs. One bottleneck was the CD-ROM recorder, which has to be run at no more than 2X recording speed. The biggest challenges, however, are administrative.

The talk identified the following outstanding issues:

Limits of DES: With newer CPUs, 3DES should be feasible.
Limits of passive analysis: There are several known limits to passive analysis for intrusion detection. The packet vault is immune to time-to-live (TTL) tricks, since it does not do packet reassembly, but passes on all packets to the destination undisturbed. An intrusion-detection system using the data from the packet vault would have to address this issue.
Evidence handling: The goal of the packet vault is to "freeze" the scene of the crime, but steps must be taken to assure continuity of the evidence. Partial solutions include adding digital signatures to the CDs (not using the escrowed private key) and assuring that the procedures are auditable. However, this just pushes the problem back a level to trusting the initial packet-collection system. In court cases, it is often a "competition of gray areas" between the prosecution and the defense; rarely is the evidence black and white. It is believed that adding digital signatures and audits will improve the chances of a jury accepting the evidence.
Legal issues: There are a lot of potential legal issues with the packet vault, especially for a university. Among them are: carrier-transport / ECPA (Electronic Communications Privacy Act); student information / FERPA (Family Educational Rights and Privacy Act); privacy and First Amendment concerns; human subject guidelines; ownership and copyright issues; right to know / Freedom of Information Act; discovery and evidence rules; search-and-seizure rules; civil liability.

In order to better understand the legal issues, the authors commissioned a six-month study by a law student. He concluded that there is little case law to provide guidance on these issues, and that "fishing trips" for data are possible under the Freedom of Information Act.

For these reasons, the University of Michigan will not be using the packet vault at any time soon. The authors believe that there are fewer such issues in corporate private network environments.

Future work is needed on the intrusion-detection system using the captured packets, better cryptography, digital signatures, the administrative interfaces, and keeping up with faster networks.

Real-time Intrusion Detection and Suppression in ATM Networks
R. Bettati, W. Zhao, and D. Teodor, Texas A&M University

R. Bettati presented work on a real-time intrusion-detection and suppression system. It is targeted to a distributed mission-critical system using a high-speed switched LAN in a closed environment with high operational knowledge. (An example of such a system is a naval battleship-control system.)

These systems have stringent real-time requirements, such as timing and guaranteed bandwidth and delay. They are vulnerable to denial-of-service attacks such as network topology changes, rogue applications, and "high-level jamming" by unexpected traffic. It is important in these situations to detect an intrusion before a denial-of-service attack happens.

In this closed environment, a naive intrusion is easy to detect. The authors propose building low-level ATM security devices that suppress unknown connections (blocking creation of new ATM VPIs and VCIs) to handle naive intrusions. They can also use these low-level devices to detect violations in a given application's traffic signature. The realtime nature of the applications helps, since the traffic characteristics are well specified and well known.

Session: Statistics and Anomalies

Summary by David Klotz

A Statistical Method for Profiling Network Traffic
David Marchette, Naval Surface Warfare Center

The task of monitoring a large network is complicated by the sheer volume of packets that travel across it. Most of this traffic is uninteresting, but in order to find the packets that are unusual and that may be part of an attack, you have to wade through vast quantities of normal packets. David Marchette's second talk of the workshop presented a statistical method for filtering out normal traffic so that abnormal traffic can be focused on.

The two questions that were addressed were:

Can normal packets be filtered out while still retaining the attack?
Do machines cluster according to activity, and can that then be called normal?

Though other packet fields could be analyzed, this work looked only at destination ports. Counts of port accesses were used to look for unusual behavior. Obviously, counts here have meaning only in relation to historical data, which is assumed to be normal. The counts are then looked at on a per-time-period basis and are used to create probability vectors for each port. From this, improbable traffic can be discovered and further analyzed. At this point someone in the audience asked whether the system could detect unusually low amounts of traffic, and whether they would consider this to be suspicious. Marchette responded that he didn't feel this could be an attack in and of itself, but that it might be an indication. Either way, their system didn't deal with unusually low amounts of traffic.

Unfortunately, this method does not scale well by itself, so in order to deal with large numbers of ports, a simplification is used. Clustering is employed to take care of networks with large numbers of machines. In most servers today, ports range into the tens of thousands. The simplification was to count ports 1 to 1024 individually, but to lump higher-numbered ports together into one "big port" category. An alternative would allow groups within the big-port range to be counted individually, and lump the rest together.

In order to cluster machines, each individual has a port-access probability vector created for it. Euclidean distance is then computed for all vectors and either the k-means clustering or the ADC clustering algorithm is applied. The resultant vector of the cluster is then used for filtering. After running experiments which involved 1.7 million packets and 27 identified attacks, ADC clustering was able to identify all 27 while filtering out 91 percent of all packets. K-means performed slightly worse, filtering out about 76 percent before recognizing all 27 attacks.

Transaction-based Anomaly Detection
Roland Büschkes and Mark Borning, Aachen University of Technology; Dogan Kesdogan, o.tel.o communications GmbH & Co.

In one of the more novel talks of the workshop, Roland Büschkes presented work that applied concepts from database theory to anomaly detection. Like specification-based anomaly detection, transaction-based anomaly detection formally specifies user behavior and then represents protocols as finite-state machine transitions. The protocols are used to define the valid transitions. Each transition is then checked against four "ACID" principles:

Atomicity: All operations of a transaction must be completed.
Consistency: A transaction takes the system from one consistent state to another.
Isolation: Each transaction performs without interference from other transactions.
Durability: After a transaction finishes, a permanent record is stored.

During implementation, an audit stream of TCP packets is used as input. The stream is sent to a splitter that distributes the packets to the appropriate deterministic finite-state machine. These DFSMs test the atomicity of the transaction. From here the stream is sent to a consistency checker to make sure the transaction leaves the system in a consistent state. Finally, all transactions are sent to an isolation checker which ensures that no transaction interferes with another. The issue of durability, whose meaning is not immediately clear in the context of intrusion detection, was left mostly undiscussed.

A few examples were given to show that network attacks do in fact map to database transactions. SYN flood was shown to violate atomicity and the ping-of-death attack to violate consistency. The talk finished with the promise of an implementation of the system in late June and further investigation of analogies that can be made between database theory and intrusion detection.

INVITED TALKS

Why Monitoring Mobile Code Is Harder Than It Sounds
Gary McGraw, Reliable Software Technologies

Summary by David Klotz

mcgraw_gary

Gary McGraw
Mobile code has been in the news quite a bit over the past few years. First portrayed as "the next big thing" in the form of Java and, later, JavaScript, it soon became clear, as malicious applets and macro viruses began to make the rounds, that it has a downside too. Gary McGraw addressed some of these issues and what is being done to combat them.

Mobile code has grown in importance with the explosion of the Internet. In trying to build a world where light switches and shoes all have IP addresses, mobile code becomes desirable, since you don't want to have to code each device individually. However, McGraw pointed out, the problem of malicious mobile code is not new. In the 1980s, downloading executables was declared "a bad idea." Given that mobile code is both useful and dangerous, how can we use mobile code without selling the farm?

To answer the question, McGraw began by going through examples of malicious activity that can be carried out by several popular forms of mobile code. He looked at JavaScript first.

JavaScript, which has nothing to do with Java and was originally called LiveScript, allows code to be placed directly inside the browser. JavaScript can be used to track Web locations a user visits, steal files from the user's machine, or create a denial of service by redirecting the browser. It can also construct Java applet tags on the fly. This creates a way to sneak applets past a firewall, since most firewalls deny Java applets by filtering on the <applet> tag before it actually reaches the browser.

A more sinister attack is Web spoofing. This classic man-in-the-middle attack can be used to steal control of a user's view of the Web. As links sent from the intended page are sent back to the user, they can be changed to point back to another site. In this way a user's credit card or personal data might end up going somewhere very different from where they intended it to go. Even secure sockets don't provide a guarantee of security.

As McGraw pointed out, just because the little lock on the browser lights up doesn't mean that the secure connection is going where you expect it to. He urged people to actually check out the certificates on secure sites, saying they might be very surprised sometimes. Attackers can get certificates too.

Another class of malicious mobile code is the macro virus. The Word macro virus was the first of the well known, and well distributed, of these. It was so well distributed, in fact, that when McGraw's first book on Java security came back from the publisher, it came as an infected Word template. While some macro viruses can be thwarted by simply turning off macro execution, at least one—the Russian New Year virus—used an auto-execute feature that could not be disabled. To make matters worse, we have now reached a point where programming skill isn't even necessary to create malicious mobile code. This point was brought home when, not two weeks before the workshop, Melissa—a macro created mostly by cut and paste—became the fastest-spreading virus ever.

McGraw discussed ActiveX next. ActiveX uses Microsoft's Authenticode method for guaranteeing safety. This strategy is faulty on at least two fronts. First, when an ActiveX module is downloaded, a signer is presented to the user and given the opportunity to deny to the module the ability to run. Once the OK is given, though, the module is free to do whatever it wants. McGraw used the analogy of allowing somebody into your office under the pretense of carrying out some task, then giving them free and easy access to all of your records regardless of whether or not they are relevant to the task at hand. Since no sandboxing is done in ActiveX, modules have access to the entire machine. Second, the very foundation of the "signed is OK" philosophy is faulty. The Exploder Control, which was a signed and verified ActiveX module, performed a clean shutdown of Windows 95 when it was downloaded.

Finally, McGraw looked at malicious Java. Though Java has been built with security in mind, attack applets can be built. Responding to the point that some malicious applets have been found in the wild, but no attack applets have, McGraw chalked this up to pure luck and presented a long list of attack applets that have been created in labs. Further, while the latest version of Java has several new security features, problems always crop up with new code.

The philosophy behind the Java architecture has been "add as much as you can while managing the risks." The Java Virtual Machine is used as a guardian, protecting operating-system calls by controlling the entry points. Programs that are trusted are given access to these calls, while untrusted code is denied. In JDK 1.0.2 a black-and-white security model was used, where code was either trusted or untrusted. Code verification was done using type checking, but since this couldn't be done statically, it led to vulnerabilities. The most prevalent attack in 1.0.2 involved throwing an exception in a certain state that caused the virtual machine to confuse types. JDK 1.1 added a sandbox and a signing mechanism that would allow code out of the sandbox. The crypto API was also introduced. 1.1, however, still used the black-and-white security model. With Java 2, a new "shades-of-gray" security model has been introduced whereby individual classes can be assigned different levels of trust.

The talk finished with a look at how to protect against malicious code. Three places for stopping mobile code, in order from worst to best, are the firewall, the client, and the browser. Stopping mobile code at the firewall allows for centralized control but is useless when cryptography is employed. Using the client gives you more control over the environment but is difficult to manage, since often there are a huge number of client machines. The browser seems to be the most logical place, since that is where the running and sandboxing actually take place, but is also tough to manage for the same reasons clients are. He also pointed out that finding mobile code is a nontrivial problem. With cryptography and multiple points of entry for mobile code, it can be nearly impossible to track it down until it is too late.

One method of stopping mobile code before it arrives is blacklisting. Though cheap to implement, it is fairly easy to thwart. Further, if it is done dynamically, it can easily lead to a denial-of-service attack as the bad code database is filled up. Another method, stopping errant applets once they start, is much harder in practice than it sounds. In fact, it can be impossible to stop a malicious thread. A thread is stopped by throwing a ThreadDeath exception. If the applet catches and handles this, there is no way to kill the thread. A more devious way to keep the malicious applet running is to put the rerun instruction in the finalizer, which means it will be restarted as a garbage-collector thread. The best approach to stopping a hostile applet, then, is to stop the virtual machine. Finally, policy management can be used, but creating and enforcing a fine-grained policy is very difficult.

In summary, McGraw stated some lessons he's learned from dealing with malicious code in the "trenches":

Type safety is not enough.
Real security is more difficult than it sounds.
It is impossible to separate implementation errors from design problems.
New features add new holes.
Humans are an essential element to consider.

During the question and answer period, the topic of remote method invocation came up. How does access checking and transitive trust fit in with RMI? McGraw feels that RMI is problematic, since it's not clear whose access policy should apply. Java servlets also pose a problem, because you can declare yourself a certificate authority. When asked his opinion on the Pentium III processor ID issue, he said that not only was it a bad idea from a privacy standpoint, but because it didn't use crypto it was essentially useless. Another question was whether net security was being reinvented with the virtual machine, and whether we might not see intrusion-detection systems for VMs. McGraw feels that this might be the case and said that stack inspection, which is used in code verification, is similar to an IDS. He also criticized the "penetrate-and-patch" mentality of software vendors that is so prevalent today, suggesting that they should do things right the first time.

Design and Integration Principles for Large-Scale Infrastructure Protection
Edward Amoroso, AT&T Labs

Summary by Rik Farrow

Edward Amoroso gave a cynical, interesting, yet rambling talk about the futility of protecting infrastructure. This is not so surprising from a man who worked on the Strategic Defense Initiative ("Star Wars"), which he called a "silly idea, to think you can shoot down ballistic missiles." SDI wasn't possible in the 1980s, and attempts to knock down even the occasional missile have mostly failed in the 1990s as well. In fact, AT&T does better than that.

For example, AT&T profiles the 230 million calls its communications network sees each day. 30,000 people make their first international call each day as well, which creates a "dot" on your account. If there are a number of these calls, you will get a call asking you if you are making these calls, in case someone has misappropriated your phone or calling card. Amoroso noted that the more money you pay, the sooner you get the call.

ID does provides more information to the network manager. But does this really aid in making a network more secure? To start with, you must be able to protect one little node first. If you can protect a single node, does protecting many nodes provide additive protection? Perhaps, said Amoroso. Maybe one police officer on the corner doesn't deter crime, but a whole bunch standing there might.

Then there is the question of whether to use passive network monitoring, as seen in many popular ID products, or active monitoring, where you touch the operating system by installing OS modifications to monitor for intrusions. Amoroso suggested that you put your ID as close to your most important servers/services as possible. He called this the "Corleone Principle," from The Godfather. (You watch those nearest to you very closely indeed.)

A malicious outsider could use any protocol, primarily IP, but also SS7 and C7 (AT&T protocols). Gateways (firewalls) include weaknesses in both design and configuration. ID at least gives you a chance to see what is happening in your network. But the firewall is often the least of your problems.

Always keep your war dialer in your back pocket if you do penetration testing, said Amoroso. If you can't find any other way into the target network, modems will always get you in. On the other hand, put a modem on your honeypot so you can detect war dialing. Create some interesting environment for the call, such as the login: prompt.

You can also analyze the output of call-detail logs if you have almost any PBX system and look for war-dialing patterns, such as sequential, short calls from the same originating number. And you probably will not find any commercial product that will do this for you.

NetRanger, now Cisco-owned, but once the Wheelgroup, was wonderful at one point. Now most of the people there have quit. Someone within AT&T bought a copy of NetRanger but did not install it for several months. When they got around to it, they discovered modules missing from the CD that prevented it from working. When they contacted Cisco, they did not know about it either—which implies that a lot of purchased copies of it must not be installed.

Infrastructure may be defined as a lot of stuff that you don't control. So when you are deploying for infrastructure, you are working with stuff you do not control.

Hierarchical handling of events/alerts makes much more sense than isolated, peer processing. And the only tool Amoroso has ever seen that addresses hierarchical processing is EMERALD. EMERALD handles a lot of sensors of different types. NFR does provide you with a toolkit, so you could potentially do this with it as well.

What about response to an ID event? Suppose you have 80,000 operators that can respond to trouble tickets. Can these operators do stuff related to ID on the basis of the alarms generated? Amoroso's answer is no, unless it is to dial someone else's pager.

Remember that IDSes will not address the one thing that most hackers will do to get into your network—use a modem. Most IDSes address only IP network attacks. So, instead of having one IDS, put them everyplace. If an IDS has fundamental limitations in one place, does putting them everyplace really makes things work better? Perhaps the way to find out is to deploy them and see.

Then you wind up with another problem: disparate information, large volumes of information, difficulty in correlation, responses that may be confused, and (currently) no standards for logging. One solution is to centralize all logs in one place. (The Air Force Information Warfare Center has crews looking at logs.)

Good IDSes can correlate events, and you need a cache to do this. This requires a knowledge base (which most IDSes have today). GUIs hide a basic truth—doing ID or network management is hard, and we do not need someone telling us what we need to see. Not that it is not useful to visualize data. For example, use scatter plots to pick out event patterns. GUIs are good for continuity of funding. NFR is the only one with flexibility; NetRanger only provides an ability to search for string patterns.

Presidential Directive PDD-63, "Protect- ing America's Critical Infrastructures," calls for important assets to be protected at a macro level. But within the infrastructure, what do you protect? Network control points are essential points in telecom (and thus are natural points to defend). A side note: AT&T handles 60 percent of the telecommunications traffic in the US. To decide what you need to protect, you must identify assets and come up with a coherent architecture.

For ID systems to succeed, they must address some major unsolved problems:

connecting alarms to realistic response systems
profiling networks and systems reasonably
network managers do more profiling than ID researchers do
correlating information (in-band and out-of-band)
integrating incompatible audit logs
filtering massive false-alarm streams
visualizations for demonstration and analysis

After the talk, someone asked how people get in with modems. Amoroso said that PC Anywhere without passwords makes this simple: Peter Shipley dialed every number in the Bay Area. Emmanuel Goldstein and Phiber Optik have radio shows in NYC that have RealAudio logs — listen to what they have to say about modems.

Jim Duncan suggested looking for slopes in scatter plots, or using Cheswick's algorithm to do your scatter plot along with color to help with the discrimination. Amoroso replied that you can try any combination of IP packet values and look for patterns in the scatter plot—if you can build a scatter plot.

Amoroso ended his talk with yet another story. The cable company televising a heavyweight title fight put up a scrambled banner that only people pirating the channel could see, saying to call an 800 number to get a free T-shirt. They got hundreds of calls. An interesting idea for a honeypot, although somewhat useless for most of our networks.

Experiences Learned from Bro
Vern Paxson, Lawrence Berkeley National Labs

Summary by David Klotz

Vern Paxson's talk was essentially a presentation of his paper on Bro, the intrusion-detection system. He first stated the design goals: high-speed and large- volume monitoring capability, no packets dropped during filtering, real-time notification, separation of system design and monitoring policy, extensibility of the system, and ability to handle attacks on the IDS itself. The eventual design was one of a layered system, with lower layers handling the greatest amount of data and the higher ones doing the most processing.

The event engine is the heart of Bro, which does generic (nonpolicy) analysis. Events are generated by traffic the event engine sees and are queued to be handled by event handlers. Extending the engine is relatively simple, since it has been implemented as a C++ class hierarchy. New classes can be written for new types of events. Event handlers, written in the Bro language, are used to implement a security policy. If an event handler isn't created for a specific type of event, that event is ignored.

The Bro language was designed with the goal in mind of "avoiding simple mistakes." The language is strongly typed, allowing type inconsistencies to be discovered at compile time. One interesting characteristic of the language is the absence of loops. This was done because of the need for speed in processing and the desire to avoid possibly unending procedures. Recursion, however, is allowed.

Two responses were added to Bro recently. One is a reset tool that terminates the local end of a TCP connection. This can be somewhat tricky on TCP-stack implementations that insist on exact sequence numbers; it is done by alternating data and RST packets using a brute-force approach. The second is a drop-connectivity script that talks to a border router and tells it to drop remote traffic from a given site. This script has minimized scans considerably without causing a denial of service up to this point.

After going over Bro, Paxson spent some time explaining why he feels the whole concept of intrusion detection is doomed to failure. Attacks on the monitor, such as overloading it with too much traffic or using software faults to bring the monitor down, can be defended against, but that still leaves the problem of "crud" that looks like an attack but isn't one. Some odd-looking but legitimate traffic includes:

storms of FIN or RST packets
fragmented packets with the don't-fragment flag set
legitimate tiny fragments
data that is different when retransmitted

As attack tools get more sophisticated, they will begin to take advantage of the fact that it is often impossible to tell attacks from "crud." Paxson ended his talk by saying that he feels network intrusion detection is a dinosaur, and that host-based intrusion detection is where it's at.

ranum_marcus
Marcus Ranum modeling his new shirt SC09
Now, how does that go again? "There was a man from . . ."
wine_track
A wine track?

Workshop on Intrusion Detection and Network Monitoring

Santa Clara, California April 9-12, 1999

KEYNOTE ADDRESS

REFEREED PAPERS

INVITED TALKS

Santa Clara, California
April 9-12, 1999