USENIX ;login: - hackers

a hacker's approach to id

Mudge is a consultant and programmer. His paper, co-written with Bruce Schneier, about vulnerabilities in PPTP resulted in a free dinner from Microsoft.

This article is a response to Rik Farrow's question about the L0pht's work on intrusion-detection packages for Network Flight Recorder. In particular he asked how we chose which packets to look at. So I shall attempt to give a brief overview of how a group of hackers — and I use the term in the good sense — goes about approaching network intrusion detection, given the current state of tools and environments.

A little background first — but don't worry, I'll try to make it as painless as possible. For those not familiar with the L0pht, I recommend checking out the Web site, <https://www.L0pht.com>. (Note: that is L-Zero-p-h-t.) What we are, in a nutshell, resembles a marriage between Consumer Reports and public television . . . gone high-tech-security happy. One of the many products/tools/technologies that we played with happened to be Network Flight Recorder's tool of the same name. NFR (<https://www.nfr.net>) was designed to be a black-box recorder of packets going across a network. If you imagine an RMON probe on steroids you have NFR. Learning what types of traffic, who the heavy talkers on the net are, where different servers are located, and logging invalid packets and whether they come in the form of invalid checksums are all functions that NFR was designed to be capable of. Some astute readers might notice that nowhere have I mentioned that NFR was designed explicitly to be an intrusion-detection system — simply a versatile sniffer with an extensible programming language, called "N-Code," for handling packets. I had downloaded a copy from NFR's Web site and, along with fellow L0pht member Silicosis, whipped up a few simple N-Code modules that would handle some trivial intrusion scenarios and posted them on our Web site for everyone to access free of charge. NFR's president and CEO, Marcus Ranum, saw them and approached us to write a more complete set of filters (we are currently at over 200 checks and less than a third done) under contract for NFR. Since we had toyed around with the notion of writing them for free we jumped at the opportunity to put some food on the table. Gee, I hope Marcus does not read this magazine. Hi Marcus — d'oh!

A Deep Dark Secret

So now you have your crash course background on the L0pht and the tool NFR. Let's move on to current network intrusion-detection systems (IDSes). You have been exposed to all of the sales pitches, marketing literature, and preacher sermons on them. The CEO of the company wants to practice due diligence and comply with industry standards and deploy ID systems to catch all of those malicious crackers that Big Blue shows you in TV commercials daily. Now, as the curtain is slowly being withdrawn, I will tell you the deep dark secret about IDSes: they don't work.

What? Say it ain't so, Mudge! Unfortunately, it is indeed true that network ID systems do not work. At least not as advertised. Maybe it is the overzealous marketing and sales forces. Maybe it is the public's huge desire to have this mythical beast. Whatever it is, do not place all of the blame on the software. Sure, some are better than others (and often price has nothing to do with that) but, if you think about it, we are asking for a solution to an almost impossible problem. At least with today's technology and a realistic budget.

A Tall Order

It would be prudent for me to elaborate on my definition of a network intrusion-
detection system. Then I can explain why I, and many other people, believe that the current systems simply cannot work as well as the vendors would like you to believe.

My definition is:

A device that passively monitors all of the traffic on a network, noticing and logging malicious patterns with extremely high accuracy and extremely low false positives. The device does not degrade network performance and will not miss detecting real attacks, cannot be tricked into logging incorrect data (false alarms), and cannot be disabled through a denial-of-service attack.

This might seem like a tall order to fill, but take a look at the next advertisement you see for one of these devices, or maybe even the recommendations from an auditing company on what is required for due diligence, and see if this is not alluded to.

Now, this is not to say that I do not feel that network IDS systems are valuable, just that they are nowhere near the panacea some would have you believe. I firmly believe that the proper use of these systems can help raise the sophistication level or difficulty required to compromise systems on your network. A huge component of proper use is understanding the systems' limitations.

In a nutshell, what one is asking of the system is for it to do the following:

Never drop a packet.
See all of the network traffic.
Understand what the TCP window is for a particular system.
Handle packets out of order.
Handle all of the bizarre aspects of fragmented packets (short fragments, overlapping fragments).
Correctly handle "duplicate" packets. (Which packet did the end system really process? Was the checksum correct? Does that matter?)
Understand exactly how many hops away every destination is to prevent TTL attacks. (Did the end system even see the packet that the IDS just logged, or did it expire in transit?)
Have infinite resources. (If state is being held for a session, what happens when it is kept open indefinitely? If there are 3,000 sessions open and the signature that one is attempting to match is extremely processor intensive, what happens?)
Never "fail open" (which is impossible, since it is a passive device).
Understand how applications handle data. (In a telnet session does the IDS correctly filter out telnet options, or does their inclusion cause the pattern matching to fail? If a client is in character mode, how does the end node handle backspaces — is rb^Hoot really the same as root?
Understand IP options.
Etc., etc.

A Bare Minimum

None of the network ID systems or environments that I am currently aware of do all of the above. Still, in a fit of insanity we decided that with only a few of the above requirements we could use some common sense and create what we felt was a due-diligence-industry-standard solution. Ewww — don't you just hate "market-ese"! Here is our list of requirements. Feel free to disagree with them, use them as your own, add to them, or whatever.

The system must:

Have an extensible language to allow custom handling, analysis, and manipulation of packets. This must give direct access to any portion of the frame that is requested. In addition, the code must execute quickly, since all processing of packets in a stream is basically an inner loop.
Be able to handle packets arriving out of order (nonsequentially) and order them for the IDS programmer.
Not ignore fragmented packets.
Have a notion of state for stateful sessions such as TCP.
Be able to handle at least a 10MB Ethernet segment running at full bandwidth (we won't go into the fact that this requires only two systems talking, since heavily populated networks will peak out at roughly 33% due to collisions) without dropping packets).
Perform valid checksum routines of packets witnessed.
Run on a system that is capable of being secured and remotely managed in an encrypted fashion.
Be extensible, programmable, extensible, programmable, ad infinitum.

I strongly believe that the above is a bare minimum that all customers need to demand from their network IDS vendors. If your current product does not do the above, beat on your vendor for them to add it. You are not only helping yourself but enabling your vendor to supply a better product.

Our choice ended up being NFR running on OpenBSD managed through SSH tunnels or SSL'd Web connections. One of the best things that we felt NFR had going for it, other than providing access to the source code (which was a big plus), was that the main engine seemed much more like an IP stack than some of the competitors at the time. This was most likely due to the product not initially being designed explicitly for IDS but as a programmable network monitoring and logging tool. All of this by no means implies that this is the only decent solution or even the best solution. There are far too many variables in people's needs for me to assume that. It fits some of our needs; maybe it will fit yours too — YMMV.

Two of our most rewarding common-sense approaches to optimizing and writing the filters are based on the following:

Model as many things as possible into state engines.
Flag anomalies as according to RFC specs.

By modeling sessions into state engines, we gained twice: increased performance and minimized false positives. Take sendmail or nntp as examples. Both have very similar state engines that can be overlaid on them. There are command, header, and data sections. Command sections would be VRFY, MAIL, HELO, POST, IHAVE, etc. Here is where you would look for the infamous 8.6.12 syslog overflow attacks, WIZ/DEBUG attacks (it is a shame that because of marketing, IDSes still spend cycles looking for this), mailing to programs, and on and on. One should not be wasting precious cycles looking for keyword matches that they expect to find in the data portion of a message. Likewise, one should not flag a false positive every time the data content of some mail from a programming-language-related mailing list contains the word DEBUG.

   [ command ]
   data
   [ header ]
   <blank>
   [ data section ]
   .
   [ command state again ]

By following the above flow between states, an optimized filter can be created that spends cycles looking for particular attacks only where they will actually be found.

With flagging anomalies according to RFC specifications, I am alluding largely to your friend and mine, the common buffer overflow. In addition to having a list of known buffer overflows and the signatures for them, it is prudent to look for new ones that you are not aware of yet. This is accomplished in many cases by watching for lengths that exceed protocol specifications. If a particular protocol states that in command state, no line will be more than 256 bytes including the command and whitespace — does it not seem wise to log the 4096-character-long IHAVE command that just went by? There are a surprising number of places where this works quite well.

So while we still have our doubts as to the future of network IDSes, we have no questions about how to get around them. But if you can drop the noise level down and catch the common script kids and crackers, that leaves more time to play the game with the true brilliant ones. Chances are you will learn a lot more this way — and isn't that so much more interesting?

The excellent paper by Tim Newsham and Thomas Ptacek, "Insertion, Evasion, and Denial of Service: Eluding Network Intrusion Detection" <https://www.nai.com/products/security/advisory/papers/ids-html/
doc000.asp> is absolutely required reading for those wishing to explore the shortcomings of network IDSes.