Check out the new USENIX Web site. next up previous
Next: Performance on Known Executables Up: Methodology for Building Data Previous: Preliminary Results

Performance on New Executables

Table 1 displays the results. The data mining algorithm had the highest detection rate, 97.76% compared with the signature based method's detection rate of 33.96%. Along with the higher detection rate the data mining method had a higher overall accuracy, 96.88% vs. 49.31%. The false positive rate at 6.01% though was higher than the signature based method, 0%.

Figure 2 displays the plot of the detection rate vs. false positive rate using Receiver Operation Characteristic curves [13]. Receiver Operating Characteristic (ROC) curves are a way of visualizing the trade-offs between detection and false positive rates. In this instance, the ROC curve show how the data mining method can be configured for different environments. For a false positive rate less than or equal to 1% the detection rate would be greater than 70%, and for a false positive rate greater than 8% the detection rate would be greater than 99%.


 
Table 1: The results of testing the algorithms over new examples. Note the Data Mining Method had a higher detection rate and accuracy while the Signature based method had the lowest false positive rate.

Profile

Detection False Positive Overall        
Type Rate Rate Accuracy        
Signature Method 33.96% 0% 49.31%        
Data Mining Method 97.76% 6.01% 96.88%        

 



next up previous
Next: Performance on Known Executables Up: Methodology for Building Data Previous: Preliminary Results
Matthew G. Schultz
2001-05-01