Check out the new USENIX Web site. next up previous
Next: Related Work Up: Implementation and Optimization Issues Previous: Optimization Issues

Performance Measurements


Query Description Records/sec KBytes/sec
dd The Unix dd program reading the disk file. N/A 5754
frame1 Count all the records in the frame relay trace and sum the length fields of each record. 19130 5423
frame2 As frame1 but qualify on the message type of the record (the message type is the high four bits of a byte in the header). 19992 5667
frame3 As frame2 but further qualify on two character fields (the control and nlpid fields of the frame relay header). 19992 5667
frame4 As frame3 but qualify on two more single-byte fields, the T1 interface and channel (the recorder has multiple interface boards, each board has multiple T1 lines and each line can have multiple channels). 20021 5675
frame5 Qualify to select only IP packets, then demux by source and destination port and count and sum lengths. 19968 5660
frame6 Qualify to select only IP, then demux by T1 board, interface and channel. Count the packets and sum the lengths on each group. 20021 5675
frame7 Qualify by message type and protocol to get two streams of IP packets in slightly different formats (frame relay can carry IP in several different formats). Project the IP protocol field and packet length from each stream and multiplex together to count the total number of IP packets and bytes (regardless of the format used). 19764 5602
frame8 As frame7 but include the board and channel in the projection and then demux by board and channel to count IP packets on each channel. 19816 5617
atm1 Demux by VCI and count cells in each. 50137 3133
atm2 Demux by VCI/VPI and count cells in each. 45098 2818
atm5 Qualify to select a particular VCI and count cells. 76746 4796
atm6 Qualify to select a particular VCI/VPI and count cells. 77425 4839
tempboxaTable 1: Performance Results for Queries run on Disk tempboxa >Table 1: Performance Results for Queries run on Disk

Table 1: Performance Results for Queries run on Disk
 

We measured Tribeca on a Sun Sparc 10 performing queries on two datasets. One data set consisted of frame relay traffic (carrying mostly IP), and the other was classical IP-over-ATM [7] traffic. We used data stored on disk (using the standard SunOS UFS filesystem) and ID-1 tape to perform our measurements. The measurements were run on 260 megabyte disk files and a 10 gigabyte tape file. All tests were run in single-user mode to prevent other user activity from affecting the test results.

We ran a variety of queries of increasing complexity to measure Tribeca's performance as its tasks became more compute-intensive. The queries exercised most of Tribeca's features. Table 1 describes the queries run and the measurement results for Tribeca running on 260 megabyte disk files. In the table the results are given as kilobytes per second and records per second. For ATM the records are 64 bytes long while for frame relay the records are variable length. As a baseline, we also present the speed at which the Unix dd program can read the disk file. The dd program simply reads the file in blocks whose size is specified by the user (we used the same block size for dd, the stand-alone programs and Tribeca), and then writes to the Unix null device (/dev/null).

Tribeca compares favorably to the baseline for all of the frame relay measurements, ranging from 1.3 to 5.8 percent slower. The speed decrease is because Tribeca must perform some processing on each record in the file while dd does not touch the data at all. When processing ATM cells, Tribeca is considerably slower than the baseline. The records in the ATM data set are much smaller than the frame relay records and the ATM cell record format includes several bit fields that must be reconstructed to perform the query. Tribeca thus spends correspondingly more time processing each record. In these tests Tribeca used 70-75% of the workstation's CPU while dd used about 68%.


  Tribeca Stand-alone program Ratio
Query Records/sec KBytes/sec Records/sec KBytes/sec  
frame1 19130 5423 20151 5712 0.95
frame2 19992 5667 20312 5757 0.98
frame3 19992 5667 20256 5742 0.99
frame4 20021 5675 20363 5772 0.98
tempboxaTable 2: Tribeca Compared to Stand-alone Programs. The queries are as described in Table 1. The ``Ratio'' column is the ratio of Tribeca's performance to that of the stand-alone program. tempboxa >Table 2: Tribeca Compared to Stand-alone Programs. The queries are as described in Table 1. The ``Ratio'' column is the ratio of Tribeca's performance to that of the stand-alone program.

Table 2: Tribeca Compared to Stand-alone Programs. The queries are as described in Table 1. The ``Ratio'' column is the ratio of Tribeca's performance to that of the stand-alone program.
 


  Tribeca Stand-alone program Ratio
Query Records/sec KBytes/sec Records/sec KBytes/sec  
frame1 27638 6427 30296 7045 0.91
frame4 28944 6730 30412 7072 0.95
tempboxaTable 3: Tribeca Compared to Stand-alone Program on ID-1 Tape. The queries are as described in Table 1. The ``Ratio'' column is the ratio of Tribeca's performance to that of the stand-alone program. tempboxa >Table 3: Tribeca Compared to Stand-alone Program on ID-1 Tape. The queries are as described in Table 1. The ``Ratio'' column is the ratio of Tribeca's performance to that of the stand-alone program.

Table 3: Tribeca Compared to Stand-alone Program on ID-1 Tape. The queries are as described in Table 1. The ``Ratio'' column is the ratio of Tribeca's performance to that of the stand-alone program.
 

We estimated the overhead introduced by Tribeca by comparing its performance on several sample queries (frame1 through frame4) to that of a simple stand-alone C program performing the same functionality. The stand-alone program is hard-coded to perform only the tested query so it does not have any of the overhead required to execute a general-purpose query. Table 2 gives the comparison between Tribeca and the stand-alone programs. The table shows that Tribeca introduces no more than 5% overhead relative to the stand-alone program in the test queries. The stand-alone program used about 5% less CPU time than Tribeca.

Finally, Table 3 compares the performance of Tribeca and a stand-alone program when running from a relatively large (10 gigabyte) dataset on ID-1 tape. We only compared Tribeca to a stand-alone program on two queries because of the time required to run the tests. In this case Tribeca ran between 5 and 9 percent slower than the stand-alone program. Both Tribeca and the stand-alone programs used about 98% of the CPU in these tests. The percent CPU used is much higher in the tape tests because the HiPPI interface used to connect to the ID-1 tape drive uses programmed I/O. The device driver must copy each word of data coming from the tape. For these tests over 95% of the CPU time is system time.

Our measurements demonstrate that Tribeca adds (in the worst case) 9% more processing overhead than a special purpose program tailored to perform the same query. In most of the tested queries, Tribeca performed even better. The small cost is far outweighed by the flexibility and convenience of changing small simple queries rather than re-writing C code to perform different analyses.


next up previous
Next: Related Work Up: Implementation and Optimization Issues Previous: Optimization Issues