Performance Measurements

Query	Description	Records/sec	KBytes/sec
`dd`	The Unix `dd` program reading the disk file.	N/A	5754
`frame1`	Count all the records in the frame relay trace and sum the length fields of each record.	19130	5423
`frame2`	As `frame1` but qualify on the message type of the record (the message type is the high four bits of a byte in the header).	19992	5667
`frame3`	As `frame2` but further qualify on two character fields (the control and nlpid fields of the frame relay header).	19992	5667
`frame4`	As `frame3` but qualify on two more single-byte fields, the T1 interface and channel (the recorder has multiple interface boards, each board has multiple T1 lines and each line can have multiple channels).	20021	5675
`frame5`	Qualify to select only IP packets, then demux by source and destination port and count and sum lengths.	19968	5660
`frame6`	Qualify to select only IP, then demux by T1 board, interface and channel. Count the packets and sum the lengths on each group.	20021	5675
`frame7`	Qualify by message type and protocol to get two streams of IP packets in slightly different formats (frame relay can carry IP in several different formats). Project the IP protocol field and packet length from each stream and multiplex together to count the total number of IP packets and bytes (regardless of the format used).	19764	5602
`frame8`	As `frame7` but include the board and channel in the projection and then demux by board and channel to count IP packets on each channel.	19816	5617
`atm1`	Demux by VCI and count cells in each.	50137	3133
`atm2`	Demux by VCI/VPI and count cells in each.	45098	2818
`atm5`	Qualify to select a particular VCI and count cells.	76746	4796
`atm6`	Qualify to select a particular VCI/VPI and count cells.	77425	4839

tempboxaTable 1: Performance Results for Queries run on Disk tempboxa >Table 1: Performance Results for Queries run on Disk

Table 1: Performance Results for Queries run on Disk

We measured Tribeca on a Sun Sparc 10 performing queries on two datasets. One data set consisted of frame relay traffic (carrying mostly IP), and the other was classical IP-over-ATM [7] traffic. We used data stored on disk (using the standard SunOS UFS filesystem) and ID-1 tape to perform our measurements. The measurements were run on 260 megabyte disk files and a 10 gigabyte tape file. All tests were run in single-user mode to prevent other user activity from affecting the test results.

We ran a variety of queries of increasing complexity to measure Tribeca's performance as its tasks became more compute-intensive. The queries exercised most of Tribeca's features. Table 1 describes the queries run and the measurement results for Tribeca running on 260 megabyte disk files. In the table the results are given as kilobytes per second and records per second. For ATM the records are 64 bytes long while for frame relay the records are variable length. As a baseline, we also present the speed at which the Unix dd program can read the disk file. The dd program simply reads the file in blocks whose size is specified by the user (we used the same block size for dd, the stand-alone programs and Tribeca), and then writes to the Unix null device (/dev/null).

Tribeca compares favorably to the baseline for all of the frame relay measurements, ranging from 1.3 to 5.8 percent slower. The speed decrease is because Tribeca must perform some processing on each record in the file while dd does not touch the data at all. When processing ATM cells, Tribeca is considerably slower than the baseline. The records in the ATM data set are much smaller than the frame relay records and the ATM cell record format includes several bit fields that must be reconstructed to perform the query. Tribeca thus spends correspondingly more time processing each record. In these tests Tribeca used 70-75% of the workstation's CPU while dd used about 68%.

	Tribeca		Stand-alone program		Ratio
Query	Records/sec	KBytes/sec	Records/sec	KBytes/sec
`frame1`	19130	5423	20151	5712	0.95
`frame2`	19992	5667	20312	5757	0.98
`frame3`	19992	5667	20256	5742	0.99
`frame4`	20021	5675	20363	5772	0.98

tempboxaTable 2: Tribeca Compared to Stand-alone Programs. The queries are as described in Table 1. The ``Ratio'' column is the ratio of Tribeca's performance to that of the stand-alone program. tempboxa >Table 2: Tribeca Compared to Stand-alone Programs. The queries are as described in Table 1. The ``Ratio'' column is the ratio of Tribeca's performance to that of the stand-alone program.

Table 2: Tribeca Compared to Stand-alone Programs. The queries are as described in Table 1. The ``Ratio'' column is the ratio of Tribeca's performance to that of the stand-alone program.

	Tribeca		Stand-alone program		Ratio
Query	Records/sec	KBytes/sec	Records/sec	KBytes/sec
`frame1`	27638	6427	30296	7045	0.91
`frame4`	28944	6730	30412	7072	0.95

tempboxaTable 3: Tribeca Compared to Stand-alone Program on ID-1 Tape. The queries are as described in Table 1. The ``Ratio'' column is the ratio of Tribeca's performance to that of the stand-alone program. tempboxa >Table 3: Tribeca Compared to Stand-alone Program on ID-1 Tape. The queries are as described in Table 1. The ``Ratio'' column is the ratio of Tribeca's performance to that of the stand-alone program.

Table 3: Tribeca Compared to Stand-alone Program on ID-1 Tape. The queries are as described in Table 1. The ``Ratio'' column is the ratio of Tribeca's performance to that of the stand-alone program.

We estimated the overhead introduced by Tribeca by comparing its performance on several sample queries (frame1 through frame4) to that of a simple stand-alone C program performing the same functionality. The stand-alone program is hard-coded to perform only the tested query so it does not have any of the overhead required to execute a general-purpose query. Table 2 gives the comparison between Tribeca and the stand-alone programs. The table shows that Tribeca introduces no more than 5% overhead relative to the stand-alone program in the test queries. The stand-alone program used about 5% less CPU time than Tribeca.

Finally, Table 3 compares the performance of Tribeca and a stand-alone program when running from a relatively large (10 gigabyte) dataset on ID-1 tape. We only compared Tribeca to a stand-alone program on two queries because of the time required to run the tests. In this case Tribeca ran between 5 and 9 percent slower than the stand-alone program. Both Tribeca and the stand-alone programs used about 98% of the CPU in these tests. The percent CPU used is much higher in the tape tests because the HiPPI interface used to connect to the ID-1 tape drive uses programmed I/O. The device driver must copy each word of data coming from the tape. For these tests over 95% of the CPU time is system time.

Our measurements demonstrate that Tribeca adds (in the worst case) 9% more processing overhead than a special purpose program tailored to perform the same query. In most of the tested queries, Tribeca performed even better. The small cost is far outweighed by the flexibility and convenience of changing small simple queries rather than re-writing C code to perform different analyses.