Before getting into the details of what can be done to make the debugging of appliance performance and configuration problems easier, it is important to understand the nature of field problems of appliance systems. In this section, we present an overview of the common causes of field problems of appliances and try to give the reader a sense of why it is hard to debug such problems.
As mentioned earlier, for the purposes of concrete illustration, we use the example of a file server (filer) appliance. A filer provides access to network-attached disk storage to client systems via a variety of distributed file system protocols, such as NFS [27] and CIFS [15]. A useful model is to think of a filer's OS as two high-performance pipes between a system of disks and a system of network interfaces. One pipe allows for data flow from the disks to the network; the other carries the reverse flow. Field problems usually arise when something in the filer or in its environment causes one (or both) of these pipes to perform below expected levels.
The taxonomy of common field problems that we describe below was obtained from a detailed study of the call records of Network Appliance's customer service database. We examined information pertaining to customer cases that were handled in the time period February 1994 through August 1999. From this data it appears that the three most important causes of field problems are system misconfiguration, inadequate system capacity and hardware and software faults. The relative ratio of these three problem types is hard to quantify because a large number of customer cases involve more than one subproblem of each type and because the specific mix has varied from month to month and from year to year. However, between these three problem types, they cover about 98% of all field problems.