For 64 bytes files, AFPA on Linux was 21% faster (12,522 requests per second) than AFPA on Windows 2000 (10,321 requests per second). In addition to obtaining the requests per second, we also used the Intel processor performance counters to measure several metrics under the same workload. We found two metrics significantly different between AFPA on Linux and Windows 2000: the number of instructions executed and instruction TLB misses. Because the average cycles per instruction were nearly identical for both cases, we conclude that the instruction count is a useful metric for comparing the two implementations. In addition, both AFPA implementations used the exact same source code to implement the HTTP and caching logic. This implies that any differences between the two would have to be limited to the interfaces used to integrate AFPA in the TCP/IP stack, the TCP/IP stack itself, and network driver. AFPA on Linux executed 19% fewer instructions than AFPA on Windows 2000 (26,000 versus 31,000 instructions per request). We also find that the number of instruction TLB misses was ten per request on Windows 2000 versus zero per request on Linux. The Linux kernel, TCP/IP stack, and kernel modules are stored entirely in non-pageable 4 MB pages, so it does not experience any instruction TLB misses. Only the Windows 2000 kernel is mapped using 4 MB pages; the TCP/IP stack is not.