A wide variety of distributed applications rely on the Berkeley Unix stream sockets model for interprocess communication [18]. The stream socket interface provides a connection-oriented, bidirectional byte-stream abstraction, with well-defined mechanisms for creating and destroying connections and for detecting errors. There are about 20 calls in the API including send(), recv(), accept(), connect(), bind(), select(), socket(), and close(). Traditionally, stream sockets are implemented on top of UDP so that applications can run across any networks using TCP/IP protocol.
Our stream sockets implementation is for system area networks and is implemented on top of VMMC for PC clusters using Linux.
Stream sockets makes heavy use of VMMC's transfer redirection mechanism. The basic idea of redirection is to use a default, redirectable buffer when the final receive buffer address is not known by the sender. Redirection is a local operation affecting only the receiving process. The sender is not aware of redirection and always sends data to the default buffer. When the data arrives at the receive side, the redirection mechanism checks to see whether a redirection address has been posted. If no redirection address has been posted, the data is moved to the default buffer. Later, when the receiver posts the receive buffer address, the data is copied from the default buffer to the receive buffer. If the receiver posts its buffer address before the message arrives, the message will be put into the user buffer directly from the network without any copying.
VMMC stream sockets performs quite well with a one-way latency of 20 s and a peak bandwidth of over 84 Mbytes/s. The bandwidth performance is due, in large part, to redirection. Without redirection bandwidth would be limited by the system copy bandwidth because one copy would be required by the receiver to move incoming data to its final destination.
Because our stream sockets is a user-level library, it does not allow open sockets to be preserved across fork() and exec() calls. With fork, the problem is arbitrating socket access between the two resulting processes. exec is difficult because VMMC communicates through memory and exec allocates a new memory space for the process. This limitation was not due to the sockets implementation but rather it is a fundamental problem with user-level communications in general.