Assuming that an external client interacts with a primary site and the primary site implements some higher level remote mirroring protocol, network-sync enhances that remote mirroring protocol as follows. First, a host located at the primary site submits a write request to a local storage system such as a disk array (e.g. [12]), storage area network (e.g. [19]), or file server (e.g. [31]). The local storage system simultaneously applies the requested operation to its local storage image and uses a reliable transport protocol such as TCP to forward the request to a storage system located at the remote mirror. To implement the network-sync option, an egress router located at the primary site forwards the IP packets associated with the request, sends additional error correcting packets to an ingress router located at the remote site, and then performs a callback, notifying the local storage system which of the pending updates are now safely in transitEgress and ingress routers operate as gateway routers between datacenter and wide-area networks, where egress routers send packets from local datacenter networks to the wide-area network and ingress routers receive packets from the wide-area network and forward packets to local datacenter networks. Generally, egress routers also function as ingress routers and visa versa since they handle duplex traffic.. The local storage system then replies to the requesting host, which can advance to any subsequent dependent operations. We assume that ingress and egress routers are under the control of site operators, thus can be modified to implement network-sync functionality.
|
Later, perhaps 50ms or so may elapse before the remote mirror storage system receives the mirrored request--possibly after the network-sync layer has reconstructed one or more lost packets using the combination of data received and error-recovery packets received. It applies the request to its local storage image, generates a storage level acknowledgment, and sends a response. Finally, when the primary storage system receives the response, perhaps 100ms later, it knows with certainty that the request has been mirrored and can garbage collect any remaining state (e.g. [19]). Notice that if a client requires the stronger assurances of a true remote-sync, the possibility exists of offering that guarantee selectively, on a per-operation basis. Figure 3 illustrates the network-sync mirroring option and Table 1 contrasts it to existing solutions.