Check out the new USENIX Web site. next up previous
Next: Is it Availability you Up: Clusters and Service Levels Previous: Fault Tolerance v. Fault

Converting Fault Resilience to Fault Tolerance

Given the definitions above, it is apparent that the client the user employs to make contact with the service may also form part of the overall experience. Namely, if the client gets the observable failure, for example the error on transaction commit, but then itself simply retries the complete transaction (i.e. the client must be tracking the entire transaction) and receives a success message back because the service has been fully recovered, the user's experience will once again be seamless.

The moral of this is that if you control the construction of the client, there are steps you can take outside of the server's high availability environment that will drastically improve the users experience, converting it from one of Fault Resilience (user observes failure) to Fault Tolerance (user observes no failure).



James Bottomley 2004-05-12