Availability Benchmarking Environment
Fault workload
- Must accurately reflect failure modes of real-world Internet service environments
- plus random tests to increase coverage, simulate Heisenbugs
- But, no existing public failure dataset
- we have to collect this data
- a challenge due to proprietary nature of data
- major contribution will be to collect, anonymize, and publish a modern set of failure data
Fault injection harness
- build into system: needed anyway for online verification