The result of the application of redundancy and the interchangeability of components is that recovery is fast, simple, and unintrusive: a brick is recovered by rebooting it without worrying about preserving its pre-crash state, and recovery does not require coordination with other bricks.
As a result, the monitoring system that detects failures is allowed to
make mistakes. In contrast, in
other systems, false positives usually reduce performance, lower
throughput, or cause incorrect behavior. Since false positives are not
a problem in SSM, generic methods such as statistical anomaly based failure
detection can be made quite aggressive, to avoid missing real faults.