11th Systems Administration Conference (LISA '97)
Extensible, Scalable Monitoring for Clusters of Computers
Eric Anderson and Dave Patterson
U. C. Berkeley
Abstract
We describe the CARD (Cluster Administration using Relational
Databases) system for monitoring large clusters of cooperating
computers. CARD scales both in capacity and in visualization to at
least 150 machines, and can in principle scale far beyond that. The
architecture is easily extensible to monitor new cluster software and
hardware. CARD detects and automatically recovers from common
faults. CARD uses a Java applet as its primary interface allowing
users anywhere in the world to monitor the cluster through their
browser.
- View the full text of this paper in
HTML form and
PDF form.
- If you need the latest Adobe Acrobat Reader, you can download it from Adobe's site.
- To become a USENIX Member, please see our Membership Information.
|