Bootstrapping an Infrastructure
Steve Traugott - Sterling Software and NASA Ames
Research Center
Joel Huddleston - Level 3 Communications
Abstract
When deploying and administering systems infrastructures it is
still common to think in terms of individual machines rather than view
an entire infrastructure as a combined whole. This standard practice
creates many problems, including labor-intensive administration, high
cost of ownership, and limited generally available knowledge or code
usable for administering large infrastructures.
The model we
describe treats an infrastructure as a single large distributed
virtual machine. We found that this model allowed us to approach the
problems of large infrastructures more effectively. This model was
developed during the course of four years of mission-critical rollouts
and administration of global financial trading floors. The typical
infrastructure size was 300-1000 machines, but the principles apply
equally as well to much smaller environments. Added together these
infrastructures totaled about 15,000 hosts. Further refinements have
been added since then, based on experiences at NASA Ames.
The
methodologies described here use UNIX and its variants as the example
operating system. We have found that the principles apply equally
well, and are as sorely needed, in managing infrastructures based on
other operating systems.
This paper is a living document:
Revisions and additions are expected and are available at
https://www.infrastructures.org. We also maintain a mailing list for
discussion of infrastructure design and implementation issues -
details are available on the web site.
- View the full text of this paper in
HTML form and
PDF form.
- If you need the latest Adobe Acrobat Reader, you can download it from Adobe's site.
- To become a USENIX Member, please see our Membership Information.