Very Large Scale Cooperative Experiments in Emulab-Derived Systems

Keith Sklower Anthony D. Joseph
University of California, Berkeley
{sklower, adj}@cs.berkeley.edu

Abstract

Cyber-defense research has been severely limited by the lack of an experimental infrastructure for testing new theories and new technologies in realistic scenarios. Current testbeds are mostly small-scale and limited to small numbers of machines. The cyber DEfense Technology Experimental Research (DETER) testbed, provides a medium-scale test environment with more than 300 nodes. However, there is increasing interest in running experiments at very large scale with more than 1,000 nodes.

This paper describes how experiments can be federated across existing small- and medium-scale testbeds using the University of Utah's Emulab software, such as the DETER testbed, to enable the running of massive-scale experiments. We describe the Emulab software and the DETER testbed and we detail the necessary steps for running a federated experiment. We provide a status update on our progress and discuss how a manually configured proof-of-concept experiment could be performed.

1 Introduction

Cyber-defense research has been severely limited by the lack of an experimental infrastructure for testing new theories and new technologies in realistic scenarios. It is both unclear and unproven that technologies tested on small subnet-sized topologies modeled by a few machines will scale up to realistic Internet-scale environments. To perform detailed emulation and analysis of the behaviors of large systems under attack (e.g., the Internet or large enterprise networks), significant numbers of computers are required. As a step towards addressing this need, the cyber DEfense Technology Experimental Research (DETER) testbed [1,2], which currently contains more than 300 nodes, provides an intermediate point in this scaling range that has turned out to be a very useful scale for many experiments.

The DETER testbed is open, free, shared infrastructure designed to support research and education in cybersecurity. The testbed supports medium-scale repeatable experiments in computer security, especially those experiments that involve malicious code or cannot be performed in the Internet because of traffic volumes or the risk of escape. The DETER testbed provides a unique experimentation facility where academic, industrial, and government researchers can safely analyze attacks and develop attack mitigation and confinement strategies for threats such as Distributed Denial of Service defenses, virus propagation, and routing security. In addition, the testbed provides tools and resources to enable repeatable scientific experiment methodologies, allowing researchers to validate their own theories, simulations, and emulations, while also enabling different researchers to repeatably duplicate and analyze the same experiments.

The DETER testbed is controlled by a version of Utah's Emulab software [3] configured and extended to provide stronger assurances for isolation and containment. With its strong security, containment, and usage policies, the testbed fills a role that is currently not met by any of the other large-scale testbeds, such as PlanetLab and Emulab. Remote experimenters can allocate large numbers of nodes in arbitrary combinations, link them with nearly-arbitrary topologies, load arbitrary code for routing, traffic generation and shaping, defense mechanisms, and measurement tools, and execute their experiments. The Emulab software provides sharing of testbed resources among multiple concurrent experiments when enough nodes are available.

Even though the DETER testbed is already capable of enabling researchers to run medium-scale experiments, more nodes are needed, both to enable larger experiments and to handle more simultaneous users. For example, an early DETER experiment on worm propagation dynamics could (just) be squeezed into the then available 72 nodes, but 100 nodes would have simplified the experiment and increased its generality. However, not all researchers are interested in performing very large-scale experiments. One group of researchers used all of the testbed's nodes to perform fine-grain analysis of enterprise networks, complete with actual machines on individual subnets. Having additional testbed nodes available would have enabled them to analyze a large enterprise network.

Given the significant researcher interest in being able to run large-scale experiments, our goal is to build a large-scale testbed facility capable of running these experiments. However, we face significant limitations in available power and cooling resources, and on the maximum floor weight loading in our machine rooms. Thus, our initial solution has been to build the testbed as two tightly coupled clusters running as a single logical administrative domain and interconnected via the CALifornia REsearch Network's High Performance Network (CALREN HPR). One cluster is located at the University of California, Berkeley (UCB) and the other is located at the University of Southern California's International Science Institute (USC/ISI) (See Figure 2).

Figure 1: The Emulab Architecture.

Using this solution, one potential way of building a larger-scale testbed would be to tightly couple together additional testbeds. However, this approach is a partial solution, as we face the additional limitation that the current instantiation of Emulab on DETER has problems with loading experiments using 200 or more physical nodes.

Instead, we propose an alternate approach as a step toward the federation of multiple, independent Emulab testbeds. We define federation as the cross granting of experimenter access and usage rights between separately administered testbed facilities. Effectively, federation enables experimenters to run experiments that span multiple separate testbeds, without the testbeds having to operate under a single administrative domain. Each participating testbed can make its own operational policy decisions and choices and decide whether to admit a new federated experiment or not.

We plan to combine existing Emulab mechanisms for resource reservation and delegation along with extensions that we have developed, to tightly couple simultaneous experiments without immediately having to solve the (difficult) issues of tightly coupling testbeds . The control information that is passed using these mechanisms, along with the rights they confer and their limitations, are more than mere implementation details. Careful examination of this approach will provide us with useful insights about how to dynamically federate multiple testbeds on an ongoing basis.

The rest of the paper is organized as follows: in Section 2, we provide a background discussion of the Emulab and DETER architectures and firewalled experiments; in Section 3, we present our idea for a prototype federation model and explain the challenges and potential solutions; in Section 4, we discuss our experiences with building the prototype solution; and in Sections 5 and 6, we acknowledge our sponsors and discuss our conclusions.

2 Background

2.1 Emulab Architecture

Figure 2

Figure 2: DETER Testbed Architecture.

The basic Emulab architecture consists of a set of experiment nodes, a set of switches that interconnect the nodes, and two control nodes, Boss and Users (see Figure 1). The switches are used to interconnect the experiment nodes. The interconnections are physically separated into a dedicated control network and an experiment network for user-specified topologies. Experiment nodes may be servers, personal computers, sensor motes, routers, or other devices, such as Intrusion Detection Systems, Field Programmable Gate Arrays, etc. Each experiment node has two or more network interfaces, one of which is connected to the dedicated control network. The other interfaces are connected to the experiment network.

The Boss server controls the testbed's operation including the ability to power cycle individual experiment nodes, while researchers log into the Users server to create and manage experiments and to store the data required or generated by their experiments. The testbed's switches are controlled using snmpit, a program that provides a high-level object interface to the individual SNMP MIB's of testbed switches. Other programs talk to the power controllers to power cycle nodes, load operating systems onto experiment nodes when requested, and interact with the database to reserve and assign nodes to experiments.

An Emulab experiment consists of a collection of nodes, software to run on the nodes, and an interconnection topology. An experiment is specified using a combination of a .ns file and a web interface. The Emulab control software on the Boss server enables multiple, separate experiments to be simultaneously run on the testbed. The software isolates experiments by assigning each experiment to one or more unique Virtual Local Area Networks (VLANs) that connect together the experimental interfaces on each experiment node either using simulated bandwidth-limited and lossy links or using LANs. By using separate VLANs, an experiment's experimental traffic is isolated from other experiments. To prevent one experiment's network traffic from interfering with that of other experiments because of insufficient internal switch or inter-switch bandwidth, the assign program is responsible for mapping an experiment's link bandwidth requirements onto the available switch resources in a manner that ensures that the experiment's bandwidth demands match available inter- and intra-switch bandwidths. Note that unless an experiment is firewalled (as described in Section 2.3), all of the control network interfaces are on the same VLAN.

The Emulab process of swapping in a new experiment consists of several steps: mapping the researcher's desired network topology onto available nodes and switch resources, configuring VLANs on the switches to connect the experiment nodes into the researcher's desired network topology, installing an initial minifs kernel and root filesystem onto the experiment nodes, and then loading and running the desired operating system and software.

In the Emulab trust and privilege hierarchy model, each researcher is a separate user of the testbed. Users working together are grouped into groups , and a project consists of a collection of related groups. Users may also belong to more than one project. Each testbed has its own complete (and independent) trust and privilege hierarchy.

2.2 The DETER testbed

To address some of the challenges of building a large testbed, we created the DETER testbed by grafting together two Emulab testbeds (clusters) in a tightly coupled manner (see Figure 2)¹. The cluster interconnection consists of three network switches connected together by two IPsec tunnels, each carrying entire Ethernet frames including IEEE 803.11q VLAN tags. The interconnection of the two clusters' control "planes" is provided by a virtual wire tunneling Ethernet frames between ports on the two switches Bfoundry1 (at UCB) and Foundry4 (at USC/ISI). The interconnection between the two experimental "planes" is between ports on Bfoundry1 and Cisco4. Each of those switches are connected to other switches, so the large yellow rectangles at the bottom of Figure 2 are not single switches but collections of them.

The two clusters share a common trust structure, with periodic (daily) replication of the Boss and Users filesystems from the USC/ISI cluster to the UCB cluster.

The two control networks of the clusters use a quasi-static assignment policy for allocating nodes and other resources between them. The serial servers can each only be connected to one of the Boss servers at a time. Both Boss servers are connected to all the switches, but only one of them is responsible for creating and managing VLANs at any given time.

The Emulab process of installing a 3 MByte minifs kernel and root filesystem for a new experiment requires a TFTP transfer and takes approximately six minutes when swapping in across the link between the two clusters. By using a local TFTP server, we are able to reduce this time to two seconds.

2.3 Firewalled Experiments

Figure 3

Figure 3: Firewalled Experiment.

Figure 4

Figure 4: Firewalled Emulab-in-Emulab Experiment.

Because DETER experiments may involve risky code, such as self-propagating worms and virus, experiments must be isolated from external networks, such as the Internet. To provide strong isolation, our approach to enabling federated experiments leverages Emulab's support for firewalled experiments, which enables an experiment to be wrapped up in a boundary-control kind of way (see Figure 3). It is implemented by a smart (layer 2) bridge between the testbed's control VLAN and a newly created control VLAN containing the control network interfaces of the PC’s in the experiment. Firewalled experiments are created using an .ns file option and ipfw rules. It is possible to model enterprise networks with multiple firewalls by creating multiple firewalled experiments on a testbed.

2.4 Firewalled Emulab-in-Emulab Experiments

At a high-level, the Emulab-in-Emulab mechanism lets a researcher reserve a group of experiment nodes and grants them the right to dynamically change the nodes' network topology. More specifically, the mechanism works by making a subset of the Emulab databases and instantiating them on inner Emulab Boss and Users servers created out of two experiment nodes (see Figure 4). The remaining nodes are available for use by the experiment. The researcher is granted administrator rights on the inner Emulab testbed and a login on the inner Boss (i.e. , they can become root). The researcher's SSL certificate is used for XML-RPC from the inner Boss to the external (real) Boss, to request VLAN (re)configuration for any node's (experimental) interfaces, and power cycling. The inner Users and Boss servers insulate the external (host) testbed from the trust and privilege structure in the inner testbed, and to a certain extent the exact the version of Emulab running inside (to the extent that one can devise scripts to upgrade/downgrade the schema of the database subset transmitted from the outside testbed to the inner one). As long as the SSH keys are the same, it will still be possible to run experiments, even if we have different users, groups, and projects in the inner and outer testbeds.

Figure 4

Figure 5: "Half" of a Prototype Federation Experiment.

From a federation standpoint, an significant advantage of being able to support different versions of Emulab inside and outside is that it would not be necessary to run the same version of Emulab on different federated testbeds. Finally, the existing firewall mechanisms should provide the same isolation for risky experiments as is currently provided in the DETER testbed when it is connected to the Internet.

3 A Federation Prototype

Our federated experiment prototype is based on the idea of connecting together independent Emulab testbeds by using a modified version of firewalled Emulab-in-Emulab functionality to instantiate subsets of the experiment within each testbed (see Figure 5). This model of operation effectively loosely couples together the testbeds for the purpose of running a large-scale experiment.

In the rest of this section, we first describe how a federated experiment would ideally be performed, and then explore several challenges and potential solutions. We also discuss several hard problems that we have not yet addressed in our federation prototype.

The process of executing a federated experiment proceeds as follows:

First, instantiate simultaneous firewalled Emulab-in-Emulab (elab-in-elab) experiments at multiple testbed facilities.
Next, co-opt the inner Users and Boss nodes:
- Designate one set of nodes as the master nodes for the experiment.
- Ignore User ID assignments and permissions at all nodes, except for the master nodes.
The next step is to "implode" the (inner) databases to extract a description of the nodes.
Now Emulab's assign process can be run on the entire assemblage of nodes. We then separate out all the database state and distribute it to each local site's Boss server, and each local site's Boss server merges everything back in.
Then have each inner Boss server request instantiation of the topology at that site.
Have each site reports back the assigned VLAN tag numbers.
Distribute the necessary disk images from the master to each site's Boss server via scp and then have each local Boss server load the operating systems on its local nodes.
Then construct IPsec tunnels between the firewalled experiments which translate the tags appropriately. Kevin Lahey at ISI has implemented two independent techniques for doing this; one using the Click router (at our suggestion) and an independent way using the netgraph mechanism in FreeBSD 6.
Finally, the experiment runs...

3.1 Challenges and Potential Solutions

Running federated experiments in the wide-area introduces several new challenges, some of which we have already encountered in connecting the USC/ISI and UCB clusters. Here is a description of some of the challenges and potential solutions.

Running the UDP-based services that Emulab depends on for its operation in the wide area (e.g. , DHCP, bootinfo, TFTP, and NFS) might not work and even multicast in the wide area has already been problematic for us. The solution to this problem is easy, each local Boss and Users server provides these services locally. Our idea for DNS is that the /etc/resolv.conf file has a search directive listing all the federating experiment suffices, and each local boss has an "NS" reference to the master site for all the other experiment suffixes. For example, it is already the case that boss.elabelab.DETER.emulab.net is a legitimate, resolvable domain name, and thus we could replace "boss" with any other virtual node name in its portion of a federated experiment.
Collisions in the IP space for unroutable control interfaces could occur. As long as the local (inner) Boss server has sole responsibility for DHCP responses to its nodes and it can reach its outer Boss server, there should be no problems with temporarily renumbering the control net. Note that we have not yet encountered this problem.
Collisions in the name space of nodes, node_types, OS ID's, and image ID's could occur. One solution would be to append the testbed's domain name to each identifier (e.g. , pc3000@isi.deterlab.net), however the length of names might be an issue. An alternative would be to have a table in the database that maps from short prepended identifiers to testbeds. For example,
- ut < anything > maps to emulab.net
- wi < anything > maps to wail.wisc.edu
- cu < anything > maps to cornell.edu
- vb < anything > maps to vanderbilt.edu
- isi < anything > maps to isi.deterlab.net
- ucb < anything > maps to ucb.deterlab.net
Operating system images for nodes at different sites might not be compatible. This is already an open issue for the existing Emulab testbeds as new types of nodes are added. One potential solution would be to create universal system images that include drivers for a broad set of hardware types. However, differences between nodes may still be an issue (e.g. different mappings from the BIOS to COM ports is an issue we previously encountered).

3.2 Hard Problems Not Addressed in an Initial Prototype

While we are confident that we have viable solutions for the problems discussed in the previous section, there are several hard problems that we have not yet addressed in our initial prototype, including synchronizing the swap in of multiple experiments and multiple sites, the requirement for accounts at all sites, and complex permissions and trust management requirements.

The first major problem is the simple requirement to schedule the availability of a major fraction of the available nodes at the participating sites so that they are all simultaneously avaiable. This is, in of itself, quite a challenge given the competition for the resources at key participating sites.

The second problem, swapping in of a single experiment of a thousand nodes among several federated sites in a truly automated way, would require the synchronization of VLAN assigments across all the sites. Given the current emulab software, one process must survey all of the vlan tags in use at that moment in all of the switches, and then compute what vlan numbers are available. Futhermore, it cannot allow any other swap-in at any other site to construct any other vlan until all of the vlans are instantiated in every switch. the sites. Obviously, this requirement introduces the potential for deadlock or significant delays if sites are slow in responding or fail.

Our strategy of rewriting the vlan tags allow for each site to contruct its vlans separately and mitigates the problem, but there is still a requirement for synchronization after that is done. The Utah emulab staff has proposed altering the snmpit software so that all vlan assignments would be stored in the database, and the tags computed on the basis of that, at the time of swap-in, which additionally would permit vlan construction to proceed simultaneously in all switches.

The current federation model requires that a researcher have accounts at all the participating sites, however related to the problem of permissions and trust management, the developers at Utah have suggested that permissions and trust management could be pushed down a level. For example, suppose there is a DETER project at Utah's Emulab (www.emulab.net). Then, projects at www.isi.deterlab.net might turn into groups within the DETER project at www.emulab.net. Thus, we could both avoid the problem of requiring accounts at all sites by using a single account, and address the permissions and trust management issue through delegation back to the originating site (and that site's account on the federated testbed). There is still the policy issue of defining which remote testbed's users would be allowed to access a local testbed resources.

4 Our Experiences and Status

In this section we provide an update on our efforts to build a working federation prototype and discuss some of the experiences and lessons learned.

We have implemented support for steps 1 through 3 (see Section 3): site prefixing, inner database implosion, the running of assign on the assemblage of nodes, redistribution of consequent database state to the remote sites, identification of cross-campus links and mediation of differing software levels and trust structures between campuses (e.g. , running an instance of Emulab-in-Emulab at DETER where the inner testbed software is within a couple of weeks of what is currently running at Utah, and the outer testbed structure is 10 months older than that). Implementing the changes took about 700 lines of changes distributed among a dozen files.

We have succeeded in getting assign to process with a single .ns file describing nodes on two campuses, and have verified that we have a sufficiently complete list of the tables to be subsetted from the combined database and sent back to each federated site to reflect its share of the experiment (step 4). We have already made the modifications to the swap-in process to enable the rest of the activity that occurs after the assignment process, and tested it with a manual swap in of two halves of an experiment at Berkeley and USC/ISI.

An issue that needs to be addressed in the future is that assign uses statically allocated arrays for some characteristics, such as node_type. The limits are unlikely to be reached in federating two or three large sites (e.g. , DETER, Utah, and Vanderbilt). A bigger question is the computational complexity of the assignment algorithm and whether it will succeed for 1,000 nodes.

For steps 5 through 6, the Utah Emulab staff has already adopted earlier minor changes we proposed to the VLAN control privileges granted to elab-in-elab experiments so that the inner testbed can request two additional services from the outer Emulab: placing experimental interfaces in trunked mode (something that non-elab-in-elab experiments can already do), and retrieving the list of actual VLAN tags in use at each site so that the tags can be rewritten by the inter-site firewalls. The earlier changes required about 500 lines of new or changed code in 5 files.

The processes of assigning VLANs, loading operating systems (step 7), and replacing the trust structure in the satellite sites, are all working now. We are, at time of the publication of this paper, continuing to resolve some minor details in the conjoining of the inner control networks (step 8) and the running of the Emulab "events" system.

4.1 A Manual Federation Experiment

While we have made significant progress towards the ultimate goal of automated federation of experiments, we are currently at the state where manual intervention still is required. The necessary technology is in place to allow us to use manual configuration and commands to demonstrate 1,000 experiment nodes interacting on a distributed security simulation.

More specifically, the way it would be done is that at each of the participating sites, we would instantiate separate experiments with separate .ns files and then we would manually configure tunneling of the constructed VLANs. The tunneling would require rewriting of the actual VLAN tags using one of the solutions we described above.

If the experiment also requires that the control interfaces in each testbed talk with each other, then it would be necessary to tunnel the control networks together (since the control network addresses are private and unroutable), and it would be prudent to place the each participating group in a firewalled experiment to contain the control network traffic. Placing each federated group in a firewalled elab-in-elab experiment would allow very stringent firewall rules, such as allowing only SSH and XML-RPC traffic from the outside to the federated experiment.

This manual process should be possible to do now, however it would require operators at each site to use the snmpit command to place a normally experimental network interface on a node with external Internet access into trunked mode, and then add all the VLANs to be tunneled onto that interface. It would also require punching a hole in the firewall rules to permit UDP traffic between the participating sites.

Kevin Lahey at ISI is currently pursuing a small scale demonstration of this manual approach with the WAIL group at Wisconsin, however he has encountered two problems: they are currently running an older version of Emulab that does not have the necessary emulab-in-emulab features, and the Wisconsin firewall is blocking the UDP ports necessary for tunneling purposes.

5 Acknowledgments

This research was supported by funding from the United States National Science Foundation (NSF), the United States Department of Homeland Security (DHS), and Space and Naval Warfare Systems Center, San Diego, under contract numbers ANI-0335298 (DETER), CNS-0454381 (DECCOR), and N66001-07-C-2001 (DIPLOMAT). Juniper Networks and Hewlett-Packard (HP) donated equipment used by the DETER testbed. Donations were also received from Sun Microsystems and Dell through their University Discount programs.

Opinions, findings, conclusions and recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the NSF, DHS, the Space and Naval Warfare Systems Center, San Diego, Juniper Networks, HP, Sun Microsystems, or Dell. Figures and descriptions are provided by the authors and are used with permission.

6 Conclusion

The growing interest in large-scale testing of cybersecurity applications is leading to increasing demand for large testbeds. However, a large testbed requires substantial power and cooling resources from a site and imposes a significant amount of weight loading.

As an alternative to a single large testbed, we have presented techniques for running massive experiments between cooperating Emulab-derived testbed facilities. The experience gained will help us understand the operational and administrative issues with federating testbeds. We discussed the specific steps, several challenges with known solutions, and some open challenges. We also provided a status update on our progress, and outlined a proof-concept experiment that uses manual configuration to demonstrate the feasibility of our approach.

References

[1]: Bajcsy, R. et. al. . Cyber defense technology networking and evaluation. Communications of the ACM 47, 3 (March 2004), 58-61.
[2]: Benzel, T., Braden, B., Kim, D., Neuman, C., Joseph, A. D., Sklower, K., Ostrenga, R., and Schwab, S. Experience with deter: A testbed for security research. In 2nd International IEEE/Create-Net Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities (March 2006), Barcelona, Spain.
[3]: White, B., Lepreau, J., Stoller, L., Ricci, R., Guruprasad, S., Newbold, M., Hibler, M., Barb, C., and Joglekar, A. An integrated experimental environment for distributed systems and networks. In 5th Symposium on Operating System Design and Implementation (OSDI 2002) (December 2002).

Footnotes:

¹A more detailed description of the DETER testbed can be found in another paper accepted to this workshop.

File translated from T_EX by T_TH, version 3.76.
On 17 Jul 2007, 15:03.