LISA '06 Paper
Managing Large Networks of Virtual Machines
Kyrre M Begnum - Oslo University College, Norway
Pp. 205-214 of the Proceedings of LISA '06:
20th Large Installation System Administration Conference (Washington, DC:
USENIX Association, December 3-8, 2006).
Abstract
As the number of available virtualization tools and their
popularity continues to grow, the way in which virtual machines can be
managed in a data center is becoming more and more important. A few
commercial tools such as VMware ESX and XenEnterprise exist, but they
are limited to a certain virtual machine technology and offer no way
to expand the tool's capabilities to local needs. This paper discusses
an open source management tool, MLN, for large virtual networks
transparent of their virtualization platform. The current version
supports the two popular open source virtual machine packages: Xen and
User Mode Linux. MLN uses an extensible configuration language for the
design of the virtual machines and their internal configuration. Large
groups of virtual machines can be managed as logical groups. We
present a web-server hosting scenario and a on-demand render farm as
case studies to show the usefulness of our tool. The text concludes
with a short discussion on the difficulties of offering abstraction to
virtualization platforms.
Introduction
With the growth in numbers of virtualization platforms, the system
administrator will face significant challenges in getting them to work
together. Most of the platforms that are popular today have their own
areas of application. For example, User-Mode Linux does not require
root access to install and can mount folders on the physical host as
partitions. Xen has impressive performance and is easy to connect to
the LAN. VMware offers graphical front-ends for VM creation and
management.
Consider a data center that hosts virtual machines for third
parties or a university IT department that provides virtual
laboratories for their students.
Ideally, one should be able to choose the virtualization platform
that suits the task best. However, the management interface of the
platform is currently so tightly connected to the given product that
it is impossible to make a decision about the platform without also
considering how the virtual machine should be managed.
Although virtualization platforms usually offer a simple
management interface, they are often difficult for non-system
administrators, e.g., as teachers, to use. Further, these tools offer
limited support for specifying attributes inside the virtual machine,
such as users or network setup. Once the virtual machine is running,
the system needs to be configured additionally by hand, a process that
is known to be error prone.
Furthermore, if a user wants to manage many virtual machines, she
needs to be able to group virtual machines together in a meaningful
way and to manage virtual machines spread out on several servers. The
free management tools scale badly because they are aimed at running a
few virtual machines on a single host. She must buy a very expensive,
more advanced tool in order to still have control over multiple
virtual machine hosting servers. The question of how to manage virtual
machines on a large scale is therefore a matter of cost and design
tool rather than platform and capability.
Managing VMs also inevitably overlaps with specifying and
implementing their configuration. For example, a system administrator
is given the task to design and configure a cluster of 50 nodes in a
render farm running entirely on virtual machines. A minimal version of
the same cluster also has to be build from the same specification for
testing. Similarly, a web-hosting company wants to consolidate several
customers onto the same physical network using groups of virtual
machines that belong to each customer. How can we minimize the drain
on the system administrators time resulting from VM administration and
configuration? How easy is it to migrate virtual machines between
different platforms and/or to migrate them between servers? In
general, how can they focus entirely on using the virtual machine
rather than on its implementation and configuration?
To summarize, these are the current challenges virtual machine
administrators face today:
-
Virtual machine management software support only a single virtual
machine platform.
-
Free management tools are often intended for running few virtual
machines. No support for grouping nor the design of larger networks.
-
Commercial tools offer better support for larger networks but are
proprietary and impossible to modify.
-
Host configuration is usually not a part of the management software.
From the field of configuration management we know many tools that
group unrelated configuration properties into a abstract configuration
language thereby enabling the user to address the whole computer
system or network from a standard interface [1, 2]. They provide an
abstraction layer so that the user can focus on the intended policy
without knowing the details of the platform nor the necessary steps
required to achieve it. Based on the popularity and effectiveness of
this approach, we present an open source tool, MLN (Manage Large
Networks), that takes a similar tactic to virtual machine
configuration and management.
MLN allows the administrator to design, create and manage whole
networks of virtual machines and their internal configuration in a
template-based fashion. Two popular open source virtualization
platforms, Xen and User-Mode Linux, are currently supported. MLN
supports an expandable configuration language using plug-ins that
allows the data center administrator to include local configurations
easily. A MLN network daemon allows for virtual networks to be spread
and managed across several physical servers.
MLN is freely available today and has been used successfully as a
tool for virtual student laboratories. In this paper we show how the
tool can offer significant improvements to the management of virtual
machines, despite their underlying platform, in real-life data
centers.
This paper is organized as follows: the next section provides some
brief background information. We then outline the main features of the
configuration language. The two subsequent sections showcase the tool
used in real-world contexts: a web hosting scenario and as a
management interface for an on-demand render farm. We then present the
interaction and control commands of MLN are presented. Finally, we
evaluate the tools current capabilities and present directions for
future work.
Background
The diversity of today's virtualization platforms make them
suitable for a wide range of tasks. We see a growth in the interest of
virtual machines in many areas:
-
Consolidation and Commercialization. Virtualized hosting
and service encapsulation is perhaps the most attractive use today, as
they are commonly associated with cost saving, flexibility, uptime and
security.
-
Research. A virtual machine is more adaptable in terms
of assigned hardware resources and fits well into self-management
scenarios [3, 4].
-
Testing. Creating test-beds for services and software is
also becoming more common.
-
Education. Advanced student labs can be implemented with
less cost and more flexibility using virtual laboratories [5].
User-Mode Linux [6] is a version of the GNU/Linux kernel that can
be run as an application on a running Linux system. The User-Mode
Linux kernel is started on the command line, and host parameters such
as memory size and filesystem image are supplied as arguments. Folders
can be mounted as partitions, and one does not require root access to
start virtual machine instances. User-Mode Linux is considered to be
lightweight in terms of resources and easy to install. A switch
emulator (uml_switch) is supplied as a tool and
enables the user to create network topologies entirely in user space.
User-Mode Linux does not offer any higher level configuration tools,
although several third party software projects exist today (with
varying progression) [7].
Xen is a virtual machine monitor that uses the concept of
parallelization [8, 9] to enable several concurrent operating system
instances to run simultaneously on a thin management layer called a
``hypervisor.'' Xen virtual machines (called domains) have low
overhead and are considered to be almost as fast as if the operating
system were running directly on the hardware. Xen installation and
management requires root access. An attractive feature of Xen is the
ability to migrate running virtual instances seamlessly across
physical servers without down-time.
Connecting Xen virtual machines together in networks is done using
bridge devices on the physical server. A bridge device functions the
same way as an Ethernet switch and can either provide isolated
internal networks on the server or bridge the physical network, making
the virtual machines appear to be regular hosts on the LAN. A Xen
domain is defined in a configuration file that addresses virtual
machine features such as memory, disk image and simple network
parameters. A Xen daemon (xend) is responsible for
managing the domains. A tool called xm will create a
single virtual machine based on the supplied domain configuration
file. A commercial tool called XenEnterprise is available for purchase
which features increased server, management and resource control [9].
From the available information on the XenSource site at the time of
this writing it is difficult to assess the management and design
capabilities of XenEnterprise.
VMware [10] is a well-known actor in the virtual machine industry.
For brevity, we will consider the freely available tools currently
offered by VMware and how they fit into our approach. VMware offers a
free product called ``VMware Server,'' which has both a web and
graphical application interface for managing virtual machines even on
remote servers. The software offers an easy way to create a single new
virtual machines but has no way to define groups of virtual machines.
Also, since every new virtual machine is created graphically it
becomes cumbersome if a user wants to design a large network of say 50
virtual machines spread out over 15 physical servers and make sure
they have a consistent configuration. A simple tool, called VMware
Player offers a quick way for users to run single pre-configured
virtual machines. VMware also has a group of products aimed at hosting
scenarios, but since they are not freely available, they are not
considered in this text.
MLN: A Management Tool for Virtual Machines
MLN (Manage Large Networks) [11] was first used in 2004 as a tool
for providing a virtual firewall lab running User-Mode Linux for
students [5]. It has since then been expanded to support Xen as a
virtualization platform and to include a plug-in framework. MLN can be
downloaded from https://mln.sourceforge.netand has its own installer,
which also downloads and installs a version of User-Mode Linux. Xen
must be installed separately.
The MLN configuration language contains both system variables and
grouping mechanisms. In MLN, a logical group of virtual machines is
defined as a project. A file in the MLN configuration language
will typically define one project.
Defining Projects
The structure of the language is a hierarchical sequence of blocks
containing keyword/value pairs. A block is enclosed in curly brackets
({ }). A keyword is generally expressed in the form keyword
value but is not bound to it. Some keywords are lines with
several parameters. It is often natural to place one keyword/value per
line, but semicolons can be used to place several pairs on the same
line.
Each host and network switch will have one block. All hosts in a
project do not have to be connected in the same network.
A project has no restrictions regarding the number of hosts or
switches. The only mandatory part of a project description is a block
of global definitions with at least the name of the project. Project
names must be unique for each user. Definitions of one or more hosts
and perhaps network switches constitute the network topology. Hosts
can have several network interfaces that can be assigned to switches
in arbitrary topologies.
Here is an example of a ring-topology:
global {
project ring
}
host router1 {
network eth0 {
switch A
}
network eth1 {
switch B
}
}
host router2 {
network eth0 {
switch B
}
network eth1 {
switch C
}
}
host router3 {
network eth0 {
switch C
}
network eth1 {
switch A
}
}
switch A { }
switch B { }
switch C { }
This is a simple but complete MLN project. In later examples we will
show how configurations such as network addresses, users and startup
commands are included.
Features for Larger Projects
Language features such as superclasses and variables are helpful
when the project is big. We will review these two features next.
Superclasses are a concept from Object Oriented Programming. They
describe a class which another class is a subclass of (i.e., a
parent). In MLN, a superclass is a description of a virtual machine
from which other virtual machines can inherit from. A superclass
virtual machine will not be built by MLN. Its most common use is to
define a configuration that is to be kept constant and let a group of
hosts point to it.
In the example below, the virtual machine node1
inherits all the keywords from the superclass common.
It also specifies additional keywords, such as the network interface
address. Notice that hosts are free to override keywords from a
superclass.
superclass common {
memory 128M
free_space 1000M
xen
network eth0 {
netmask 255.255.255.0
}
}
host node1 {
superclass common
network eth0 {
address 10.0.0.1
}
}
Hierarchies of superclasses can be constructed. Hosts only inherit
from a single superclass (or superclass hierarchy).
MLN supports string variables in its syntax. This enables the user
to keep information consistent across keywords. Consider the following
example:
global {
project example1
$password = 2mf9fmcaioa8w
}
host node {
root_password $password
users {
jack $password
}
}
The variable $password is defined in the global block
and used later on inside a host. Variables have scope, so if a
variable is used, MLN will look for its value upwards in the block
structure and lastly inside the global block. Variables can be
expanded into strings if the variable name is enclosed in brackets ([
]).
Virtual Appliances
Every virtual machine has its own filesystem. One approach to
virtual machine management is to create new machines that boot into an
installer the first time and install a new version of an operating
system. Another approach is to use a ready-made filesystems which are
installed and configured with software already.
MLN supports a repository of these filesystems, called
templates, from which the user can choose from. Templates vary
in size based on the amount of installed software they contain. A
virtual machine based on a template of this kind is the basis for what
is called virtual appliances [12] which can be specialized to
perform specific tasks (as the examples later will show). The
encapsulation of software components in this way has the benefit that
experts can put together and properly configure software, and enables
users to hit the ground running with a working virtual machine. For
example, in educational contexts, this allows students to focus on
using a software tool without having to learn how to install and
configure it first.
A variety of templates can be downloaded from the MLN web-site.
Users and system administrators can also modify existing virtual
appliances as well as create new ones.
Plug-ins
It is not the intent of this tool to re-invent configuration
management paradigms in its own language. The plug-in architecture is
a way to allow other configuration management tools to be integrated
as easily as possible with MLN.
A plug-in that is executed can do two actions: access the entire
MLN data tree and change the project before it is built, or configure
virtual machine filesystems during the build process. Plug-ins have no
need to write their own parsing code.
In the following example, we want to build a project where the
virtual machines use the configuration management tool
cfengine [1] for internal maintenance. The template
used in this project already has the cfengine software installed, but
for flexibility, we want to be able to define the cfagent policy
inside the MLN project as a block inside a host or superclass. The
cfagent policy should be written to a file /cfengine/inputs/
cfagent.conf when the project is built. Here is a MLN project
with an embedded cfengine policy:
suberclass common {
template ubuntu-server-cfengine.ext3
cfagent {
control:
any::
actionsequence =
(
shellcommands
processes
)
}
}
host agent {
cfagent {
shellcommands:
"/usr/bin/updatedb"
processes:
"cron" signal=hup
}
}
A plug-in in the Perl programming language that writes the above
specified cfagent into a file does not have to be more
than the following code:
sub cfenginePlugin_configure {
my $hostname = $_[0];
if ( getScalar("/host/$hostname/cfagent")){
my @cfagent_poligy =
getArray("/host/$hostname/cfagent");
writeToFile($hostname,
"/cfengine/inputs/cfagent.conf",
@cfagent_policy);
}
}
1;
The benefit of this approach is that it is easy to combine MLN
with tools that the community is experienced with and that can handle
long-term management of the host while it is running. Many system
admins have established policies which can be integrated this way
using a small amount of code and removes the task of adding the policy
manually on each virtual machine. We will see an example later where a
plug-in is used to modify the data structure and not the filesystem.
Distributed Projects
Until now, the examples have all been on the same server. MLN also
provides a network daemon for distribution of projects among several
physical servers. A physical server is in MLN called a
service_host because it provides a hosting service to
the virtual machine. A physical host must be made aware of it being a
service host in its local MLN configuration files.
A virtual machine is assigned a service host the following way:
host startfish {
service_host huldra.iu.hio.no
memory 96M
network eth0 {
address dhcp
}
}
Project Organization
All the files belonging to a project are stored in a dedicated
project folder. The contents of each project folder is the start and
stop scripts for the network switches and each VM together with its
filesystem image (unless it is placed in a LVM partition). Starting
and stopping a project will result in the corresponding scripts being
called. UML does not possess any other way to interact with it then
through the command line in the time of writing. Xen, on the other
hand, is working on an RPC-based approach for VM management which in
time might be possible for MLN to interact with.
In the following example, we show a configuration for a data-center
that provides virtualized hosting in the form of virtual
sites. Customers can deploy a gateway and a set of servers on a
back-net for their services. A typical example would be a web service
with redundant load balanced servers. For simplicity, we omit another
tier of database servers.
The physical layout is set up with a single gateway server and a
back-net of hosting nodes. The gateway server will host all the
virtualized gateway machines. It is possible to physically mirror the
gateway server also. A single customer may span one or several of the
back-end nodes. Back-end servers may contain one or more virtualized
machines from several customers. The customers can chose the amount of
web-servers they want to deploy based on the expected load on their
web-sites. See Figure 1 for an example setup.
Figure
1: A web-hosting scenario where virtual machines are
encapsulations for customer services. Two customers are accommodated
in this setup: ``abc toys'' with three web-servers and a gateway and
``kafe on-the-corner'' with two web-servers and a gateway. In MLN they
are represented as the two projects abc and
kafe.
Every server runs the MLN daemon. Each virtual site is represented
as one MLN project. The virtual machines in the project are spread
across the servers using the service_host keyword. The
project is built across all the servers that host a node from that
project.
global {
$cust_name = kafe
$default_gateway = 10.0.0.141
project $cust_name
}
# general settings
superclass common {
xen
lvm
root_password *********
free_space 1000M
memory 128M
term screen
template ubuntu-server.ext3
network eth0 {
bridge back-net
netmask 255.255.255.0
}
files {
/customers/$[cust_name]/www
/var/www
}
}
host gw {
superclass common
service_host gateway1
memory 256M
network eth1 {
address 128.39.73.101
netmask 255.255.255.0
gateway 128.39.73.1
}
network eth0 {
address $default_gateway
}
startup {
echo 1 > /proc/sys/net/ipv4/ip_forward
iptables -t nat -A POSTROUTING
-o eth1 -j MASQUERADE
}
}
host www1 {
superclass common
service_host backend3
network eth0 {
10.0.0.142
gateway $default_gateway
}
}
host www2 {
superclass common
service_host backend4
network eth0 {
10.0.0.143
gateway $default_gateway
}
}
... continues to node N ...
The example above shows three virtual machines connected together
where the host gw has an extra network interface to
the outside. The superclass common defines common
keywords for all the virtual machines. In one case, the memory keyword
is overridden locally by the gateway host. A variable is used to keep
track of the gateway address on the back-net. This way we make sure
all the back-end nodes point to the correct address and that the
gateway actually has that address. The keywords xen
and lvm enable the virtual machines to run on the Xen
platform and to put their filesystems in LVM partitions for maximum
performance. The files block defines what files should
be copied into the filesystems at build time. In this case, it is the
source files for the web-servers.
The project is spread over three physical servers. The following
command can be used to build the project:
mln build -f kafe.mln
MLN will attempt to contact the daemons on all the involved servers
except where the build command is launched. Once the project is built,
it can be started using the following command:
mln start -p kafe
A remaining task is to configure the virtual site to its intended
product. Generally this might be done by the owner of the project.
Auto-configuration using promise theory and roles on virtualized sites
like this was explored in a separate paper [13]. The hosting company
can also offer dynamic configurations, where the number of web-servers
is adjusted to the real-time loads on the web-site.
Case Study 2: An On-demand Render Farm
Managing a cluster for rendering can be expensive for small
companies as one needs to support a location and hardware for it. In
addition, hardware performance increases each year making the local
render farm outdated quickly unless one has extra room for expansion.
In this scenario, we consider a small animation company that does
not support their own render farm. Instead, they contract with a data
center to provide virtual machine hosting for a render farm of virtual
machines that the animation company manages themselves. The cost model
is pay-per-use, so the animation company only has costs when they are
doing actual work, something they consider as an advantage. The data
center can rent out their servers to other customers as well and may
even run several customer networks at the same time. Moreover, since
the data-center is likely to upgrade their servers on a regular basis,
they are more likely to attract customers of this kind who will get
better performance over time with no extra costs.
The way this is realized with MLN is that the administrator at the
animation company has a small, local test-bed on a single machine
running a light-weight virtualization platform. There, the template
for the render nodes and the master is maintained. Once the animation
company has a new contract, a render farm is deployed from these
templates. A contract with the data center is made with regard to the
number virtual machines to deploy and their resources.
To ease the design of the MLN project for the render farm we
introduce a plug-in, autoenum, that enumerates the render nodes
for us. Here is a project for one master and 50 render nodes:
global {
project renderfarm_customerX
# the following block contains the
# configuration for the autoenum
# plug-in
autoenum {
superclass render_node
addresses enum
addresses_begin 2
numhosts 50
network 10.0.0.0
service_hosts {
#include /tmp/servers.txt
}
}
$gateway_address = 10.0.0.1
}
superclass common {
term screen
xen
lvm
}
superclass render_node {
superclass common
template renderNode.ext3
free_space 1500M
memory 256M
network eth0 {
netmask 255.255.255.0
gateway $gateway_address
}
}
host master {
superclass common
template renderMaster.ext3
free_space 5GB
memory 512M
network eth0 {
netmask 255.255.255.0
address $gateway_address
bridge cluster-network
}
network eth1 {
netmask 255.255.255.0
address 128.39.73.102
gateway 128.39.73.1
}
}
The entire render farm of 51 virtual machines is specified in only
45 lines of code. Actually, the render farm could be increased to 254
without increasing the complexity of the project. We see the use of
two superclasses, common and
render_node. The platform specific details are all in
the first superclass. Xen is chosen as the virtualization platform
with LVM partitions for their hard-disks. A simple test bed of only a
few nodes using User-Mode Linux that runs on a laptop could be
realized with only minor changes to the file above. This way, the
administrator from the animation company could make sure the software
on the two templates works as he intends before the full-blown cluster
is created and charges start accumulating.
The autoenum block in the beginning of the project
sets flags for the plug-in. This plug-in has a different intention
compared to the one showed in Section 3.4. Upon parsing, the plug-in
will fetch the information from the autoenum block and use it to
create the rest of the virtual machines that make the cluster. This is
done by adding them to the data structure before the project is built.
This plug-in is therefore not something that expands the configuration
of each virtual machine filesystem, but adds design features and logic
to MLN based on local needs.
The list of servers where the nodes are spread out is written in a
separate file. The #include statement is used to read
in the contents of that file during the MLN parsing process. The
autoenum plug-in will assign the render nodes to the servers.
Control Commands and Monitoring
The MLN command builds and starts the virtual machines and
networks defined as projects. Here are some examples:
-
mln build -f example.mln
Build the project from
the file example.mln .
-
mln -P /my/important/projects start -a
Start all
the projects in the folder /my/important/projects .
-
mln status -p mysql-servers -u
List all the
switches and virtual machines belonging to the project mysql-servers
that are currently running.
-
mln stop -p web-services -w 120
Stop all the
virtual machines belonging to the project web-services.
Wait 120 seconds for the hosts to shutdown. After
the time is elapsed, destroy the remaining ones of the project.
The last example is useful when the physical host is to shut down
and has limited time to wait for all of the virtual machines to shut
down properly.
Regular users can use MLN without root privileges while using
User-Mode Linux as a virtualization platform. Only users with
administrator access can start projects that are based on Xen.
The building of a distributed project is started the same way as
for other projects. MLN will contact the service hosts for the virtual
machines not intended for the local machine and will send them the
project for them to do their part.
Upon receiving the build request, the daemons start to build the
project in the background and await a subsequent request from the same
client asking for the output. Starting and stopping a project will
also result in the attempt to contact the other service hosts so that
the entire project is managed simultaneously.
MLN will always start the network switches before the virtual
machines. The boot order and time to wait between each virtual machine
can be specified. Best practice is to avoid a strain on the system by
simply letting MLN sleep a few second between each host starts. An
example of this, introducing a three second pause, is:
mln start -p example -s 3
A project is often part of a bigger context on the network or the
physical server. Often times one needs to run specific commands on the
physical server prior or after the virtual machines have started, such
as adding firewall rules or modifying routing information. MLN
provides blocks for additions of shell commands so that they are run
by MLN at specified points during the starting or stopping of a
project.
Modifying Existing Projects
It is not always possible to initially design a project to be
optimal for its task. Once the project is running, certain design-time
decisions, like memory or disk-space, might be re-evaluated and have
to be adjusted. The problem is often that the project already is in
use and cannot be rebuilt from scratch. This problem was encountered
several times when running virtual student labs over the course of a
semester (five months).
Server | # Projs
| #vms | Mem Used
| Mem Ava | Groups
| gateway1 | 2 | 2 | 512 | 768 | gateways,xen
| backend1 | 1 | 1 | 128 | 896 | backends,xen
| backend2 | 1 | 1 | 128 | 896 | backends,xen
| backend3 | 2 | 2 | 256 | 768 | backends,xen
| backend4 | 1 | 1 | 128 | 896 | backends,xen
| Total | 7 | 7 | 1152 | 4224 |
Table 1: Status information.
MLN's approach to this problem is to provide an upgrade command,
which will read in a new and modified version of the project and try
to upgrade it accordingly. Typical modifications are to change the
amount of memory, increase the disk size or even add/remove virtual
machines from the project. System specific changes, such as adding
users, can also be performed this way. For networks which can scale
from a software point of view, like web-servers and computing
clusters, the upgrade feature can be used to manage the amount of
nodes that participate in the cluster at any given time. The
modification of a project can be done manually by the system
administrator, but recent literature suggests a range of applications
for this within self-managing and adaptive systems [13, 4].
A more fundamental change to the virtual machine would be to
change its virtualization platform, like going from User-Mode Linux to
Xen. To change service host will result in a migration of the virtual
machine between two service hosts. The result of this is that one can
start with a lightweight User-Mode Linux virtual machine on a regular
laptop or workstation, and the virtual machine could later be moved to
a more powerful server using MLN where it would be running on the Xen
platform with perhaps more memory assigned to it too.
Monitoring
The user can use MLN to collect the status information from all
the servers that run the MLN daemon. The information displayed
includes how many projects are running, the number of virtual
machines, the amount of used memory and how much memory is left from
the allowed maximum for that server. This information is useful for
monitoring and planning of new projects.
Here, we see the result of the mln daemon_status
command on the network discussed in Case Study 1. Note that the total
number of projects can be misguiding, as several servers can have a
part of a project and that every part will count as a single project
in the summary.
Servers can be put into groups and status can be queried on a per
group basis, thereby giving more specialized feedback. One example is
to only show the resources on the servers assigned for testing or the
ones used in production. Whether or not a project or a certain host is
up is also possible through MLN.
Discussion
Successes
Consolidation of several virtual machine technologies into one
tool is a new and challenging task. Until now, it seems, the focus for
development of virtual machine monitors has been on performance, and
carving out a niche. The authors do not see any direct competition
between the virtual machine platforms used in this project. In fact,
the sum of them is a greater gain. The user should have ability to
choose which one to use without affecting the choice of the management
interface. Through MLN we have provided one way to design large
virtual networks before thinking about the platform it will run on.
MLN creates start and stop scripts for each virtual machine and
switch. As a result, any virtual machine technology that is
controllable from the command line, would be relatively easy to
integrate into MLN. There is no common API to virtualization today. A
stronger effort to provide a common API to all the virtual machine
technologies would greatly improve the result for projects like this
and enable MLN to support even more virtualization platforms by
talking to the API directly.
MLN has been tested and used as a commercial hosting tool for over
a year, during which it has provided us with much feedback on the
needed features for and limitations of current tools. Many features,
such as the plug-in framework, have spawned from this exchange. The
plug-in framework allows for administrators to add features both in
configuration scope as well as logic without re-inventing the wheel.
MLN has become the standard tool for virtual machine management at
Oslo University College. It provides the means for massive virtual
student laboratories in security classes as well as a virtual
appliance tool for student projects. Other institutions, such as
University of Linkping in Sweden and Oregon State University in the US
have also benefited from it in educational contexts. A Norwegian ISP
uses MLN today in their R&D department to rapidly create virtual test-beds.
MLN is one of the few freely available tools that offer ``cold
migration,'' where the virtual machine is shut down first and the
filesystem is compressed and copied to the new service host. Live
migration is supported in Xen but requires the two servers to be on
the same LAN and to have the same CPU architecture and concurrent
access to the virtual machine's disk. This is hard to realize
transparently to the user as it is bound to a certain platform. Cold
migration works in many scenarios where live migration would fail
because the servers are of a different architecture and have no
concurrent access to the filesystems. Another benefit of this approach
is that virtual machines can change other aspects in the migration
process. A User-Mode Linux host can migrate into a Xen host with more
memory and a different network setup. This is practical for moving
test-beds onto more powerful servers of different architecture and to
completely separate locations, changing network parameters in the
process. All of this is realized using the mln upgrade
command.
Current Limitations
MLN's configuration language addresses both hardware attributes of
the virtual machine and system configurations. It is therefore not
possible to avoid the challenges of host configuration management.
Currently, GNU/Debian based templates, such as Ubuntu Linux, are best
supported. MLN should ideally be able to support several operating
systems let alone support different Linux distributions. However, such
concerns are part of the ongoing effort of a large systems
configuration management community. The plug-in infrastructure of MLN
is one way to invite seasoned configuration management systems and
third-parties to handle the lower level tasks. However, some languages
might fit better then others into this framework and certain new
requirements might surface. This research is in progress and will be
discussed in a later publication.
Sufficient monitoring of the virtual machines is a critical
feature for data centers. MLN supports status on projects, hosts and
globally. Memory usage, the number of virtual machines and projects on
each server can be collected as well. This works well for monitoring a
project's status and to see the level of remaining resources on a
physical server. However, a usage indicator as to how much CPU and
network traffic is related to each virtual machine might further help
capacity planning. Xen has tools like xentop and
xenmon [14] that can monitor network, CPU usage and IO
operations of their virtual machines. One improvement to MLN would be
to expand the plug-in framework to also enable monitoring and
management. This would allow the local data center to develop
specializations that assist in capacity planning or fault detection,
such as a plug-in that finds free IP addresses or logs operations such
as starting and stopping.
Another question is weather or not MLN should provide better
encapsulation of each project in order to protect them from each
other. In User-Mode Linux, this is possible as the virtual machines
run as processes and are assigned to users. In Xen, all virtual
machines exist in the same ``pool'' and have no direct ownership.
Although the Xen domains are considered to be securely encapsulated,
they might still have network access to other virtual machines. One
solution is to create virtual switches on each physical server and to
connect those with virtual tunnels. This implementation is in progress
at this time of writing and will be presented as a plug-in.
Future Directions
The MLN daemon uses IP-based access control for management access.
Added features, such as user support for the daemon and finer access
control would indeed be a benefit. This way, one could separate access
for building a project and the ability to start and stop them.
Interaction with MLN is currently in the form of a configuration
language and shell commands. Although the language features improve
design and control over large virtual networks, one can investigate
other approaches such as graphical design and control tools. Also,
adding support for well-known document formats such as XML may enable
MLN to play the role of a back-end for higher level tools.
Future work will also look at the improvement of the distributed
management aspects of MLN. Scenarios such as management of large and
distributed virtual hosting platforms and how to introduce closer
monitoring and fail-over are of particular interest.
Conclusion
We have presented an approach to virtual machine administration
that lets the user describe the wanted configuration in an
understandable declarative language and then build the virtual hosts
and networks from it. The virtualization platform is secondary to the
configuration interface. A concept of logical groups of virtual
machines enables the user to issue management commands to all virtual
machines that belong together. Language features such as inheritance
from machine superclasses and variable expansion make it possible to
consistently describe large networks in just a few lines and to avoid
redundant information.
A plug-in architecture lets the user transparently expand the
configuration domain of the language to solve their specialized needs.
Part of the toolkit is a daemon that allows management of virtual
networks that span several physical servers. All of these features
have been harnessed to provide a flexible and powerful way to define,
create and manage scenarios for data-centers. Two case studies show
the usefulness of our approach; a web-hosting facility and a on-demand
render farm are realized using simple configurations and local
additions to the MLN language.
Author Biography
Kyrre earned his M.Sc. in Computer Science from the University in
Oslo. Apart from his studies, Kyrre has worked as a course instructor
at a Linux company where he has written and held courses in system
administration and Linux. Kyrre started as a full time Ph.D. student
in 2003 at the University College of Oslo. His main research areas are
anomaly detection, formal modelling of distributed systems and
configuration management.
Acknowledgments
The author would like to thank Professor Mark Burgess and John
Sechrest for helpful discussions and pointers throughout this work.
Bibliography
[1] Burgess, M., ``Cfengine: a site configuration engine,'' USENIX
Computing Systems, Vol 8, 1995.
[2] Desai, N., A. Lusk, R. Bradshaw, and R. Evard, ``Bcfg: A
configuration management tool for heterogeneous environments,''
IEEE International Conference on Cluster Computing
(CLUSTER'03), 2003.
[3] Liu, X., J. Heo, L. Sha, and X. Zhu, ``Adaptive control of
multi-tiered web application using queueing predictor,'' 10th
IEEE/IFIP Network Operations and Management Symposium (NOMS 2006),
2006.
[4] Xu, W., X. Zhu, S. Singhal, and Z. Wang, ``Predictive control
for dynamic resource allocation in enterprise data centers,'' 10th
IEEE/IFIP Network Operations and Management Symposium (NOMS 2006),
2006.
[5] Begnum, K., K. Koymans, A. Krap, and J. Sechrest, ``Using
virtual machines in system and network administration education,''
Proceedings of the System Administration and Network Engineering
Conference (SANE), 2004.
[6] Dike, J., ``A user-mode port of the linux kernel,''
Proceedings of the 4th Annual Linux Showcase & Conference,
Atlanta, 2000.
[7] The UMLwiki tools page, 2006,
https://uml.harlowhill.com/index.php/tools.
[8] Barham, P., et al., ``Xen and the art of virtualization,''
SOSP 03, 2003.
[9] The xensource homepage, 2006, https://www.xensource.com
.
[10] The vmware website, 2006, https://www.vmware.com.
[11] The mln project homepage, 2006, https://mln.sourceforge.net/.
[12] Sapuntzakis, C., D. Brumley, R. Chandra, N. Zeldovich, J.
Chow, M. S. Lam, and M. Rosenblum, ``Virtual appliances for deploying
and maintaining software,'' Proceedings of the 17th Large
Installation Systems Administration Conference, (LISA '03),
October, 2003.
[13] Begnum, K., M. Burgess, and J. Sechrest, ``Adaptive
provisioning using virtual machines and autonomous role-based
management,'' SELF - Self-adaptability and self-management of
context-aware systems, SELF'06, 2006.
[14] Gupta, Diwaker, Rob Gardner, and Ludmila Cherkasova,
``Xenmon: Qos monitoring and performance profiling tool,'' 2005.
|