|
Paper - 1999 USENIX Annual Technical Conference, June 6-11, 1999, Monterey, California, USA   
[Technical Program]
SBOX: Put CGI Scripts in a
Box
Lincoln D. Stein Cold Spring Harbor Laboratory
One Bungtown Road Cold Spring Harbor, NY 11764, USA lstein@cshl.org Abstract
sbox is a CGI wrapper script that allows Web sites to safely
grant CGI authoring privileges to untrusted or naive authors. The
script increases security in several ways. It changes the process
privileges of CGI scripts to match their owners, preventing one
script from interfering with another's data files or operations. It
establishes configurable ceilings on script resource usage, avoiding
intentional or unintentional denial of service attacks. Most
importantly, sbox can also be used to run untrusted CGI scripts
within a chroot()-ed directory, thereby preventing CGI scripts from
accessing sensitive portions of the file system.
sbox is written in ANSI C and compiles on multiple
flavors of Unix. It can be used and redistributed freely. The
complete package is available for download at https://stein.cshl.org/WWW/software/sbox/
The Problem with CGI Scripts
Common Gateway Interface (CGI) scripts were among the first
techniques for creating interactive Web pages and probably remain the
most popular [Stein, 1997]. Perhaps the main reason for the enduring
popularity of CGI scripts is their simplicity. To create a dynamic
Web page, a Web author writes a program that prints a short HTTP
header followed by the contents of the desired Web page. The author
then moves the program into a specially designated "CGI
directory" on the Web server host. When the program's URL is
requested, its output is displayed on the Web page.
A CGI script can be written in any language, compiled or
interpreted. A fully functional CGI script can be written in just
three lines of Bourne shell scripting code (including the #!
line):
The full CGI protocol [Coar, 1998] provides mechanisms for scripts to accept input from web forms, get information about the current operation of the server, learn the name and IP address of the remote browser, and pass back status information to the Web server. Communication between script and server is accomplished via environment variables and standard input/output. Essentially, the CGI protocol is a transient coshell system [Fowler, 1993] in which the Web server delegates the responsibility of producing the page content onto an external program, the script. The simplicity and ease with which CGI scripts can be created is
also the protocol's Achille's heel [Garfinkle 1997, Rubin 1997, Stein
1998]. It is so simple to write CGI scripts that programming novices
who have no prior experience in network server software development
can readily create interactive Web pages. And here's where the
problem lies. Novices, and sometimes even experienced programmers,
are prone to errors that expose the Web server host to attack by
unscrupulous individuals.
As an example of the problems beginners run into, consider the
following CGI script written in Perl. Its intent is to recover an
e-mail address from a submitted fill-out form and then mail a message
to that address using the mail program.
This script, which is intended to be typical of a beginner's program rather than illustrative of good style, begins by using the param() function of the Perl CGI module [Stein 1998] to recover the contents of three fill-out form fields named "mailto," "subject," and "contents." The values of "mailto" and "subject" are used to open up a pipe to the Unix mail command. The value of "contents" is then printed to the mail process, which is then closed. The script ends by printing out a short confirmation message. This script has a number of problems, including a reliance on
the PATH environment variable to resolve the mail command, a failure
to examine the "contents" field for a line beginning with a
dot, which would terminate the mail message prematurely, and a
failure to check for errors after the open() and
close() calls. However the most egregious flaw is in the call
to open(), whre the programmer passes the contents of
"subject" and "mail" to the shell without having
first checked them for metacharacters. Consider what happens when the
wiley hacker provides the following text as the value of the
"mailto" field:
The piped open() transforms this into the following call:
with the result that the system password file is inadvertently mailed out to a potential attacker. Other common problems in CGI scripts include the failure to
check the length of strings before copying them into static buffers,
failure to check for the existence of temporary files before
clobbering them, and the failure to check user-provided pathnames for
the ".." characters before opening files. Possible CGI
script exploits include a variety of denial of service attacks. For
example, a CGI script that reads user-provided input and spools it to
a disk file is vulnerable to the mischievious hacker who uses a Web
robot to transmit an endless stream of random bits to the script.
Eventually the server's file system will fill up, causing the host
system to stall.
Experience has shown that CGI scripts are a major source of
vulnerability on the Web. Over the past five years, dozens well-known
and widely distributed CGI scripts have been found to contain
exploitable security holes [Stein 1998, also see CERT 1996-1998].
Even experienced developers get burned from time to time. Offenders
include freeware/public domain scripts, such as count.cgi as well as
commercial products from such respected developers as Microsoft and
Silicon Graphics. Understandably, most Webmasters are extremely
cautious about installing new and untested CGI scripts on their
servers.
One way to limit the harm that poorly-written CGI scripts can do
is to run the Web server with as few privileges as possible. Most Web
servers run as an unprivileged user without login privileges, such as
"nobody." CGI script processes spawned by the server will
ordinarily run under the same privileges as their parent, and by
carefully controlling file and directory permissions, the Webmaster
can limit the scope of any potential damage that errant CGI scripts
can inflict on the server host. To increase safety even further, the
Webmaster could place the entire server into a restricted directory
using the chroot command. Now any CGI scripts it spawns will
be limited to the portion of the file system that the server runs
in.
User-Maintained CGI Scripts
Now consider a Web server run in an academic environment or by
an Internet service provider (ISP). Such a system generally supports
multiple Web authors of varying levels of experience and aptitude. In
an academic environment, the authors are students, faculty members,
and support staff who are granted personal Web pages. In the case of
an ISP, the users are customers who have paid for Web space, and can
range from individuals who maintain personal "vanity" pages
to large co-hosted corporations. If users are allowed to write and
install their own CGI scripts, then the risk from user-maintained
scripts is magnified several fold.
First of all, a malicious author might seek to break into the
Web server host by writing a CGI script that deliberately probes the
host for holes. In many Web hosting environments, authors are not
given a login shell. Instead they are constrained to uploading new
and modified HTML pages via FTP or a Web publishing package such as
FrontPage [Microsoft 1998]. If authors are allowed to upload Perl
scripts and compiled binaries for use as CGI scripts, this policy is
easily circumvented.
Second, even if the host is protected by running the Web server
as an unprivileged user and in a change-root directory, there is
nothing to protect authors from each others' CGI scripts. Because all
CGI scripts run under the same user account and execute in the same
change-root directory, there is nothing protecting one author's data
from another author's script. A student could write a CGI script to
peek at the answers to a faculty member's online quiz, kill other
students' CGI processes, or fill a user's guestbook file with obscene
messages. In an ISP environment, one corporate customer could write a
CGI script to spy on another customer's order entry system and client
database.
Third, even if there is no active intent to do evil on the part
of an author, a single poorly-written CGI script can still be used by
Internet intruders to compromise the security of all authors on the
system. For example, a guestbook script that doesn't check for the
presence of ".." directories in the path to its data file
can easily be exploited to view or overwrite files maintained by
other authors.
Fourth, user-maintained CGI scripts are an invitation to denial
of service (DoS) attacks. A malicious script writer can launch a DoS
attack on the Web server host with a Perl script like the one shown
below, which forks itself forever until the server host runs out of
slots in its process table. It is possible that the system
administrator will be unable to log in to kill the runaway process
and may be forced to reboot the machine:
Finally, it is difficult to trace an attack from a user-maintained CGI script back to its owner. Since all scripts execute with the identity and privileges of the Web server, there is no easy way to determine whose script is, for example, leaving 40 megabyte scratch files in /tmp. Wrappers
There are a number of approaches to the problem of
user-maintained CGI scripts. One approach is to outlaw them
completely. The site's administrators can preinstall a number of
standard CGI scripts for users to link to and configure the server so
that no additional scripts can be added. Another solution is to
submit all user-written scripts to an exacting code review.
Neither of these approaches is particularly appealing. The first
solution is unlikely to be popular in the competitive Web hosting
market where customers migrate to the service that offers the most
features for the least cost. The second solution is only practical
for sites that have unusually generous administrative resources or an
unusually small number of users who want to install custom CGI
scripts.
A more practical solution is to use a wrapper script. Instead of
invoking user-maintained CGI scripts directly, the web server runs
then indirectly via a wrapper program. The wrapper modifies the
environment in some way that make the execution of the
user-maintained script safer. The wrapper is also a good place to
enforce security policy decisions. For example, the wrapper can keep
a log of the scripts it has run and can refuse to run scripts whose
permissions are insecure.
The first and still most widely-used wrapper program was
cgiwrap, written by Nathan Neulinger [Neulinger 1996]. cgiwrap
performs several useful functions. Its main feature is that it uses
the Unix setuid() call to run user-maintained CGI scripts under the
user and group ID of the script's owner rather than the shared Web
server account. This prevents one user's scripts from writing to data
files maintained by another, and makes it easier to track down
problems caused by poorly written scripts. cgiwrap also allows the
Webmaster to place resource limitations on user-maintained scripts
using the Berkeley setrlimit() call. This prevents a number of
deliberate and inadvertent DoS attacks.
The cgiwrap program is straightforward to use. Once cgiwrap is
installed in the system CGI directory, URLs used to invoke
user-maintained scripts like this one:
are replaced by URLs that invoke cgiwrap. For example:
More recently, the popular Apache Web server has shipped with a
built-in wrapper program called suEXEC [Apache Group 1998]. The
operation of suEXEC is similar to cgiwrap, but it is more tightly
integrated into the Web server, making it unecessary to change any
URLs in order to use it. In addition to changing its user ID to match
that of the owner of the script, suEXEC logs each script it executes
along with the user and group ID that it runs under. It also performs
a series of consistency checks in order to detect unsafe practices.
For example, suEXEC will refuse to run a script that is world
writable or which is contained within a world writable
directory.
The main limitation of both cgiwrap and suEXEC is that neither
truly insulates scripts written by one user from those written by
another. Naive users who store confidential information in world
readable files and directories can still be attacked when another
user's CGI script is used to peek at that data. In fact, although
these scripts increase the security of the Web hosting service as a
whole, they decrease the security of the individual user. Because the
wrapped script runs with the same privileges as the user, it has free
access to all the user's files. A poorly written script can be
tricked into changing the user's HTML documents or recursively
deleting his home directory. It can also impersonate the user, for
example by sending e-mail from the user's account.
The sbox Wrapper
The sbox program is a CGI wrapper that goes beyond cgiwrap and
suEXEC to offer the following features:
1. 2. 3. 4. 5. 6. 7. These features can be used together, or can be switched on and
off selectively to implement a variety of security policies.
Once installed, sbox is straightforward to use. To run an
untrusted CGI script, create a composite URL consisting of the path
to sbox followed by the path to the target CGI script. A typical URL
for invoking a user-supported script looks like this:
https://www.site.com/cgi-bin/sbox/~fred/guestbook.cgi sbox can also be used in conjunction with the virtual hosts
feature provided by Apache and other servers. With some servers, it
is even possible to make sbox transparent, so that its name doesn't
appear in the path. A scheme to do this using the Apache mod_rewrite
module is presented later in this paper.
The next sections describe each of sbox's features in more
detail and shows how they can be used to increase the security of the
Web site.
suid()/sgid() Features
Before sbox launches a user-supported CGI script, it can be
configured to change its UID and/or GID to match the script's owner.
There are two possible variants of this feature. In the first
variant, sbox uses the script file to determine which user and group
to run as. This functionality is similar to the scheme implemented by
cgiwrap. In the second variant, the ownership of the script is
ignored; instead the ownership of the directory that contains it is
used to determine the user and/or group.
Allowing sbox to take on the identity of the enclosing directory
might seem a bit obscure, but the rationale is that it gives the
Webmaster more flexibility than just using the script ownership does.
For example, the Webmaster could use this technique to create a
common cgi-bin directory for use by a particular group of developers.
The directory would be owned by a pseudo-user and be group writable
by each of the developers, allowing any user in the group to create
and edit CGI scripts. When the script runs, it executes under the
permissions of the common pseudo-user account, preventing it from
modifying any of the author's files or databases unless he explicitly
gives it permission to do so by setting the group writable bit.
Another strategy that the Webmaster might want to adopt is to
configure sbox so that it performs an sgid() only. This will cause
the target script to be executed with the group permissions of the
script or enclosing directory, but with the user permissions of the
Web server. By adopting a system-wide user-private group strategy in
which each user is assigned a unique primary group, the script's
author can exactly control what resources the script does and does
not have access to. This strategy also makes it possible to create
scripts that cannot modify their own source code file or binary, a
risk that cgiwrap and suEXEC are both subject to.
Consistency Checks
When sbox launches, it checks its environment for signs that it
has been tampered with or that it is being run in an unsafe fashion.
If any of the checks fail, sbox aborts with an error message.
The following checks are performed:
1. 2. 3. 4. 5. 6. 7. 8. These checks, along with the environment sanitization performed
later in the launch process, go a long way toward preventing many of
the loopholes and configuration errors that are frequently exploited
by intruders.
Resource Controls
After applying its consistency checks, sbox applies resource
limitations to the current process using the BSD-derived setrlimit()
system call. Limits include the size of the CGI process, its resident
(virtual) size, the number of file descriptors it can open, the size
of the largest single file it can create, and the number of
subprocesses it can spawn.
sbox uses both "hard" resource limits and
"soft" ones. The soft limits, which can be adjusted upwards
by the CGI script simply by calling setrlimit() itself, are set at
low, stringent values by default. The hard limits, which once set
cannot be increased during the lifetime of the process, use more
liberal values. For example, the maximum file size that the
user-supported CGI script has a soft limit of 100K, and a hard limit
of 2 megabytes. These values can be adjusted at sbox compile time.
The exception to this rule is the hard ceiling on core dumps, which
is set to size zero. This prevents the user's CGI script from
creating core files and closes various exploits that make use of core
dumps to recover confidential information or to overwrite other
files.
The net result of this design is that user-supported CGI scripts
will, by default, be executed in an environment with strict resource
controls. If a CGI script requires more of a particular resource than
the soft limits provide, it can increase the resource up to the
preset hard limit by calling setrlimit() itself. This design limits
problems caused by resource hogging scripts written by naive users
without unduly restricting the options of sophisticated users who
need more resources than the soft limits allow.
In addition to setting resource limits, sbox also nices its own
process to a priority of 10. This helps keep CGI scripts from
becoming too much of a drain on a loaded system. Unlike setrlimit()
values, a priority level, once increased, can never be
decreased.
The priority level and the soft and hard limits on all system
resources are set at sbox compile time. The system administrator can
change the default values, or choose not to set a particular limit at
all.
Changing the Root Directory
The crux of sbox security is its change-root function. If
configured to do so, sbox will use the chroot() system call to change
its root directory to some subdirectory enclosing the target CGI
script. When the target CGI script runs, it will be unable to access
parts of the filesystem outside the new root directory. This closes a
large number of CGI exploits, including unauthorized access to the
system password file, the modification of user's .rhosts files, the
creation of hard links to system files in /tmp, and many more. It
also provides a way to control exactly which system binaries and
other resources that user-maintained CGI scripts have access to.
Administrator-configurable options determine how sbox chooses
which directory to make the new root. In order for the target CGI
script to be executed, it must live within the subdirectory selected
for the new root. However, most CGI scripts will also need access to
copies of system files such as interpreters and shared libraries in
order to function correctly. Because it is inconvenient for the user
to intermix his CGI scripts with system files, these files are
usually stored in directories parallel to the directory that contains
the target script. Another consideration is the user's "document
root", the directory that contains his static HTML files. A
number of popular CGI scripts, including guestbook scripts and page
counters, require access to the user's HTML pages. In order for these
scripts to work under the sbox system, the user's document root, or a
portion of it at least, must also be located within the new root
directory.
The locations of the new root directory and the target CGI
script itself are controlled by the configuration variables ROOT and
CGI_BIN respectively. Both variables are pathnames relative to the
user's document root. A typical configuration will use the following
values:
This configuration tells sbox to look for the target CGI script inside a directory named cgi-bin on the same level as the user's document root directory. The new root directory will be the parent of both the cgi-bin directory and the user's document root. To see how this works in practice, consider a Web site in which user-supported directories are located in /u/username/pub/html, where "username" is substituted with the login name of the user. In Apache, this setup could be accomplished using the configuration directive UserDir pub/html. A listing of /u/username/pub might look something like this:
A drawback to this scheme is that it makes the user's entire document tree visible to his CGI scripts, which might not always be desirable. However a slight modification improves the scheme by making only a selected portion of the user's document tree visible. In this improved scheme, the Web server is configured so that the user's document tree is found, for example, in /u/username/public_html, and sbox is configured to change its root to a directory named sbox that is completely outside the public_html document tree:
while sbox-controlled CGI scripts are accessed with a URL like this one:
and CGI scripts that need to read or manipulate static HTML files are passed the additional path information in URLs like this one:
If the Apache web server is being used, these URLs can be simplified significantly with URL rewriting rules. An example of this is shown below. Environment Cleansing Before executing the target CGI script, sbox sets up a clean environment to run the target in. Depending on how the Web server was launched, there may be residual information in the environment that is not germaine to the CGI protocol or may in fact divulge sensitive information, such as database authentication information, or private PATH directories. sbox filters the current environment, allowing through only those environment variables that are specified by the CGI/1.1 protocol, such as REMOTE_ADDR, or which contain fields from the incoming HTTP request header, such as HTTP_USER_AGENT. In addition, sbox recognizes and permits a small number of common extensions to the CGI/1.1 protocol, such as the DOCUMENT_ROOT and SERVER_ADMIN variables. Other variables are not automatically copied into the target script's environment. In particular the PATH environment variable, because of its history of exploitation is not passed through. Instead PATH is set up using a constant "safe path" set at compile time. By default, the safe path is "/bin:/usr/bin:/usr/local/bin". Because the target script will be running in a change-root directory, it is likely that only /bin will be available to the target script. When possible, sbox adjusts path-related environment variables so that they correctly reflect the change-rooted filesystem seen by the user's CGI scripts. Among the environment variables that are adjusted are the DOCUMENT_ROOT variable, which should point to the top of the user's document tree and PATH_TRANSLATED, which points to the file passed to the user's CGI script as additional path information. Logging Before passing control to the user's CGI script, sbox logs its actions. It prints out a timestamp, the name of the CGI script being executed, and the UID and GID of the process that it will execute the script as. Diagnostic information is also logged when sbox's consistency checks fail, or when an error occurs during the processing or execution of the target CGI script. By default, sbox sends its log entries to standard error, which on most web servers becomes incorporated into the shared server error log file. However sbox can instead be configured to write entries into a private log file. There's there's a performance penalty in keeping a private log file, since sbox must open the file for appending every time it runs. The main rationale for having a log entry for each CGI script executed is that it provides an audit trail in the case of a CGI-based attack. The time of the attack can be correlated with the sbox log, and possibly lead to the identification of the script that was exploited. The sbox log could also be used to monitor CGI script usage for patterns suggestive of probing activity. Practical Considerations Configuring the sbox executable and preparing user-supported directories are the most tedious parts of using the sbox system. In order to reduce dependencies on the external environment, sbox does not use a configuration file. Instead, all its operational parameters are determined at compile time via a series of preprocessor #defines. About three dozen defines are contained in a single include file, sbox.h, which the system administrator must edit before compiling the executable. Fortunately, the vast majority of the defines are boilerplate values which will not need to be changed by most sites. Only about a half dozen are truly site-specific. System administrators used to modern configuration scripts will probably be disappointed by this primitive configuration process, even though it is simple and straightforward. For this reason, a GNU configure style configuration script [Freisenhalm 1997] is currently in preparation. A more onerous task is setting up user-supported directories so that their CGI scripts run correctly in a change-root environment. On most modern Unices, compiled programs need one or more shared libraries in order to execute. Either the user's CGI scripts must be compiled statically, or the new root directory must contain a /lib subdirectory (or the dialect's equivalent) containing the shared libraries the user needs. Other system support files may needed as well. CGI scripts that require access to the DNS system for hostname resolution will need an /etc subdirectory containing resolv.conf. Scripts that perform time calculations may need access to the compiled timezone file, /usr/lib/zoneinfo/localtime. Programs that need access to device special files, such as /dev/null and /dev/zero will need the appropriate files created with the mknod program. Scripts written in interpreted languages such as Perl will require a /bin directory containing the interpreter executable, and any support files that the interpreter needs, such as code libraries. Clearly there are drawbacks to replicating a good chunk of the root filesystem for each user-supported web directory. For one thing, the disk storage requirements may become prohibitive on a system with many users. One solution is to limit the type of CGI scripts that users can write to a particular development system, such as Perl. Then only those files needed to support the Perl interpreter will have to be copied into the user's scripting directory. Another solution to this problem is to use NFS to mount a trimmed set of /lib, /bin, and /etc directories in each user-supported directory. Even after the chroot() operation, the contents of these directories will continue to remain available to the user's CGI scripts. Although this technique creates a lot of mount points, the overhead for unused NFS mounts is minimal [Stern 1991], and an automount daemon can be further used to reduce the load [Crosby 1997]. However if this technique is used, care must be taken not to mount directories that contain sensitive information, such as an /etc directory that contains a live passwd file. This would defeat the purpose of the change-root system. A minor drawback to using sbox is that it is not completely transparent to the user. Instead of writing natural-looking CGI URLs, users have to be trained to interpose /cgi-bin/sbox in front of any URL that points to a CGI script. On Apache servers, an elegant solution to this problem is to use the mod_rewrite URL rewriting module to automatically add the /cgi-bin/sbox prefix to users' CGI URLs. For example, one could use a mod_rewrite URL rewrite rule to transform URLs of the form:
into URLs of the form:
by adding these directives to Apache's configuration file:
In order to perform its suid(), sgid(), and chroot() functions, sbox must run with superuser privileges. This means that, like cgiwrap and suEXEC, it must be installed set-user-id to root. This fact should give any cautious Unix system administrator pause. However, sbox consists of only 700 lines of C code, all of which are available for public scrutiny. sbox is careful to avoid using static buffers and string copy operations that could cause a buffer overflow. It also checks its environment at startup time to confirm that it was invoked by the web server and not some other local user. Conclusions The sbox wrapper increases the security of web sites that need to run untrusted CGI scripts. It prevents different users' CGI scripts from interfering with each other by running each user's program under distinct user and group IDs. It prevents user-maintained scripts from accessing sensitive parts of the file system by running each script in a change-root directory. It lessens the impact of denial of service attacks by establishing per-process resource limits, and it avoids certain common misconfigurations by checking the environment for consistency before it launches the target CGI script. Lastly, it creates an audit trail that can be used to track down malicious or poorly implemented CGI scripts. sbox is not a panacea for CGI woes. There are a variety of CGI-based attacks that sbox cannot prevent. Chief among these are network-based attacks. For example, if a CGI script can be tricked into probing a firewall system from within the protected network, there is nothing that sbox can do to prevent this type of attack. To completely insulate the user's environment from that of the host, you need to step out of the Unix domain and use a partitioned operating system, such as Hewlett Packward's VirtualVault technology [Hewlett Packard 1998]. Finally, it is important to remember that the sbox wrapper alone won't make a Web site secure. CGI script precautions are just one component of a carefully considered site security policy that includes attention to operating system security, web server configuration, operating and backup procedures, and user education. While nothing is ever going to completely eliminate the risk of running untrusted CGI scripts on a Web server, the sbox wrapper does go a long way towards limiting the potential damage that poorly-written or malicious scripts can inflict. Bibliography
|
This paper was originally published in the
Proceedings of the 1999 USENIX Annual Technical Conference, June 6-11, 1999, Monterey, California, USA
Last changed: 1 Mar 2002 ml |
|