|
Perl Practicum: Network Wiles
(Part III)
by Hal Pomeranz
This is the last of three articles dealing with network programming in Perl.
Strictly speaking, this article is not about network programming, per se, but
rather deals with a pair of system calls that are often used in network servers.
This article starts from where I left off in
Part II,
so a quick reread is in order if you are hazy on any of the concepts presented there.
Doing Several Things At Once
Last issue presented the typical loop a network server uses for handling pending
network connections:
|
for ( ;; ) {
$remote_host = accept(NEWSOCK, SOCK);
die "accept() error: $!\n" unless ($remote_host);
# do some work here
close(NEWSOCK);
}
|
Unfortunately, during the "do some work here" phase, the network server is not
handling requests that are queueing. Heavily loaded Web servers can get
hundreds of requests per second, so handling them all in a serial fashion
is unworkable. In a perfect world, the server would spend all of its time doing
accept() s and let one or more other processes handle all of the
specific requests for information in parallel.
UNIX-like systems support the fork() system call for creating new processes.
The fork() call causes the program to create an exact duplicate of itself - all
data structures and file handles (and buffered output) are copied to the new
process and both processes continue execution from the point in the program at
which fork() was called. The only difference between the two processes is that
fork() returns zero to the new process (referred to as the "child") while the
original process (the "parent") gets the process id number of the child process.
The usual application of fork() in network daemons is to have the parent
process call accept() and then immediately call fork() .
The parent goes back to the next accept() while the child handles
the information request from the remote machine and then exits once the request
has been completed. In Perl:
|
for ( ;; ) {
$remote_host = accept(NEWSOCK, SOCK);
die "accept() error: $!\n" unless ($remote_host);
# We're the parent if fork() returns non-zero
last unless (fork());
close(NEWSOCK);
}
# We've fallen out of the loop, we're the child.
# Do work here...
|
It is the child process that falls out of the for loop
(i.e., when fork() returns
zero). Otherwise, the parent closes the socket that was created for the pending
request and jumps back up to the top of the loop and waits for the next
accept() . It is safe for the parent to close NEWSOCK
since the child has its own copy, which is still open.
The child continues executing the same code that used to live inside the for loop
as shown in the last article (in ;login:
Vol. 21 No. 5, October 1996) - a complete
copy of the code for the mini Web server is reproduced at the end of this article.
Once the request has been serviced, the child exits, but it does not die completely
silently. When a child process exits, the parent is notified and must acknowledge
the child before the child process can be terminated. If the parent does not
acknowledge the child, then the child process stays around in limbo until the
parent process exits (such stuck child processes are referred to as "zombies").
The parent receives its notification through the UNIX signal(3) interface.
Specifically, every time a child exits, the parent receives a SIGCHLD signal and must
respond in some fashion. Perl uses the special variable %SIG to define
how a process will respond to a given signal: the keys of %SIG are
the signal names, and the values are the name of a subroutine to call when the signal is received.
The easiest response is to simply ignore the signal. Using the keyword
"IGNORE" instead of a subroutine name as a value in the %SIG array causes the
process to acknowledge but discard any occurrences of the given signal:
|
$SIG{"CHLD"} = "IGNORE";
|
at the top of your program will cause child processes to exit silently.
Protecting Your Data
In the last issue, our server responded to requests with:
|
if (open(FILE, "< $docroot$path")) {
@lines = <FILE>;
print NEWSOCK @lines;
close(FILE);
}
|
where $docroot was defined at the top of the script, and $path
was the document pathname in the HTTP request. We noted that users could request
|
../../../../../../../etc/passwd
|
and get a copy of the system password file. Ideally, the server should allow only
remote users to get files that are located under $docroot .
Most UNIX-like systems support the chroot() system call, which enables a
process to restrict the directories that can be accessed. The chroot() call takes a
directory name as an argument and effectively makes that directory the root of
the filesystem for the process. If our Web server were to
|
chroot($docroot);
|
and then get the request shown above, the remote user would get only
$docroot/etc/passwd .
There are, however, a couple of problems with using chroot() . First, only
processes running as the superuser can call chroot() . Although working in a
chroot -ed environment is very secure, running all of your network servers as
root is not. The usual workaround is for the process to give up superuser privileges
as soon as it has performed the chroot() call - usually becoming a system
user with no privileges like the "nobody" user. This is easily accomplished in
Perl:
|
$user = "nobody";
unless ($uid = (getpwnam($user))[2]) {
die "Attempt to run server as non-existent or superuser\n";
}
# [...] stuff happens [...]
# chroot() to docroot and then change our effective userid
#
chroot($docroot) || die "chroot() failed: $!\n";
$> = $uid;
|
The special variable $> is the effective userid of the process. Perl programs that
are running as the superuser may change this variable to change their privilege
level.
The second problem with chroot() is that, once the process has performed the
chroot() , it can no longer access system configuration files or system devices
that live outside the chroot -ed filesystem. This is why anonymous FTP servers
require the administrator to create scaled-down copies of various system files in
the anonymous FTP area: the FTP server does a chroot() before giving
anonymous users access.
This really is not a big problem for our Web server, because the files we need to
access are under $docroot . However, the server wants to try and look up the
hostname of the remote host for logging purposes, and this step must be per
formed before the chroot() or the lookup will fail.
|
$user = "nobody";
unless ($uid = (getpwnam($user))[2]) {
die "Attempt to run server as non-existent or superuser\n";
}
# [...] stuff happens [...]
# Resolve hostname of remote machine before calling chroot()
#
$raw_addr = (unpack("S n a4 x8", $remote_host))[2];
$dot_addr = join(".", unpack("C4", $raw_addr));
$name = (gethostbyaddr($raw_addr, AF_INET))[0];
# chroot() to docroot and then change our effective userid
#
chroot($docroot) || die "chroot() failed: $!\n";
$> = $uid;
|
Not all network servers are amenable to running in a chroot -ed environment,
but consider the option when developing your own servers.
That's It!
Network programming is fun once you get the hang of it. Feel free to take this
basic framework and adapt it for your own needs. One improvement that could be
made is to use syslog() instead of print() for outputting informational and
error messages. Typically, nothing is watching the standard output of network
daemons, so syslog is a more appropriate mechanism. See Syslog.pm in the
Perl lib directory for more details.
|
#!/usr/local/bin/perl
use Socket;
$docroot = "/home/hal/public_html";
$this_host = "my-server.netmarket.com";
$port = 80;
$user = "nobody";
# Let children perish
#
$SIG{"CHLD"} = "IGNORE";
# Get userid for $user. Abort if userid is zero (superuser) or
# non-existent.
#
unless ($uid = (getpwnam($user))[2]) {
die "Attempt to run server as non-existent or superuser\n"; }
# Initialize C structure
#
$server_addr = (gethostbyname($this_host))[4];
$server_struct = pack("S n a4 x8", AF_INET, $port,
$server_addr);
# Set up socket
#
$proto = (getprotobyname("tcp"))[2];
socket(SOCK, PF_INET, SOCK_STREAM, $proto) ||
die "Failed to initialize socket: $!\n";
# Bind to address/port and set up pending queue
#
setsockopt(SOCK, SOL_SOCKET, SO_REUSEADDR, 1) ||
die "setsockopt() failed: $!\n";
bind(SOCK, $server_struct) || die "bind() failed: $!\n";
listen(SOCK, SOMAXCONN) || die "listen() failed: $!\n";
# Deal with requests
#
for ( ;; ) {
# Grab next pending request
#
$remote_host = accept(NEWSOCK, SOCK);
die "accept() error: $!\n" unless ($remote_host);
# We're the parent if fork() returns non-zero
#
last unless (fork());
close(NEWSOCK);
}
# *** If we've fallen out of the loop, then we're the child. ***
# Close master socket
#
close(SOCK);
# Resolve hostname of remote machine before calling chroot()
#
$raw_addr = (unpack("S n a4 x8", $remote_host))[2];
$dot_addr = join(".", unpack("C4", $raw_addr));
$name = (gethostbyaddr($raw_addr, AF_INET))[0];
# chroot() to docroot and then change our effective userid
#
chroot($docroot) || die "chroot() failed: $!\n";
$> = $uid;
# Read client request and get $path
#
while (<NEWSOCK>) {
last if (/^\s*$/);
next unless (/^GET /);
$path = (split(/\s+/))[1];
}
# Print a line of logging info to STDOUT
#
print "$dot_addr\t$name\t$path\n";
# Respond with info or error message
#
if (open(FILE, "< $path")) {
@lines = <FILE>;
print NEWSOCK @lines;
close(FILE);
}
else {
print NEWSOCK <<"EOErrMsg";
<Network Wiles (Part III)>Error</TITLE><H2>Error</H2>
The following error occurred while trying to retrieve your
information:
$! EOErrMsg
}
# All done
#
close(NEWSOCK);
|
Reproduced from ;login: Vol. 22 No. 1, February 1997.
|