Check out the new USENIX Web site. SAGE - Perl Practicum


Perl Practicum: You Say `rsh' and I Say `remsh'

by Hal Pomeranz

Last time I showed how Perl can emulate many of the more common UNIX filters and information gathering tools. While you spend some time "reinventing the wheel," the payback is a much more portable script. At times though, you simply have to invoke an operating system command. This is where you start running into real portability problems. What directory does the command live in and what arguments does it take? In some cases, even the name of the command is different - on some System V machines, rsh is the restricted shell and remsh executes a command on a remote machine. This column is dedicated to helping you navigate the morass of UNIX dialects and end up with a Perl script that will run on most of them. This column is a little short on Perl, so if you are not interested in maintaining scripts across multiple architectures, then it may not be the article for you.

Where Am I?

The first trick is finding out your machine's architecture. Many systems implement a /bin/arch command which gives some sort of unique identifier to indicate what manufacturer and what architecture type your script is running on. The arch command does not give operating system revision information (sometimes useful), however, and is far from universal. Your best bet is to try /bin/uname.

You can get a machine's hostname, OS type, and hardware architecture with the following command:

     ($host, $os, $arch) = split(/\s+/, '/bin/uname -nrm');

This works on every UNIX machine I have ever used except Convex machines, which for some strange reason simply do not implement uname. For machines with no uname command, you will just have to build up a list of special cases based on /bin/hostname. If you have a lot of special case machines, you could build up a static associative array by hostname:
     ENV{`PATH'} = "/bin:/usr/bin";
             %machines = ("convex1", "ConvexOS: Convex", ...);
        ($host, $os, $arch) = split(/\s+/, '/bin/uname -nrm');
        unless ($host) {
             chop($host = 'hostname');
             ($os, $arch) = split(/:/, $machines{$host});
             die "Unknown: $host\n" unless ($os);
        }

Note that I left the call to the hostname command as a relative path (hostname can live in /bin or /usr/bin depending upon the flavor of UNIX you are using, but I have never found it anywhere else). If you are going to use relative paths in your script, make sure you set the $PATH variable in your environment to a list of known "safe" directories or you will be susceptible to Trojan Horse programs. Never have the current directory (".") or a user home directory in $PATH.

The problem now is that the $os and particularly the $arch values are some strange text string that was meaningful to the vendor, but not necessarily all that humanly intuitive. For example, on SGI machines $arch will be something like "IP\d+" while Amdahls return numeric codes like "580." You will just have to survey all your machines to know exactly what values to expect.

Once you have identified your machine type, you can choose appropriate defaults and then modify them per architecture and OS release:

     $bigwords = 0;
        $gooduucp = 1;
        $confdir = "/etc";
        $ps = "ps -e"

        if ($arch =~ /^sun/) {
             $ps = "ps -ax";
             $gooduucp = 0 unless ($os =~ /^4\./);
        }
        elsif ($arch =~ /^IP\d+$/) {
             $confdir = "/usr/etc";
        }
        elsif ($arch =~ /^CRAY/) {
             $bigwords = 1;
        }
             :
             :
        else {
             die "$host: unknown arch $arch\n";
        }

Suns use the Berkeley style ps command (unless you are running Solaris 2.x - check $os). Older Suns use a brain-damaged UUCP. SGI machines put some of their configuration files in /usr/etc instead of /etc. Crays have big words, so we need to be careful for bit-shifting operations. It is a good idea to trap for unrecognized architectures.

Doing it Once

If you have a large number of Perl scripts, it may become cumbersome to repeat this same conditional over and over again. There are a several ways to approach this problem.

One choice is to implement a "universal" configuration by creating a giant conditional which properly sets defaults for all of your Perl scripts. Place this file in the same location on all of your machines, and your scripts can use the file either with require or

     eval { do "$configfile"; };
        die "Error in $configfile:\n$@" if ($@);

Remember that if you use require, the last statement in the file must evaluate to true. Most packages simply put
     1;

as the last line of the file.

If you have many architectures and many Perl scripts, the conditional can become quite large. On the other hand, you only have to maintain a single file, and it is quite straightforward to bring in a new architecture and port all of your scripts in one fell swoop.

A second alternative is to have a configuration file per individual machine located someplace like /etc. You can then use simple assignments rather than having a large conditional. While this may seem like a great deal of effort, chances are you will only have one file per architecture, or perhaps a few per architecture if you have wildly varying OS releases installed. You can distribute the "master" files from a central location to individual machines using something like rdist. You might even consider writing a "meta-configurer" script which would run out of cron and automatically build configuration files for each machine (a similar program for Bourne shell scripts was presented by Bob Arnold at LISA V1).

A third approach is really just an amalgam of previous ideas. Place architecture/OS specific information in separate files, but in a single location available to all machines. By naming the files appropriately, it is easy for you scripts to grab the right one:

     $configdir = "/usr/local/configs";
        ($host, $os, $arch) = split(/\s+/, `/bin/uname -nrm`);
        die "$host: no config file $arch.$os\n" unless (-f "$configdir/$arch.$os");
        eval { "do $configdir/$arch.$os"; };
        die "$host: config error:\n$@" if ($@);

In this case, all config files are located in /usr/local/configs/ and are named by the strings returned as $arch and $os by the uname command.

Whatever method you choose, you must be extremely careful to avoid name collisions with variables in the scripts which pull in the configuration files. I tend to use lowercase variable names in the scripts and reserve all uppercase variables for configuration information.

Conclusion

This probably seems like a great deal of wasted effort if you are not a system administrator at a large site or only maintain one or two architectures, and you are absolutely right (but I did try to warn you way back in the first paragraph). If you are a large site administrator contending with a wealth of Perl code, these techniques can simplify your life immeasurably.


1. Arnold, Bob, "If You've Seen One UNIX, You've Seen Them All", LISA V Conference Proceedings, 1991.

Reproduced from ;login: Vol. 19 No. 1, February 1994.


?Need help? Use our Contacts page.
Last changed: May 24, 1997 pc
Perl index
Publications index
USENIX home