Perl Practicum: Network Wiles
by Hal Pomeranz
|
use Socket; $server = "www.netmarket.com"; $port = 80; $server_addr =(gethostbyname($server))[4]; $server_struct = pack("S n a4 x8", AF_INET, $port, $server_addr); $proto = (getprotobyname(`tcp'))[2]; socket(MYSOCK, PF_INET, SOCK_STREAM, $proto)|| die "Failed to initialize socket: $!\n"; connect(MYSOCK, $server_struct) || die "Failed to connect() to server: $!\n";
The first line of this example simply pulls in the Perl sockets
module. This module defines a number of useful constants that are
employed later in the program. Next come the name of the server that
this client will contact and the network port on which to talk to the
server-port 80 happens to be the port that Web servers listen on, by
default. You might actually get these values passed into your program
as command line arguments, or this code might become part of a
function that gets these values as function arguments.
In order to be able to connect to the server, the program has to
translate the server's human readable name
(www.netmarket.com) into a network address. The
The C structure is created by using this address and the
With that messy
Finally, the transmission protocol is specified: the discussion of TCP
versus UDP is beyond the scope of this article, but TCP is always the
right thing to use unless you are very sure that it isn't. Always use
With one end of the socket firmly in hand (again, as the file handle
Using ItMYSOCK can now be treated just like any Perl file handle,
except that you can both read and write from the same socket. In order
to save network and system resources, it is particularly important to
remember to close() sockets when you are done with them.
Because this client has connected to the Web server (port 80, remember?) on www.netmarket.com, the client program can request an HTML document using the HTTP protocol: |
select(MYSOCK); $| = 1; select(STDOUT); print MYSOCK "GET /\n\n"; while (<MYSOCK>) { print; } close(MYSOCK);
The first three lines turn off the standard I/O buffering on the
socket. When reading and writing from a file, it is usually most
efficient to do large reads or writes (read more data than needed or
save up a lot of small writes and do them all at once), and most UNIX
systems take care of doing this automatically. This behavior can,
however, be disabled - for example, on a network socket where the client
and server are passing short messages back and forth. The Perl
mechanism for turning off buffering is to set the $|
variable to be non zero (it's zero by default). Setting this variable
affects only the currently selected() file handle (STDOUT
is selected by default), so you have to select(MYSOCK) ,
set the vari able, and then go back to the default of
STDOUT .
That done, the client requests a file from the Web server using the
|
/some/other/file.html).
The GET request is followed by two newlines.
Once the client makes its request, the server sends the contents of the requested file back down the socket (or an error message if the file was not found or some other error occurred). The standard HTTP protocol defines that when the server finishes sending the file, it hangs up its end of the connection - this causes the entire socket to be torn down. A client reading from a socket interprets this event just as if it had been reading from a file and reached the end-of-file marker. In the program above, the HTML document is simply being printed to the standard output. Practicing ItThe above example covers the basics of writing a network client program. There is a good deal of additional lore surrounding this subject, but there are a lot of people out there earning huge salaries who don't know anything more than what you have seen here. In the next article I will explore server programming by writing a simple Web server.In the meantime, practice these concepts by taking the example above and writing a program that will take the server name, port number (default to port 80), and file name as command line arguments and fetch that file from the remote Web server. Impress your friends (and increase your productivity) by building a Web robot that surfs the Web for you by looking for HREF tags in the documents you download and then fetches those documents as well (making sure that you don't download the same document twice!). Now make sure the robot stops at some point, or you'll download the entire Web. Reproduced from ;login: Vol. 21 No. 4, August 1996. |
Need help? Use our Contacts page.
Last changed: May 24, 1997 pc |
|