Chapter 15 System Functions

CONTENTS

System Library Emulation Functions
Socket-Manipulation Functions
The UNIX System V IPC Functions
Summary
Q&A
Workshop
- Quiz
- Exercises

Today's lesson describes the built-in Perl functions that perform various system-level operations. These functions are divided into three groups:

The functions that emulate system library functions
The functions that work with Berkeley UNIX sockets
The functions that perform UNIX System V IPC operations

Many of the functions described in today's lesson use features of the UNIX operating system. If you are using Perl on a machine that is not running UNIX, some of these functions might not be defined or might behave differently.

Check the documentation supplied with your version of Perl for details on which functions are supported or emulated on your machine

System Library Emulation Functions

Several built-in Perl functions enable you to execute various system library calls from within your Perl program. Each one corresponds to a UNIX system library function.

The following sections briefly describe these system library functions. For more information on a particular system library function, refer to the on-line manual page for that function. For example, to find out more about the getnetent function, refer to your UNIX system's getnetent manual page.

The `getgrent` Function

In the UNIX environment, each user belongs to a user group. Being in a user group enables you to define files that only certain users-the people in your user group-can read from or write to.

On UNIX systems, the file /etc/group lists the user groups defined for your machine. Each entry in the user group file consists of four components:

The user group name
The user group password, if one exists
The group ID, which is a unique integer that the system uses to identify this particular user group
A list of the user IDs that belong to this group

The Perl function getgrent enables you to retrieve an item from the user group file.

The syntax for the getgrent function is

(gname, gpasswd, gid, gmembers) = getgrent;

This function returns a four-element list consisting of the four components of a group line entry, as just described. gname contains the user group name, gpasswd contains the user group password, gid is the group ID, and gmembers is a character string consisting of a list of the user IDs belonging to this group. The user IDs listed in gmembers are separated by spaces.

Each call to getgrent returns another line from the /etc/group file. Therefore, you can put getgrent inside a while loop.

while (($gname, $gpasswd, $gid, $gmembers) = getgrent) {

        # do stuff here

}

When the /etc/group file is exhausted, getgrent returns the empty list.

Listing 15.1 is an example of a program that uses getgrent to list all the user IDs associated with each group on your system.

Listing 15.1. A program that uses getgrent.

1:  #!/usr/local/bin/perl

2:  

3:  while (($gname, $gpasswd, $gid, $gmembers) = getgrent) {

4:          $garray{$gname} = $gmembers;

5:  }

6:  foreach $gname (sort keys (%garray)) {

7:          print ("Userids belonging to group $gname:\n");

8:          $gmembers = $garray{$gname};

9:          $userids = 0;

10:         while (1) {

11:                 last if ($gmembers eq "");

12:                 ($userid, $gmembers) =

13:                      split (/\s+/, $gmembers, 2);

14:                 printf ("  %-20s", $userid);

15:                 $userids++;

16:                 if ($userids % 3 == 0) { 

17:                         print ("\n");

18:                 }

19:         }

20:         if ($userids % 3 != 0) {

21:                 print ("\n");

22:         }

23: }

$ program15_1

Userids belonging to group adm:

  adm                   daemon

Userids belonging to group develop:

  dave                  jqpublic              kilroy

  mpython               ralomar               xyzzy

Userids belonging to group root:

  root

$

Line 3 of this program calls getgrent. This function returns a four-element list whose elements are the components of a group entry stored in the /etc/group file. If /etc/group is exhausted, getgrent returns the empty list.

Line 4 takes the list of group members in $gmembers and stores it in an associative array named %garray. The subscript for this array element is the name of the group, which is contained in $gname.

Lines 6-23 print the list of user IDs for each group. The loop iterates once for each group name, and the call to sort in line 6 ensures that the group names appear in alphabetical order. First, line 7 prints the name of the group. Then, line 8 retrieves the list of user IDs in the group by accessing the associative array %garray. This list is stored, once again, in $gmembers.

Lines 12 and 13 call split to extract the next user ID from the list. split breaks the string into two parts when it sees the first white space. The first part, the substring before the first space, contains one user ID and is assigned to $userid; the rest of the string is reassigned to $gmembers.

The rest of the loop prints the extracted user ID. User IDs are printed three per line to save space.

The `setgrent` and `endgrent` Functions

The setgrent function affects the behavior of getgrent: it tells the Perl interpreter to rewind the /etc/group file. After setgrent is called, the next call to getgrent retrieves the first element of the /etc/group file.

The endgrent function tells the Perl interpreter that you no longer need to access the /etc/group file. It frees the memory used to store group information.

Neither setgrent nor endgrent accepts any arguments or returns any values.

The syntax for these functions is

setgrent();

endgrent();

The `getgrnam` Function

The getgrnam function enables you to retrieve the group file entry corresponding to a particular group name.

The syntax for the getgrnam function is

(gname, gpasswd, gid, gmembers) = getgrnam (name);

Here, name is the group name to search for. getgrnam returns the same four-element list that getgrent returns: gname is the group name (which is the same as name), gpasswd is the group password, gid is the group ID, and gmembers is the list of user IDs in the group. If getgrnam does not find a group entry matching name, it returns the empty list.

Listing 15.2 is a modification of Listing 15.1. It asks you for a group name and then prints the user IDs in that group.

Listing 15.2. A program that uses getgrnam.

1:  #!/usr/local/bin/perl

2:  

3:  print ("Enter the group name to list:\n");

4:  $name = <STDIN>;

5:  chop ($name);

6:  if (!(($gname, $gpasswd, $gid, $gmembers) = getgrnam ($name))) {

7:          die ("Group $name does not exist.\n");

8:  }

9:  $userids = 0;

10: while (1) {

11:         last if ($gmembers eq "");

12:         ($userid, $gmembers) = split (/\s+/, $gmembers, 2);

13:         printf ("  %-20s", $userid);

14:         $userids++;

15:         if ($userids % 3 == 0) {

16:                 print ("\n");

17:         }

18: }

19: if ($userids % 3 != 0) {

20:         print ("\n");

21: }

$ program15_2

Enter the group name to list:

develop

  dave                  jqpublic              kilroy

  mpython               ralomar               xyzzy

$

Line 6 takes the group name stored in $name and passes it to getgrnam. If a group corresponding to that name exists, getgrnam returns the name, password, group ID, and members. If no such group exists, getgrnam returns the empty list, the conditional expression in line 6 fails, and line 7 calls die to terminate the program.

The rest of the program is taken verbatim from Listing 15.1: the while loop in lines 10-18 extracts a user ID from the list of user IDs in $gmembers and prints it, continuing until the list is exhausted.

The `getgrid` Function

The getgrid function is similar to getgrnam, except that it retrieves the group file entry corresponding to a given group ID.

The syntax for the getgrid function is

(gname, gpasswd, gid, gmembers) = getgrid (id);

Like getgrname, getgrid returns a four-element list consisting of the group name, password, ID, and member list. If the group specified by id does not exist, getgrid returns the empty list.

This function often is used to retrieve the associated group name:

($gname) = getgrid (11);

This line retrieves the group name associated with group ID 11. (The other elements of the list are thrown away.)

You must place parentheses around $gname to denote that getgrid is assigning to a list. The statement

$gname = getgrid (11);

assigns the list returned by getgrid to the scalar variable $gname. In Perl, assigning a list to a scalar variable actually assigns the length of the list to the variable, so this statement assigns 4 to $gname because there are four elements in the list returned by getgrid

The `getnetent` Function

The getnetent function enables you to step through the file /etc/networks, which lists the names and addresses of the networks your machine is on.

The syntax for the getnetent function is

(name, altnames, addrtype, net) = getnetent();

name is the name of a network. altnames is a list of alternative names for the network; this list of names is returned as a character string, with spaces separating the individual names. addrtype is the address type; at present, this is always whatever value is defined for the system constant AF_INET, which indicates that the address is an Internet address.

NOTE

To get the value of AF_INET on your machine, refer to the header file
/usr/include/netdb.h or /usr/include/bsd/netdb.h, and look for a statement similar to the following:

#define AF_INET 2

The number that appears after AF_INET is the one you want

net is the Internet address of this network. This address is represented as a string of four bytes, which can be unpacked into Perl scalar values using the unpack function.

Listing 15.3 shows how you can use getnetent to list the machine names and addresses at your site.

Listing 15.3. A program that uses getnetent.

1:  #!/usr/local/bin/perl

2:  

3:  print ("Networks this machine is connected to:\n");

4:  while (($name, $altnames, $addrtype, $rawaddr) = getnetent()) {

5:          @addrbytes = unpack ("C4", $rawaddr);

6:          $address = join (".", @addrbytes);

7:          print ("$name, at address $address\n");

8:  }

$ program15_3

Networks this machine is connected to:

silver, at address 192.75.236.168

$

Line 4 calls getnetent, which reads from the file /etc/networks. If the file has been exhausted, getnetent returns the empty list, and the while loop terminates. If /etc/networks still contains an unread entry, getnetent retrieves it and assigns its components to $name, $altnames, $addrtype, and $rawaddr.

$rawaddr contains the Internet address for a particular network. This address is stored as a four-byte integer; each byte contains one component of the address. (This method works because each number in an Internet address has a maximum value of 255, which is the largest value that can fit in a byte.) Line 5 converts this four-byte integer into a list of integers by calling unpack, and it stores the list in @addrbytes.

Line 6 calls join to convert the list of integers into a character string that contains the readable address. Line 7 then prints the network name and the readable address of the network.

The `getnetbyaddr` Function

The getnetbyaddr function enablesyou to retrieve the line of input from /etc/networks that matches a particular network number.

The syntax for the getnetbyaddr function is

(name, altnames, addrtype, addr) = getnetbyaddr (inaddr, inaddrtype);

Here, inaddr is the network number or address for which you want to search. This address must be a packed four-byte integer whose four bytes are the four components of the address. (An example of a network address is 192.75.236.168, which is the machine on which I work.) To build a packed address, use the pack command:

@addrbytes = (192, 75, 236, 168);

$packedaddr = pack ("C4", @addrbytes);

The packed address in $packedaddr can now be passed to getnetbyaddr.

inaddrtype is the address type, which is always AF_INET (whose value is located in the file
/usr/include/netdb.h or /usr/include/bsd/netdb.h).

The getnetbyaddr function returns the same four-element list as getnetent: the name of the network, the list of alternative names, the address type, and the packed address.

The `getnetbyname` Function

The getnetbyname function is similar to getnetbyaddr, except that it enables you to search in the /etc/networks file for a network of a particular name.

The syntax for the getnetbyname function is

(name, altnames, addrtype, net) = getnetbyname (inname);

Here, inname is the machine name to search for. Like getnetbyaddr and getnetent, getnetbyname returns a four-element list consisting of the network name, alternative name list, address type, and packed address.

NOTE

You can pass getnetbyname either the principal network name or one of its aliases

The `setnetent` and `endnetent` Functions

The setnetent function rewinds the /etc/networks file; after setnetent has been called, a call to getnetent returns the first entry in the /etc/networks file.

The syntax for the setnetent function is

setnetent (keepopen);

keepopen is a scalar value. If keepopen is not zero, the /etc/networks file is not closed after getnetbyname or getnetbyaddr is called; therefore, you can efficiently call these functions repeatedly. If keepopen is zero, the file is closed.

The endnetent function tells the Perl interpreter that your program is finished with the /etc/networks file. It closes the file and frees any memory used by your program to store related information.

The syntax for the endnetent function is

endnetent;

It accepts no arguments and returns no values.

The `gethostbyaddr` Function

The gethostbyaddr function searches the file /etc/hosts (or the equivalent name server) for the host name corresponding to a particular Internet address.

The syntax for the gethostbyaddr function is

(name, altnames, addrtype, len, addrs) = gethostbyaddr (inaddr, inaddrtype);

This function requires two arguments. The first, inaddr, is the Internet address to search for, stored in packed four-byte format (identical to that used by getnetbyaddr). The second argument, inaddrtype, is the address type; at present, only Internet address types are understood, and inaddrtype is always AF_INET. (The value of AF_INET can be found in /usr/include/netdb.h or /usr/include/sys/netdb.h.)

gethostbyaddr returns a five-element list. The first element, name, is the host name corresponding to the Internet address specified by inaddr. altnames is the list of aliases or alternative names by which the host can be referred. addrtype, like inaddrtype, is always AF_INET.

addrs is a list of addresses (main address and alternatives) corresponding to the host node named name. Each address is stored as a four-byte integer. len is the length of the addrs field; this length is always four multiplied by the number of addresses returned in addrs.

Listing 15.4 shows how you can use gethostbyaddr to retrieve the Internet address corresponding to a particular machine name.

Listing 15.4. A program that uses gethostbyaddr.

1:  #!/usr/local/bin/perl

2:  

3:  print ("Enter an Internet address:\n");

4:  $machine = <STDIN>;

5:  $machine =~ s/^\s+|\s+$//g;

6:  @bytes = split (/\./, $machine);

7:  $packaddr = pack ("C4", @bytes);

8:  if (!(($name, $altnames, $addrtype, $len, @addrlist) =

9:          gethostbyaddr ($packaddr, 2))) {

10:         die ("Address $machine not found.\n");

11: }

12: print ("Principal name: $name\n");

13: if ($altnames ne "") {

14:         print ("Alternative names:\n");

15:         @altlist = split (/\s+/, $altnames);

16:         for ($i = 0; $i < @altlist; $i++) {

17:                 print ("\t$altlist[$i]\n");

18:         }

19: }

$ program15_4

Enter an Internet address:

128.174.5.59

Principal name: ux1.cso.uiuc.edu

$

The program starts by prompting you for an Internet address. (In this example, the Internet address specified is 128.174.5.59, which is the location of a popular public access Gopher site.) Lines 5-7 then convert the address into a four-byte packed integer, which is stored in $packaddr.

Lines 8 and 9 call gethostbyaddr. This function searches the /etc/hosts file for an entry matching the specified machine name. If the entry is not found, the conditional expression becomes false, and line 10 calls die to terminate the program.

NOTE

Line 9 uses the value 2 as the address type to pass to gethostbyaddr. If your machine defines a different value of AF_INET, as defined in the files /usr/include/netdb.h or /usr/include/bsd/netdb.h, replace 2 with that value

If the entry is found, line 12 prints the principal machine name, which was returned by gethostbyaddr and is now stored in the scalar variable $name. Line 13 then checks whether the returned entry lists any alternative machine names corresponding to this Internet address.

If alternative machine names exist, lines 14-18 split the alternative name list into individual names and print each name on a separate line.

NOTE

gethostbyaddr and the other functions that access /etc/hosts expect the following format for a host entry:

address mainname altname1 altname2 ...

Here, address is an Internet address; mainname is the name associated with the address; and altname1, altname2, and so on are the (optional) alternative names for the host.

If your /etc/hosts file is in a different format, gethostbyaddr might not work properly

The `gethostbyname` Function

The gethostbyname function is similar to gethostbyaddr, except that it searches for an /etc/hosts entry that matches a specified machine name or Internet site name.

The syntax for the gethostbyname function is

(name, altnames, addrtype, len, addrs) = gethostbyname (inname);

Here, inname is the machine name or Internet site name to search for. gethostbyname, like gethostbyaddr, returns a five-element list consisting of the machine name, a character string containing a list of alternative names, the address type, the length of the address list, and the address list.

Listing 15.5 is a simple program that searches for an Internet address when given the name of a site.

Listing 15.5. A program that uses gethostbyname.

1:  #!/usr/local/bin/perl

2:  

3:  print ("Enter a machine name or Internet site name:\n");

4:  $machine = <STDIN>;

5:  $machine =~ s/^\s+|\s+$//g;

6:  if (!(($name, $altnames, $addrtype, $len, @addrlist) =

7:          gethostbyname ($machine))) {

8:          die ("Machine name $machine not found.\n");

9:  }

10: print ("Equivalent addresses:\n");

11: for ($i = 0; $i < @addrlist; $i++) {

12:         @addrbytes = unpack("C4", $addrlist[$i]);

13:         $realaddr = join (".", @addrbytes);

14:         print ("\t$realaddr\n");

15: }

$ program15_5

Enter a machine name or Internet site name:

ux1.cso.uiuc.edu

Equivalent addresses:

128.174.5.59

$

This program prompts for a machine name and then removes the leading and trailing white space from it. After the machine name has been prepared, lines 6 and 7 call gethostbyname, which searches for the /etc/hosts entry matching the specified machine name. If gethostbyname does not find the entry, it returns the null string, the conditional expression becomes false, and line 8 calls die to terminate the program.

If gethostbyname finds the entry, the loop in lines 11-15 examines the list of addresses in @addrlist, assembling and printing one address at a time. Line 12 assembles an address by unpacking one element of @addrlist and storing the individual bytes in @addrbytes. Line 13 joins the bytes into a character string, placing a period between each pair of bytes. The resulting string is a readable Internet address, which line 14 prints.

NOTE

The machine name passed to gethostbyname can be either the principal machine name (as specified in the first element of the returned list) or one of the alternative names (aliases)

The `gethostent`, `sethostent`, and `endhostent` Functions

The gethostent function enables you to read each item of the /etc/hosts file in turn.

The syntax for the gethostent function is

(name, altnames, addrtype, len, addrs) = gethostent();

The first call to gethostent returns the first element in the /etc/hosts file; subsequent calls to gethostent return successive elements. Each call to gethostent returns a five-element list identical to the list returned by gethostbyaddr or gethostbyname. This list consists of a machine name, a character string listing the alternative machine names, the address type (always AF_INET), the length of the address field, and the address field itself.

Many machines simulate an /etc/hosts file using a name server. When a program that is running on a machine using a name server attempts to access /etc/hosts, the server queries various Internet sites for machine names, addresses, and other information.

If a Perl program running on such a machine calls gethostent repeatedly, the program might try to access many Internet sites to obtain machine information. This takes a lot of time and is a strain on Internet resources; do not do it unless you absolutely must, and do it during off-peak hours if possible

The sethostent function rewinds the /etc/hosts file, which means that the next call to gethostent will return the first entry in the file.

The syntax for the sethostent function is

sethostent (keepopen);

keepopen is a scalar value. If keepopen is nonzero, the Perl program keeps /etc/hosts information in memory, which ensures that subsequent calls to gethostent are performed as efficiently as possible. If keepopen is zero, no information is retained after sethostent finishes executing.

The endhostent function closes the /etc/hosts file and indicates that the program is to free any internal memory retaining host-related information.

The endhostent function expects no arguments and returns no values:

endhostent();

The `getlogin` Function

The getlogin function returns the user ID under which you are logged in. The user ID is retrieved from the file /etc/utmp.

The syntax for the getlogin function is

logname = getlogin();

logname is the returned user ID.

The following is a simple example using getlogin:

$logname = getlogin();

if ($logname == "dave") {

        print ("Hello, dave! How are you?\n");

}

The `getpgrp` and `setpgrp` Functions

In the UNIX environment, processes are organized into collections of processes known as process groups. Each process group is identified by a unique integer known as a process group ID.

The getpgrp function retrieves the process group ID for a particular process.

The syntax of the getpgrp function is

pgroup = getpgrp (pid);

pid is the process ID whose group you want to retrieve, and pgroup is the returned process group ID, which is a scalar value.

If pid is not specified or is zero, getpgrp assumes that you want the process group ID for the current process (the program you are running).

Listing 15.6 is an example of a program that retrieves its own process group ID.

Listing 15.6. A program that uses getpgrp.

1:  #!/usr/local/bin/perl

2:  

3:  $pgroup = getpgrp (0);

4:  print ("The process group for this program is $pgroup.\n");

$ program15_6

The process group for this program is 3313.

$

Line 3 calls getpgrp with the argument 0, which indicates the current process (the current program). The process group ID for this process is returned in $pgroup and then printed.

The setpgrp function enables you to set the process group ID for a process.

The syntax of the setpgrp function is

setpgrp (pid, groupid);

pid is the ID of the process whose group you want to change, and groupid is the process group ID you want your process to be part of. (This group ID is usually returned by a call to getpgrp.)

Not all machines support setpgrp, and some machines impose limitations on how you can use it. If your program uses setpgrp, you should call getpgrp immediately afterward to ensure that the process group ID has been set properly

The `getppid` Function

On UNIX machines, as you have seen, every running program or other executing process has its own unique process ID. Each program and process also is associated with a parent process, which is the process that started it. For example, when you execute a command that starts a Perl program, the parent process of the Perl program is the shell program from which you entered the command.

To retrieve the process ID for the parent process for your program, call the function getppid.

The syntax of the getppid function is

parentid = getppid();

Here, parentid is the process ID of your program.

You can use getppid with fork to ensure that each of the two processes produced by fork knows the process ID of the other.

Listing 15.7 shows how to do this.

Listing 15.7. A program that calls fork and getppid.

1:  #!/usr/local/bin/perl

2:  

3:  $otherid = fork();

4:  if ($otherid == 0) {

5:          # this is the child; retrieve parent ID

6:          $otherid = getppid();

7:  } else {

8:          # this is the parent

9:  }

This program requires no input and generates no output.

When line 3 calls fork, the program splits into two separate processes (or running programs, if you want to think of them that way). fork returns 0 to the child process and returns the process ID of the child process to the parent process. At this point, the parent process knows the process ID of the child, but the child does not know the process ID of the parent.

Line 6, which is executed only by the child process, fixes this imbalance by calling getppid and returning the process ID of the parent (the other process created by fork). After the child process executes line 6, both the parent and the child process have stored the process ID of the other process in the scalar variable $otherid.

After each process has the ID of the other, the processes can send signals to one another using the kill function (which is discussed on Day 13, "Process, String, and Mathematical Functions").

The `getpwnam` Function

On UNIX machines, the /etc/passwd file (also known as the password file) contains information on each of the users who are authorized to use the machine. The getpwnam function enables you to retrieve the password file entry for a particular user.

The syntax of the getpwnam function is

(username, password, userid, groupid, quota, comment, infofield, 

�_homedir, shell) = getpwnam (name);

name is the login user ID of the user whose information you want to retrieve. If an entry in the /etc/passwd file corresponds to this name, getpwnam returns a nine-element list containing the contents of the entry. These contents are

username, which is identical to name
password, which is the user's encrypted password
userid, which is the unique numerical ID that represents this user
groupid, which is the ID of the group to which this user belongs
quota and comment, which mean different things on different machines (check your local getpwnam manual page for details)
infofield, which is a character string containing personal information about the user (such as the room number of the user's office, or the user's phone number)
homedir, which is the user's home directory (the directory that becomes the current directory when the user logs in)
shell, which is the command shell that is started when the user logs in

getpwnam returns the empty list if no password file entry for name exists.

You can use getpwnam in various ways. The most common way is to retrieve the user ID or group ID corresponding to a particular user name. Listing 15.8 is a program that retrieves and prints the user ID for a particular user.

Listing 15.8. A program that retrieves the user ID
for a user.

1:  #!/usr/local/bin/perl

2:  

3:  print ("Enter a username:\n");

4:  $username = <STDIN>;

5:  $username =~ s/^\s+|\s+$//g;

6:  if (($username, $passwd, $userid) = getpwnam ($username)) {

7:          print ("Username $username has user id $userid.\n");

8:  } else {

9:          print ("Username not found.\n");

10: }

$ program15_8

Enter a username:

dave

Username dave has userid 127.

$

After lines 4 and 5 have retrieved the user name and removed any extraneous white space, line 6 passes the user name to getpwnam. If a password file entry exists for this user name, the nine-element entry is returned, and the first three elements are assigned to $username, $password, and $userid. (The remaining elements are thrown away.) The third element, the user ID, is stored in $userid and is printed by line 7.

The `getpwuid` Function

The getpwuid function is similar to the getpwnam function because it also accesses the /etc/passwd file. getpwuid, however, searches for the password file entry that matches a particular user ID.

The syntax of the function is

(username, password, userid, groupid, quota, comment, infofield, 

�_homedir, shell) = getpwuid (inputuid);

inputuid is the user ID that is to be searched for; it must be a nonzero integer. The nine-element list returned by getpwuid is identical to that returned by getpwnam.

NOTE

The userid field in the nine-element list returned by getpwuid is always identical to the inputuid field that is passed as an argument

The `getpwent` Function

The getpwnam and getpwuid functions enable you to retrieve a single entry from the password file. To access each entry of the password file in turn, call getpwent.

The syntax for the getpwent function is

(username, password, userid, groupid, quota, comment, infofield, 

�_homedir, shell) = getpwent();

When a program calls getpwent for the first time, it retrieves the first entry in the /etc/passwd file. Subsequent calls retrieve further entries; if no more entries remain, the empty list is returned.

The components of the nine-element list returned by getpwent are the same as those in the lists returned by getpwnam and getpwuid.

Listing 15.9 is an example of a program that uses getpwent. It lists the user names known by the machine as well as their user IDs.

Listing 15.9. A program that uses getpwent.

1:  #!/usr/local/bin/perl

2:  

3:  while (1) {

4:          last unless (($username, $password, $userid)

5:                        = getpwent());

6:          $userlist{$username} = $userid;

7:  }

8:  print ("Users known to this machine:\n");

9:  foreach $user (sort keys (%userlist)) {

10:         printf ("%-20s %d\n", $user, $userlist{$user});

11: }

$ program15_9

Users known to this machine:

adm                 4

daemon              1

dave                127

ftp                 8

jimmy               711

root                0

$

The while loop in lines 3-7 uses getpwent to read every entry in the password file. Only the first three elements of the returned list are saved-in the scalar variables $username, $password, and $userid-the rest are thrown away. After the /etc/passwd file has been completely read, line 4 terminates the while loop.

Line 6 creates an associative array element for each user. The subscript for the array element is the user name, and the value of the element is the user ID.

Lines 9-11 print the list of users, sorting them in order by user name and printing the name and user ID for each.

The `setpwent` and `endpwent` Functions

Like getpwent, setpwent and endpwent manipulate the /etc/passwd file.

The setpwent function rewinds the /etc/passwd file.

The syntax of the setpwent function is

setpwent (keepopen);

If keepopen is nonzero, the Perl interpreter assumes that the /etc/passwd file is to be accessed again, and it keeps information about the password file stored in internal memory. If keepopen is zero, any information the program has related to the password file is thrown away.

The endpwent function closes the password file and tells the program to throw away any internal memory related to it.

The endpwent function accepts no arguments and returns no values.

endpwent();

The `getpriority` and `setpriority` Functions

In the UNIX environment, each process has a priority, which tells the system which processes are important and which are not. Priorities are integer values that vary from system to system: a typical range is from -20 (most important) to 20 (least important), with a default value of 0.

NOTE

Although priority ranges might vary from system to system, the general rule under UNIX is always this: the higher the priority number associated with a process, the less important the process is

To change the priority for your program, process, process group, or user ID, call the setpriority function.

The syntax of the setpriority function is

setpriority (category, id, priority);

category is a scalar value that indicates what processes are to have their priorities altered. To find the value to use, take the following actions:

Examine the header file /usr/include/sys/resource.h
In this file, look up and note the values of the constants PRIO_PROCESS, PRIO_PGRP, and PRIO_USER
Pick the appropriate value to use, as described in the remainder of this section

NOTE

If you are not familiar with the C programming language, the value of a constant is specified by a statement of the following form:

#define constant value

Here, constant is a constant such as PRIO_PROCESS, and value is its defined value

If category is the value associated with PRIO_PROCESS, only one process has its priority altered. If category is the value of PRIO_PGRP, every process in a process group has its priority altered. If category is the value of PRIO_USER, every process belonging to a particular user has its priority altered.

The value of id depends on category. If category is the value of PRIO_PROCESS, id is the process ID for the process whose priority is to be altered. If category is the value of PRIO_PGRP, id is the process group ID for the group whose priority is to be altered. If category is PRIO_USER, id is the user ID for the group whose priority is to be altered.

NOTE

If category is the value of PRIO_PROCESS or PRIO_PGRP and id is 0, id is assumed to be the ID of the current process or process group

priority is the new priority for the process, group, or user. You can specify a lower priority value for your process or processes (in other words, specify that your processes are "more important") only if you are a privileged user (the superuser).

The function getpriority retrieves the current priority for a process, process group, or user.

The syntax of the getpriority function is

priority = getpriority (category, id);

Here, category and id are identical to the equivalent arguments in setpriority. priority is the returned current priority.

Listing 15.10 is a program that lowers the priority of every process you are currently running. It uses several of the functions you have seen in today's lesson.

Listing 15.10. A program that uses setpriority and getpriority.

1:  #!/usr/local/bin/perl

2:  

3:  print ("You're not in a hurry today, are you?\n");

4:  $username = getlogin();

5:  ($username, $password, $userid) = getpwnam ($username);

6:  $oldpriority = getpriority (2, $userid);

7:  setpriority (2, $userid, $oldpriority + 1);

$ program15_10

You're not in a hurry today, are you?

$

Line 4 of this program calls getlogin, which retrieves the user's login name. Then, line 5 passes this name to getpwnam, which retrieves the user ID from the /etc/passwd file.

Line 6 calls getpriority. Because the first argument to getpriority is 2 (the value of the system constant PRIO_USER), the current priority for all processes owned by the user specified by the user ID stored in $userid is returned.

Line 7 calls setpriority, adding one to the current priority for the user to obtain the new priority. As in line 6, the first argument to setpriority is 2 (PRIO_USER), which indicates that the current priority for all processes belonging to this user is to be changed.

The `getprotoent` Function

The getprotoent function enables you to search through the system protocol database, which is stored in the file /etc/protocols.

The syntax of the getprotoent function is

(name, aliases, number) = getprotoent();

name is the name associated with a particular system protocol. aliases is a scalar value consisting of a list of alternative names for this system protocol, with names being separated by a space. number is the number associated with this particular system protocol.

The first call to getprotoent returns the first element in /etc/protocols. Further calls return subsequent entries; when /etc/protocols is exhausted, getprotoent returns the empty list.

The `getprotobyname` and `getprotobynumber` Functions

The getprotobyname and getprotobynumber functions provide ways of searching in the /etc/protocols file.

The getprotobyname function enables you to search for a particular protocol entry in the
/etc/protocols file.

The syntax of the getprotobyname function is

(name, aliases, number) = getprotobyname (searchname);

Here, searchname is the protocol name you are looking for. name, aliases, and number are the same as in getprotoent.

Similarly, getprotobynumber searches for a protocol entry in /etc/protocols that matches a particular protocol number.

The syntax of the getprotobynumber function is

(name, aliases, number) = getprotobynumber (searchnum);

searchnum is the protocol number to search for. name, aliases, and number are the same as in getprotoent.

Both functions return the empty list if no matching protocol database entry is found.

The `setprotoent` and `endprotoent` Functions

The setprotoent and endprotoent functions provide other ways of manipulating the /etc/protocols file.

The setprotoent function rewinds the /etc/protocols file.

The syntax of the setprotoent function is

setprotoent (keepopen);

If keepopen is a nonzero value, the value indicates that the program should keep /etc/protocols open, because it intends to continue accessing the system protocol database. After setprotoent has been called, the next call to getprotoent reads (or rereads) the first element of the database.

The endprotoent function closes the /etc/protocols file and indicates that the program no longer wants to read any system protocols from the database.

The endprotoent function requires no arguments and returns no values:

endprotoent();

NOTE

For more information on system protocols, refer to the getprotoent manual page on your system

The `getservent` Function

The getservent function enables you to search through the system services database, which is stored in the file /etc/services.

The syntax of the getservent function is

(name, aliases, portnum, protoname) = getservent();

name is the name associated with a particular system service. aliases is a scalar value consisting of a list of alternative names for this system service; the names are separated by a space.

portnum is the port number associated with this particular system protocol, which indicates the location of the port at which the service is residing. This port number is returned as a packed array of integers, which can be unpacked using unpack (with a C* format specifier).

protoname is a protocol name (such as tcp).

The first call to getservent returns the first element in /etc/services. Further calls return subsequent entries; when /etc/services is exhausted, getservent returns the empty list.

The `getservbyname` and `getservbyport` Functions

The getservbyname and getservbyport functions provide ways of searching in the /etc/services file.

The getservbyname function enables you to search the /etc/services file for a particular service name.

The syntax of the getservbyname function is

(name, aliases, portnum, protoname) = getservbyname (searchname, searchproto);

Here, searchname and searchproto are the service name and service protocol type to be matched. If the name and type are matched, getservbyname returns the system service database entry corresponding to this name and type. This entry is the same four-element list as is returned by getservent. (The empty list is returned if the name and type are not matched.)

Similarly, the getservbyport function searches for a service name that matches a particular service port number.

The syntax of the getservbyname function is

(name, aliases, portnum, protoname) = getservbyname (searchportnum, �searchproto);

searchportnum and searchproto are the port number and protocol type to search for. name, aliases, portnum, and protoname are the same as in getservbyname and getservent.

The `setservent` and `endservent` Functions

The setservent and endservent functions provide other ways of manipulating the /etc/services file.

The setservent function rewinds the /etc/services file.

The syntax of the setservent function is

setservent (keepopen);

After setservent has been called, the next call to getservent retrieves the first element of the /etc/services file. keepopen, if nonzero, specifies that the /etc/services file is still in use and is to remain open.

The function endservent indicates that the /etc/services file can be closed, because it is no longer needed.

The endservent function requires no arguments and returns no values:

endservent();

The `chroot` Function

The chroot function enables you to specify the root directory for your program and any subprocesses that it creates.

The syntax of the chroot function is

chroot (dirname);

dirname is the name of the directory to serve as the root. After chroot has been called, the directory name specified by dirname is appended to every pathname specified by your program and its subprocesses. For example, the statement

chroot ("/pub");

adds /pub to the front of every directory name. For example, when your program or a subprocess tries to access the directory /u/jqpublic, the directory it accesses is actually /pub/u/jqpublic.

chroot often is used to restrict access to a particular portion of a file system. It can be called only if you have superuser privileges on your machine and execute permission on the specified root directory.

The `ioctl` Function

The ioctl function enables you to set system-dependent file attributes (such as the special character definitions for your keyboard).

The syntax of the ioctl function is

ioctl (filevar, attribute, value);

filevar is a file variable representing a previously opened file. attribute is a value representing the operation to be performed. Incorporated as part of the attribute value is a number indicating whether the operation is retrieving the value of an attribute or setting the value of an attribute.

value holds the attribute value associated with the operation specified by attribute. If the operation is setting an attribute, value contains the new value of the attribute. If the operation is retrieving the current value of the attribute, value is assigned this current value.

ioctl returns a nonzero value if the operation is performed successfully, or zero if the operation fails.

NOTE

For details on what operations can be performed on your machine, refer to the file /usr/include/sys/ioctl.h on your machine. This file is a header file written in the C programming language that contains information on the available ioctl operations

Different machines (and devices) support different ioctl operations. Thus, a program that requests an ioctl operation is not portable if you move it from one machine to another. You therefore should use ioctl operations only when you must

The `alarm` Function

The alarm function sends a special "alarm" signal, SIGALARM, to your program.

The syntax of the alarm function is

alarm (value);

value is an expression indicating how many seconds are to pass before the alarm goes off.

For more information on signals and their relationship to processes, refer to the description of the kill function on Day 13.

Calling the System `select` Function

Perl enables you to call the UNIX select function from within your Perl program.

The syntax for a call to the UNIX select function is

select (rmask, wmask, emask, timeout);

rmask, wmask and emask are bit masks, and timeout is a timeout value in seconds.

For more information on select, refer to the UNIX manual page.

NOTE

The UNIX select function is different from the Perl select function that you've seen in earlier lessons.

The Perl interpreter determines whether a program is calling the Perl select function or the UNIX select function by counting the number of arguments: the Perl function expects only one, and the UNIX function expects four

The `dump` Function

The dump function, which is defined only in Perl 5, enables you to generate a UNIX core dump from within your Perl program.

The syntax for the dump function is

dump(label);

label is an optional label, specifying where execution is to restart if the UNIX undump command is executed.

If a core dump file created by dump is restarted by the UNIX undump command, files that were open when the program was executing will no longer be open. This means they cannot be read from or written to

Socket-Manipulation Functions

In Berkeley UNIX environments (version 4.3BSD) and some other environments, processes can communicate with one another using a connection device known as a socket. When a socket has been created, one process can write data which can then be read by another process.

Perl supports various functions that create sockets and set up connections with them. The following sections describe these functions.

The `socket` Function

To create a socket, call the socket function. This function defines a socket and associates it with a Perl file variable.

The syntax of the socket function is

socket (socket, domain, type, format);

socket is a file variable that is to be associated with the new socket.

domain is the protocol family to use. The legal values for domain are listed in the system header file /usr/include/sys/socket.h; these values are represented by the constants PF_UNIX, PF_INET, PF_IMPLINK, and PF_NS.

type is the type of socket to create. The legal values for type are also listed in the file /usr/include/sys/socket.h. These legal values are represented by the five constants SOCK_STREAM, SOCK_DGRAM, SOCK_RAW, SOCK_SEQPACKET, and SOCK_RDM.

format is the number of the protocol to be used with the socket. This protocol is normally retrieved by calling getprotobyname. (See the manual page for getprotobyname for details on what protocols are supported on your machine.)

The socket function returns a nonzero value if the socket has been created and zero if an error occurs.

The `bind` Function

After you create a socket using socket, the next step is to bind the socket to a particular network address. To do this, use the bind function.

The syntax of the bind function is

bind (socket, address);

Here, socket is the file variable corresponding to the socket created by socket.

address is the network address to be associated with the socket. This address consists of the following elements:

The address type, which is an unsigned short integer and is always AF_INET (defined in /usr/include/netdb.h or /usr/include/bsd/netdb.h).
The number of the port to use when connecting, which is a short integer in network order
The packed four-byte representation of the Internet address of the machine to which the socket is to be bound

This function returns a nonzero value if the bind operation succeeds and zero if an error occurs.

To create an address suitable for passing to bind, call pack.

$address = pack ("Sna4x8", 2, $portnum, $intaddress);

Here, the pack format specifier Sna4x8 indicates an unsigned short integer, followed by a short integer in network order (the port number), a four-byte ASCII string (which is the packed address), and eight null bytes. This is the format that bind expects when binding an address to a socket.

The `listen` Function

After an address has been bound to the socket associated with each of the machines that are to communicate, the next step is to define a process that is to be the "listening" process. This process waits for connections to be established with it. (In a client-server architecture, this process corresponds to the server.) To define this listening process, call the listen function.

The syntax of the listen function is

listen (socket, number);

socket is the socket created using the socket function. number is the maximum number of processes that can be queued up to connect to this process.

listen returns a nonzero value if it executes successfully, zero if it does not.

The maximum number of processes that can be queued using listen is 5. This limitation is imposed by the Berkeley UNIX operating system

The `accept` Function

After a process that has been established as the listening process calls listen, the next step is to have this process call the accept function. accept waits until a process wants to connect with it, and then it returns the address of the connecting process.

The syntax of the accept function is

accept (procsocket, socket);

procsocket is a previously undefined file variable that is to represent the newly created connection. The listening process can then send to or receive from the other process using the file variable specified in procsocket. This file variable can be treated like any other file variable: the program can send data through the socket by calling write or print, or can read data using the <> operator.

socket is the socket created by socket and bound to an address by bind.

Listing 15.11 is an example of a program that uses listen and accept to create a simple server. This server just sends the message Hello, world! to any process that connects to it. (A client program that receives this message is listed in the next section, "The connect Function.")

Listing 15.11. A simple server program.

1:  #!/usr/local/bin/perl

2:  

3:  $line = "Hello, world!\n";

4:  

5:  $port = 2000;

6:  while (getservbyport ($port, "tcp")) {

7:          $port++;

8:  }

9:  ($d1, $d2, $prototype) = getprotobyname ("tcp");

10: ($d1, $d2, $d3, $d4, $rawserver) = gethostbyname ("silver");

11: $serveraddr = pack ("Sna4x8", 2, $port, $rawserver);

12: socket (SSOCKET, 2, 1, $prototype) || die ("No socket");

13: bind (SSOCKET, $serveraddr) || die ("Can't bind");

14: listen (SSOCKET, 1) || die ("Can't listen");

15: ($clientaddr = accept (SOCKET, SSOCKET)) ||

16:         die ("Can't accept");

17: select (SOCKET);

18: $| = 1;

19: print SOCKET ("$line\n");

20: close (SOCKET);

21: close (SSOCKET);

This program requires no input and generates no output.

The first task this server program performs is to search for a port to use when establishing a socket connection. To be on the safe side, the program first checks that the port it is going to use, port 2000, is not reserved for use by another program. If it is reserved, the program checks port 2001, then port 2002, and so on until it finds an unused port.

To do this checking, line 6 calls getservbyport. If getservbyport returns a non-empty list, the port being checked is listed in the /etc/services file, which means that it is being used by some other program. In this case, the port number is increased by one, and getservbyport is called again. This process continues until getservbyport returns an empty list, which indicates that the port being checked is unused. When lines 5-8 are no longer executing, the scalar variable $port contains the number of the port to be used.

Line 9 calls getprotobyname to retrieve the /etc/protocols entry associated with the TCP protocol. The protocol number associated with the TCP protocol is retrieved from this /etc/protocols entry and is stored in the scalar variable $prototype. (The other elements of the list are ignored; the convention used by this program is to store element entries that are not going to be used in variables named $d1, $d2, and so on; the d stands for dummy.)

Line 10 calls gethostbyname to retrieve the network address of the machine on which this server is running. This program assumes that the server is running on a local machine named silver. To run this program on your own machine, replace silver with your machine name.

TIP

You can modify this program to run on any machine. To do so, modify line 10 as shown here:

($d1, $d2, $d3, $d4, $rawserver) = gethostbyname ('hostname');

The string in backquotes, 'hostname', tells the Perl interpreter to call the hostname program and return its output as a scalar value. The hostname program returns the name of the machine on which it is running. Therefore, the call to gethostbyname retrieves the address
of the machine on which you are running regardless of what the machine is.

This capability enables you to move this program from one machine to another without having to modify it.

Note that enclosing a command in backquotes works for any UNIX command that returns output. For example, the statement

$userid = 'whoami';

assigns the current login user ID to the scalar variable $userid (because the UNIX command whoami displays the current login user ID)

After gethostbyname has been called, the scalar variable $rawserver contains the Internet address of your machine. Line 11 calls pack to convert the address type, the port number, and this address into the form understood by the operating system. (The address type parameter, 2, is the local value of AF_INET, which is the only address type supported.) This information is stored in the scalar variable $serveraddr.

After pack is called to build the server address, the program is ready to create a socket. Line 12 does this by calling socket. This call to socket passes it the file variable SSOCKET, the socket domain, the socket type, and the protocol number. After socket is called, the file variable SSOCKET represents the "master socket" that is to listen for connections. (Note that the values 2 and 1 passed to socket are, respectively, the local values of the constants PF_INET and SOCK_STREAM. PF_INET indicates Internet-style protocol, and SOCK_STREAM indicates that transmission will be in the form of a stream of bytes. You likely will not need to use any other values for these arguments.)

After the socket has been created, the next step is line 13, which associates the socket with your machine by calling bind. This call to bind is passed the file variable SSOCKET associated with the socket and the server address created by the call to pack in line 11.

After the socket is bound to your machine address, you are ready to listen for clients that want to connect to your server. Line 14 does this by calling listen. This call to listen is passed the file variable SSOCKET and the value 1; the latter indicates that only one client is listened for at any particular time.

Line 15 calls accept, which waits until a client process wants to connect to this server. When a connection is established, accept creates a new socket associated with this connection and uses the file variable SOCKET to represent it. (The address of the client connection is returned in $clientaddr; if you want to, you can use unpack to obtain the address, and then call gethostbyaddr to retrieve the name of the machine on which the client process is running.)

When the connection has been established and the file variable SOCKET has been associated with it, you can treat SOCKET like any other file variable: you can read data from it or write data to it. Lines 17 and 18 turn off buffering for SOCKET, which ensures that data sent through the socket is sent right away. (If buffering is left on, the program won't send data until the special internal buffer is full, which means that the client process won't receive the data right away.) After buffering is turned off, line 19 writes the line of data to SOCKET, which sends it to the client process. (For more information on buffering and how it works, refer to "Redirecting One File to Another" on Day 12, "Working with the File System.")

Although you can both send and receive data through the same socket, doing so is dangerous, because you run the risk of deadlock. Deadlock occurs when the client and server processes think that the other is going to send data. Neither can proceed until the other does.

The only way to get out of a deadlock is to send signals to the processes (such as KILL).

To avoid a deadlock, make sure that you understand how data flows between the processes you are running

The `connect` Function

As you have seen, when two processes communicate using a socket, one process is designated as the listening process. This process calls listen to indicate that it is the listening process, and then it calls accept to wait for a connection from another process. (Listening processes are called servers, because they provide service to the processes that connect to them. The processes that connect to servers are called clients.)

To connect to a process that has called accept and is now waiting for a connection, use the connect function.

The syntax of the connect function is

connect (socket, address);

socket is a file variable representing a socket created using socket and bound using bind. address is the internal representation of the Internet address to which you want to connect. In the process to which this process is connecting, this address must have been passed to bind to bind it to a socket, and the socket, in turn, must have been specified in calls to listen and accept.

After connect has been called, the program that calls it can send data to or receive data from the other process by means of the file variable specified in socket.

Listing 15.12 is an example of a program that uses connect to obtain data from another process. (The process that sends the data is displayed in Listing 15.11.)

Listing 15.12. A simple client program.

1:  #!/usr/local/bin/perl

2:  

3:  $port = 2000;

4:  while (getservbyport ($port, "tcp")) {

5:          $port++;

6:  }

7:  ($d1, $d2, $prototype) = getprotobyname ("tcp");

8:  ($d1, $d2, $d3, $d4, $rawclient) = gethostbyname ("mercury");

9:  ($d1, $d2, $d3, $d4, $rawserver) = gethostbyname ("silver");

10: $clientaddr = pack ("Sna4x8", 2, 0, $rawclient);

11: $serveraddr = pack ("Sna4x8", 2, $port, $rawserver);

12: socket (SOCKET, 2, 1, $prototype) || die ("No socket");

13: bind (SOCKET, $clientaddr) || die ("Can't bind");

14: connect (SOCKET, $serveraddr);

15: 

16: $line = <SOCKET>;

17: print ("$line\n");

18: close (SOCKET);

$ program15_12

Hello, world!

$

Lines 3-6 obtain the port to use when receiving data by means of a socket connection. As in Listing 15.11, the port number is compared with the list of ports stored in /etc/services by calling getservbyport. The first unused port number greater than or equal to 2000 becomes the number of the port to use. (This program and Listing 15.11 assume that the same /etc/services file is being examined in both cases. If the /etc/services files are different, you will need to choose a port number yourself and specify this port number in both your client program and your server program-in other words, assign a prespecified value to the variable $port.)

Line 7 calls getprotobyname to retrieve the protocol number associated with the TCP protocol. This protocol number is eventually passed to socket.

Lines 8 and 9 retrieve the Internet addresses of the client (this program) and the server (the process to connect to). $rawclient is assigned the Internet address of the client, and $rawserver is assigned the Internet address of the server; each of these addresses is a four-byte scalar value.

Lines 10 and 11 take the addresses stored in $rawclient and $rawserver and convert them to the form used by the socket processing functions. In both cases, the 2 passed to pack is the local value for AF_INET (the only type of address supported in the UNIX environment). Note that line 10 doesn't bother specifying a port value to pass to pack; this is because the connection uses the port specified in the server address in line 11.

Line 12 now calls socket to create a socket for the current program (the client). As in the call to socket in Listing 15.11, the values 2 and 1 passed to socket are the local values of the constants PF_INIT and SOCK_STREAM; if these values are different on your machine, you need to replace the values shown here with the ones defined for your machine. The call to socket in line 12 associates the file variable SOCKET with the newly created socket.

After the socket has been created, line 13 calls bind to associate the socket with the client program. bind requires two arguments: the file variable associated with the socket that has just been created, and the address of the client machine as packed by line 10.

Line 14 now tries to connect to the server process by calling connect and passing it the server address created by line 11. If the connection is successful, you can send and receive data through the socket using the SOCKET file variable.

The SOCKET file variable behaves just like any other file variable. This means that line 16 reads a line of data from the server process. Because the server process is sending the character string Hello, world! (followed by a newline character), this is the string that is assigned to $line. Line 17 then prints $line, which means that the following appears on your screen:

Hello, world!

After the client process is finished with the socket, line 18 calls close. This call indicates that the program is finished with the socket. (After the socket is closed by both the server and the client programs, the server program can accept a connection from another client process, if desired.)

The `shutdown` Function

When two processes are communicating using a socket, data can be sent in either direction: the client can receive data from the server, or vice versa. The shutdown function enables you to indicate that traffic in one or both directions is no longer needed.

The syntax for the shutdown function is

shutdown (socket, direction);

Here, socket is the file variable associated with the socket whose traffic is to be restricted. direction is one of the following values:

0 indicates that the program can send through the socket but can no longer receive data.
1 indicates that the program can receive data from the socket but can no longer send.
2 indicates that both sending and receiving are disallowed.

NOTE

To terminate communication through a socket, call close and pass it the file variable associated with the socket:

close (SOCKET);

This line closes the socket represented by SOCKET

The `socketpair` Function

The socketpair function is similar to socket, but it creates a pair of sockets rather than just one socket.

The syntax of the socketpair function is

socketpair (socket1, socket2, domain, type, format);

socket1 is the file variable to be associated with the first newly created socket, and socket2 is the file variable to be associated with the second socket.

As in socket, domain is the protocol family to use, type is the type of socket to create, and format is the number of the protocol to be used with the socket.

socketpair often is used to create a bidirectional communication channel between a parent and a child process.

Some machines that support sockets do not support socketpair

The `getsockopt` and `setsockopt` Functions

The getsockopt and setsockopt functions enable you to obtain and set socket options.

To obtain the current value of a socket option in your environment, call the getsockopt function.

The syntax of the getsockopt function is

retval = getsockopt (socket, opttype, optname);

socket is the file variable associated with the socket whose option you want to retrieve.

opttype is the type of option (or option level). The value of the system constant SOL_SOCKET specifies a "socket level" option. To find out the other possible values for opttype, refer to the system header file /usr/include/sys/socket.h.

optname is the name of the option whose value is to be retrieved; retval is the value of this option.

To set a socket option, call setsockopt.

The syntax of the setsockopt function is

setsockopt (socket, opttype, optname, value);

Here, socket, opttype, and optname are the same as in getsockopt, and value is the new value of the optname option.

NOTE

Socket options are system dependent (and a full treatment of them is beyond the scope of this book). For more information on socket options, refer to the getsockopt and setsockopt manual pages on your machine or to the /usr/include/sys/socket.h header file

The `getsockname` and `getpeername` Functions

The getsockname and getpeername functions enable you to retrieve the addresses of the two ends of a socket connection.

The getsockname function returns the address of this end of a socket connection (the end created by the currently running program).

The syntax of the getsockname function is

retval = getsockname (socket);

As in the other socket functions, socket is the file variable associated with a particular socket. retval is the returned address.

The returned address is in packed format as built by the calls to pack in Listing 15.11 and Listing 15.12.

The following code retrieves a socket address and converts it into readable form:

$rawaddr = getsockname (SOCKET);

($d1, $d2, @addrbytes) = unpack ("SnC4x8", $rawaddr);

$readable = join (".", @addrbytes);

NOTE

Normally, you already have the address returned by getsockname because you need to pass it to bind to associate the socket with your machine

To retrieve the address of the other end of the socket connection, call getpeername.

The syntax of the getpeername function is

retval = getpeername (socket);

As in getsockname, socket is the file variable associated with the socket, and retval is the returned address.

NOTE

The address returned by getpeername is normally identical to the address returned by accept

The UNIX System V IPC Functions

The functions you've just seen describe interprocess communication using sockets. Sockets are supported on machines running the 4.3BSD (Berkeley UNIX) operating system and on some other UNIX operating systems as well.

Some machines that do not support sockets support a set of UNIX System V interprocess communication (IPC) functions. These functions consist of the following:

Functions that send messages from one process to another by means of a message queue
Functions that create and manipulate shared memory
Functions that create and manipulate semaphores

Perl enables you to use these IPC functions by defining Perl functions with the same names as the IPC functions. The following sections provide a brief description of these functions.

For more information on any IPC function, refer to the manual page for that function.

IPC Functions and the `require` Statement

Before you can use any System V IPC functions, you first must give the program the information it needs to use them.

To do this, add the following statements to your program, immediately following the #!/usr/local/bin/perl header line:

require "ipc.ph";

require "msg.ph";

require "sem.ph";

require "shm.ph";

The require statement is like the #include statement in the C preprocessor: it takes the contents of the specified file and includes them as part of your program.

The syntax for the require statement is

require "name";

Here, name is the name of the file to be added to your program.

For example, the following statement includes the file ipc.ph as part of your program:

require "ipc.ph";

NOTE

If the Perl interpreter complains that it cannot find a file that you are trying to include using require, one of two things is wrong:

The built-in array variable @INC is not defined properly.
The file does not exist.

See the description of @INC on Day 17, "System Variables," for more details

The `msgget` Function

To use the System V message-passing facility, the first step is to create a message queue ID to represent a particular message queue. To do this, call the msgget function.

The syntax of the msgget function is

msgid = msgget (key, flag);

Here, key is either IPC_PRIVATE or an arbitrary constant. If key is IPC_PRIVATE or flag has IPC_CREAT set, the message queue is created, and its queue ID is returned in msgid.

If msgget is unable to create the message queue, msgid is set to the null string.

The `msgsnd` Function

To send a message to a message queue, call the msgsnd function.

The syntax of the msgsnd function is

msgsnd (msgid, message, flags);

msgid is the message queue ID returned by msgget. message is the text of the message, and flags specifies options that affect the message.

msgsnd returns a nonzero value if the send operation succeeds, zero if an error occurs.

For more information on the format of the message sent by msgsnd, refer to your msgsnd manual page.

The `msgrcv` Function

To obtain a message from a message queue, call the msgrcv function.

The syntax of the msgrcv function is

msgrcv (msgid, message, size, mesgtype, flags);

Here, msgid is the ID of the message queue, as returned by msgget. message is a scalar variable (or array element) in which the message is to be stored. size is the size of the message, plus the size of the message type; this message type is specified in mesgtype. flags specifies options that affect the message.

msgrcv returns a nonzero value if the send operation succeeds, zero if an error occurs.

The `msgctl` Function

The msgctl function enables you to set options for message queues and send commands that affect them.

The syntax of the msgctl function is

msgctl (msgid, msgcmd, msgarg);

msgid is the message queue ID. msgcmd is the command to be sent to the message queue; the list of available commands is defined in the file /usr/include/sys/ipc.h.

Some of the commands that can be specified by msgcmd set the values of message queue options. If one of these commands is specified, the new value of the option is specified in msgarg.

If an error occurs, msgctl returns the undefined value. msgctl also can return zero or a nonzero value.

The `shmget` Function

To use the System V shared memory capability, you must first create the shared memory. To do this, call the shmget function.

The syntax of the shmget function is

shmid = shmget (key, size, flag);

Here, key is either IPC_PRIVATE or an arbitrary constant. If key is IPC_PRIVATE or flag has IPC_CREAT set, the shared memory segment is created, and its ID is returned in shmid. size is the size of the created shared memory (in bytes). If shmget is unable to create the message queue, shmid is set to the null string.

The `shmwrite` Function

To send data to a particular segment of shared memory, call the shmwrite function.

The syntax of the shmwrite function is

shmwrite (shmid, text, pos, size);

shmid is the shared memory ID returned by shmget. text is the character string to write to the shared memory, pos is the number of bytes to skip over in the shared memory before writing to it, and size is the number of bytes to write.

This function returns a nonzero value if the write operation succeeds; it returns zero if an error occurs.

NOTE

If the character string specified by text is longer than the value specified by size, only the first size bytes of text are written to the shared memory.

If the character string specified by text is shorter than the value specified by size, shmwrite fills the leftover space with null characters

The `shmread` Function

To obtain data from a segment of shared memory, call the shmread function.

The syntax of the shmread function is

shmread (shmid, retval, pos, size);

Here, shmid is the shared memory ID returned by shmget. retval is a scalar variable (or array element) in which the returned data is to be stored. pos is the number of bytes to skip in the shared memory segment before copying to retval, and size is the number of bytes to copy.

This function returns a nonzero value if the read operation succeeds, and it returns zero if an error occurs.

The `shmctl` Function

The shmctl function enables you to set options for shared memory segments and send commands that affect them.

The syntax of the shmctl function is

shmctl (shmid, shmcmd, shmarg);

shmid is the shared memory ID returned by shmget. shmcmd is the command that affects the shared memory; the list of available commands is defined in the header file named /usr/include/sys/ipc.h.

Some of the commands that can be specified by shmcmd set the values of shared memory options. If one of these commands is specified, the new value of the option is specified in shmarg.

If an error occurs, shmctl returns the undefined value. shmctl also can return zero or a nonzero value.

The `semget` Function

To use the System V semaphore facility, you must first create the semaphore. To do this, call the semget function.

The syntax of the semget function is

semid = semget (key, num, flag);

Here, key is either IPC_PRIVATE or an arbitrary constant. If key is IPC_PRIVATE or flag has IPC_CREAT set, the shared memory segment is created, and its ID is returned in semid. num is the number of semaphores created. If semget is unable to create the semaphore, semid is set to the null string.

The `semop` Function

To perform a semaphore operation, call the semop function.

The syntax of the semop function is

semop (semid, semstructs);

Here, semid is the semaphore ID returned by semget, and semstructs is a character string consisting of an array of semaphore structures. Each semaphore structure consists of the following components, each of which is a short integer (as created by the s format character in pack):

The number of semaphores
The semaphore operation
The semaphore flags, if any

This function returns a nonzero value if the semaphore operation is successful, zero if an error occurs.

NOTE

For more information on semaphore operations and the semaphore structure, refer to the semop manual page

The `semctl` Function

The semctl function enables you to set options for semaphores and send commands that affect them.

The syntax of the semctl function is

semctl (semid, semcmd, semarg);

semid is the semaphore ID returned by semget. semcmd is the command that affects the semaphore; the list of available commands is defined in the file /usr/include/sys/ipc.h.

Some of the commands that can be specified by semcmd set the values of semaphore options. If one of these commands is specified, the new value of the option is specified in semarg.

If an error occurs, semctl returns the undefined value. semctl also can return zero or a nonzero value.

Summary

Today you learned about Perl functions that emulate system library functions, perform Berkeley UNIX socket operations, and perform System V IPC operations.

Perl functions that emulate system library functions perform the following tasks, among others:

Read the /etc/group file, which lists the user groups for your machine
Read the /etc/networks file, which lists networks to which your machine is connected
Read from the /etc/hosts file, which lists the remote machines accessible from your local network
Obtain the current login user ID
Retrieve the current process group and parent process ID
Read the /etc/passwd file, which lists information about the users who have access to your machine
Obtain the current priority for your program and set it to another value
Read the /etc/protocols file, which lists the types of protocols available for interprocess communication
Read the /etc/services file, which lists the port numbers associated with system services on your machine

Today you also learned about the Berkeley UNIX socket mechanism, which provides interprocess communication using a client-server model. The System V IPC message queue, shared memory, and semaphore capabilities are also briefly covered.

Q&A

Q:	What is the difference between `getnetent` and `gethostent`, and which one accesses `/etc/networks`?
A:	On most systems, `getnetent` accesses the contents of `/etc/networks`, which lists the names and numbers of the networks for your machine. `gethostent`, on the other hand, accesses the contents of `/etc/hosts`, which lists the names and addresses of other machines on local and remote networks.
Q:	What will happen if I establish a socket connection using a port number listed in `/etc/services`?
A:	If the system service is always active, the system likely will not enable you to establish a socket connection using this port. If the system service runs intermittently, you run the risk of disrupting it. In your programs, it is always best to use a port never used by any other system service.
Q:	How did sockets get their name?
A:	A server process that is listening for clients is like an electrical socket on your wall: any client process with the appropriate protocol can "plug into" it.
Q:	What is the purpose of a semaphore?
A:	A semaphore is a method of ensuring that only one process can run a particular segment of code or access a particular chunk of shared memory storage at any given time. A full description of how semaphores work is beyond the scope of this book. Many books on operating systems can give you an introduction to the concepts used in semaphores. Also, the UNIX System V manual pages for the semaphore functions listed in today's lesson provide a brief description of how semaphores work.
Q:	The machine name I retrieved using `gethostbyaddr` has a lot of funny characters in it. Why?
A:	The address you've retrieved is an Internet domain address, which is a list of names separated by periods (`.`). These domain names ensure that each Internet user and machine can be distinguished from the millions of other users and machines around the world. For more details on the Internet and how to use it, refer to a book on the subject (many are available).

Workshop

The Workshop provides quiz questions to help you solidify your understanding of the material covered, and exercises to give you experience in using what you've learned. Try and understand the quiz and exercise answers before you go on to tomorrow's lesson.

Quiz

Which functions manipulate the following files?
a. /etc/passwd b. /etc/hosts c. /etc/networks d. /etc/services
Which of the following functions are called by client processes and which by server processes when performing socket operations? In what order should these functions be called?
a. Bind b. listen c. socket d. accept e. connect
What do the following functions do?
a. getpwuid b. setprotoent c. gethostbyaddr d. getgrent e. getservbyport
How do you send information using a socket?
Describe how to list all the (numeric) user IDs on your machine.

Exercises

Write a program that lists (by name) all the groups into which user IDs are sorted on your machine. List all the user names in each group. Sort the groups, and the user names in each group, in alphabetical order.
Write a program that lists every user name on your machine and prints the home directory for each.
Write a program that lists the shells used by users on your machine. List the number of users of each shell, and sort the list in descending order of use.
Write a program that splits into two identical processes, and have each process print the process ID of the other.
Write a program that sends a specific file, /u/jqpublic/testfile, to clients who request it. The program should send the file by creating a copy of itself using fork, and it should be able to send to five clients at once.
BUG BUSTER: What is wrong with the following program?
#!/usr/local/bin/perl print ("Network names and numbers at your site:\n"); while (($name, $d1, $d2, $address) = getnetent()) { print ("$name, at address $address\n"); }