Introduction

The purpose of this lab is to introduce you to some features of I/O in an operating system, and one mechanism for doing interprocess communication. You will be asked to extend the shell you wrote in the previous lab, but in case you didn't finish, a sample solution will be made available before the start of the week for this lab.

The lab is divided into sections, and in each section you will be asked to develop your program a little further. It is important you keep copies of the state of your code from the end of each section, as your demonstrator will mark every stage, not just the final version.

You must complete this lab to pass it.

Nevertheless, your program does not have to be bug free or even work particularly well; it just needs to cover all of the features mentioned in the lab. In fact, you should expect and ignore bugs in your program that result from tricky command lines, since the aim of this lab is not to write a good command line parser.

For all of the function calls you use in this lab, you should make a habit of printing error information using strerror() and errno that was introduced in the previous lab. Debugging is very difficult without knowing what has gone wrong.

cd - Doing work in the `parent'

Every process has a current working directory that is used to complete relative paths. In a shell this is particularly important, and the `present'(current) working directory can be seen with the command pwd. You probably already know that the command cd is used to change this directory.

Add the ability to change directory to your shell when it is given the standard command cd. This is done with the chdir() function which will need a directory name that can be extracted from the command line in the usual fashion (probably with strtok()). The only trick is that this must be done without forking, since it is the shell itself that needs to change its working directory, and a new process would not have that effect on its parent.

I/O redirection - Some special command line parsing

Every UNIX shell offers I/O redirection with the characters `<' (which redirects standard input to come from a file) and `>' (which redirects standard output to go to a file). The programming of this feature should be done in three stages.

Parsing a command line with redirection

The main thing to keep in mind is that the redirection is an instruction to the shell, and is not something that is part of the command line to be executed by the child. For example, most shells allow you to write something like
ls > listing -al
which actually executes the command ls -al, as if the part that says > listing wasn't there at all. Nevertheless, the conventional way to write this would have been ls -al > listing and it is acceptable if this is all your shell can cope with.

To make the command line parsing simple, search the commandline for an occurance of the character `>' using strchr(), and if found replace it, along with exactly one word following (which is the filename), with spaces. That would turn
ls > listing -al
into

ls           -al
Do the same thing for the character `<'.

Giving the child alternative I/O

Since redirection means the child may no longer be reading from the keyboard or writing to the screen, the first thing that must be introduced is the ability of the child to set its I/O to whatever it is told.

In UNIX, all I/O control is done with file descriptors, which are integers that can be set to refer to different things such as different files. The operating system maintains a separate list for each process of what the file descriptors refer to. By convention the value 0 is standard input, 1 is standard output, and 2 is standard error, though we won't be concerned with that last one.

Since fork() creates a child process as a copy of the parent process, it gets copies of all the file descriptor references also. This is why standard input and standard output worked automatically in the first lab. We can take advantage of this by preparing in the shell any file descriptors the child will need to use, and then using them appropriately after the fork.

To begin with, open a new file before doing the fork that will be written to by the child instead of the screen. Since we are using file descriptors, you should use open() for this instead of fopen(). To keep this stage simple, hardcode the name of the file in your program. The flags that you will need are for writing, creating (since the file might not yet exist), and truncating (since writing less than the current contents of the file would leave the rest). Also, in case the file is created, you should use a mode of at least read and write permissions for the file owner, which is given as the same kind of number used by the command chmod. This is an octal number, and so can be given in C by making the first digit 0. If the open() is successful, a file descriptor will be returned.

Since the aim of this is to make the child write its output to this file instead of the screen, we need a way of making its standard output file descriptor, which is 1, point to this file instead of the screen. The way to `move' a particular file descriptor is with dup2(), and this should be used at the start of the child (i.e. after the fork) to make 1 point to the same thing that the file descriptor returned by the open() points to, which is the file. Once this has been done, any standard output produced by the child will go to that file.

Finally, since only the child is using the file, the parent must close it immediately after the fork(). close() needs to be used and not fclose() for this. Make sure you do not close file descriptor 0 or 1 for the shell itself or it will stop being able to input or output. It will be particularly important for this closing to be correct in the last section of the lab, so make sure you have it in place here. If you want to make sure you are closing properly, then print out the file descriptor each time you open a file. If your shell is holding files open then the descriptors will continue to increase. If not, then the descriptors used by a particular command line should always start at 3.

Specifying the redirection on the command line

With the first two stages working, this should not be difficult. Place your call to open() inside the part of your program that has detected the character `>', and instead of using a hardcoded name for the file, use the word following the `>'. Repeat this idea for the code dealing with the character `<', except you will need to store this file descriptor, which is for input, separately. Also put a second dup2() call in the child in the same place as the first to redirect file descriptor 0, which is for standard input to this other file, and add the corresponding second call to close() in the shell.

Make sure you test this new feature of your shell. Ask your demonstrator for help if you have trouble doing so.

The `;' separator - Multiple commands on the command line

UNIX shells conventionally use the semicolon, `;', as a command separator in the same manner as languages like C, to allow command lines such as
ls > listing ; echo to screen ; ls
or even
ls>listing;echo to screen;ls
If your program ends up requiring the extra spaces, that is fine.

To do this, you will need to introduce a loop inside the loop you already have, after the command line has been read in, that will run until the entire command line has been processed. The start of this loop will have to separate out one part of the command line at a time according to where the semicolons are. It is tempting to use strtok() to separate the command line into separate strings, each of which is exactly one command, but there is a problem. strtok() remembers where it is up to in a string, but can only do this one string at a time. The code you have for breaking up a command into arguments and for breaking up the PATH into directories probably already uses strtok(), and this will cause the strtok() for the semicolons to forget where it was up to.

Normally the solution for this is to use a variant of strtok() called strtok_r() which will allow you to specify a variable for remembering where the tokenising is up to, and will not get confused with the uses of strtok() already in your code. Unfortunately, later in the lab you will be tokenising with another delimiter in addition to the semicolon and you will need to know which one was found. Since strtok() and strtok_r() overwrite the delimiters they find, the easiest overall solution is to write some code yourself which does the same job as strtok_r().

Code which mimics strtok_r() should scan for a semicolon using strchr(), remember where it is, and replace it with a '\0'. You can then give the code you already have for the rest of your shell a pointer to the start of the command line and that code will think the command line ends where the semicolon used to be. Next time around, the start of this new loop should begin the scan for the next semicolon at the place just after the previous semicolon was found, and give this as the start of the command line to the rest of the code.

A common bug when this feature is introduced is for the strtok_r() code or equivalent to produce a string like " " or "" which is not a command at all. This may cause the rest of your code to crash, so the easiest way to fix it is to test for such empty commands after the first strtok() for separating out the arguments. If your code is about to try executing an empty command, just skip it.

The `&' separator - Backgrounded commands

Running a command in the background means the shell does not wait for the command to finish before prompting for the next command. It is typical to run graphical programs in the background as in
emacs &
or
xv &
or
netscape &
since you normally want to keep using the shell for other things even while the graphical program is running.

What is not as typical, and perhaps not well known, is that the ampersand, `&', is a command separator just like the semicolon, and can be used like this
emacs & xv & netscape &
if you wanted to run all three programs in the background. Note that you still need the ampersand on the end to indicate the last command is backgrounded, but the end result is the same as if you had typed the three lines above separately.

It will be easiest to add this feature in three stages.

Parsing the ampersand

Change the strchr() call in your semicolon code to strpbrk() which will do the same thing for finding any of a number of characters rather than just one, and have it look for both semicolons and ampersands. Nothing else should need changing to make the ampersand work as a separator, so test this before continuing.

Not waiting for backgrounded commands

Of course, a command is not backgrounded if the shell is still waiting for it, but this is also only a small modification. Declare a flag variable in your code and set it to indicate whether the command should be backgrounded or not based on whether a semicolon or an ampersand was found by strpbrk(). Then after the fork() skip the waitpid() for the parent if the flag variable indicates it is a backgrounded command that is being done.

Redirecting default input for backgrounded commands

Since standard input is automatically copied for child processes, backgrounded commands still have access to the keyboard by default. This presents a problem when the shell prompts for the next command as the shell and the backgrounded process will end up fighting for each key that is pressed, quite literally. If you are game, you can try running cat & to see what happens. Normally this command, when run in the foreground, would just echo back anything that is typed. When it is backgrounded and fights with the shell, the behaviour is unpredictable.

The solution is to redirect the standard input to a special file in UNIX which gives no input, and this file is called "/dev/null". The redirection should be done in the child using dup2() as before if the flag variable indicates a backgrounded command is being executed, but should be done before the other dup2() calls so that a user-specified redirect will override it.

(This odd method of using /dev/null is needed because if standard input was simply closed the backgrounded command would get an error for the input instead of no input.)

Revenge of the zombies

Run a few backgrounded commands such as ls & in your shell. The output of them will mix with the shell prompt, but this is supposed to happen. Next, do a ps and you will see that these backgrounded commands all turned into zombies when they completed, since the parent never waited for them. As described in the previous lab, they are zombies because they are waiting for the parent to wait for them. In this section you will better understand why this is done.

At the moment your code should be calling waitpid() with a specific PID, being the PID of the child most recently created. For commands already backgrounded, however, we would have to maintain a list of PIDs that we didn't wait for to do something similar. While some shells do exactly that, we will use the simpler method of giving -1 as the PID to wait for any lingering child processes.

Leave the current call to waitpid() as it is. The code needed to wait for backgrounded commands is in addition to the waitpid() which is already there, since that is used to wait for foregrounded commands and is still necessary.

The wait for backgrounded commands will output information about any processes it found that had finished, so pick a part of your shell's loop where you're happy for this to happen. The spot you pick is significant because it might mean that you have to type another command before the wait for previous backgrounded commands occurs, but this is fine.

Since the call to waitpid() is a `check' of the backgrounded processes rather than making the shell genuinely wait for them, it needs to be a `non-blocking' call, i.e. a call which does not block(stop) the process doing the call. This is achieved with an option for waitpid(). Furthermore, each time waitpid() is called with -1, it does a wait for either zero or one processes, never more. Since there may be a few backgrounded commands, this waitpid() use should be placed inside a loop which terminates when the return value of waitpid() indicates there are no more processes to wait for.

Inside the loop, for each of the PIDs waitpid() returns, you should check the status of that process with one of the macros described in the waitpid() man page (remember the status ends up in one of the arguments to waitpid()). If the status is that the process has exited, print out what its exit status was along with its PID. The exit status is the number that the process gave to the exit() function call, and is conventionally used to indicate if the program was successful or not - kind of like a return value. An exit status of 0 means success, anything else is a failure.

To see what this output should look like, try a backgrounded command as before in a real shell and see what happens.

The reporting of exit status is one of the real reasons that processes go into a zombie state. If they didn't hang around like that, we would never be able to tell what the exit status was. If you like, you can also print out the exit status of foregrounded processes, as this will be given by the waitpid() from the previous lab. Although most shells don't show this information for foregrounded processes, it can be useful to see for getting used to the idea of exit statuses and the need for waiting and zombies.

Test everything you've done so far with
ls & ls ghjk &
Assuming you don't have a file called "ghjk" in your directory, you should see two very different exit statuses for these commands.

Piping using `|' - Putting everything together

The pipe symbol, `|', redirects standard output for the command to the left of it, redirects standard input for the command to the right, serves as a separator, and backgrounds the commands on the left. If you have everything in the lab working so far, most of the code you need to do it will already be in your program.

Piping is used to send the output of one command to the input of another and, if you like, to another and then to another and so on. For instance, you can view the listing of a gzipped tarfile with something like
gzip -d < buggy.tar.gz | tar tvf - | less
which has gzip reading a file from standard input and uncompressing to standard output, tar picking up the output of gzip and outputting a listing of the tarfile, and less picking up that output and displaying it page by page.

The code for this will be added in similar stages as for previous parts.

Parsing the pipe

Add the character `|' to the strpbrk() list with the other two separators.

Backgrounding piped commands

When the separator is found to be `|', set the backgrounding flag just as you do when the separator is `&'.

Creating the pipe

All that remains now is to correctly prepare the file descriptors before the fork(). The dup2() calls in the child will not need any modification.

The function pipe() creates a pipe. It takes an array of two ints (which will be called pipefd[] for this discussion) as its argument and, if the pipe creation is successful, puts the file descriptor that refers to each end in each element. Even though it deals in file descriptors, the pipe is not a file. Instead, it is a FIFO queue, a buffer, in the operating system's memory, that can only be written to with pipefd[1] and read from with pipefd[0]. This means that the operation of the pipe looks this way

command1              |              command2
process1 > pipefd[1]=====pipefd[0] > process2
It may seem like the numbering in the pipefd[] array is backwards, but the reason it is done like this is that process1 is writing into the pipe, which means it must redirect file descriptor 1 (standard output) to pipefd[1]. process2, on the other hand, is reading, which means it must redirect file descriptor 0 (standard input) to pipefd[0]. So the numbers clearly match up when viewed this way.

The most significant thing about this situation is that the pipe is created just before executing command1, but the file descriptor in pipefd[0] must be remembered for use with command2. Moreover, if command2 pipes into another command, command3, then another pipe must be created, but not confused with the previous pipe.

So the logic at the start of the loop must go like this:

  1. Set the input file descriptor of the up and coming child to pipefd[0], the output of the pipe from the previous command.
  2. Set pipefd[0]=0 in case the current command does not create another pipe. This way the command after that will be reading from the keyboard again.
  3. Check to see if the separator at the end of the current command is a pipe. If it is, create a new pipe.
  4. Set the output file descriptor of the up and coming child to pipefd[1], regardless of whether a new pipe was created.
  5. Set pipefd[1]=1 in case the next command does not create another pipe, so that it will be writing to the screen again.

This creates the handovers needed to make piping work, but is still missing one small detail. pipefd[] is only set back to default values after it is used since it must preserve the file descriptors of the most recent pipe, but what if there is no recent pipe? The very first command on the line, which might be the only command, has no pipes created for it. As such, pipefd[] should be initialised to 0 and 1 even before the parsing of the command line begins.

Denouement

Congratulations, you now have a very usable shell not unlike the basic functionality of professional shells such as sh, csh, tcsh, bash, and zsh. Of course, these shells support complete scripting languages with if statements and loops, but apart from that the only real differences are in job control, variables, globbing, history, and file completion.

Job control is the most interesting of these as it provides ways of pausing processes with ^Z, resuming them in the foreground with fg, or in the background with bg, and even bringing a background process back into the foreground. A great deal of this is achieved with something called a signal (there is a man page), which is the standard way in UNIX of doing `unpredictable' interprocess communication, as opposed to pipes which are specified and set up in advance. Nevertheless, even for signals some advance work is required and this takes the form of establishing some functions, using function pointers, which will be called if the process receives a particular signal. Such functions are signal handlers - similar to the idea of an interrupt handler.

Variables are the same idea as exists in a programming language, and in fact are really only used in shell scripting. Globbing is what allows the use of characters like `~' for home directory and `*' for filename matching. The history is, as it sounds, a list of commands you've used in the past, making it easy to repeat them. Finally file completion is, in most shells, achieved by pressing the tab key after having typed most of a file name, and the shell will fill in the rest if it can. The addition of any of these to your shell is a worthwhile programming exercise, but they do not require any more understanding of the operating system in which the shell is running.