You are here: Home > Honours project > Rationale

Implementation

The implementation of the new command line environment was done in three stages, matching the three categories of programs. First, a couple of data generating programs were developed. Then a few data filtering programs were written. Finally, a shell to interact with the environment was created.

Presented here is a selection of the programs implemented for the project.

Data generating programs

Two of the data generating programs created for the new command line environment are ls and ps.

ls

The ls program is a short Perl script which just does a stat on the files listed on the command line. Here is some sample output from the ls program:

<files xmlns="urn:ns:clm.xml-unix.ls" long="true">
  <file name="." path="." device="2102" inode="2392620" nlink="2" uid="1000"
    gid="1000" rdev="0" size="4096" blksize="4096" blocks="8" type="directory"
    mode="0755" atime="1036936080" mtime="1036448944" ctime="1036448944"
    username="cameron" group="cameron" modechars="drwxr-xr-x"
    human-mtime="Nov  5 09:29"/>
  <file name=".." path="." device="2102" inode="2392619" nlink="15"
    uid="1000" gid="1000" rdev="0" size="4096" blksize="4096" blocks="8"
    type="directory" mode="0755" atime="1036935786" mtime="1036645021"
    ctime="1036645021" username="cameron" group="cameron" modechars="drwxr-xr-x"
    human-mtime="Nov  7 15:57"/>
  <file name="one.xml" path="." device="2102" inode="2392135" nlink="1"
    uid="1000" gid="1000" rdev="0" size="1151" blksize="4096" blocks="8"
    type="regular" mode="0644" atime="1036645028" mtime="1027316581"
    ctime="1027316581" username="cameron" group="cameron" modechars="-rw-r--r--"
    human-mtime="Jul 22 15:43"/>
  <file name="two.xml" path="." device="2102" inode="2392137" nlink="1"
    uid="1000" gid="1000" rdev="0" size="5912" blksize="4096" blocks="16"
    type="regular" mode="0644" atime="1036645028" mtime="1027316588"
    ctime="1027316588" username="cameron" group="cameron" modechars="-rw-r--r--"
    human-mtime="Jul 22 15:43"/>
</files>

All of the relevant information about each file is stored as an attribute in the element corresponding to that file. If this program is run from the new shell, the output XML document is transformed into the familiar ls output:

-rw-r--r--    1 cameron  cameron      1151 Jul 22 15:43 one.xml
-rw-r--r--    1 cameron  cameron      5912 Jul 22 15:43 two.xml

ps

The ps program is another Perl script. This script outputs an XML document with elements that describe the processes running on the system. ps is actually a wrapper around the GNU ps program. Some example XML output from ps:

<!-- many attributes elided from this sample output -->
<processes xmlns="urn:ns:clm.xml-unix.ps" sid="9703">
  <process pid="1" lstart='Sun Nov 10 13:07:49 2002' etime='12:03:53'
    command='ini ' sid='0' wchan='select'/>
  <process pid="239"lstart='Sun Nov 10 13:08:08 2002' etime='12:03:34'
    command='/sbin/dhclient-2.2.x -q eth0' sid='239' wchan='select'/>
  <process pid="243" lstart='Sun Nov 10 13:08:08 2002'etime='12:03:34'
    command='/sbin/portmap' sid='243' wchan='poll'/>
  <process pid="3702" lstart='Sun Nov 10 13:08:19 2002' etime='12:03:23'
    command='/sbin/syslogd' wchan='select'/>
  <process pid="3705" lstart='Sun Nov 10 13:08:19 2002' etime='12:03:23'
    command='/sbin/klogd' wchan='syslog'/>
  <!-- ... -->
</processes>

Data filtering programs

Two of the data filtering programs written for the new command line environment are cat and sort.

cat

As in the standard UNIX command line environment, the cat program exists to concatenate files. The traditional cat can simply output the files one after another, as the flat text file format will be preserved. However, the XML cat program cannot do this, as appending document elements to another will result in a document which is not well formed.

Instead, the cat program here takes the elements from inside the document element of the source documents and inserts them at the end of the document element of the destination document. For example, if the file doc1.xml contains:

<data>
  <item id="1"/>
  <item id="2"/>
</data>

and doc2.xml contains:

<data>
  <item id="3"/>
  <item id="4"/>
</data>

then running the command "cat doc1.xml doc2.xml" will result in the following output document:

<data>
  <item id="1"/>
  <item id="2"/>
  <item id="3"/>
  <item id="4"/>
</data>

sort

The sort program in the traditional UNIX command line environment sorts lines in a file based on a given field (or character range). The XML sort program can instead be given an arbitrary XPath location to use as the sort key. For example, if doc3.xml contains:

<foods>
  <staple id="3">Rice</staple>
  <fruit id="02">Starfruit</fruit>
  <fruit id="1">Apple</fruit>
  <staple id="4">Bread</staple>
</foods>

then the elements can be sorted based on their textual content with just "sort doc3.xml":

<foods>
  <fruit id="1">Apple</fruit>
  <staple id="4">Bread</staple>
  <staple id="3">Rice</staple>
  <fruit id="02">Starfruit</fruit>
</foods>

If the file is to be sorted numerically on the id attribute, the command "sort --numeric --node @id doc3.xml" can be used:

<foods>
  <fruit id="1">Apple</fruit>
  <fruit id="02">Starfruit</fruit>
  <staple id="3">Rice</staple>
  <staple id="4">Bread</staple>
</foods>
Cameron McCormack <clm@csse.monash.edu.au>