Structured Process Input/OutputCameron McCormackclm@csse.monash.edu.auSchool of Computer Science and Software Engineering
http://www.csse.monash.edu.au/~clm/home/uni/honours/
title.pngbg.pngWhat's the project about?Moving away UNIX process I/O from flat text (line-based records) to structured text (XML documents)The UNIX Command Line EnvironmentWhy is it popular?Idea of the "software tool" (Kernighan 1976)Many simple, specific programsComposition of these simple programs to make complex onesFocus is on flat text processingLine-based recordsFields separated by whitespaceIs this a problem?A problem with flat textNot all data conform to this record/field modelAn exampleA hierarchical directory listinghier2.pngAn exampleOutput of "ls -R".:
documents
./documents:
personal
shared
./documents/personal:
phonebook
resume
./documents/shared:
housekeeping
The hierarchical information doesn't fit into records/fieldsUsing this output, how do you find all files two directories deep?Another problemMany programs format output for human readingMore difficult for programs to parseAnother exampleOutput of ls long format$ ls -l
total 0
-rw-r--r-- 1 cameron cameron 253 May 22 23:40 phonebook
-rw-r--r-- 1 cameron cameron 763 Sep 5 2001 resume
How do you extract the modified time?No clear field delimeterNormally use "cut"But this needs knowledge of output formatModified time format can also changeNeed to separate information from presentationSo, what can we do?The solutionIncorporate a well-defined structure into process' input/outputXML as the data formatStandardisedParsers already existStill human readableHierarchicalAdded benefit: UnicodeThe updated lsThe output generated by our new ls program:
]]>
Now how easy is it toFind files two directories deepExtract the last modified timeWhat else do we need?Without more XML aware programs, it is still difficult to
extract information from this outputWe need some filtersAn equivalent for cut, grep etc.Some of these filters can utilise XPathWe can now say
]]>
What about output for human consumption?This XML output is not suitable for presenting to the userNeed to transform it to some format for the terminalCan use XSLT to do the transformationBut it is unwieldy to manually transform every command's outputTransformation must happen automaticallyWe need support from the shellThe new shellHandles composition of programs just like Bourne shellBut also detects output type and transforms it appropriatelyIssues of metadataMy projectSplit into three stagesStage 1Identify standard UNIX text generating programsls, ps, netstat, etc.Modify or wrap them to generate XMLStage 2Identify standard UNIX text filtering programsgrep, cut, awk, etc.Modify them to filter XML documentsStage 3Write (simplistic) shell similar to Bourne shellThe shell will handle presentation issuesWill transform output for human reading on the terminalProject statusAt this stage, implemented two text generating programs from Stage 1ls and psJust started looking at Stage 2Concluding remarksFor more information on my project, see http://www.csse.monash.edu.au/~clm/home/uni/honours/.Thanks for listening!