Documentation Definition In XML
by: Susanti


1. Introduction

This thesis involves the creation of an editor which enables users to view the graphical representation of a document. In general, if there is no graphical interface for editing a document, a text-based editor is used. For computer scientists, this is not a problem, as they are quite happy to work with a text-based system. However, less computer literate users would prefer to work with the graphical representation of a document. They prefer to use software with graphical user interface (GUI) rather than the text-based system. Therefore, the rationale behind this project is to provide users with a more user-friendly graphical representation of a document.

More specifically, this project aims to build an editor for the AXE (Ajh's XML Engine) style sheet or translation document, namely AXEsse (AXE style sheet editor). AXE is an XML tool that allows users to translate an XML (Extensible Markup Language) document to different types of document [8], such as HTML, RTF (Rich Text Format) and Latex, based on its style sheet. AXE and its style sheet will be discussed further in Sections 4.6 and 3.3, respectively. XML, including its background, translation process and the XML document itself, will be discussed in more detail in Chapter 2.

To achieve the project aim, Perl (Practical Extraction and Report Language) programming language and GTK (Gimp ToolKit) Perl binding were chosen to build the editor. Perl programming language was chosen for this project because of features including its capability to work with markup languages such as HTML and XML, and because it is good at text processing. Since this project deals with XML and text processing, Perl was an appropriate choice for this project. GTK was selected as a library for creating graphical user interfaces because of its "look and feel" infrastructure [15]. The details of Perl and GTK will be discussed in Sections 5.1 and 5.2 respectively.

As stated before, the XML background, its translation process and the XML document will be discussed in Chapter 2, followed by Chapter 3 on style sheets available for XML. Chapter 4 illustrates several XML tools available and compares them. The design of the editor is covered in Chapter 5, while Chapter 6 explains its implementation. The result is discussed in Chapter 7. The conclusions and future works will be discussed in Chapter 8. The User Manual, the AXEsse program interface and program source code are included in the appendices.

Back Top


2. XML: An Overview

2.1. Background

In the current Internet world, the most popular web markup language is the Hypertext Markup Language (HTML). HTML has been a powerful asset to Web development, but it lacks the capability of specialization [11]. HTML is designed to format how to present web page data, but not what the data represents. On the other hand, there is another markup language that is designed to format what the data represents, namely XML (Extensible Markup Language). Similar to HTML, XML has an interoperability feature, yet it has some significant features that are different from HTML, i.e. it is extensible and it separates document structure from document representation.

According to Kiely [16], the best part of XML is the 'X' in the acronym, which stands for extensible. This means that users have the ability to create their own vocabularies to describe the information. With this ability, an XML document can be designed to fit specific purposes. However, this is not possible with HTML. For instance, in building a document that stores address book information, XML allows users to create their own tags (tags are those that start with the 'less than' symbol, < and end with the 'greater than' symbol, >) such as <Name>, <Address>, <Phone> and <Email>. These XML tags are more meaningful and easier to understand compared to tags used in HTML. Tags will be discussed further in Section 2.3.6.

Different from HTML, XML separates document structure from document representation. This allows users to use the same XML document to produce different document representations. Moreover, the relevant information in XML can be retrieved more easily by search engines compared to HTML. For example, a user is searching for the names of all students at Monash University using a search engine. If an XML based search engine is used, the user is more likely to be able to retrieve all the names of Monash students. However, if an HTML based search engine is used, only those documents which contain the word "name" will be displayed. This is often not relevant to what the user wanted.

Similar to HTML, XML has an interoperability feature. It also gives users the ability to exchange documents in different platforms and applications. This is essentially because it is actually a simple text document [17]. In this sense, XML can be said to stand for exchangeable markup language rather than extensible markup language [18].

Both HTML and XML originate from Standard Generalized Markup Language (SGML). SGML is an international standard for the semantic tagging of documents [7], that was issued in 1986. It is particularly popular in large industries such as aircraft, aerospace, power and telecommunication, because it deals with large quantities of highly structured data, which need to be presented in the form of documents. It assists computer in cataloging and indexing, SGML however was very complex and expensive.

A simpler and cheaper alternative, XML was developed in 1996 by an XML Working Group chaired by Jon Bosak of Sun Microsystems, under the auspices of the World Wide Web Consortium (W3C). As a subset of SGML, XML retained the structural power and flexibility of SGML but eliminated much of the syntactic complexity of SGML [29].

2.2. XML Translation Process

An XML document can be translated into different types of documents such as those in HTML, RTF or TEX.



Figure 1: Translation Process

In the translation process, the parser is used to check the well-formedness and/or validity of an XML document. If the document is well-formed and/or valid, it can then be transformed by the Transformer or Code Generator into a different document type (see Figure 1). The well-formedness in an XML document will be discussed in Section 2.3.6 while the validity will be discussed in Section 2.3.4. To transform an XML document to any other document, a translation file or style sheet is also needed. The style sheet will be discussed in Chapter 3.

2.3. XML Document

The XML document is made up of character data and markup. Character data is the basic information of the document. Markup, on the other hand, describes the properties of the document. Markup properties, to be described below, include entity, CDATA, declarations, document type definitions, elements, comments, character references, and processing instructions.

2.3.1. Entity

An entity is a storage unit that contains particular parts of an XML document [6]. It may be a file, a database record or network resource [19]. To refer to an entity through entity reference, requires the entity to be declared by using entity declaration. For example, an entity named xml is declared with the content of "Extensible Markup Language". The XML processor will replace each instance of the entity reference, i.e. &xml;, by Extensible Markup Language. An example is shown in Figure 2.

The entity possesses the following type: internal entity, external entity, parsed entity and unparsed entity, the examples of which are shown in Figure 3. Internal entities are defined completely within the XML document, while external entities are those which acquire their content from another source located via a URL (Uniform Resource Locator) [6]. The parsed entities are entities whose replacement text will be parsed as part of the document in which a reference to it occurs [19]. The unparsed entities are external entities that the XML processor should not parse as XML in the current document [19].



Figure 2: Example of entity declaration and entity reference

Figure 3: Example of internal, external, parsed and unparsed entity declarations

2.3.2. CDATA

CDATA, which stands for character data, is used as verbatim quote to escape blocks of text containing characters which would otherwise be recognized as markup. This part of CDATA is not to be confused with the character data. CDATA is usually used to write the markup code as pure text in XML. A CDATA section begins with a string of "<![CDATA[" and ends with a string of "]]>". The content between these strings will not be interpreted by the XML parser. An example of CDATA is given in Figure 4.



Figure 4: Example of CDATA

2.3.3. Document Type Declaration

The document type declaration specifies the document type definition a document uses [6]. The document type declaration is not to be confused with the document type definition (DTD). The DTD is the set of rules for specifying the structure of a document. It defines the legal building blocks of an XML document [20]. An example is given in Figure 5, where <!DOCTYPE Document [ … ]> is the document type declaration and <!ELEMENT … (…)> are the DTD.



Figure 5: Example of an XML document with DTD

An XML document is valid if it matches the constraints listed in the DTD. From the example shown in Figure 5, the constraints are:
¤ A Document needs to have a Greeting and a Body parts. It cannot have more than one of them.
¤ The Greeting and Body have to be parsed character data (PCDATA).

2.3.4. Declaration

There are three parts to the XML declaration, the version information, the encoding declaration and the standalone document declaration.
¤ The version information part declares the version of XML that is in use. This part is required in all XML declaration (Figure 6).


Figure 6: Example of XML declaration with version information



Figure 7: Example of XML declaration with version information, encoding declaration and standalone document declaration

The XML declaration must precede the document type declaration if both are provided.

2.3.5. Comment

Comments are inserted into an XML document to improve the readability of that document. They are ignored by the processing software. The comment must begin with a string of "<!--", followed by any text and end with a string of "-->". For compatibility, the string "--" must not occur within comments. An example is given in figure 8.



Figure 8: Example of comments

2.3.6. Element

Each XML document contains one or more elements, which can be broken down into two categories, i.e. element with content and empty element. The first element has content which begins with a start-tag and finish with an end-tag. The start-tag consists the name of an element type which is known as generic identifier (GI), enclosed by a 'less than' symbol < and a 'greater than' symbol >. An end-tag consists of the string "</", the same GI as start-tag and a 'greater than' symbol >. An example is shown in Figure 9.


Figure 9: Example of an element with content

The second element is empty element, which is an element without any content. There are two ways that can be used to denote an empty element, i.e. either by simply leaving out the content or by using an empty tag. An empty tag consists of a 'less than' symbol <, followed by GI and closed by a string "/>". To provide a better understanding of an empty element, two examples are shown below.



Figure 10: Example of an empty element with a start-tag and an end-tag

Figure 11: Example of an empty element with an empty tag

An XML document is said to be well-formed if all its element's tags are in pairs, that is for each start-tag there is an end-tag, except for tags denoting empty elements.

In addition to content, elements may have attributes which will be discussed in Section 2.3.8.

2.3.7. Character Reference

Character reference refers to specific characters in the Unicode character set [20]. Unicode is the native character set of XML which can be displayed by the XML browser [6]. Every Unicode character is a number between 0 to 65,535. The character reference consists of string "&#", the code's decimal number and a semicolon (;). If the hexadecimal character code is used, the character set starts with a string of "&#x", the code's hexadecimal number and a semicolon (;). For example, the Greek pi symbol, has Unicode decimal value of 960, thus it can be inserted to an XML document as &#960;, or &#x3C0;.

2.3.8. Attribute


Attributes are a way of attaching characteristics or properties to the elements of a document. Each attribute will have names and values pair. For instance, a person has height and weight as his/her properties. Thus, these properties can be transformed into a person element's attributes (see Figure 12). Note that the attribute name does not go in quotes while attribute value does.


Figure 12: Example of attribute

2.3.9. Processing Instruction

Processing Instruction (PI) is an explicit mechanism for embedding information in a document intended for proprietary application rather than the XML parser or browser [6]. It allows an XML document to contain instructions for applications. The XML parser will pass the instructions to the application and the application will decide what to do with the instructions. If the application does not recognize the instruction, the instruction will be ignored.
PI starts with a string of "<?", followed by strings of text and ends with a string of "?>". An example of PI is shown in Figure 13 where PI passes gcc HelloWorld.c to the application.



Figure 13: Example of a Processing Instruction

Back Top


3. XML Style Sheet

An XML document only specifies the content of the document, however it does not say anything about how the content should look. Information about an XML document's appearance is stored in a style sheet. Different style sheet can be used for a single document to produce different appearance. A Few examples of XML style sheets are CSS, XSL and AXE style sheet. These will be discussed in Section 3.1, 3.2 and 3.3 respectively. In Section 3.4, the comparison of these style sheets are discussed.

3.1. CSS

Cascading Style Sheets (CSS) were introduced in 1996 as a standard of adding information about style properties to HTML document. It is a simple declarative language that allows stylistic information, such as font, spacing, colour and so on, to be applied to the structured documents written in HTML or even XML [23]. It allows elements to be rendered by associating them with properties (e.g. font-size, font-weight, color) and values (e.g. 24pt, bold, blue). For instance, in Figure 14, Greeting element is rendered as a block-level element in 24-point bold blue text.



Figure 14: Example of CSS

3.2. XSL

Extensible Stylesheet Language (XSL) is a specification under development within the W3C for applying formatting to XML documents [21]. It is a language for expressing style sheets [5]. XSL itself is an XML application. It contains two major parts:

In Figure 15, XSL will transform Greeting element to 24-point bold blue text.



Figure 15: Example of XSL

3.3. AXE Style Sheet

AXE (Ajh's XML Engine) was developed by John Hurst at Monash University. After experimenting with several XML tools that did not perform to expectations, he built his own [8]. AXE was built to translate an XML document to any document types such as HTML and Latex [8]. All information below relating to details of the AXE and its style sheet are derived from Hurst's work [8].

AXE has its own style sheet mechanism, which allows it to contain comments, "include commands" and translation commands. Comments are denoted by a character of '#', which must not be preceded by any characters, and these are followed by strings of text. The "include commands" allow users to include other style sheets to the current style sheet, allowing users to reuse any style sheets that they have created. These begin with string of "include", followed by the filename. An example of two comments and an "include command" can be seen in Figure 16.



Figure 16: Example of AXE comments and an include command

The translation commands define the translations to be applied upon recognizing each element's start and end tag in the document. The translation may consist of various texts or Perl code fragments. Furthermore, the translation for each element may be divided in three parts: prefix, infix and postfix translation (see Figure 17). The prefix translation is applied before the element content is translated. The element content or infix is indicated either by a variable name or by string of "^^". The postfix translation is applied after the element content is translated.



Figure 17: Example of AXE Style Sheet

More detailed information about the AXE translation can be found on John Hurst's homepage ( http://www.csse.monash.edu.au/~ajh/research/doctech/index.html).

3.4. Comparison of The Style Sheets

CSS allows elements to be rendered by associating them with the properties and the values. The CSS syntax is much simpler compared to XSL and it has the advantage of broader browser support [6]. CSS does not allow the user to change or reorder the content of an XML document or add extra information like a signature block [6]. In other words, it cannot provide a display structure that deviates from the structure of the XML document [24].

XSL and AXE, on the other hand, allow the user to rearrange and reorder the element. XSL is more flexible and powerful, and better suited to XML documents, compared to CSS [6]. Furthermore, XSL and AXE allow the user to access and display the content of the attributes easily, which cannot be done in CSS. CSS can only apply to the elements' content, not the attributes. Therefore, if there is any data that the user wants to display, the data must be part of an element's content rather than one of its attributes in CSS. Both XSL and AXE are transformation languages which enable the user to convert an XML document into different types of document, such as HTML and RTF. XSL is a formatting language as well, same as CSS. AXE is only a transformation language, not a formatting language.

Back Top


4. XML Software

There are many software available that enable the user to view and modify XML documents and style sheets, as well as to translate the XML document based on the style sheet. This report provides information about some of the software, such as XML Spy, XMLwriter, UltraXML, XED, XML Notepad and AXE. The comparison among the software is provided in Section 4.7.

4.1. XML Spy

XML Spy is developed by the Altova company in Austria. As a member of W3C, Altova has been actively involved in XML software technologies. XML Spy is software that provides users with major aspects of XML in one powerful and easy-to-use product [13]. It allows XML editing & validation, DTD editing & validation and XSL editing & transformation. The screenshot is shown in Figure 18.



Figure 18: XML Spy Interface – taken from http://new.xmlspy.com/features_intro.html on 19/7/2000

To fulfil users' preferences, XML Spy provides four advance views on XML documents. These four advance views are the enhanced grid view for structured editing, the database/table view that shows repeated elements in a tabular fashion, a text view with syntax-coloring for low-level work and an integrated browser view that supports both CSS and XSL style-sheets [13].

Other than the four advanced views, XML Spy also has several important features such as [13]:

Other information includes:

For more information about XML Spy, please visit XML Spy homepage ( http://www.xmlspy.com).

4.2. XMLwriter

XMLwriter is developed by the Wattle Software company based in Sydney, Australia, whose aims are to produce high quality, user-friendly applications with comprehensive online help and support. As an XML editor, XMLwriter is designed to help users to take advantage of the latest XML and XML-related technologies such as XSL and XQL [1] by providing users with range of XML functionalities such as validation of XML documents against a DTD or XML Schema and the ability to convert XML to HTML using XSL style sheets [1]. The screenshot can be seen in Figure 19.



Figure 19: XMLwriter interface – taken from http://www.xmlwriter.net/images/fullscreen.gif on 19/7/2000

Furthermore, it also provides users with three different windows to make the job easier, such as Workspace, Document and Preview windows [2].

Other information includes:

For more information about XMLwriter, please visit its homepage ( http://www.XMLwriter.com).

4.3. UltraXML

According to WebX System Ltd., UltraXML is the very first true native WYSIWYG integrated XML editor solution available [9]. UltraXML allows users to see the XML document appearance directly as it is created. UltraXML important features are [9]:

Other information includes:

For more information about UltraXML, please visit its homepage (http://www.webxsystems.com/UltraXML.htm).

4.4. XED

XED is a XML editor created by Henry S. Thompson from the University of Edinburgh. It only supports hand-authoring of small-to-medium size XML documents [12]. An outstanding feature of XED is to ensure only a well-formed document is produced. Moreover, XED can be used in different platforms, such as Windows 95/98/NT, Linux, FreeBSD, and Solaris 2.5 [12]. The screenshot is shown in Figure 20.

XED is indeed a very simple editor that support some simple functionalities [12], such as:



Figure 20: XED Interface - taken from http://www.ltg.ed.ac.uk/~ht/xed.html on 19/7/2000

Other information includes:

For more information about XED, please visit XED homepage (http://www.ltg.edu.ac.uk/~ht/xed.html).

4.5. XML Notepad

XML Notepad is a product of Microsoft Corporation. It is a simple application that enables the rapid building and editing of an XML document [10]. It provides simple user interface that graphically represents XML data in tree structure.



Figure 21: XML Notepad Interface – taken from http://msdn.microsoft.com/xml/notepad/run.asp on 19/7/2000

As shown in Figure 21, the structure of the document is represented in the left column while the values of the nodes are displayed in the right column [14]. Elements in XML Notepad are represented either by folder icons if they have dependent structures such as other elements, attributes, etc, or leaf icons if they contain no substructures [14]. Attributes, text and comments are represented by 3-D blocks, text icons and exclamation mark icons, respectively [14]. Besides element, attribute, comment and character data, other properties are not supported by XML Notepad.

Other information includes:

For more information, please visit XML Notepad's homepage ( http://msdn.microsoft.com/xml/NOTEPAD/intro.asp).

4.6. AXE

As explained in Section 3.3, AXE is an XML tool that allows a general purpose XML, which only needs to be well-formed, to be translated to any translation mechanism [8]. The unique features of AXE are that it has its own style sheet and translation mechanism and it allows Perl code to be included as part of the translation. However, AXE does not have a graphical interface for its style sheet yet.

Other information includes:

For more information about AXE and its style sheet, please visit AXE homepage ( http://www.csse.monash.edu.au/~ajh/research/doctech/index.html).

4.7. Comparison of XML Software

Based on the XML software available in the market, discussed in detail in the previous section, the differences among the various software will now be analyzed.

In terms of its capabilities in speed and quality, XML Spy, XML Writer and Ultra XML are powerful editors which rank on the higher end of the axis. On the other hand, editors such as XED and XML notepad offer simple features and simple application functionalities.

Different XML software are developed for different group of users. Hence, the degree of ease of usage and user-friendliness of the software varies. XML Spy, XML Writer and XML notepad are some of the easier-to-use editors in the market.

There are also variations in the pricing of the products based on the features and quality of the products. On the higher end of the price range would be Ultra XML, costing a hefty $3000, while XML Writer and XML Spy cost less than $150. XED, AXE and XML notepad are available free of charge from their websites.

XML Notepad has a simple user interface, while AXE does not have a graphical interface. Ultra XML is so powerful, it allows users to view the XML document appear directly even as it is being created.

The six editors vary in their range of features and functionalities offered, such as their editing and validation capabilities, their window display and their translation mechanisms.

With regards to product availability, all the six editors mentioned can be used on Microsoft Windows that is, Windows 95/98/NT. AXE can be used on Unix and Linux, while XED can be used on Solaris 2.5, Free BSD and Linux and XML Notepad can be used on Internet Explorer 4.01, which goes to show its flexibility in usage.

Back Top


5. System Design

This project is about building an editor for AXE style sheet, namely AXEsse (AXE style sheet editor). The editor would enable users to see the content of the file in a user-friendly way. Thus, the graphical user interface of AXE style sheet is designed to have the following features:
¤ An AXE style sheet displayed in graphical way. This could be achieved by building an editor that has a graphical front end. GTK was chosen to accomplish this project because it is the library to build the user graphical interface. Furthermore, the design could be further improved by providing the display of tags using images.

To achieve the design specified above, the program was written in Perl with GTK. Both Perl and GTK will be discussed in more detail in Section 5.1 and 5.2 respectively.

5.1. Perl Language

Perl, which stands for Practical Extraction and Report language, was originally developed in 1986 by Larry Wall as a glue language for the UNIX operating system [25]. It was developed to produce reports from many files with many cross-references between files. For this reason, Perl is good at text processing, such as scanning arbitrary text files, extracting information from those files and processing that information with various build-in functions.

There are many reasons for the success of Perl. These include:

Since this project deals with the XML style sheet, Perl's capabilities of working with XML and text processing make it a good choice for building AXEsse software. Furthermore, the fact that it does not impose arbitrary limitations on data makes the choice even stronger.

Other than those reasons given above, compared with C, GTK with Perl binding is much simpler. With Perl, programmers do not need to worry about casting as needed in C, it has already been taken care of. Examples of GTK with C and Perl are given in Figures 22 and 23 respectively.


Figure 22: Example of GTK with C binding


Figure 23: Example of GTK with Perl binding

5.2. GTK

GTK (GIMP Toolkit) is an Open Source Free Software GUI Toolkit written in C programming language by its primary authors, Peter Mattis, Spencer Kimball and Josh MackDonald [15]. Although it was primarily developed for use with the X Window System, but GTK is now also used in the process of building different software projects. Though it is written in C, GTK is essentially an object-oriented application interface (API) because it is implemented using the idea of classes and callback functions (pointers to functions) [15].

GTK, that was built on top of GDK (GIMP Drawing Kit), is a library for creating graphical user interfaces with the "look and feel" infrastructure. Designed to be small and efficient, it is still flexible enough to allow the programmer freedom in the interfaces created. It allows the programmer to use a variety of standard user interface widgets such as push, radio and check buttons, menus, lists and frames. It also provides several container widgets which can be used to control the layout of the user interface elements.

Being written in C, GTK can be used in C, but there are also GTK bindings for many other languages including C++, Guile, Perl, Python, TOM, Ada95, Objective C, Free Pascal, and Eiffel [15]. Since AXEsse is the graphical interface for an editor which is written in Perl, GTK with Perl binding became the natural choice.

Back Top


6. System Implementation

The following section describes the development and the implementation of the AXEsse interface and functionalities. The obstacles faced in the process of completing this project will be discussed as well as the successes.

The initial design specified a screen containing a "tree view" at the left-hand side and a text box at the right hand side, separated by a GTK widget named Hpaned which enabled the user to resize the screen horizontally. The expandable "tree view" structure was used to provide a brief description of each component of the style sheet based on the DTD structure. The text box was used to display the content of each component in the style sheet. However, this "tree view" structure was not retained because there was no DTD in the style sheet.

Another factor which made the use of a "tree view" structure desirable was that it was able to display the content of the included files (using include commands) in the same window used for the style sheet. Nonetheless, this structure was not finally implemented as the user would not have seen the content of the current file and included files simultaneously.

In the final design, the list structure was chosen to replace this "tree view" structure to display the content of the style sheet. To display the content of any included file, another window is displayed when the user selects the included file component in this list structure. The list structure will be further discussed in Section 6.6.

The pictures used to distinguish the component of the style sheet were created as image files using the Icon Editor tool written by Thomas Tanghus which is available in Linux. To display these image files on AXEsee, the GDK pixmap widget was used. In addition, icons used in AXEsse were created using this tool.

The right-hand side of the interface contains a text box which displays the content of the file. This interface was further improved by using four different text boxes to display the content of each component of the style sheet, enabling the user to see the contents clearly. The first text box is used for displaying the content of the component as a whole. The second one is used to display the prefix of the component. The third one is used to display the infix of the component. The last one is used to display the postfix of the component. These text boxes were retained and used in the final interface of AXEsse.

In order to distinguish the prefix, the infix and the postfix of the content, a pattern matching mechanism was required. Perl provides regular expressions for the pattern matching mechanism, thus this procedure was easily achieved. One obstacle faced in using this regular expression was to find a tag component whose start tag and end tag were not written in one line. One solution to this obstacle was to delete the new line ('\n') at the end of each line, combine those lines together, and then apply the regular expression to find the start tag and end tag. However, this solution also changes the content structure. In the end, this solution was not used in the final implementation of AXEsse because a pattern matching operator 's' were found that treats the string as a single line in the regular expression. Using this pattern matching operator, the structure of the content could be retained while the start tag and end tag could be found easily. This mechanism was implemented in AXEsse.

To store the component of the file, the following attributes are needed, such as Type (to distinguish whether the component is a ‘tag’, ‘comment’, ‘include file’, ‘new line’ or ‘element’), Tag name, Content, Prefix, Infix, Postfix and Comment (the text that will appear in the status bar). Originally, a 2D array was designed to be used to store these attributes. For example, to access the type of the first component, $array[0][0] was used. However, instead of implementing in 2D, an array of hashes was implemented in the final design, because the code would be more understandable. Thus for the example above, to access the type of the first component was written as $array[0]{'type'}, which is more understandable compared to $array[0][0].

As stated before, when the user selects the included file component in the list structure, another window is popped up to display the content of the included file. To prevent several windows popping up for the same file, a temporary file is created to store the filename of an open file in AXEsee, in this case ".AXEsse.tb". Whenever a file is closed, the filename is removed from ".AXEsse.tb". When there is no more file in ".AXEsse.tb", it is removed from the directory. Thus, when the user selects an included file component in the list structure and the file is already displayed in another window, the status bar displays a message to inform the user that the file is already opened. This does not mean the user cannot open the same file in different windows, the user can open the same file by using the Open menu item or toolbar item. If there is an inconsistency in this temporary file, for example the user accidentally or deliberately deletes the ".AXEsse.tb" file, when another included file component in the list structure is selected, the system will inform the user that there is some inconsistency happening and will suggest to the user to reload the editor.

The present interface of AXEsse (see Figure 24) consists of:

Each of them is explained in the sections below.

6.1. Menu Bar

The Menu bar consists of File, Edit, Preference and Help. To create the menu bar, GTK provides a built-in widget called Accel Group. Accel Group provides a mechanism to include short cut keys for each menu item. Thus, by using the Accel Group widget, design features such as a basic menu bar and the provision of short cut keys, were achieved. Unfortunately, there was a minor problem encountered while using this Accel Group widget. It could not add the ‘F1’ key as a short cut key. Thus, the short cut ‘F1’ key that is usually used to provide the user with the Help function was not implemented in this application. The user simply clicks on Help menu bar and Content to call the Help function.


Figure 24: Final AXEsse editor screen

The File menu bar contains New Window, New, Open, Save, Save As and Exit menu items. The Edit menu bar item has Undo, Redo, Update, Insert, Append, Delete, Cut, Copy and Paste functionalities. The Preference menu bar consists of Icon only, Text only and Icon & Text menu items. Finally, the Help menu bar consists of Content and About AXEsse menu items.

By clicking on the New Window file menu item, a new window is displayed for the user. This was done by calling a Perl function named system, which executes any program on the system, in this case the "perl AXEsse.pl". The New file menu item allows the user to create a new style sheet. To ensure no data is lost, if the current file is modified and has not been saved, a message box is displayed to inform the user that the file has been modified, it then asks whether the file should be saved before allowing the new creation of a style sheet.

The Open menu item allows the user to open an existing style sheet. This was implemented using a file selection dialog box, where the user can choose either to select the file from the file list field or to type in the filename in the provided text box. The selected file is then opened for the user. The file selection dialog box is discussed in Section 6.4. Similar to the New menu item, if the current file has been modified and has not been saved, the message box is popped up to warn the user.

The Save and Save As menu items allow the user to save the current file. If the Save menu item is selected and the user has not specified any filename yet, a file selection dialog box is displayed for the user to provide the filename. After a filename is provided, the data is then saved to that file. Similar to the Save As menu item in other word processors, it always displays a file selection dialog box to allow the user to select or insert the filename to which the data is saved. If the file already exists, then a message box is popped up to provide the user with the information that the file already exists, it then asks if the user wants to overwrite it.

The Exit menu item allows the user to quit the application. If the current file has been modified and has not been saved, a message box will pop up to inform the user of this fact and asks if the file should be saved first before quitting the application.

The Undo menu item provides the user with the ability to undo changes made previously, such as insert, append, update and delete. The Redo menu item allows the user to redo the actions that have been undone by the Undo. To provide unlimited undo and redo features as specified in the design, two arrays, which are treated as stacks, are used to store each action that the user does. Whenever an action is performed, the undo array is updated with information about the action details, such as the action type, the column number that was deleted and the content of that column number. The redo array, on the other hand, is updated only when the user clicks on the Undo menu item or the undo icon. However, the undo and the redo functions cannot support the undo and redo for cut and paste functionalities. This was because the content that was cut or pasted cannot be obtained from the buffer used in GTK. The tutorial and the technical report for GTK were unable to provide useful information on this topic. Thus, the undo and redo features could be used for the update, insert, append and delete capabilities only.

The Update, Insert, Append and Delete menu items provide the capability to modify the content of the current style sheet. The Insert, Append and Delete menu items functionalities will be discussed in Section 6.8. The Update toolbar item is used to update the component of the style sheet. After modifying the content of the Content text box, the user needs to either click on the Update menu or toolbar item. If the user clicks on other list item in the Component Display screen without click on Update menu or toolbar item, the modification is lost. This is performed in such a way to prevent the inconsistency of the content after the modification. For example, if the current component is a 'tag' component, after the modification, the error occurs. Thus, this component should be categorized as an error, rather than 'tag' component.

Specifically for the 'tag' component, the user can also modify the content of the Prefix, Infix and Postfix text boxes. However, when the user modifies these three text boxes, the user must not modify the content of the Content text box. Because by modifying both the Content and one or more of the other text boxes, the system may not know which modification should be updated. For example, the content of the component is <tag><B><I>.^^.</B></I></tag>. After the modification of the Content and Prefix text box, the content becomes:

Notice that the modifications make the content of the text boxes inconsistent to each other, thus these modifications may cause confusion as to which modification should be updated. Therefore, to allow the modification on Prefix, Infix and Postfix text boxes to be updated, the content of the Content text box must not be updated. If it is updated, the modification is updated based on the modification done in the Content text box.

The last three edit menu items are the Cut, Copy and Paste menu items. These menu items can be applied for all the text boxes available in AXEsse. These functionalities were easily done using callback functions provided for text box widget in GTK.

The preference menu item only includes Icon only, Text only and Text & Icon functionalities. They enable the user to view the items in the toolbar as icon, text or both text and icon. These functionalities can be easily accomplished through the built-in functions provided by GTK.

Finally, Content and About AXEsse in the Help menu item provide the user manual of AXEsse (see Appendix A) and information about AXEsse to the user. When the user clicks on the Content menu item, the application will open Netscape to display the Help content of AXEsse, written in HTML. If the About AXEsse menu item is clicked, a message box which provides information about the version of AXEsse, the copyright, the author and the author’s email address, is displayed (see Figure 25).



Figure 25: The About AXEsse message box

When the message box or the file selection dialog box is displayed, the main window is dimmed or de-activated to prevent the user from doing other actions until an option provided in the message box or file selection dialog box is chosen.

6.2. Toolbar

The toolbar items provided in AXEsse are New, Open, Save, Undo, Redo, Cut, Copy, Paste and Update. All of these toolbar items' functionalities have been discussed in the previous section. Similar to the menu bar, GTK also provides a built-in toolbar widget which allows the creation of each toolbar item to be achieved easily. Furthermore, the toolbar widget in GTK also provides the capability to show some tips about the toolbar item. This is normally done by using the Tooltips widget. If the user leaves the pointer over a toolbar item for a few seconds, the tips of that toolbar item is displayed.

6.3. Status Bar

The Status bar implemented in this application is essentially a label. The message is displayed when the user clicks on the list item in the Component List screen. It provides information about the current selected list item. For the include command component, if the file cannot be opened, then a message is displayed in the status bar to inform the user that the file cannot be opened. If the file is already displayed in other window, the status bar provides a message to inform the user that the file is currently open. The GTK status bar keeps a stack of messages. To display the message, this can be achieved by popping the stack content [28]. Thus, the GTK status bar widget was not used for this application as it is complicated and cumbersome.

6.4. File Selection Dialog Box

The file selection dialog box allows the user to select an existing filename. The GTK built-in file selection widget was implemented for the file selection dialog box. The programming time was cut down through using the built-in file selection widget provided by GTK. As shown in Figure 26, it provides several functionalities such as a Create Dir button, Delete File button, Rename File button, Directory drop down list box, Directory list screen, File list screen, Selection text box, OK button and Cancel button.

The Create Dir button is used to create a new directory in the current selected directory. As the names suggest, the Delete File and Rename File buttons provide the user with the ability to remove the selected file from that directory and to alter the name of the selected file. The Directory List screen enables the user to change the directory to search for a certain file. The user can select the file by clicking on the filename in the File list screen. Finally, the OK button allows the user to proceed to the open or save actions, while the Cancel button allows the user to cancel the actions that are to be carried out.



Figure 26: File Selection dialog box

6.5. Message Box

The message box, which is used to provide some important information to the user, was created by using the GTK dialog widget. This simple dialog widget is basically a window with two boxes and a separator packed into it. The first box provides the ability to include pictures and text, while the second box, which is called the action area [28], allows the addition of several buttons, such as OK, Cancel, Yes and No. The message box interface that asks the user to save the file before the next action is carried out, is shown in Figure 27.



Figure 27: A Message Box interface

6.6. Component List screen

The Component List screen was implemented using the GTK Clist widget to briefly display the content of the style sheet. The Clist widget provides easy access to the list items. The scrolledwindow widget was also implemented to provide the scrollbar automatically when it is needed.

With the Clist widget, the item can either be text only or picture only or both picture and text. A feature that allowed picture and text to be included together was considered necessary for the interface. The pictures could be used to distinguish the components of the style sheet such as ‘tags’, ‘comments’, ‘include files’, ‘new lines’ and ‘elements’ clearly. The text could be used to briefly describe the content of each component.

Compared to Figure 24, Figure 28 includes a scrollbar which appears automatically and the list elements that are displayed in pictures and text. Because this report is printed in black and white, the actual colour of the pictures are shown in black. The original colour for the background of the pictures is yellow, the selected item is highlighted by blue. The picture for each component is:

Originally, each comment in the style sheet that is next to each other was displayed in separate list items. This specification was altered to display all comments in only one list item because adjoining comments usually describe the same fragment of code. On the other hand, if two comments are separated by a new line, they are displayed in different list elements.

6.7. Content Screen

The content display screen consists of four text boxes. The first text box displays the content of the element selected in the element list screen. The prefix, infix and the postfix text boxes are used to display the prefix, the infix and the postfix components of the element content. The GTK Vpaned widget is used to allow the size of these text boxes to be adjusted vertically as the user desires. If the total content cannot fit in the text box, the scroll bar will automatically appear to allow the user to scroll through the whole content. As shown in Figure 28, only two text boxes have scroll bars. This is because their content cannot be seen totally in their text boxes. This mechanism is implemented using the GTK scrolledwindow widget which automatically displays the scroll bar when it is needed. Notice that there is a curved arrow symbol, in two of the text boxes. This symbol indicates that the line of text is too long to fit onto a single line of the display window, thus the text is wrapped onto the next line.


Figure 28: AXEsse interface with file content

6.8. Insert, Append and Delete Buttons

To allow the user to modify the content of the style sheet, a few GTK command buttons, such as Insert, Append, Delete buttons, are provided. The Insert button allows the user to insert the component of the translation file before the current selected component in the list. The Append button allows the user to append the component of the style sheet at the end of the component list. Both Insert and Append buttons will insert or append the components that are typed by the user in the Insert/Append text box. The Delete button is used to delete a single component in the list.

Back Top


7. Result

The AXEsse was built to achieve the aim of this project, that is to provide an AXE style sheet editor. AXEsse, an editor that was built with graphical user interface provides the user with several functionalities to view and to modify the style sheet.

In viewing the style sheet, AXEsse gives the user these following benefits:

In modifying the style sheet, the user also has these following benefits:

Admitting that there are more features can be added to AXEsse, since AXEsse enables the user to edit and modify the style sheet in a graphical way, it can be said that AXEsse does fulfil the requirements asked of it.

Back Top


8. Conclusions and Future Works

This project has successfully built and tested an editor, namely AXEsse (AXE style sheet editor), for the AXE style sheet. AXEsse enables the user to view and modify the content of the style sheet in a graphical way. The user could easily distinguish different components, such as 'tag', 'comment', 'include file', 'element' or error in the style sheet because AXEsse provides different icons for each component. For 'tag' components, AXEsse displays their prefix, infix and postfix components. The user can also modify the content of the style sheet as AXEsse provides them with update, insert, append and delete functionalities. Furthermore, AXEsse provides users with unlimited undo and redo capabilities, thus major corruption of the style sheet can be prevented.

The current version of AXEsse could be further improved in various ways to provide even better editing mechanism to the user. Firstly, the graphical representation of the tags. Secondly, to include the Undo and Redo capabilities for Cut and Paste functionalities. Thirdly, the addition of Print and Find capabilities to enable the user to print the style sheet and to search for specific component in the style sheet. Fourthly, the capability to change the font size, which enables the user to view the content in the preferred font, size. Finally, the capability to display the rendered document which enables the user to view how the XML document will look like. With the presence of these additions, AXEsse would be considered as a complete editor.

Back Top


Appendix A - AXEsse User Manual

Please go to AXE User Manual

Back Top


Appendix B - Program Interface


Interface of AXEsse

Back Top


Last updated by Susanti on 16/11/2000