CSSE / Monash Honours Research Projects 2004
About CSSE Courses Our People Research Student Information Community Links Internal Information

Computer Science, Digital Systems

(NB: This list may change up until mid-February 2004. Supervisors may also offer projects that are not explicitely mentioned in this list. It pays to talk to potential supervisors who research in areras that you are interested in.)

These are the projects being offered for honours research projects for Honours students in 2004.

Projects that are marked as "(12 points or 20 points)" are also available to BSE students in a scaled down version.

Projects marked as "(12 points)" are only available to BSE students.

Projects that are unmarked are only available as 20 point projects and not suitable for BSE.

Notes

Active Supervisors in 2004

 


David Albrecht

 

Investigation of Machine Learning techniques for handling Missing Data (12pt or 20pt)

DWA1

David Albrecht and Kevin Korb

Machine learning techniques have been applied very successfully to a great range of problems. A recurring difficulty in applying machine learning to real-world data is when attribute values for some data is missing. We shall perform a comparative experimental analysis of different proposed methods for coping with missing data values and look at ways of improving upon them using Bayesian models.

 

Investigation of Support Vector Machines (12pt or 20pt)

DWA2

David Albrecht and Peter Tischer

Support Vector Machines (SVM) have been used in various machine learning problems, including supervised classification, regression analysis and
novelty detection. They process interesting properties and have been demonstrated to give excellent performance in a number of practical
applications. There are many different aspects of Support Vector Machines which could be investigated in this project.

These include:
* sensitivity to anomalous data,
* effect of overlapping of classes on the performance of SVM algorithms,
* methods of improving SVM algorithms,
* application to image processing, and
* application to novelty detection


Students are not expected to have any special knowledge, but a background in mathematics will be an advantage. Also, students will not be necessary
expected to implement SVM algorithms as there are several public domain software packages available for computing SVMs.


Nandita Bhattacharjee

 

Development of FPGA based digital beamformer for ultrasonic imaging system

NB1

Nandita Bhattacharjee and Andrew P. Paplinski

Ultrasonic imaging systems are used in two main areas of application, namely, in non-destructive testing of materials and in a non-invasive medical examinations. Current research concentrates on methods of increasing accuracy, flexibility and functionality of the imaging system. An ideal system should deliver 3-dimensional views of objects using a compact or hand-held device. A typical ultrasonic imaging system consists of the following main parts: 1. Array of transducers: typical medical probe contains 192x8 = 1536 transmitters/receivers (transducers), operating at 2-5MHz. 2. A front-end: which converts signals from transducers into a digital form. 3. Beamformer: electronic steering subsystem that sends and receives ultrasonic pulses to/from selected direction. In addition the focal depth of the beam can be dynamically adjusted. In addition, spectral Doppler information can be also recorded and processed. In general, the beamformer is a complex digital signal processing subsystem 4. Imaging subsystem: that converts data from the beamformer into an image and related digital information The main objective of this research work is to investigate theoretical and design aspects of a new generation ultrasonic imaging system In particular, we will investigate 1. Application of the CORDIC signal rotation algorithm in a multi-beam real-time beamformer. The system, operating at 2-5 MHz, requires that approximately every 100ns, up to 1536 signal samples must be processed by the beamforming subsystem. One of the typical processing operation is rotation of two-dimensional vectors equivalent to signal samples. In order to achieve the required throughput with real-time processing, a significant number of rotating processors must be embedded in the beamforming subsystem. We will investigate the effectiveness of the following implementations of the signal rotation algorithm: a) word-parallel b) bit-serial For the front-end of the beamformer we will consider synthetic data obtained using the simulation software Field II and also display the image using the simulation software Field II Familiarity with Mentor Graphics CAD tools and VHDL (Hardware description language) is essential for this project.

Modelling Ultrasonic Imaging systems using cluster computing

NB2

Nandita Bhattacharjee and Charles Greif

This project is related to our ongoing research into the design aspects of a high resolution ultrasonic imaging system.The main objective of this project is to investigate the modelling of the physical aspects of the formation of the ultrasonic images using cluster computing methods. The essence of the model is the calculation of the spatial impulse responses, which can readily be implemented in parallel on the cluster. The mathematical modelling of an ultrasonic imaging system requires the simulation of the three main parts: 1. Sensor system: An array of transducers, which produces the ultrasonic pressure fields and receives the scattered and reflected signals. 2. Signal processing system: A beamformer, an electronic steering and focussing subsystem that sends and receives ultrasonic pulses to/from selected azimuth and elevation angles, 3. Imaging system: that converts data from a beamformer into an image. In order to achieve parallelisation of the simulated ultrasonic imaging system the following aspects will be investigated: 1. Design techniques for fast simulation of the linear acoustic models for the physical aspects of the formation of ultrasonic images. 2. Evaluation and selection of a suitable partitioning of the simulation models for concurrent execution on the cluster. 3.Design of an extendable frame for parallel processing of ultrasound data, which can be called and used under the MATLAB program.

 

Reconfigurable Sensor Networks using Berkeley Smart Dust motes

NB3

Nandita Bhattacharjee and Charles Greif

People have been talking about combining Microelectromechanical System (MEMS) sensors with wireless communication for about a decade. The Smart Dust effort at Berkeley is one of the many projects in this area. As a part of Smart Dust, they have implemented wireless sensor nodes using common off-the-shelf(COTS) components in custom boards. The nodes, known as macro-motes or just motes, are about a cubic inch in volume and contain sensors, a microprocessor, RF communications circuit, and a battery or solar cell for power. This project will focus on the remote reprogramming of the hardware and software on the motes over their on-board RF-network, via interaction with TinyOS. TinyOS is the onboard embedded operating system which uses a simple event-driven, active-message model, in which an application is comprised of modules with defined interfaces that facilitate the reuse and fine-tuning process.

References:

Kris Pister's(Smart Dust guru@ucb) current projects page: http://robotics.eecs.berkeley.edu/~pister/SmartDust/ Mobile Motes with strong military flavour: http://robotics.eecs.berkeley.edu/~pister/29Palms0103/

An easy to read article by Tom Cantrell in Circuit Cellar March 2002: http://www.chipcenter.com/circuitcellar/march02/c0302su1.htm


John Crossley

Generic interactive theorem proving

JNC1

John Crossley with Iman Poernomo

This project involves the development of an environment for (visual) interactive theorem proving. As a basic requirement, the environment should permit the user to write proofs in traditional predicate logic. However, logicians study many other logics. The design of our environment should be generic enough to represent proofs of some of these other logical systems. This will require approaching the project with a meta-level view. Thus it will require NOT the implementing of well-known logical rules but a general framework for taking (almost) anything that could be regarded as a logical rule and automatically encoding all instances of it. The student will be expected to

The student has a choice of implementation language: .NET, C# or Java. It is assumed that, on completion of the project, the student will have a good working knowledge of XML and reflection. However, no prior knowledge of these concepts is assumed. A familiarity with at least the formal logic discussed in Formal Methods I is essential.


Trevor Dix

 

Data Mining Web Server (12 points)

TD1

Trevor Dix

Snob is a classication program developed at Monash. We have a datamining web server that is to be used to provide a service for external users. This project is to interface Snob to the web server www.datamining.monash.edu.au and develop a user interface as a Java applet.

Data Mining Classification in CDMS

TD2

Trevor Dix

A software platform for data mining, CDMS for Core Data Mining Software, is being developed. This project is to put the Sonb classification program into CDMS as a plug in. This includes operating within the existing Java base level for the user interface. There is even scope to rewrite Sob in Java for the right student.

Extending Artemis's annotation capabilities (12 point)

TD3

Trevor Dix, Dieter Bulach and Torsten Seemann

Artemis is a GPL open source Java application for browsing DNA sequences and their associated annotated features. A typical feature is a gene, which would be annotated (commented on, labelled) with various tags/fields such as its name, biological function, or anything else of relevance. 1. Artemis's annotation facility is primitive - it provides only a single editable textbox. The aim would be to re-implement this as a full featured editor which understands all the GenBank/EMBL annotation tags and verify all the fields. 2. Artemis is designed to be used in a standalone manner. It only reads and writes simple flat files in standard formats. However annotation is usually done by a group of people simultaneously. The aim would be to allow Artemis to read and write features from a remote SQL-compliant database. All improvements would be fed back to the Artemis community. The project is offered jointly by CSSE and the Victorian Bioinformatics Consortium.


Alan Dorin

Geometric Modelling of Radiolarian-like forms

AD1

Alan Dorin

For some 600 million years, since the Paleozoic era, Radiolaria have been manufacturing exceedingly delicate skeletons and depositing them in the fossil record. These tiny organisms are protozoa whose soft tissues they encapsulate in an opaline skeleton which posesses radial symmetry - hence their name. These tiny skeletons are typically not visible to the naked eye, but through the lens of a microscope their wonderful shapes have amazed many... some (see for instance Ernst Haeckel) have devoted large portions of their lives to studying and recording their forms.

This honours project involves the dedication of one year of the life of an exceedingly keen honours student (with an eye for aesthetics) to the study and modelling of the structures produced by these creatures.

The models will be produced using interactive generative techniques. This requires the implementation of algorithms to describe the process of production of the structures which allow for realtime input by a human. Models will initially be displayed using OpenGL but it is hoped that by the end of the year they may be exported to be rendered using off-the-shelf packages.

Students must have a solid background in C/C++, OpenGL and graphics programming to third year level and find the prospect of writing code to generate images and 3D models exceedingly exciting. If, after looking at the links in the text above you don't immediately think "this sounds amazing" then this project is probably not for you! (Sorry.)

Students will need to complete the course CSE450 to assist in covering some of the groundwork.

Recommended Reading:
Watt, Watt, Advanced Animation and Rendering Techniques Addison-Wesley, 1992 or later editions (Scan for relevant info. on procedural modelling & animation)

Thompson, D'Arcy W., On Growth and Form (the abridged edition edited by John Tyler Bonner and published by Cambridge University Press, 1992 is fine).


Maria Garcia de la Banda

3D Java Visualization of Constraint Solving

MGDB1

Maria Garcia de la Banda, Kim Marriott, Mark Wallace


Constraint programming can help solve, flexibly and efficiently, a large class of combinatorial problems. This is typically done by first modeling the problem using constraints, and then applying an appropriate search algorithm to find an adequate solution. The particular choice of model and search algorithm determines the efficiency of the approach for a given problem. In order to make that choice, the programmer must have a deep understanding of complex issues such as constraint interaction, propagation, etc. Visualizing the program execution often helps to gain such knowledge. Unfortunately, the high number of constraints usually required to solve real-size problems makes meaningful visualization a difficult task.

Recently, new algorithms capable of very efficiently projecting higher-dimensional graphs into 3D have been proposed. We have modified an implementation of such algorithm to support weighted graphs, and to display and manipulate the resulting 3D graph in Java. We would like to use this an other visual methods to develop meaningful ways of displaying profiling information of constraint programming execution.


References
K. Marriott and P.J. Stuckey. Programming with Constraints: An Introduction. MIT Press, Cambridge, MA, 1998
D. Harel and Y. Koren. Graph Drawing by High-Dimensional Embedding. Proceedings Graph Drawing, 2002 (GD'02). Lecture Notes in Computer Science, Springer Verlag.


 

David G. Green

 

Testing the Building Block Hypothesis

David Green

It has been long held that Genetic algorithms (and other forms of evolutionary computation) solve complex problems by discovering, and manipulating short high-power solutions known as building blocks. Recent studies suggest that this is not the case. This project will investigate the ability of genetic algorithms to assemble and manipulate building blocks, through the use of a co-evolution. Results of this study will provide insights into how to use GA's to solve hierarchical decomposable problems, an understanding of what problems are hard for GA's to solve, insights into the optimal strategies to use for solving hard problems, and ultimately the validity of building block hypothesis.

Evolving Robust Complex Networks

David Green

The Longford Esso gas explosion in 1998, (which cut gas supplies to Victoria for a fortnight), the Los Angeles blackout of 1996 (that cut power to 15 states in the US), and more recently the New York blackout, show that apparently robust systems can be brought down by the failure of a single entity. The goal of this research project is to identify how complex networks can be made more robust to the failure of key nodes. Essentially this project will use an evolutionary algorithm to aid in the optimization of resilience of complex networks. Comparative analysis with naturally occurring complex networks (such as ecosystems and gene-regulatory networks) may provide into the evolution of these systems. This research has wide-reaching implications for a number of discipline areas, for example: provide strategies for building affordable redundancy into fragile infrastructure; provide strategies for limiting the spread of diseases and computer viruses; and provide planning tools for the development of robust networks.

Self-organisation in networks

David Green

Networks of agents are extremely common. Examples range from massively parallel computation to human society. In traditional networks, order is imposed in constraining interactions in hierarchical fashion (eg internet domains), but in many contexts this is not possible. An important question therefore is to understand how large scale structure of a system emerges out of hosts of local interactions between agents. Certain patterns, such as hierarchies and scale-free networks (eg "six degrees of separation"), are well known, but the processes are poorly understood. These issues can be explored in many interesting contexts and lead to some important questions, such as What kinds of processes lead to particular structures? How do clusters and modules form spontaneously? Under what conditions will all agents adopt the same state? and when will the network fragment into cliques?


John Hurst

The Prefix-Content-Suffix Model Of XML Translation

JH1

John Hurst

AXE (http://www.csse.monash.edu.au/~ajh/research/doctech/index.html) is an example of an XML translator. It is written in Perl, and accepts a translation specification that converts XML documents into other quite arbitrary document formats. For example, it has been used to build all the web pages rooted at http://www.csse.monash.edu.au/~ajh. It uses the paradigm of specifying translations for each element as a prefix component, a nested content component, and a suffix component. It can be used as a declarative model, but becomes far more powerful when each of these components may themselves be procedural.

However, it is still a somewhat prototypical implementation, and relies upon escaping into Perl code within the translation file to perform some of the more arcane translations needed, and it uses the restrictive SAX left-to-right translation model. This project has two main goals: to rewrite AXE using Python and the DOM tree-based evaluator, and to design translation specifications that do not rely upon escaping into embedded code.

This project and the previous one may be undertaken conjointly or separately, and are not dependent upon each other, although there are obvious areas of overlap.


Kevin Korb

also see joint project with Charles Twardy

Modeling Cardiovascular Health (12pt or 20pt)

KK1

Kevin Korb and Ann Nicholson

This is a knowledge engineering project to build models of cardiovascular health combining both expert input (from the Department of Epidemiology, Monash Medical Ctr) and the results of data analysis from one or more large health studies. The result will be a Bayesian network useful for predicting the health and dollar outcomes of possible drug and lifestyle interventions in the Australian population.


Carlo Kopp

Hypergame Modelling (12pt or 20pt)

CK1

Carlo Kopp

Higher order hypergames provide a powerful tool for modelling real life situations, and also provide powereful models for analysing conflict and information warfare scenarios. A modelling tool was developed in 2004 (see below) This project will extend that work, with the scope and focus of the project still to be determined.

C, X11/Xt, Qt or Motif programming skills desirable, a basic understanding of game theory would be helpful.

Fraser, Hipel, Conflict Analysis, North-Holland, 1984; Kopp C., Shannon,
Hypergames and Information Warfare, Proceedings of IWC3, 2002.

"HYPANT: A Hypergame Analysis Tool" Lachlan Brumley's 2004 BSE project.


Chris Ling

Deriving Agent Behaviour from Interaction Protocols

CL1

Chris Ling

Interaction protocols for multiagent systems proposed by the Foundation of Inteligent Physical Agents (FIPA) have been described in an extension of UML called AUML (Agent UML). Examples of interaction protocols are the Contract Net protocol, the English Auction protocol and the Dutch Auction protocol. Agents intending to participate in any of these activities must adhere to the AUML specification. The purpose of this project is to develop an algorithm which will derive a minimal Petri net model for each participating agent based on the AUML specification.

Analyzing Extra-Functional Properties of Itinerant Agents

CL2

Chris Ling

It is one thing to design an application involving agents, including agents that are mobile or agents that communicate via message-passing, or both. But it is another thing to grasp the extra-functional properties of such designs such as performance and security. This project will tackle this problem by defining a performance model and a security model for agents expressed in the ITAG Scripting Language [2]. These two models will help in the analysis and specification of performance and security properties of applications involving mobile agents. The performance and security model is based on work in [1]. An implementation of a tool in which performance and security estimates can be computed will also be carried out. If time permits, we will extend the ITAG language to include communication primitives, and extend the performance and security models to include these primitives.

[1] P. Fradet, V. Issarny, and S. Rouvrais. Analyzing non-functional properties of mobile agents. Proc. of Fundamental Approaches to Software Engineering, FASE'00, LNCS, March 2000. http://www.irisa.fr/lande/fradet/PDFs/FASE00.pdf

[2] S.W. Loke, H. Schmidt, and A. Zaslavsky. Programming the Mobility Behaviour of Agents by Composing Itineraries. Proceedings of the 5th Asian Computing Science Conference (ASIAN '99), Phuket, Thailand, LNCS 1742, P.S. Thiagarajan and R.Yap (eds), pages 214- 226, December 1999. http://goanna.cs.rmit.edu.au/~swloke/writings/asian99.ps.gz


Gordon Lowe

Virtual Reality: task planning and control of robotic tasks

GL1

Gordon Lowe

The research is focused on the study of techniques for telerobotics. The project also involves exploring various techniques to describe tasks in a virtual reality environment (i.e. the user is immersed in a virtual reality environment for task specification). Students will be provided access to robots, telerobotics system and force sensor.

Modular Robot Construction

GL2

Gordon Lowe

Pre/Co-requisite: CSE3133, and an interest in robot construction

This project focuses on development of a basic module from which all robots are created. The project entails analysing robot motion to determine the basic unit of design and determination of software issues. Modules will be added together to produce a hand, or insect robot. The hardware issue must include evaluation of motion control and co-ordination.

Laser Diagnostic Measurement

GL3

Gordon Lowe, Julio Soria

One project deals with real time control of a laser diagnostics measurement method. The equipment that needs to be controlled are a 2 high energy pulsed lasers and high resolution 12 bit digital cameras, the measurement may be free-running at a user selectable frequency or triggered by an external event. The control signals are fairly simple TTL - we use the parallel ports on a PC. The project would involve the development of a GUI program based on Reat time linux (RTAI) with KDE/Qt graphics.

Autonomous flying robots

GL4

Gordon Lowe, B. Shirinzadeh, D. Honnery,G. Alici, J. Soria

Pre/Co-requisite: MEC4456, MEC4457

Various projects in this area are available for students interested in mechatronics and autonomous systems.

Six Legged robot

GL5

Gordon Lowe, B. Shirinzadeh, G. Alici

Pre/Co-requisite: MEC3402, MEC44565, MEC 4457, software engineering background is desirable

The project focuses on the development of a six legged robot. This involves the hardware design and construction, as well as development of control hardware and software.


Kim Marriott

A graphical equation editor

KM1

Kim Marriott, Bernd Meyer and Tony Jansen

The project is to build a WSIWYG equation editor using the diagram authoring tool CIDER being developed by Tony Jansen, Kim Marriott and Bernd Meyer (all of whom will co-supervise the project). It should also generate the corresponding MathML. CIDER is described in more detail on links from http://www.csse.monash.edu.au/~tonyj/


Authoring SVG animations

KM2

Kim Marriott

SVG, the new vector-based graphic standard for the web provides animation. An important question is how can a graphics editor provide good authoring support for these capabilities. This project will extend the SVG authoring tool inkscape to explore different possible tools. For information about inkscape see http://www.inkscape.org/

Drawing Unordered Trees Trees

KM3

Kim Marriott

Trees are used to represent many different types of information, including organisational hierarchies and simple family trees. Many algorithms for drawing trees have been developed but these are for ordered trees, i.e. trees in which the children are ordered. Little is known about how to draw trees in which the order of the children does not matter. In this case we may need to swap the order so as to find a more compact layout. This project will investigate techniques for drawing such unordered trees.


Bernd Meyer

please check Bernd's list of current projects at

http://www.csse.monash.edu.au/~berndm/Projects/


Jon McCormack

please check Jon's list of current projects at

http://www.csse.monash.edu.au/~jonmc/teaching/honsProj.html


Linda McIver

Automated Web Navigation (12pt or 20pt)

LMi1

Linda McIver

Navigation information is a critical component of a usable website. Many sites provide incomplete site maps, with poor categorisation, so it can be very difficult to track down the information you need. Once you are deep in the site hierarchy, it can be particularly difficult to work out where you are in relation to the rest of the site, and what else the site offers.

Good navigation information provides answers 3 questions: Where am I? Where can I go from here? Where do I want to be?

Much of this information can be extracted quite simply from collections of pages, in order to construct a consistent navigation mechanism for each page. This project aims to provide tools for automatic extraction of navigation information, as well as construction of customisable navigation mechanisms for a website. A natural extension of this might be an automated indexing system which uses the text on each page to construct a site map to guide the user to the desired information quickly and easily.


Matthew Mitchell

 

Well established connectionist approaches have serious limitations which have affected their usefulness for creating learning robots. For example, two limitations of back-propagation neural networks are the large amount of training required and difficulties when learning multiple tasks (McCloskey, 1989;Fahlman 1988). To combate this a new connectionist learning algorithm (called TRACA - Temporal Reinforcement-learning and Classification Architecture) has been developed (Mitchell 2003). This new system has a specific design intention: to create intelligent learning robots that can perform difficult tasks in unstructured real-world environments. There are three 20 point projects in relation to this new system:.

Performance Limitations

MM1

Matthew Mitchell and David Squire

TRACA is complex, making formal analysis difficult. Therefore it is desirable to conduct experiments in simulated environments which indicate the limits of this new system and to identify factors which will result in it finding poor solutions. This analysis should also compare the system to similar systems such as (Mozer 1991) and (Mozer 1992). It is intended that the experiments in this project should include using a real Nomad or ER1 robot.

Learning Subroutines

MM2

Matthew Mitchell and David Squire

Critical for usefulness in the real-world is the ability to perform different tasks and reuse the solutions for new novel tasks. This project proposes to investigate and implement potential methods of creating sub-routines using the connectionist network created by TRACA. These subroutines should then be able to be recombined to perform novel tasks. This is an interesting topic and existing work by a wide variety of researchers (eg: Lin (1993a) and Drescher (1991)) should provide clues to possible solutions.

Higher Order Concepts

MM3

Matthew Mitchell and David Squire

A learning agent needs a variety of forms of memory. The new connectionist system already has some memory mechanisms, however, it lacks the mechanisms necessary to learn about, represent and reason about physical objects in the real-world (Drescher 1991). This project aims to investigate the existing literature on how objects can be represented and understood by both man and machine, propose a possible approach and implement a prototype demonstrating one or more aspects of the proposed approach.

Contact either Matt or David (matt@csse.monash.edu.au, david.squire@csse.monash.edu.au) for further details.

References

Drescher, G. (1991). Made-Up Minds: A constructivist approach to Artificial Intelligence. MIT Press.

Fahlman, S. (1988).An empirical study of learning speed in back-propagation networks. Technical Report CMU-CS-88-162, Carnegie Mellon University, USA.

Lin, L. (1993). Reinforcement learning for robots using neural networks. Ph.D. thesis, School of Computer Science, Carnegie Mellon University, Pittburgh USA.

McCloskey, M. and N. Cohen (1989). Catastrophic interference in connectionist networks: The sequential learning problem. The Psychology of Learning and Motivation, 24, pp 109--164.

Mitchell, M. (2003). An Architecture for Situated Learning Agents. Ph.D. thesis, School of Computer Science and Software Engineering, Monash University, Melbourne Australia.

Mozer, M. (1992). Induction of multiscale structure. In J. Moody, S. Hanson, and R. Lippmann (Eds.), Advances in Neural Information Processing Systems 4, San Mateo, California, pp. 275--282. Morgan Kaufmann Publishers.

Mozer, M. and J. Bachrach (1991). SLUG: a connectionist architecture for inferring the structure of finite-state environments. Machine Learning, 7, 139--160.

 

Embodied Interactive Agents

MM4

Matthew Mitchell

Recently there has been a large interest in embodied conversational agents, which interact with users naturally in conversation (Cassell et al. 2000). These agents respond to user gestures and combine gestures and facial expressions in conversation. Such agents are used as interfaces to information systems and as virtual playmates for children. However, while these agents can exhibit sophisticated behaviour, they cannot learn. This project aims to investigate the possibility of having such agents learn by using TRACA as the brain in a virtual embodied agent. It is envisioned that the agent will be a virtual dog living in a virtual ``back yard''. It will be able to sense using vision and smell. The aim is for a user to be able to train and play with the virtual dog using gestures and speech. In addition to gesturing the user will be able to restrain the dog, place food for it or drag it from one location in the yard to another. This project will involve both providing the personality of the dog (which will have a natural inclination to run around the yard and occassionally get lazy and lie down, be more tempted by food when it is hungry) and providing for the training requirements of the user (which may include making the dog come, stay or go to a particular position in the yard).

Reference: Cassell, J. Sullivan, J. Prevost, S. and Churchill, E. (2000) Embodied Conversational Agents, MIT Press.


Ann Nicholson

see joint project with Kevin Korb


Ron Pose

Suburban Ad Hoc Networks (3 Projects, 20 points or 12 points)

RP1

Ron Pose and Carlo Kopp

The Suburban Ad Hoc Networking group focusses its research activities on techniques for implementing Suburban Ad Hoc Networks. These are self organising, quasi-static ad hoc (typically wireless) networks which provide an alternative technology for providing high speed digital connectivity to households, small businesses and distributed campuses. Specific areas of research interest include security, low level routing protocols, access controls and propagation behaviour. Given the broad scope of the research performed in this area, there is considerable choice in project topics. Students should consult Dr Ronald Pose or Dr Carlo Kopp for suitable project topics.

http://www.csse.monash.edu.au/research/san/

Walnut Password Capability Kernel (4 Projects, 20 points or 12 points)

RP2

Ron Pose and Carlo Kopp

The Walnut Password Capability System is a secure operating system design, which employs password capability techniques to control access to objects within the system. Previous projects included the addition of a stdio libraries, a compiler port, a network protocol stack design, a shell design and other miscellaneous topics. Future topics could include a revised network protocol stack, a port of the current gcc 3.X compiler, addition of POSIX compliant I/O libraries, porting to other hardware platforms, a CORBA like remote object mechanism. Students should consult Dr Ronald Pose or Dr Carlo Kopp for suitable project topics.

http://www.csse.monash.edu.au/~carlo/archive/CSSE/WALNUT/


Sid Ray

Sid may offer additional projects in 2004. Please contact him directly for details.

 

Content-Based Image Retrieval

SR1

Sid Ray

Content-based image retrieval is a well-known method of image retrieval. The objective of this project is to devise an algorithm for retrieval of images from a large image data base based on colour quantisation and regional quantisation.

Texture Analysis of Images

SR2

Sid Ray

Texture plays an important role in both human interpretation of visual scenes and computer analysis of images. Textural cues are of particular relevance in two different, but related, image analysis problems, namely, the problems of (i) segmentation and (ii) classification of images. The proposed project will deal with both of these problems. It will involve two phases. In the first phase some existing texture analysis methods will be investigated from the point of view of their theoretical soundness as textural measures as well as their practical applicability. In the second phase attempts will be made to derive a new methodology with particular attention to its computational efficiency.

Document Image Analysis System

SR3

Sid Ray

The objective of document image analysis is to recognize text, graphics, and pictures in printed documents and to extract the intended information from them. There are two broad categories of document image analysis, namely, textual processing and graphical processing. Textual processing includes skew determination (any tilt at which the document may have been scanned), finding columns, paragraphs, text lines and words, and performing optical character recognition. Graphical processing deals with lines and symbols. The scope of the proposed project will be the aspect of text processing and it will concentrate on the development of a system for page layout analysis.

Selection of Features for Pattern Recognition

SR4

Sid Ray

Experience shows that in recognition of patterns by humans, the recognition is made based on only a few important features (or attributes) characterizing the pattern classes. Conversely, in statistical pattern recognition, the patterns are often represented by a large number of numerical features. Although there is no conceptual justification in reducing the number of features to a small number, in practical problem solving, this becomes a necessary step due to the wellknown phenomenon of the 'curse of dimensionality' of the feature vector on the complexity of the pattern classifier. The aim of the proposed project is to develop an interactive feature selection paradigm well-suited to multiclass (rather than 2-class) pattern recognition problems.

The project will comprise four components, namely, (i) theoretical comparison of existing feature selection criteria, (ii) development of software tools for the existing criteria, followed by an experimental investigation of them, (iii) analysis of the above results leading to, hopefully, a feature selection paradigm, and (iv) development of an interactive software tool for the above paradigm. The software developed will be 'interactive' in the sense that depending on the classification accuracy arrived at a certain stage, the user will have the option of changing the dimensionality value. Options will also be available to supply interactively a range of values of the dimensionality. The software tools will include procedures for displaying the distribution of pattern samples in different feature spaces obtained by different feature selection methods.


Heinz Schmidt

 

please consult Heinz Schmidt's project list at

http://www.csse.monash.edu.au/~hws/honours.html


David Squire

 

A Tool for Integrating Concept Maps and Bibliography Management (12 pt)

DS1

David Squire

One of the challenges of research writing, particularly of theses, is the literature review. Carrying out a literature review involves much more than reading a lot of papers. It is necessary to work out the main ideas and contributions in each paper, and their relationships to each other. The relationships between these concepts are typically complex: there are hirearchies, and, more generally, graphs. One approach which can be used to organize such information is to use a graphical tool known as a concept map, where concepts are typically represented as "bubbles", and relationships by (possibly directed) lines linking them, i.e. a graph. The notion of a concept map is related to that of a semantic network. In this project you will develop a tool which allows a user to create, edit and store their concept maps, as well as to link them to his/her bibliographic database. Attention will be given to:

If successful, this project would be the first step in producing an open-source tool that would be of great value to researchers, and especially research students, aruond the world.

References

http://cmap.coginst.uwf.edu/info/

http://www.mind-map.com/mindmaps_definition.htm (perhaps better hyped than the term "concept map")

 

Segmentation for Content-Based Image Retrieval

DS2

David Squire

In recent years, the use of digital image collections has become common, both on the world wide web and in the preparation of both electronic and paper publications. The need for tools to manage this rapidly-increasing quantity of visual data is greater than ever. Specifically, there is great interest in systems which allow users to query image databases. The attachment of text labels to images is inadequate, since identical images can be described in different ways. Moreover it is very expensive. Consequently, there is great interest in Content-Based Image Retrieval. CBIR uses the Query by Example paradigm: to retrieve images, and example image is provided as the query, and the system endeavours to retrieve similar images, based on features such as colour, texture and shape. Many such systems have been developed, such as IBM's QBIC, VisualSEEk, the GIFT. Most current systems use whole images for query and for relevance feedback. In reality, users are often only interested in a particular object in, or region of, the query image. Some efforts have been made to support query by region, such as in BlobWorld, but the open-source GNU Image Finding Tool (the GIFT) does not yet do so. In this project you investigate a variety of image segmentation algorithms for suitability for CBIR. You will design a new feature extraction module for the GIFT, which treats segments, rather than images, as the basic elements. You will extend the XML-based MRML (the Multimedia Retrieval Markup Language) to work with segments.

References:

http://www.gnu.org/software/gift/gift.html

http://www.mrml.net/

http://viper.csse.monash.edu.au/~davids/cgi-bin/PerlMRMLClient/PerlMRMLClient.pl

Languages: C++, Perl (others possible too)

 

CBIR-based Dermatology Diagnostic Assistant

DS3

David Squire

Australia has the highest incidence rate of skin cancer, and specifically melanoma, in the world. Although most skin cancers are relatively harmless, more than one in seven people diagnosed with melanoma dies from it. Each year, around 1000 people die from what is an almost totally preventable disease. The initial step in melanoma diagnosis is visual inspection. The dermatologist inspects an image of a skin lesion image for features characteristic of malignant lesions, such as

The aim of this project is to invesitage the use of a content-based image retrieval (CBIR) system with a database of skin lesions to provide clinicians with diagnostic support. You will create feature extractors, building on existing work, which provide descriptors for lesion images related to the characteristics described above. The system will be based on the GIFT (GNU Image Finding Tool) platform.

References:

http://www.gnu.org/software/gift/gift.html

http://www.mrml.net/

http://viper.csse.monash.edu.au/~davids/cgi-bin/PerlMRMLClient/PerlMRMLClient.pl

http://www.csse.monash.edu.au/courseware/cse3308/cse3308_2002/assets/images/assign2.pdf (for an idea of the kind of system I'm talking about)

Philippe Schmid-Saugeon, Joël Guillod and Jean-Philippe Thiran, Towards a computer-aided diagnosis system for pigmented skin lesions, /Computerized Medical Imaging and Graphics/, *27*, pp. 65-78, 2003. (http://citeseer.nj.nec.com/571151.html)

Languages: C++, Perl (others possible too)

 


Bala Srinivasan

Implementation of the Viola-Jones Face Detection System

BS1

Bala Srinivasan and Andrew Paplinski

based on: Paul Viola, Michael J. Jones: "Robust Real-time Object Detection", The Cambridge Research Laboratory, Tech. Rep. CRL 2001/01, February 2001

The Viola-Jones face detection system is able to detect faces extremely rapidly. The detection speed of 15 frames/second for 384 by 288 pixel images on a conventional 700 MHz Intel Pentium III has been reported. That is 15 times faster than the Rowley-Baluja-Kanade detector and about 600 times faster than the Schneiderman-Kanade detector. The method is based on three main contributions: The introduction of a new image representation called the "Integral Image" which allows the features used by our detector to be computed very quickly. a learning algorithm, based on AdaBoost, which selects a small number of critical features and yields extremely efficient classifiers. a method for combining classifiers in a `cascade' which allows imputation on promising object-like regions.


Peter Tischer

Generalised Vector Median Filtering (BCS 20 pt, could suit BDS 20 pt)

PT1

Peter Tischer

Median filtering is a way of removing impulsive noise from images whose pixel values are scalar. The median operation is defined for scalars but it needs to be generalised when we want to talk about the median of a set of vectors. Colour images have pixels which have a red, green and blue scalar components. Other images whose pixels take on vector values include multispectral images and fMRIs. The project will involve coming up with a framework for generalising the concept of median filtering to images with vector valued pixels and studying the performance of these generalised median filters on a variety of image types but with special interest in performance on colour images and fMRI sequences.

Special pre-requisites: student need not have taken 3rd year Image Processing

Approximating frequency distributions by mixture models (BCS 20 pt)

PT2

Peter Tischer

Often when we are trying to estimate the probability associated with a set of events we compute the relative frequency distribution. A question we may ask is whether the relative frequency distribution can be adequately described as a mixture of probability distributions where the components in the mixture are in some sense simple probability distributions. If we can can find a good approximating mixture model we may be able to show that the events can be thought of as originating from different sub-populations of the overall population. Separating things into sub-populations is important in data mining and machine learning. This project aims to come up with good mixture models for a given relative frequency distribution by using MML techniques. In the first place, the problem will be approached by treating it as a lossless data compression problem. This can be seen as a simpler form of MML. Should time permit, the problem might be approached using MML in its general form.

It would be an advantage to have a maths background and to be taking the honours unit on 'Learning and Prediction'.

 

Improving the Near-Lossless Compression mode of JPEG-LS (12 or BCS, BDS 20 pt)

PT3

Peter Tischer

JPEG-LS is an international standard for the lossless compression of grey-scale images. It achieves good compression at high throughput rates. JPEG-LS also has a near-lossless mode. This achieves greater compression but introduces some reconstruction error. By specifying a parameter, the maximum reconstruction error in a pixel value can be controlled.

The near-lossless mode in JPEG-LS is relatively inefficient with respect to compression in situations where the resulting image is actually quite compressible. In addition, the prediction strategy used in JPEG-LS is not well suited to near-lossless compression. The aim of the project will be to take an implementation of JPEG-LS and improve its performance with respect to near-lossless image compression by both improving its coding strategies and its prediction strategies. In addition, the project will investigate incorporating region of interest coding. In some parts of an image the user may want to tolerate no reconstruction errors while in other parts the user may specify quite large bounds on the reconstruction error.

While the near-lossless work will aim at a wide range of image types, the region of interest coding may choose to concentrate more on medical images.

 

Document Segmentation for Document Image Compression (12 or BCS, BDS 20 pt)

PT4

Peter Tischer

The JBIG-2 standard for compressing binary images, that is images whose pixel values are either black or white, allows for images to be segmented and for different segments to be treated differently in the compression process. The standard, however, does not specify how this segmentation should be carried out. It specifies only how the encoder is to tell the decoder what the segmentation is.

The aim of this project is to investigate ways of segmenting binary images of documents, such as those that arise in fax and document processing systems, so that the resulting segmentation may be used to encode the binary image most efficiently. The segmentation will therefore be aimed at finding regions that have similar properties with respect to how compactly the information in those regions may be represented. Thus, an image of a page might be broken into segments which represent text, headlines, tables, half-tone images or white space at the margins of a page. The project might also investigate how to use such a segmentation to get best compression of the binary image.

 

Image Deblocking using Local Segmentation (12 or BCS,BDS 20 pt)

PT5

Peter Tischer

The baseline JPEG lossy compression mode has been hugely successful and is widely used, particularly on the internet. The same approach is also used in coding digital video such as videoCD, MPEG3, video-teleconferencing and digital television. However, at low bit rates the block approach leads to objectionable artifacts in the reconstructed image.

Local segmentation is a paradigm for constructing image processing algorithms in which the pixels within a block are either classified as belong to the same segment or as belong to a number of different segments. Operations on a pixel will only use values from those neighbouring pixels which belong to the same local segment. As one example, local segmentation has been shown to be effective in creating denoising algorithms for removing Gaussian noise while preserving underlying image structure.

The aim of the project is to use knowledge of how the image was compressed in conjunction with local segmentation to produce a reconstructed image with minimal block artifacts.

Document Similarity Measurement via Data Compression (12 or BCS 20 pt)

PT6

Peter Tischer

In document classification it is often desirable to measure how similar two documents are. This might be because we are interested in saying whether documents are spam or whether the source code of two programs is so similar that at least program is an example of plagiarism. If two documents are similar, we would expect that we would require little information to know the contents of the second document given that we already know the contents of the first. One way to measure information is by using data compression techniques as the compressed size is an upper bound on the amount of information needed to store the original data. The aim of the project is to adapt techniques from the area of text compression to compress the data which shows how one document can be created from another. Of particular interest is coming up with similarity measures that are also distance functions.


Charles Twardy

 

Modelling Fish Populations for Sustainability

CT1

Charles Twardy, David Green and Charles Todd (Sr. Scientist with The Department of Sustainability and the Environment-DSE)

The DSE would like to know the impacts of recreational fishing practices on native freshwater cod species. Using a 6-year mark-recapture study that is already in place, it may be possible to "jump the gun" by using computer modelling to estimate survival rates. We have used Snob to help discover age cohorts, and have talked about using more elaborate models (such as Bayesian networks) to combine expert knowledge with the limited data now available. We wish to develop practical models with confidence bounds. Future data from the continuing mark-recapture study offers the possibility of validating the models.

 

Causal Discovery from Prior Models (12pt or 20pt)

CT2

Charles Twardy and Kevin Korb

This project will enhance CaMML (Causal discovery via Minimum Message Length) to allow the specification of a prior probability distribution over the causal model space by specifying one or more causal models together with a weighted edit-distance metric. This will allow domain experts to indicate where in the model space they expect to find the true model so as to bias a subsequent automated search using sample data. The project will involved developing a Java GUI to specify the prior model(s) and metric and integrating that with the current CaMML GUI.

Snobmeric: Open Source Software for Science (12pt)

CT3

Charles Twardy and Trevor Dix

Many scientists we talk to want to use MML programs like Snob. They always have their data in Excel spreadsheets and need to convert it, and then make sense of unfamiliar output. Why not just embed MML methods into GNUMERIC, a powerful open-source spreadsheet that reads and writes Excel files, and take advantage of it's GUI? This project involves the student in active development of a large open-source software project. At base, it involves taking existing research code and putting into another project. But that will take work: the student must understand the existing code and what it does, figure out how to translate that into spreadsheet operations, especially given naive users, get spreadsheet data into Snob data structures, and get the output into some spreadsheet form or embedded graphic. Because accuracy is critical, it involves regression testing. Ideally, by the end of the semester, the student's work will already have been used by dozens of people around the world. In particular, we will focus on some evolutionary biologists at Monash.


Mark Wallace

 

Capturing a Core Combinatorial Problem within a Constraint (12 or 20 pt)

MW1

Mark Wallace

Over the last ten years, Constraint Programming (CP) has been used widely to solve real world complex combinatorial problems. Applications have been deployed in areas as diverse as aircraft scheduling, chocolate production, and microchip design. CP enables us to take clever algorithms and use them, in combination with with other algorithms, by reasoning on possible values taken by each of the problem variables. Each time the algorithm is invoked, it removes any values that it can tell are impossible from the domains of the variables. The different algorithms then communicate through the domains of their shared variables.

This project takes one such problem, the "Knapsack" problem, and turns it into a constraint that prunes variable domains in this way. The aim is to adapt one or even a combination of knapsack algorithms to do this. One approach we will exploit is integer/linear programming (1), and another is a form of dynamic programming (2). We will implement the constraint in a suitable constraint programming system: HAL (3) or ECLiPSe (4).

Two 12 pt projects could study one algorithm each, or one 20 pt project could integrate the two approaches into a single constraint.

References

(1) Meinolf Sellmann, Torsten Fahle Cost Based Filtering for the Constrained Knapsack Problem Annals of Operations Research, 73-93, 115

(2) Michael A. Trick Dynamic Programming Approach for Consistency and Propagation for Knapsack Constraints Annals of Operations Research, 73-84, 118

(3) HAL www.csse.monash.edu.au/~mbanda/hal/

(4) ECLiPSe www.icparc.ic.ac.uk/eclipse/


Geoff Webb

K-most-interesting Rule Discovery

GW1

Geoff Webb

K-most-interesting rule discovery allows a user to discover interesting rules from data, where the user is empowered to define the criteria by which a rule is judged interesting. This project will investigate new criteria of interestingness and develop techniques to support rule discovery in the context of those criteria.

Degree Suitability: Computer Science, Digital Systems, or Software Engineering

References:

G. Webb (2000) Efficient search for association rules. Proceedings of KDD-2000: the SIGKDD Conference of Knowledge Discovery and Datamining, Boston, pp.99-107.

G. Webb (1995) OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research., 3: 431-465.

Learning from quantitative data  (12pt or 20pt)

GW2

Geoff Webb

There are relatively few data mining techniques for analysis of quantitative (numeric) data. However, much data is quantitative, and there is a tremendous unmet demand for techniques that support sophisticated quantitative data analysis. This project will investigate extensions to the impact rules technique for learning rules that identify interesting distributions on a numeric variable.

Degree Suitability: Computer Science, Digital Systems, or Software Engineering

References:

G. Webb (2001) Discovering associations with numeric variables. Proceedings of KDD-2001: the SIGKDD Conference of Knowledge Discovery and Datamining, San Francisco, pp.383-388.
G. Webb (1995) OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research., 3: 431-465.

 

Techniques for efficient probabilistic prediction (12pt or 20pt)

GW3

Geoff Webb

AODE is an efficient new technique for estimating conditional probabilities from data that has demonstrated high levels of classification accuracy. This project will investigate extensions to AODE that seek to further improve AODE's efficiency while retaining its prediction accuracy.

Degree Suitability: Computer Science, Digital Systems, or Software Engineering

References:

G. Webb, J. Boughton & Z. Wang (2002) Averaged one-dependence estimators: Preliminary results. Proceedings of the Australasian Data Mining Workshop, Canberra.
Z. Zheng & G. Webb (2000) Lazy learning of Bayesian rules. Machine Learning, 41(1): 53-84.

 

An investigation of classification of objects that are dissimmilar to those seen before (12pt or 20pt)

GW4

Geoff Webb

It is relatively easy for a machine learning system to learn accurate classifiers for objects that are highly similar to those in the training data. However, the prediction accuracy of many systems falls as the similarity of test objects to training objects decreases.  This project will investigate this effect and develop new learning techniques that seek to minimise it.

Degree Suitability: Computer Science, Digital Systems, or Software Engineering

 

User modelling (12pt or 20pt)

GW5

Geoff Webb

Many computer systems can benefit from anticipating a user's likely intentions. This project will extend the Feature-Based Modelling approach to developing models of a user's interactions with a computer system.

Degree Suitability: Computer Science, Digital Systems, or Software Engineering


References:

G. Webb, M. Pazzani, D. Billsus (2001) Machine learning for user modeling. User Modeling and User-Adapted Interaction, 11(1-2): 19-29. Tenth anniversary issue.
B. C. Chiu & G. Webb (1998) Using decision trees for agent modelling: Improving prediction performance. User Modeling and User-Adapted Interaction, 8(1-2): 131-152.
G. Webb, B. C. Chiu & M. Kuzmycz (1997) Comparative evaluation of alternative induction engines for Feature Based Modelling. International Journal of Artificial Intelligence in Education, 8: 97-115.
G. Webb & M Kuzmycz (1996) Feature Based Modelling: A methodology for producing coherent, consistent, dynamically changing models of agents' competencies. User Modeling and User-Adapted Interaction. 5 (2): 117-150.