[Monash Home] [Monash Info] [News and Events] [Campuses and Faculties]
[Monash University]
School of Computer Science and Software Engineering
about courses People research student community internal

BCS, BDS, BSE Honours Projects 2005


 2006 list
 2004 list

This list contains Honours projects for BCS (20-point), BDS (20-point) and BSE (12-point). A project may be 12-point only, or 20-point only, or available in both 12- or 20-point versions.
BSE cannot take a 20-point project unless there is a 12-point version of it.
BCS and BDE cannot take a 12-point project unless there is a 20-point version of it.
You must discuss any project that you are interested in with the supervisor(s). If you do not you might not be considered for the project.

School of Computer Science and Software Engineering, Monash University.

Active Supervisors this year:

Investigation of Support Vector Machines (12- or 20-pts)
Project-id: AT-svm
David Albrecht & Peter Tischer (also see PT)
Support Vector Machines (SVM) have been used in various machine learning problems, including supervised classification, regression analysis and novelty detection. They process interesting properties and have been demonstrated to give excellent performance in a number of practical applications.
There are many different aspects of Support Vector Machines which could be investigated in this project. These include: * sensitivity to anomalous data, * effect of overlapping of classes on the performance of SVM algorithms, * methods of improving SVM algorithms, * application to image processing, and * application to novelty detection.
Students are not expected to have any special knowledge, but a background in mathematics will be an advantage. Also, students will not necessarily be expected to implement SVM algorithms as there are several public domain software packages available for computing SVMs.

MML + Haskell + templates or generics (20 or 12-pts)
Project id: LA-mml1
Lloyd Allison
Haskell (www.haskell.org) is a lazy functional programming language. I have been using it for MML-based machine learning, see http://dx.doi.org/10.1017/S0956796804005301 e.g. classification (clustering), decision trees, time-series, mixtures, Bayesian networks (including discrete and/or continuous variables), etc.. The project is to investigate the use of 'template Haskell' or 'generic Haskell' and similar extensions to Haskell to make it easier to process new multivariate data-sets, e.g. data held in spread-sheets or 'csv' files.
An application is to Search and Rescue data on Missing People -- who, where, how found, when, in what condition, etc..
MML + Haskell + cluster computing (20-pts)
Project id: LA-mml2
Lloyd Allison
Haskell (www.haskell.org) is a lazy functional programming language. I have been using it for MML-based machine learning, see http://dx.doi.org/10.1017/S0956796804005301 The project is to investigate cluster computing and/or parallel Haskell to speed up MML+Haskell machine learning on very large data sets.
Compression of DNA (20 or 12-pts)
Project id: LA-dna
Lloyd Allison
Most familiar file-compression algorithms fail to beat the trivial 2-bits/symbol {A->00, C->01, G->10, T->11} compression level on such data. Compression=modelling+coding, and arithmetic coding has solved the coding part of the compression problem, so we need better models; see www.csse.monash.edu.au/~lloyd/tildeStrings/#Compression The project is to develop improved models for use in compression of biological data.
Parallel Functional Language (20-pts)
Project id: LA-pfp
Lloyd Allison
The project is to implement a parser and interpreter for a small, typed, CCS-based language. The system will be written in Java. The final system should not be much larger than the (untyped, non-parallel) λ: www.csse.monash.edu.au/~lloyd/tildeFP/Lambda/
Large, Low-Cost Digital Display (20 or 12-pts)
Project-id: LA-display
Lloyd Allison
If you are inventive, like robotics, and can make things: The project is to design and build a very large but cheap digital display. The target size is about 2m×2m with a resolution of about 1cm/pixel (or higher), 40,000+ pixels, say. The cost must be low, a small number of cents (1c is good) per pixel. Display speed is "not the highest priority", e.g., a rewrite time of 1-hour could be acceptable. Small prototype(s) could be built before scaling-up. (You must discuss this project with LA before nominating it.)
Algorithms to find approximate palindromes (20 or 12-pts)
Project id: LA-pal
Lloyd Allison
Given a string, s, of symbols, a palindrome is a substring p=s[i..j] that reads the same forwards and backwards, i.e. either p=t;rev(t) or p=t;c;rev(t) where c is a symbol and rev(t) is the reverse of t. But in an approximate palindrome, p=t;u or p=t;c;u where u can be approximately, not necessarily identically, equal to rev(t). The project is to develop, test, and apply fast algorithms to find approximate palindromes in long strings. See http://arxiv.org/pdf/cs.DS/0412004

Haskell + gnumeric + plugins (20 or 12-pts)
Project id: LA-gnu
Lloyd Allison
Gnumeric (www.gnumeric.org) is an open-source spread-sheet. It allows a programmer to add new 'plug-ins'; these are usually written in C or C++. Recently Pang et al showed that "Haskell can be comfortably used as a statically typed extension language, and that it can support type-safe dynamic loading of plugins using dynamic types...."; see www.cse.unsw.edu.au/~dons/hs-plugins/paper/ The project is to investigate writing Haskell plugins (particularly plugins for MML-based machine learning) for gnumeric, possibly using the work of Pang et al.

A platform for DNA exploration and visualisation (12- or 20-pts)
Project-id: DP-dna
Trevor Dix and David Powell
The DNA of an organism contains the information about how that organism is to develop and function. Tools exist to find patterns and "interesting" regions of DNA based on information-theoretic and other techniques. Biologists also look at how much information one DNA sequence tells us about another DNA sequence, this may be within one organism, or between different organisms.
This project is to develop a GUI tool to explore the patterns and information content of one or more DNA sequences. This will allow a user to explore and visualise the output of existing pattern discovery tools, until a cumbersome and error-prone task.
The student completing this project will need to be proficient in Java. Some background or interest in biology would be helpful, but is not essential. [more]
Image Processing for Protein Expression (20-pts)
Project-id: DSB-image
Trevor Dix, Torsten Seemann, and John Boyce
Background: Monash is part of the Aust. Research Council centre for structural and functional microbial genomics that is interested in identifying vaccine targets from the bacterial pathogens Pasteurella multocida and Dichelobacter nodosus. The Centre has used bioinformatics analyses to predict all secreted and surface proteins from these organisms and are currently undertaking high-throughput cloning and protein expression of these predicted antigens. The scale of the protein cloning expression experiments (hundreds of proteins) means that data management and expression analysis become very time consuming.
This project is to develop an image analysis program which would allow rapid identification of 1) quantity of protein expression and 2) molecular mass of expressed protein from our protein gel images (see image 1). The package would integrate seamlessly with an in-house genomic information database to carry out automated checks of the integrity of the expressed products. The project could be extended to allow analysis of similar images of DNA amplification (sample image 2).
Images are available at www.csse.monash.edu.au/~trevor/honours/2005/John.html.
Analysing whole genome shotgun assembly (20-pts)
Project-id: DP-shotgun
Trevor Dix and Darren Platt
Background: The genomic sequence of an organism is sequenced (read) using many small individual reads of unknown location. These are typically around 600 letters (bases) long and must be correctly aligned, ordered and oriented to recover the original underlying DNA sequence. This process known as whole genome shot gun sequencing was traditionally used for small genomes of the few million bases but is now widely used to assemble giga base sized genomes such as that of human. While the assembly programs are gradually improving in quality, they routinely encounter problems with repetitive DNA (where one section of a genome looks very similar to another) resulting in mis-assembly.
The goal of this project would be to build on some primitive existing tools for visualising assemblies to devise automated methods for detecting potential mis-assemblies and better tools for visual inspection.
Real and simulated assemblies are available for testing. Techniques that could be employed include visualisation of the assembly, basic statistical analysis of the pairing distances, depth of coverage. A slightly more advanced analysis could include comparison with known genes that align to the genome. A technique that enabled a "diff" of two different assemblies would be extremely valuable for analysing how changes to assembly algorithms impact assemblies.
See www.csse.monash.edu.au/~trevor/honours/2005/Darren.html for images and more explanation. This is a difficult project and part of an international collaboration in the USA with a Monash expatriate. Agreement must be reached with all parties. The existing system will be made available and some supervision will be done remotely. [more]

David Dowe is interested in Minimum Message Length (MML) inductive inference. MML is particularly useful in machine learning, statistics, econometrics, "knowledge discovery", "data mining" and philosophy of science. Both theoretical and applied projects are available. I list a sample below. You should feel free to discuss any and all of these with David Dowe. Some of my areas of interest include clustering, angular data, correlated data, multi-dimensional data, supervised learning, decision trees/graphs, hybrid models, market forecasting, biological data, ..., etc. [more]. All of these would be done by MML. There is no need to have done any previous subject on AI.
Phylogenetic modelling of indigenous, endangered and other languages (20 pts)
Project-id: DD-lang
David Dowe
This project concerns phylogenetic modelling of languages. In other words, we wish to model how languages have descended from and evolved from one another. There will be an emphasis on endangered indigenous languages, largely as a way to help preserve them. The modelling should be carried out using Minimum Message Length (MML), largely because of its theoretical optimality properties and its wide-ranging achievements in all range of inference problems. The project could be taken in the direction of using (probablistic) finite state automata/machines (PFSAs) to model grammars or syntax. More info' on the project, including a list of references, is at www.csse.monash.edu.au/~dld/Hons/2005/dld2005_projects
Time-series autoregression using MML with econometric applications (20 pts)
Project-id: DD-econ
David Dowe
We adapt the minimum message length (MML) principle to the problem of efficient partitioning of economic units, such as firms or countries, into groups (or regions) whose behavioural patterns are similar within each group but distinct across groups. The approach to the partitioning will be MML multi-way join decision graphs (Tan and Dowe, 2003). We will develop an autoregression model in the leaves of the decision graph, possibly augmenting it with a variety of lags (or time delays). We hope to consider at least two specific applications. There could be many variations on this project. More info' on the project, including a list of references, is at www.csse.monash.edu.au/~dld/Hons/2005/dld2005_projects

On Go.
Project-id: GF-go
Graham Farr
The aim of this project is to estimate, as accurately as possible, the number of legal positions in the game of Go when played on large boards. In this game (known as Go or Igo in Japan, Weiqi in China and Baduk in Korea), two players (Black and White) take turns to place stones of their respective colours on the vertices of a large square grid (usually 19×19) so as to enclose (in a particular sense) as much territory as possible, and to capture the opponent's stones by surrounding them. In the project, both the size of the board and the number of players will be allowed to be arbitrary, and some alternative (non-square-grid) board topologies will be considered. The methods to be used include transfer matrices and Markov Chain Monte Carlo, which you will learn about. Programming will probably be in C, and user interaction will just be simple and text-based. You should have a good mathematical background and good results in cse2304, cse3305, cse2303.
I will be at Clayton for one day a week, and can meet students then. I am now mostly based at Caulfield, and can meet there as well, if the need arises.
Ref: G. E. Farr, The Go polynomials of a graph, Theoretical Computer Science, 306 (2003), pp.1-18.

Scalable Video Compression (20pts)
Project-id: TF-video
Tim Ferguson
The traditional method for compressing video is to use the common MC/DPCM/DCT hybrid block based coding technique. This hybrid technique is popular due to its simplicity, efficiency and general performance. However a relatively new and more scalable approach, showing significant promise in image compression (JPEG2000), is the wavelet transform. Research is currently being conducted as to how best apply the wavelet transform to video compression. The primary problem relates to motion compensation and how this can be combined with the wavelet transform. The MPEG standards organisation is currently investigating this area of research (www.chiariglione.org/mpeg/working_documents.htm).
Camera Calibration for Making Real-world Measurements (12 or 20pts)
Project-id: TF-camera
Tim Ferguson
The basic problem here is we wish to take a two-dimensional image and obtain three-dimensional information so that real-world measurements and positions can be found. Obviously some limitations must be applied to this process. Rather than locating points at any position in the three-dimensional space, we can only consider points on a reference planar surface. In this case, a road surface is considered. To solve this problem, we need to be able to determine the internal parameters of the camera (focal length, principal point of focus and radial lens distortion) and the external parameters of the camera (rotation and translation). Initial research uses the well known Tsai's coplanar camera calibration method. The aim of the project is to improve the current calibration methodology and possibly track changes to the calibration using visual queues.
Detection and Measurement of Pavement Cracking (20pts)
Project-id: TF-pavement
Tim Ferguson
Pavement cracking leads to the rapid degradation of road surfaces. Early restoration of cracking can lead to the prevention of more serious problems (like potholes). The present method for measuring pavement cracking is visual inspection. This is time consuming, costly and can be dangerous. This project investigates methods for automated, or semi-automated detection of pavement cracking from video taken of the pavement surface. Due to practical implementation reasons, the video has resolution (currently a relatively low 0.5 - 1 megapixel) and lighting limitations placed upon it. This presents a significant challenge for the detection system. Early work on this problem was commenced by a previous honours student. It is expected that further development and testing of his method, or the proposal of an improved method, can lead to improvement in these results.
Content Based Identification (CBID) (20pts)
Project-id: TF-cbid
Tim Ferguson
Content based identification (CBID) is also known as digital media fingerprinting. The idea is that multimedia content is assigned a unique fingerprint that can be used for future identification. Given a database of known fingerprints and an unknown piece of multimedia, it should be possible to identify the multimedia by generating a fingerprint and using that fingerprint as an index into our database. Ideally the CBID implementation should be robust enough to still identify content even after it has gone through noise additive processes (compression, transmission, etc). This project will investigate CBID for video. The primary goal will be to create unique fingerprints for video sequences which will be robust under noisy conditions. The fingerprint will need to be a continuous one that may be generated not only by analysing the video track, but may consider the audio track (if present) as well.

Smart Circuit Reconfiguration of Wireless Sensor Networks (20-pts).
Project-id: CG-recon
Charles Greif
The use of Ad Hoc wireless networks to allow smart sensor circuits to communicate has recently become an active area of research both from the Data Communications Theoretic point of view as well as from the practical circuit design point of view. This project extends the work of Steven Koelmeyer BDS(Hons 2004). The main aim of this project is to apply the Mentor Graphics "Seamless Hardware Software Co-Verification tools to the task of designing the Hardware and Software for the next generation of smart sensor notes. The nodes already carry a small processor and will be extended with the addition of a FPGA or CPLD to enhance their functionality.
Reference: I. Stojmenovic, J. Wu, "Ad Hoc Networks", IEEE Computer, 37, 23, 2004
C. Zuver, D. Lim, J. Lockwood, C. Neely, "Automated tools to implement and test internet systems in reconfigurable hardware", ACM SIGCOMM Computer Communications Review, 2004
Software development to support the programming of smart sensor networks running the Berkeley TinyOS Operating System (12 or 20-pts)
Project-id: CG-sensor
Charles Greif
Networks of small low-power sensor nodes are currently of significant research interest to both scientist and engineers for the purposes of remotely sensing environments and controlling machinery using that sensory data. This aim of this project is to use specialised compiler technology and a simulation environment to develop software for small sensor networks, similar to those used in building-climate control systems. The operating system used on the sensor nodes is Berkeley's TinyOS and the compiler that is to be used is the NESC compiler.
Reference: R Katz, J. Kahn, K. Pister, "Emerging Challenges: Mobile Networking for Smart Dust" Berkeley EECS Smart Dust website, 2000
P. Buonadonna, R. Szewcxyk, D. Culler, J. Hill and A. Woo, "A network- centric Approach to embedded software for tiny devices" Emsoft website page 16, 2001

Suburban Ad Hoc Networks (20- or 12-pts)
Project-id: KP-net
Carlo Kopp and Ron Pose
The Suburban Ad Hoc Networking group focusses its research activities on techniques for implementing Suburban Ad Hoc Networks. These are self organising, quasi-static ad hoc (typically wireless) networks which provide an alternative technology for providing high speed digital connectivity to households, small businesses and distributed campuses. Specific areas of research interest include security, low level routing protocols, access controls and propagation behaviour. Given the broad scope of the research performed in this area, there is considerable choice in project topics. Students should consult Dr Ronald Pose or Dr Carlo Kopp for suitable project topics.
See: www.csse.monash.edu.au/research/san/

Supervised Learning of Email Features (20 points)
Project-id: MA-mail
Yuval Marom (res fellow) and David Albrecht (also see DA)
Automatic processing of emails is required in numerous practical applications, such as summarizing a thread of emails (eg to extract the gist of a discussion), and grouping similar emails/threads (eg to provide a collection of topic-related threads in a newsgroup). The challenge that such applications face is finding a suitable data representation for coding and processing the emails. Success hinges on the choice of features used to transform the raw text in an email to a form suitable for processing by computer algorithms. Research has shown that high-level features, such as sentence types, are often very useful. However, they tend to be very domain-specific, and thus not easily ported between domains. One way to overcome this limitation is to employ a corpus-based approach, where high-level features are learned from low-level ones.
This project will investigate the supervised learning of sentence types from low-level features, in the domain of help-desk emails. Examples of sentence types are GREETING (eg "Hi again Mr Blob") and REQUEST ("Please send me the most recent driver for my printer"). Examples of low-level features are the words in a sentence, and the number of verbs used in the sentence. The main outcome of this project will be a classifier (such as an SVM) that automatically classifies sentences using low-level features. Optional extensions of the project, upon successful implementation of a classifier, are email clustering and summarization, or other applications of the student's choice.
Prerequisites: A strong mathematical and statistical background is required for this project. Knowledge of markup languages, such as xml, is useful but not essential.

Organic Modelling with Generalised Cylinders (12 or 20pts)
Project-id: JM-1
Jon McCormack
Generalised cylinders are a geometric modelling method, originally developed for use in computer vision. For this project we wish to apply them to the modelling of organic structures for computer graphics. The basic principle for creating a generalised cylinder is to define a series of cross-sectional profiles, possibly of varying shape and size, and distribute them over some continuous curve, known as the carrier curve. The cross-sections are connected to form a continuous surface. Sounds easy, but there are a number of important issues that need to be addressed to ensure that the geometry defined by the cylinder is legal (i.e. can be rendered). Constructing compound surfaces is very useful for modelling organic structures such as branches, leaves, tentacles, veins, shells, etc., as this image (below) illustrates. The image is procedurally generated using generalised cylinders.
The challenge for this project will be to create a software system to assist in the automated construction of such models using generalised cylinders. The system will also have to deal with managing geometric complexity and geometric output in a variety of formats (e.g. real-time, offline rendering). The software for the project should be written in C++ and OpenGL. You should have successfully completed CSE3313 Computer Graphics (or equivalent) in order to work on this project. [more]
Preliminary Reading: McCormack, J. 2004, Generative Modelling with Timed L-Systems, in Gero, J.S. (ed) Design Computing and Cognition '04, Kluwer Academic Publishers, Dordrecht. pp. 157-175. (Hargrave Library reference: H 620.00420285 I61.4D 2004)
Prusinkiewicz, P., L. Mündermann, R. Karwowski & B. Lane 2001, The Use of Positional Information in the Modeling of Plants. Proceedings of SIGGRAPH 2001 (Los Angeles, California, August 12-17). In Computer Graphics (Proceedings) Annual Conference Series, ACM SIGGRAPH, pp. 289-300.
Mech, R., P. Prusinkiewicz & J. Hanan 1997, 'Extensions to the Graphical Interpretation of L-Systems Based on Turtle Geometry', Technical Report, No. 1997-599-01, April 1, 1997. University of Calgary, Calgary, Alberta Canada.
Interactive Adaptive Learning Systems (12 or 20pts)
Project-id: JM-2
Jon McCormack
This project will investigate the use of adaptive learning systems for a real-time interactive application. Collections of virtual creatures are required to develop a symbiotic relationship with a human audience – responding to movement and gesture. Students should be prepared to investigate a number of adaptive learning techniques including classifier systems and evolutionary ANN (Artificial Neural Network) approaches, with the goal of creating a novel system that evolves and adapts to its real and virtual environments. The key emphasis for this system must be flexibility and real-time behaviour as the dynamic environment is actually changing in real-time. [more]
References: General Introductions to Adaptive and Evolutionary Systems (all available from the Hargrave library):
Flake, G.W. (1998), The Computational Beauty of Nature : Computer Explorations of Fractals, Chaos, Complex Systems, and Adaptation, MIT Press, Cambridge, Mass.
Pfeifer, R. and C. Scheier (1999), Understanding Intelligence, MIT Press, Cambridge, Mass.
Holland, J.H. (1995), Hidden Order : How Adaptation Builds Complexity, Helix Books, Addison-Wesley, Reading, Mass.
More specific papers: McCormack, J. (2002), Evolving for the Audience, International Journal of Design Computing 4 (Special Issue on Designing Virtual Worlds). pdf version.
McCormack, J. (2003), Evolving Sonic Ecosystems, Kybernetes 32(1/2). pdf version.
The Illusion of Beauty (20pts)
Project-id: JM-3
Jon McCormack
What is beauty and what makes a thing beautiful? Is beauty a property of things and can it be measured? This was famously answered by the mathematician Birkhoff who in 1933 proposed an 'aesthetic measure', equal to the ratio of order to complexity. Birkhoff was able to measure the aesthetics of simple shapes and vases. However, the measure was not successful in general. Many principles of what makes a good picture are well known, i.e. in composition we talk of balance, entrance and exit, symmetry, etc. as properties of what is considered a good composition. For this project the challenge is to try and encode or infer rules about visual composition to see if we can devise a better aesthetic measure of an image. This could have all sorts of applications, for example, automated selection of camera positions in 3D virtual environments, use as "fitness functions" to be used in evolutionary programs and so on — read this paper as a good starting point. Would we think a computer that can make beautiful images as being creative? [more]
Preliminary Reading: Saunders, R and Gero, JS: 2001, Artificial Creativity: A Synthetic Approach to the Study of Creative Behaviour, in Gero, JS (ed), Proceedings of the Fifth Conference on Computational and Cognitive Models of Creative Design, Key Centre of Design Computing and Cognition, Sydney.
Humphrey, NK: 1973, The Illusion of Beauty, Perception 2: 429-439.
Solomonoff, RJ: 1995, The discovery of algorithmic probability: A guide for the programming of true creativity., in Vatanyi, P (ed), Computational Learning Theory: EuroCOLT '95, Springer-Verlag, Berlin, pp. 1-22.
Boden, MA: 1994, What is Creativity?, in Boden, MA (ed), Dimensions of Creativity, MIT Press, Cambridge, MA, pp. 75-117.
Dartnall, T (ed.) 2002, Creativity, Cognition, and Knowledge: An Interaction, Praeger, Westport, Connecticut; London.
Partridge, D and Rowe, J: 1994, Computers and creativity, Intellect, Oxford, England.

Hybridizing Ant Colony Optimization and Constraint Propagation (24-pts)
Project-id: BM-ants-1
Bernd Meyer and Andreas Ernst
Solving combinatorial optimization problems, (COP) such as fleet routing, rostering and timetabling, is a core task of practically all industrial operations. As these problems are computationally hard, specialized techniques and algorithms are required to solve them. This makes COP a very interesting class of problems, both from the theoretical and from the applied perspective.
The common characteristics of COP is that a vast space of potential solutions has to be searched for a solution that satifies all problem constraints and is in some sense optimal. Problem constraints restrict the range of feasible solutions: in rostering, for example, a constraint may be that no employee can work two consecutive shifts. Optimality must in some sense be measurable, so that solutions can be compared. In rostering a component of optimality often is the total salary cost.
There is large variety of algorithms and techniques to tackle COPs. Two very different approaches are meta-heuristics and constraint programming. With a bit of over-simplification, it could be said that meta-heuristics are a good approach when finding feasible solutions is comparatively easy but achieving optimality is difficult, whereas constraint programming is at an advantage where it is difficult to find a feasible solution, but optimising among the feasible solutions is comparatively easy.
The project will investigate hybrid techniques that combine the complementary strengths of these methods. Particularly promising meta-heuristics for such a combination are Ant Colony Optimization (ACO) and the Bayesian Optimization Algorithm (BOA). ACO is a recent stochastic meta-heuristic inspired by the foraging behaviour of real ants. For an introduction to ACO see "A Review on the Ant Colony Optimization Metaheuristic: Basis, Models and New Trends" by Oscar Cordo, Francisco Herrera and Thomas Stützle in Mathware and Soft Computing, Vol. 9, No. 2-3, 2002, pp. 141-175. For an overview of BOA see "Linkage Problems, Distribution Estimation and Bayesian Networks" by M. Pelikan, D. Goldberg and E. Cantu-Paz (available at: www.cs.umsl.edu/~pelikan/publications.html). The project will be conducted in association with the Commonwealth Scientific and Industrial Research Organisation CSIRO.
Optimality of Ant Foraging (24-pts)
Project-id: BM-ants-2
Bernd Meyer and TBA
The emergence of co-operative behaviour, such as collective foraging, is one of the most fascinating aspects of social biology. How can very simple individuals, such as ants, collectively conduct highly organised activities, such as nest building or foraging, without any need for a central co-ordinator?
Probably the best studied forms of co-operative behaviour are foraging and combat in social insects, such as ant, wasps, termites and bees. Foraging, in particular, raises interesting questions. A wide-spread assumption in modern biology is the so-called Optimal Foraging Theory (OFT), which basically asserts that the foraging behaviour of a highly evolved species must in some sense be optimal or at least close to optimum. OFT really is a consequence of standard evolutionary theory, where foraging efficiency is taken as a currency (as a proxy) for general fitness. It is, however, worth questioning whether such a simple form of OFT is really justified.
While it is difficult or even impossible to rigorously test OFT in the field or in the lab, mathematical and computational modelling gives us a handle to do this. Mathematical modelling provides the opportunity to vary parameters that we cannot (easily) isolate or control in the lab and to derive predictions how the foraging behaviour will change with such parameter changes. Thus we can design representative simple, schematic scenarios and abstractly test OFT for these. A closer inspection of existing mathematical models of ant foraging immediately invites some types of computational experiments that allow us to analyze the optimality of ant foraging in certain types of environments.
The purpose of the project is to use mathematical and computational modelling to analyse whether ant foraging is optimally adapted to the environment. Results of a successful project will in turn inform a biological investigation and give guidance for the design of field or laboratory experiments with real ants to validate the results of the modelling.
These questions can be tackled with two complementary modelling techniques: individual-based simulation models (IBMs, sometimes also called agent-based micro simulation), and differential / difference equation models. Differential equation models, the classical modelling technique for collaborative foraging, unfortunately often fall short where complex environments have to be taken into account, for example dynamically changing food resources. The project is therefore more likely to rely upon IBM simulation. Students wishing to undertake this project should feel confident with programming these. Certain important aspects of the problem can, however, well be studied in abstract mathematical models without simulation and a student with the appropriate mathematical background may choose to specialise the project in this direction.
Interest in biology and some enthusiasm for interdisciplinary work is a must, but no particular biological knowledge is required as a prerequisite. Depending on the specialisation, this project may be conducted with an external supervisor and may give interested students the opportunity to collaborate on experiments with real ants in a biological laboratory if desired. For an introduction to collaborative foraging and its modelling see "Self-organization in biological systems" by S. Camazine, J.-L. Deneubourg, N.R. Franks, J. Sneyd, G. Theraulaz and E. Bonnabeau. Princeton University Press, 2001.

Bayesian Networks for Modeling Cardiovascular Health (Background)
The ARC has funded a 3-year project to apply Bayesian networks (BNs) to epidemiological data, specifically to predict and prevent coronary heart disease (CHD). Bayesian networks excel when researchers need to combine causal and diagnostic reasoning in areas characterised by uncertainty. But they have one flaw which hinders their use: they do not yet easily mix continuous and discrete variables. We intend to extend them to handle such mixes, then demonstrate how much they can improve on current methods for predicting, among other things, coronary heart disease. One aim, obviously, is to prevent deaths. Another is to reduce health-care costs by showing where we can shift resources from a small and expensive high-risk group to a large and inexpensive moderate-risk group.
The ARC project commenced in 2004 with the appointment of Dr. Charles Twardy as Research Fellow. We have constructed two BNs from epidemiological models of coronary heart disease (CHD), specifically (1) a regression model from the Australian Busselton study [Knuiman et al., 1998] and a simplified ``points-based'' model from the German PROCAM study [Assmann et al., 2002]. We then adapted both these BNs for the Busselton data set and compared them in cross-validation, along with machine-learned models. These BNs will serve as baseline predictors for the ARC project.
Project 1: Knowledge Engineering BNs for Epidemiology (12pt or 20pt)
Project-id: NK-epi-1
Ann Nicholson and Kevin Korb
This project involves constructing BNs that combine knowledge from experts (from the Department of Epidemiology, Monash Medical Ctr) and the results of the baseline predictor BNs constructed thus far. A number of stages in the knowledge engineering process (see KorbNicholson2004] will be undertaken including:
  • investigation of network structures using the software tool Matilda
  • interleaving expert and automated parameters estimation, using a previously developed methodology [WoodberryEtAl2004]
  • applying prior causal knowledge (from experts or the literature) to automated learners, including CaMML (See www.datamining.monash.edu.au/software/camml/ for an overview of CaMML).
  • sensitivity analysis (including sensitivity to findings and sensitivity to parameters), using the Netica software and software developed at Monash [WoodberryEtAl2004]
  • comparitive evaluation of the BNs using the Weka environment.
Project 2: Learning hybrid BNs for Epidemiology (12pt or 20pt)
Project-id: NK-epi-2
Ann Nicholson and Kevin Korb
This project involves extending existing BN methods to learn hybrid networks (those with mixed continuous and discrete variables). These methods will be implemented in CaMML [CaMML], a successful BN learner that uses Minimum Message Length (MML) metrics tuned for causal structure. (See www.datamining.monash.edu.au/software/camml/ for an overview of CaMML). The methods developed will be used to build hybrid BNs from the CHD epidemiological data. A comparive evaluation of these hybrid BNs against the baseline predictor BNs will be undertaken using the Weka environment.
Requirements: Student should have completed an undergraduate AI subject, and have a reasonable maths background.
[Assman et al. 2002]. G. Assmann, P. Cullen and H. Schulte., Simple scoring scheme for calculating the risk of acute coronary events based on the 10-year follow-up of the Prospective Cardiovascular Munster (PROCAM) Study. Circulation. 105(3):310-315, 2002.
[Knuiman et al., 1998].M.W. Knuiman, H.T. Vu and H. C. Bartholomew. Multivariate risk estimation for coronary heart disease: the Busselton Health Study. Australian & New Zealand Journal of Public Health. 22:747-753, 1998.
[Korb & Nicholson, 2004] K. B. Korb and A. E. Nicholson. Bayesian Artificial Intelligence. Chapman & Hall / CRC, 2004.
[Woodberry et al., 2004] O. Woodberry, A. Nicholson, K. Korb, C. Pollino. Parameterising Bayesian Networks: A Case Study in Ecological Risk Assessment, Technical Report 2004/159, School of Computer Science and Software Engineering, Monash University, 2004 (based on 2003 Honours project, see www.csse.monash.edu.au/hons/projects/2003/Owen.Woodberry/.

Computer models of autistic learning (allocated)
Andrew P. Paplinski and Lennart Gustafsson
I will be offering only one Honours project which has been already pre-assigned to Chris Mills.

Content-Based Image Retrieval (20 pt or 12 pt, 2 projects)
Background: In recent years Content-Based Image Retrieval (CBIR) has become one of the most active areas of research in image processing. The purpose of a CBIR system is to retrieve images from a database whose visual contents are similar to that of a query image. Two key research aspects (see below) in CBIR are 1. Representation of visual features such as colour, texture and shape, followed by indexing of image in terms of multi-dimensional features and 2. Use of relevance feedback in the retrieval process.

Project-id: SR-cbir1
Sid Ray
This project will focus on the second aspect mentioned above, namely, the use of relevance feedback in the retrieval process. In this method, the retrieved images are chosen in such a way that they reflect the user's choice that is based on visual inspection of images. The project will involve development of methodology for combining image features, such as colour, texture and shape, by incorporating the relevance feedback information.
Project-id: SR-cbir2
Sid Ray
This project will focus on the first aspect mentioned above, namely, representation of image features and image indexing. The project will involve development of an image indexing algorithm based on (a) the study of usefulness of various computational features in describing the visual contents of an image and (b) the study of combination of features leading to successful retrieval results.
For details on the projects please see the supervisor. In addition, reference may be made to the three honours projects in the area of CBIR that were carried out by Stewart Hore, Scott Clements and Hai Le during the years 2000, 2003 and 2004, respectively.
Texture Analysis of Images (20 pt)
Project-id: SR-texture
Sid Ray
Texture plays an important role in both human interpretation of visual scenes and computer analysis of images. Textural cues are of particular relevance in two different, but related, image analysis problems, namely, the problems of (i) segmentation and (ii) classification of images. The proposed project will deal with both of these problems. It will involve two phases. In the first phase some existing texture analysis methods will be investigated from the point of view of their theoretical soundness as textural measures as well as their practical applicability. In the second phase attempts will be made to derive a new methodology with particular attention to its computational efficiency.
Selection of Features for Pattern Recognition (20 pt or 12 pt)
Project-id: SR-features
Sid Ray
Experience shows that in recognition of patterns by humans, the recognition is made based on only a few important features (or attributes) characterizing the pattern classes. Conversely, in statistical pattern recognition, the patterns are often represented by a large number of numerical features. Although there is no conceptual justification in reducing the number of features to a small number, in practical problem solving, this becomes a necessary step due to the wellknown phenomenon of the 'curse of dimensionality' of the feature vector on the complexity of the pattern classifier. The aim of the proposed project is to develop an interactive feature selection paradigm well-suited to multiclass (rather than 2-class) pattern recognition problems.
The project will comprise four components, namely, (i) study of some existing feature selection criteria, (ii) development of software tools for the existing criteria, followed by an experimental investigation of them, (iii) analysis of the above results leading to, hopefully, a feature selection paradigm, and (iv) development of an interactive software tool for the above paradigm. The software developed will be 'interactive' in the sense that depending on the classification accuracy arrived at a certain stage, the user will have the option of changing the dimensionality value. Options will also be available to supply interactively a range of values of the dimensionality. The software tools will include procedures for displaying the distribution of pattern samples in different feature spaces obtained by different feature selection methods.
Document Image Analysis (20 pt or 12 pt)
Project-id: SR-docs
Sid Ray
The objective of document image analysis is to recognize text, graphics, and pictures in printed documents and to extract the intended information from them. There are two broad categories of document image analyses, namely, textual processing and graphical processing. Textual processing includes skew determination (any tilt at which the document may have been scanned), finding columns, paragraphs, text lines and words, and performing optical character recognition. Graphical processing deals with lines and symbols.
The scope of the proposed project will be the aspect of text processing and it will concentrate on the development of a system for page layout analysis.
Image Thresholding by Probabilistic Distance Criteria (20 pt)
Project-id: SR-threshold
Sid Ray
Thresholding is one of the most popular approaches to image segmentation. In this approach, all the pixels having a certain range of some image property, say, intensity, are considered to belong to one group. Connected regions in these groups lead to the desired segmentation. In the past, the probabilistic entropy measure of Shannon, defined in the context of Information Theory, has been successfully applied in determining the gray level threshold between the "object" and the "background" regions of an image, assuming separate probability distributions for the object pixels and the background pixels. Image segmentation methods also exist in which non-probabilistic entropic criteria, devised in the fuzzy set-theoretic framework, have been used.
Projects carried out in 1997 and 2002 included the study of two probabilistic distance criteria, namely, the Bhattacharyya distance and Kullback-Leibler Divergence function, as image thresholding criteria. The purpose of the proposed project will be to make further investigations in the area of probabilistic distance-based image segmentation methods and their applications for both monochrome and colour images. Knowledge of introductory Statistics and Probability Theory will be advantageous. Attendance of CSE456 Pattern Recognition in Semester 1 will be helpful but not essential.
Image Segmentation by Clustering (20 pt)
Project-id: SR-seg-1
Sid Ray
The aim of the project will be to make a critical study of a number of existing non-hierarchical cluster analysis methods, such as, c-means, fuzzy c-means, and isodata, with a view to their application in image segmentation. Study of cluster validity will form an essential part of the project. Applications will include both monochrome and colour images.
Segmentation of Textured Image (20 pt)
Project-id: SR-seg-2
Sid Ray
Note: See the supervisor for details.
Classification of Textured Image (20 pt)
Project-id: SR-class
Sid Ray
Note: See the supervisor for details.
Hybrid Image Segmentation (20 pt)
Project-id: SR-hybrid
Sid Ray
Note: See the supervisor for details.

Spatial models of evolutionary biology
Project-id: SG-spatial
Suzanne Sadedin (post doc) and David Green
A project with the general theme of spatial models of evolutionary biology. Some possible areas for investigation include the evolutionary dynamics of mating systems, game theory for territorial animals, and the paradox of the lek.
You will need to meet with me to negotiate a more specific topic that meets your interests and is relevant to current work. You should only consider this project if a) you are genuinely interested in evolutionary biology and willing to independently learn a lot of biological information and jargon and b) you are confident in your coding abilities, preferably in C++.

Soft Models for Analysing Real-Time properties (Robyn SMART) (20pt or 12pt)
Project-id: PS-rta
Ian Peake and Heinz Schmidt
Meet Robyn, our Evolution ER1 robot (ER1 Robot Page 2004). In this project you will extend compositional performance models to predict "soft real-time" properties of Robyn, and design experiments to validate these. This will involve extending an existing package written in Java (Schmidt et al. 2003) and becoming more experienced with Markov Chain analysis. Robyn and other real time systems have requirements on their ``extra-functional'' properties, such as resource (CPU and memory) requirements, performance over time, reliability, and quality of service. "Soft real-time" properties take into account probability, for instance "executes program in 12s with 0.6 probability assuming collision probability less than 0.1". As industry moves to component-based software engineering, predicting extra-functional properties compositionally remains a challenging research problem (Schmidt 2003) for realistically sized systems. The Evolution ER1 is a robotics kit with computer vision, hearing, speech, networking, remote control, email, autonomous mobility, gripping, and IR sensing capabilities, implemented under Windows(TM) and Linux, and controllable via simple state machines or even remotely via a character-based TCP/IP API. The ER1 is suitable for use in a wide range of concept demonstration and prototyping scenarios, which have real-time aspects.
References: ER1 robot page, 2005 (www.evolution.com/er1/)
Heinz W. Schmidt: Reliability prediction for component-based software architectures. In Journal of Systems and Software, 66(3), pp. 241-252, Elsevier Science Inc, 2003
Heinz W. Schmidt, Ian D. Peake, Jue Xie, Ian Thomas, Bernd J. Kramer, Alexander Fay, Peter Bort: Modelling Predictable Component-Based Distributed Control Architectures Proc. Ninth IEEE International Workshop on Object-Oriented Real-Time Dependable Systems (WORDS' 03), IEEE, pp. 339-346, 2003
Recovery-oriented Model checking (RoMoc) (20pt or 12pt)
Project-id: ST-RoMoc
Heinz Schmidt and Ian Thomas
In this project you will develop a new moderately-sized state machine model of recovery-oriented interactions between independent autonomous software components. Recovery oriented computing (Patterson 2000) branches into exceptional behaviour before a failure occurs, i.e., proactively, for instance triggered by panicking components. Immediate and efficient measures are taken to safeguard the computation and lower the failure probability - overall mean time to recovery shortens and availability increases. Contrast this with reactive fault-tolerance where exceptions are thrown and handled after a failure is detected. Think 'alert' instead of 'failure' or 'exercise' instead 'hospital bed'. Your model demonstrator will be implemented by extending an existing Java package for probabilistic finite state automata (PFSAs) (Jayaputera 2003). PFSAs serve as abstract models of programs (particularly in real-time and distributed, i.e., networked, systems). These models can be checked against expected properties including failure probabilities. Such properties are expressed in probabilistic temporal logic (PRISM home page) and checked symbolically against your abstract PFSA model or against actual executions of deployed and running software which is claimed to be recovery-oriented as modeled in the PFSA. The symbolic (model checking) executions could also form the basis for automated testing. Automata-based model checking has numerous practical applications in industry in advanced telecommunication, military and technical domains such as automotive software engineering.
While there are many challenging open problems with this novel recovery-oriented approach (Fox & Patterson 2003, Shapiro 2004) the scope of this Honours project is limited to developing the one model, implementing and exploring it as a model-checking case study.
PRISM home page
Jane Jayaputera, Iman H. Poernomo, Heinz W. Schmidt: Timed Probabilistic Reasoning on UML Specialization for Fault Tolerant Component Based Architectures, Proceedings of the SAVCBS Workshop at European Software Engineering Conference, 2003
Patterson, 2002: ROC: A new research agenda for a new century. Proc. 8th Intl Symp. on High-Performance Computer Architecture (HPCA'02), IEEE, 2002
Armando Fox and David Patterson: Self-Repairing Computers, Scientific American, June 2003
Schmidt, 2004: Building systems from unreliable components, Dagstuhl Workshop on Architecting Systems with Trustworthy Components, 12/2004
Shapiro: Self-healing in modern operating systems, ACM QUEUE vol 2 no 9, pp. 66-75, 12/2004

Learning and Selection of ICA feature extractors for Image Retrieval (20pt)
Project-id: DS-ica
David Squire
[NB Prefers to supervise one project only] Independent Component Analysis (ICA) is a statistical technique for discovering hidden factors underlying a set of random variables. It goes beyond Principal Component Analysis in that it seeks factors that are statistically independent, rather than just decorrelated. ICA can be used to discover filters that can be used to characterize visual textures in images. In this project you will investigate the discovery and selection of a set of ICA filters for an image collection, and test them in a content-based image retrieval setting, using the GIFT, and open source image retrieval system. It is hoped that such filters will give improved performance when applied to specific types of images, such as dermatoscopic images of melanoma.
References: www.cis.hut.fi/projects/compneuro/whatisica.html www.gnu.org/software/gift/gift.html
Segmentation for Content-Based Image Retrieval (12 or 20-pts)
Project-id: DS-cb
David Squire
[NB Prefers to supervise one project only] In recent years, the use of digital image collections has become common, on the world wide web, in the preparation of both electronic and paper publications, and via domestic digital cameras. The need for tools to manage this rapidly-increasing quantity of visual data is greater than ever. Specifically, there is great interest in systems which allow users to query image databases. Consequently, there is great interest in Content-Based Image Retrieval. CBIR uses the Query by Example paradigm: to retrieve images, an example image is provided as the query, and the system endeavours to retrieve similar images, based on features such as colour, texture and shape. Many such systems have been developed, such as IBM's QBIC, VisualSEEk, the GIFT. Most current systems use whole images for query and for relevance feedback. In reality, users are often only interested in a particular object in, or region of, the query image. Some efforts have been made to support query by region, such as in BlobWorld, but the open-source GNU Image Finding Tool (the GIFT) does not yet do so. In this project you investigate image segmentation algorithms for suitability for CBIR, with a focus on Markov Random Field techniques. You will design a new feature extraction module for the GIFT, which treats segments, rather than images, as the basic elements. www.glue.umd.edu/~vikas/Geman%26Geman.ppt www.gnu.org/software/gift/gift.html

Security in Out-Sourcing Databases (20 or 12pts)
Project-id: BS-dbase
Prof. B. Srinivasan
Database as a service model (DAS) is transferred to service provider for backup, administration, restoration, space management, upgrades, etc., the advantage being that the application service provider supports the S/W, H/W and provides their own human resources. The issues and challenges are: 1 How to charge for service? 2. What kind of service guarantees can be offered? 3. Liabilities of the service provider. 4. Powerful interfaces to support development environment are required. 5. Scalability in the web environment and the associated overhaed costs due to network latency. 6.Privacy / security of out sourced data from intruders and attacks, protecting clients from misuse by service providers. Secure and efficient RDBMS Storage model needs to reduce the overhead with encryption. A new RDBMS storage model needs to be developed to: Reduce number of encryption calls, padding overheads and avoid over encryption of queries on non-sensitive data.

Improved image reconstruction for JPEG-LS near-losslessly encoded images (12pt)
Project-id: PT-jpeg
Peter Tischer
JPEG-LS is an international standard for the lossless compression of grey-scale images. It achieves good compression at high throughput rates. JPEG-LS also has a near-lossless mode. This achieves greater compression but introduces some reconstruction error. By specifying a parameter, the maximum reconstruction error in a pixel value can be controlled.
The near-lossless mode in JPEG-LS enables a picture to be encoded in a way in which the maximum reconstruction error per pixel of the decoded picture is guaranteed not to exceed a user-defined amount. The decoder has a degree of flexibility in terms of how to assign a reconstructed value to a pixel. This project will use local segmentation-based techniques to reduce the amount of reconstruction error in a JPEG-LS near-losslessly encoded image. Special Prerequisites - none
Document Segmentation for Document Image Compression - (12 or 20pt)
Project-id: PT-doc
Peter Tischer
The JBIG-2 standard for compressing binary images, that is images whose pixel values are either black or white, allows for images to be segmented and for each segment to be treated differently in the compression process. However, the standard does not specify how this segmentation should be carried out. It specifies only how the encoder tells the decoder what the segmentation is.
The aim of this project is to investigate ways of segmenting binary images of documents, such as those that arise in fax and document processing systems, so that the resulting segmentation may be used to encode the binary image most efficiently. The segmentation will therefore be aimed at finding regions that have similar properties with respect to how compactly the information in those regions may be represented. Thus, an image of a page might be broken into segments which represent text, headlines, tables, half-tone images or white space at the margins of a page. The project might also investigate how to use such a segmentation to get best compression of the binary image. Special Prerequisites - none
Image Deblocking using Local Segmentation (20 pt)
Project-id: PT-seg
Peter Tischer
The baseline JPEG lossy compression mode has been hugely successful and is widely used, particularly on the internet. The same approach is also used in coding digital video such as videoCD, MPEG3, video-teleconferencing and digital television. However, at low bit rates the block approach leads to objectionable artifacts in the reconstructed image.
Local segmentation is a paradigm for constructing image processing algorithms in which the pixels within a block are either classified as belong to the same segment or as belong to a number of different segments. Operations on a pixel will only use values from those neighbouring pixels which belong to the same local segment. As one example, local segmentation has been shown to be effective in creating denoising algorithms for removing Gaussian noise while preserving underlying image structure.
The aim of the project is to use knowledge of how the image was compressed in conjunction with local segmentation to produce a reconstructed image with minimal block artifacts. This project is a continuation of a project undertaken as a BSE honours 12 point project. There are a number of possible extensions to the project. These include implementing and improving an objective technique for the blockiness artifact of DCT-encoded images, implementing a state-of-the-art deblocking technique to use in comparison studies of deblocking techniques, an investigation of deblocking based on a sub-band approach and a local segmentation-based approach based on using the DCT coefficients directly. Special Prerequisites - none
Document Similarity Measurement via Data Compression (12 or 20 pt)
Project-id: PT-sim
Peter Tischer
In document classification it is often desirable to measure how similar two documents are. This might be because we are interested in saying whether documents are spam or whether the source code of two programs is so similar that at least program is an example of plagiarism.
If two documents are similar, we would expect that we would require little information to know the contents of the second document given that we already know the contents of the first. One way to measure information is by using data compression techniques as the compressed size is an upper bound on the amount of information needed to store the original data.
The aim of the project is to adapt techniques from the area of text compression to compress the data which shows how one document can be created from another. Of particular interest is coming up with similarity measures that are also distance functions. Special Prerequisites - none
Scientific Visualisation using MCST Projections. (12 or 20pt)
Project-id: PT-vis
Peter Tischer
In Scientific Visualization we are often interested in representing higher dimensional data in some lower number of dimensions such as 2,3 or 4. Minimal Cost Spanning Trees (MCSTs) are simple to compute and are a way of preserving closeness information in representing higher dimensional data.
Current methods for carrying out projections for visualizing higher dimensional data tend to be computationally expensive with results that are sensitive to parameters within the algorithm that are difficult to tune. These methods are also sensitive to the dimension to which the projection is being made. MCST-based projection does have these drawbacks and can be engineered to preserve properties of the original higher dimensional data.
This project will involve implementing a package that uses MCSTs for visualizing higher dimensional data in lower dimensions such as 2, 3 or 4 and comparing the performance of this package against packages incorporating techniques like SOMs (Self Organising Maps). Special Prerequisites - none.
Automatic Cropping of Medical Images (20 pt)
Project-id: PT-med
Peter Tischer (+ Department of Medical Imaging and Radiation Sciences)
In many forms of medical imaging there is a region of interest in which all possible information needs to be retained but there may also be parts of the image where substantial information loss may be acceptable. For example, in an x-ray of an arm it might be acceptable that information in the backround could be removed by turning the values of the background pixels into some constant value.
This project will involve working with radiographers from the Department of Medical Imaging and Radiation Sciences to automate the process of deciding that a part of a medical image represents a background region which does not need to be stored with a high degree of fidelity. Special Prerequisites - none

Exploring the Four-colour Theorem (20 or 12-pts)
Project-id: MW-4col
Mark Wallace
That you only need four colours to colour a map, was proven using a computer enumeration which is brilliant but unsatisfactory. Given the years of effort on this problem we cannot expect a simple pencil-and-paper proof, but I believe that new visualisation techniques might enable one to achieve new insights. For example any region with four or more neighbours will have at least two neighbours linked by Kempe chains [Fritsch]. Repeatedly flipping the colours of all regions bounded by a chain must eventually produce a colourable map or return to the original colouring. Can the latter ever occur? If we transform arbitrary maps to a regular form based on shortest paths, can we get a better inkling why it is difficult to colouring some maps? This project will generate some nice animated graphics but it is unlikely to achieve anything more substantial. But if it did, it would be very exciting!
Rudolph Fritsch (1998) The four color theorem: history, topological foundations and idea of proof.
Rail Timetabling (20 or 12-pts)
project-id: MW-rail
Mark Wallace
Invent a small suburban railway and design a regular timetable that maximises the number of trains per hour without any trains coming too close to each other. This problem is quite practical and also surprisingly hard [Peeters]. Constraint programming was once used to solve a version of this problem for the Melbourne suburban railway [Gosselin]. The challenge is to use a combination of constraint programming and operational research techniques to do it faster, better and with more scalability.
Peeters, L. W. P. (2003). Cyclic Railway Timetable Optimization. Erasmus University Rotterdam.
Vincent Gosselin (1993) Train Scheduling Using Constraint Programming Techniques
Learning While Searching (20 or 12-pts)
Project-id: MW-learn
Mark Wallace
Solving NP-hard problems, such as the most efficient way to wire a car, or the lowest energy structure of a protein, involves searching. Searching means trial and error: and our challenge is to implement a search routine that learns from its mistakes and never makes a second similar error. This requires recording the choices that resulted either directly, or indirectly, in the error and extracting minimum "inconsistent" combinations of these choices, which can be added as constraints called "nogoods" on the remaining search [Junker]. The challenge is to combine this learning with forms of reasoning ahead "constraint propagation" so as to achieve a highly efficient complete hybrid algorithm. The implementation language will be constraint logic programming.
Ulrich Junker, QuickXPlain: Conflict Detection for Arbitrary Constraint Propagation Algorithms (2001)

Learning under Resource Constraints (20pt).
Project ids: GW1 and GW2
Geoff Webb (two projects available).
These projects, as part of a large research team investigating the wider problem, will each tackle one aspect of the question of how to optimise machine learning in a context where the computational resources available for learning are unpredictable and varying. This is typical of many online applications where the user load varies greatly from time to time. Such applications include information retrieval, recommender systems, user modeling, junk email filters and online fraud detection.

FIT units & help-desk

Bibliographies: [CS-bibs] [LA] [Mon.lib.]

Comp'l Sci in the news 17/6/2005, & [report]

Monash Eng & IT #1 in .au & [#18 in world]
Also overall [THES @idp].


Course code: 2770.   URL:   http://www.csse.monash.edu.au/courses/bse/
Created with "vi (Linux + Solaris)",   charset=iso-8859-1