|
Honours Research Projects 2002
|
Computer Science, Digital Systems
(NB: This list may change up until mid-February 2002.
Supervisors may also offer projects that are not explicitely mentioned in this
list. It pays to talk to potential supervisors who research in areras that you
are interested in.)
These are the projects being offered for honours research projects for
Honours students in 2002.
-
BCompSci(Hons) (20-points)
-
BDigSys(Hons) (20-points)
Notes
-
Some projects may require an agreement covering intellectual property rights
to be signed as a condition of undertaking them.
-
These projects are available to Computer Science and Digital
Systems students unless marked otherwise. Students must ensure they
have have the required background for the project, see the supervisor(s)
of projects that you are interested in a.s.a.p.
-
Projects may also be available on Caulfield campus.
Please contact the BComp Honours co-ordinator Peter Granville
for information about such projects and consult with the BCS/BSc or BDS
coordinators if you intend to choose some such project.
Supervisors
The CSSE Interactive Video Wall
David Abramson
The School of CSSE operates a MPEG-2 low latency video wall connection
between the staff common rooms at the Caulfield and Clayton campuses. The
system
consists of a set of FVC CODECS, high output data projectors and cameras.
This project concerns development of a number of supporting systems for the
video wall. These include technologies for optimizing the field of view,
the addition of software to detect when users are present and also for
varying the network bandwidth requirements automatically. Students will be
exposed to some leading edge digital video equipment and will be required
to interface with the University networking hardware. The project will may
involve both hardware and software, depending on the skills that the
student has.
Multi-user games for an interactive video wall
David Abramson
The School of CSSE operates a MPEG-2 low latency video wall connection
between the staff common rooms at the Caulfield and Clayton campuses. The
system consists of a set of FVC CODECS, high output data projectors and
cameras.
This project will explore ways of supporting conventional group games like
board games, cards and chess, etc, across the video wall. This would allow
users to play games where no physical contact is possible.
Master-Slave Parallel Processing using Nimrod
David Abramson
This project concerns extensions to a parallel computing paradigm embodied
in the tools like Nimrod and Clustor. These tools support the parallel
execution of computational models which are totally independent.
Consequently, there is no interprocess communication required and data is
usually transmitted via the file system. However, there are a class or
problems that can be supported by minimal communication between a master
process and a set of slaves. For example, a parallel genetic algorithm
might perform most of its computations by separate slave processes, but
still require occasional communication with a master which recombined the
results from the slaves and redistributes them.
This project will address efficient mechanisms for allowing a master-slave
type parallel program to execute under Nimrod, thereby extending its
capabilities.
Iterative algorithms for Support Vector Machines generation
David Albrecht
Joint supervisor: Adam Kowalczyk and Herman Ferra (Telstra Research Labs)
Recently, Telstra Research Laboratories has been investigating a
machine learning method known as Support Vector Machines (SVMs). They have
shown that SVMs possess interesting properties and have demonstrated
that they give excellent performance in a number of practical
applications, in particular text categorisation.
In typical applications, labelled examples are scarce, while unlabelled
examples are plentiful. Learning algorithms which can effectively utilise
the unlabelled examples will have a definite advantage. In this
collaborative project with Telstra Research Laboratories,
we will investigate developments of algorithms suitable for such learning
paradigm. We shall focus on transductive inference using Support Vector
Machines in particular.
Clustering (alias Supervised Classification, Mixture Modelling, etc.)
Lloyd Allison
The project is to add an MML clustering "plugin"
to the CDMS data-mining system.
This can be done in one of two ways
- in Java using the CDMS library of probability distributions etc.
- by modifying an existing `C' program and joining
it to CDMS through the C:Java interface.
It could well be worthwhile to do both.
Once the basic clustering plugin is added and thoroughly tested,
extensions to
e.g. [1-D time-series]
data and higher-dimensions,
e.g. images, could be considered.
Graph Based Models for Data Mining
Lloyd Allison and Trevor Dix
Workers in CSSE have investigated MML
[decision-trees],
decision graphs
(more properley classification-trees etc.)
and related structures for data mining
for the `supervised classification problem'.
(For references, search for Dtree or for Dgraph in the
[bibliography].)
The project is to develop a "plugin"
for this kind of Model for the CDMS data-mining system.
Artificial Neural Net for CDMS
Lloyd Allison and David Dowe
An artificial neural network (ANN) "plugin"
is being developed for the CDMS data-mining system.
The project is to develop
- the MML theory (and its implementation) of ANNs,
- which will allow us to choose between "simple" and "complex" ANNs thus
- preventing over-fitting,
and
- the control and visualization of the ANN through the CDMS GUI.
Bioinformatics: Analysing Microarray Data
Lloyd Allison and Trevor Dix
A `micro-array'
experiment can show the `level of expression'
of tens of thousands of genes on a single microscope slide.
Unfortunately the "noise level" in such experiments is high,
so they are usually replicated several times for each set of
experimental conditions
but even so correctly dealing with the noise is difficult.
Microarrays are often used to compare gene expression levels
of an organism under two or more conditions.
e.g. 1: Under normal conditions and
when some control gene is mutated.
e.g. 2: At different times in a life-cycle, or in some other cycle.
The project is to create tools in the CDMS data-mining system
to analyse micro-array data.
This may involve developing statistical models of the problem,
working out the (MML) complexity of such models, and
implementing the above as a plugin to CDMS
with controls and visualization through the CDMS GUI.
Parallel Programming Language [-LA]
Lloyd Allsion
A simple notion of `parallel-process' consists of:
- Channel, (typed) along which Values are sent and received.
- I/O action, to send or receive a Value on a channel.
- Process which carries out a sequences of actions and may stop.
- Composition of processes, in parallel or
by non-deterministic choice.
Note that channels, actions and processes are Values
and can be passed through channels.
A proof-of-concept, untyped interpreter exists.
The project is to
- develop a practical, typed system, including
- execution in more than one (Unix-)process and
- on more than one computer,
and
- investigate the use of the system in
- typical resource-management ~ operating-system tasks, and/or
- generalized search and optimization algorithms.
Overloading and/or Classes in Functional Programming
Lloyd Allsion
A light-weight functional programming language is being developed.
It has a parametric polymorphic type system.
The project is to explore one or more ways of incorporating
overloaded operators and functions
- by a class system, as in Haskell where
- each Value has a Type,
- each Type can belong to one or more Classes, and
- each Class is specified by the operators and functions
that must be defined for Types in that Class,
- by declaring certain operators and/or functions to be overloaded.
The system is intended to be used
- for general purpose programming, and
- (coincidentally) as the basis of
a scripting language in the CDMS data mining system.
TRANSMISSION CONTROL PROTOCOL (TCP) FLOW CONTROL SIMULATION AND
VISUALIZATION
Jim Breen
TCP is probably the most important protocol in the Internet, and its
ability to respond appropriately to errors and packet loss is of vital
importance. TCP's flow control mechanisms are fully understood by very few
people, and are in fact quite difficult to teach.
This proposed project is to develop a simulation of TCP that can be used
as a teaching and demonstration tool. The simulation would be embedded
in a visualization environment that would enable the user to:
(a) select particular connection conditions between the two TCP clients
(RTT, packet loss, available bandwidth)
(b) select different TCP implementation options (Reno, Reno 2, Long Fat
Pipes, etc.)
(c) make adjustments to these while the simulation is running
and observe the packet-by-packet operations of TCP and its reaction to the
changing environment. The project would involve developing or modifying
an implementation of TCP, including all the latest extensions (something
Microsoft has not yet done).
Generic interactive theorem proving
John Crossley with Iman Poernomo
This project involves the development of an environment for (visual)
interactive theorem proving. As a basic requirement, the environment should
permit the user to write proofs in traditional predicate logic. However,
logicians study many other logics. The design of our environment should be
generic enough to represent proofs of some of these other logical systems.
This will require approaching the project with a meta-level view. Thus it
will require NOT the implementing of well-known logical rules but a general
framework for taking (almost) anything that could be regarded as a logical
rule and automatically encoding all instances of it. The student will be
expected to
- Design an XML schema for the representation of theorem and
proof information
- Implement an object-oriented API for representing
theorems and making proofs
- Implement a GUI for visually representing
proofs of the API.
The student has a choice of implementation language:
.NET, C# or Java. It is assumed that, on completion of the project, the
student will have a good working knowledge of XML and reflection. However,
no prior knowledge of these concepts is assumed. A familiarity with at
least the formal logic discussed in Formal Methods I is essential.
Assembling Genomic Tag Sites
Trevor Dix
This is a bioinformatics project applying computer science to analyse
data from microbiology, in this case genomic DNA information.
Fortunately, we can view this problem purely as computer science.
The problem involves assembling fragments of DNA in a consistent way.
The fragments have been marked at so called tag sites. In effect, we
need to align these sites. Trouble is the data is noisy. Some sites
should have tags and don't; other sites are wrongly marked.
Fortunately we can apply the minimum-message-length (MML) principle
that was developed at Computer Science at Monash (CSSE) to this
problem. We can use MML as we search for the best explanation of the
observed data, including the noise.
We also need to be able to view the results.
Clustering Microarray Data
Trevor Dix and Lloyd Allison
New technology allows tens of thousands of test tubes to be replaced
by a single microscope slide - a `microarray'. There are two types -
one using VLSI-like masks to built up arrays of molecules on a
slide, the other using inkjet printer-like techniques to
print a library of molecules onto a slide. Each dot on the microarray
binds to complementary messenger RNA, mRNA, labelled with a
flourescent dye. Spots where labelled mRNA is bound are detected
using a scanner.
Microarrays are of great interest to the Victorian Bioinformatics
Consortium of which several CSSE staff are members.
Microarrays can be used to reveal gene expression under varied
experimental conditions.
The problem is that the data are very "noisy" with considerable
experimental error, variation and noise.
The project is to use machine learning techniques developed in CSSE
to extract significant results from the raw data.
Alan Dorin
The projects Alan Dorin and Jon McCormack are offering in graphics,
animation and artificial life can be found
at: [
http://www.cs.monash.edu.au/~cema/projects/hons2001/hons.html].
Jon will be on OSP in semester 1 and will not offer any projects.
Linda McIver undertakes research in the areas of
- Psychology of programming
- Computing education
- Human computer interaction
She may also offer other projects in these areas that are not listed here.
Automated Web Navigation
Linda McIver
Navigation information is a critical component of a usable website.
Many sites provide incomplete site maps, with poor categorisation, so it
can be very difficult to track down the information you need. Once you
are deep in the site hierarchy, it can be particularly difficult to work
out where you are in relation to the rest of the site, and what else the
site offers.
Good navigation information provides answers 3 questions: Where am I?
Where can I go from here? Where do I want to be?
Much of this information can be extracted quite simply from collections
of pages, in order to construct a consistent navigation mechanism for
each page. This project aims to provide tools for automatic extraction
of navigation information, as well as construction of customisable
navigation mechanisms for a website. A natural extension of this might
be an automated indexing system which uses the text on each page to
construct a site map to guide the user to the desired information
quickly and easily.
Bernd Meyer
With the rapidly increasing size of the world-wide web, the visualization of
web sites, their contents and structure is quickly becoming a neccessary
tool for managing the complexity of the web. Visual representations are
extremely helpful for analyzing and interpreting the results of
world-wide web searches, and they can also be used to understand and optimize
the structure of individual sites or intra-nets.
The project builts upon previous projects for the visualization of query results and
web site structures. It will aim to create systems that provide effective
visualizations based on the combination of different types of information:
(a) structure-based
representations and (b) contents-based visualizations. Structure-based
visualizations analyze the link structure and/or traffic information and
use them to produce typically network-like site maps. Contents-based
visualizations analyze the text contents (and/or keywords, metatags etc.)
in web pages and produce displays that make it easier to judge the
similarity of two pages.
We will put particular emphasis on the utilization of innovative
visualization metaphors, such as spatialization, and the questions how
these can be used in a browser to support efficient explorative web
searches.
Experience in Java programming is required and knowledge about interfacing
with web browsers and plug-in programming would certainly be a plus.
Artificial Immune Systems for Plagiarism Detection
Bernd Meyer
Artificial Immune Systems (AIS) are a new class of nature-inspired methods
that are becoming increasingly popular in security applications such as
network intrusion detection and virus protection as well as for adaptive
optimization.
In a broad sense, AIS are close relatives of genetic algorithms (GA). Their
function is to learn to distinguish known or expected patterns (such as
types of network traffic) from patterns previously not encountered. AIS
algorithms are inspired by and roughly modelled after the function of the
human immune system.
Many of the basic building block of AIS resemble components of GAs and AIS
can indeed be interpreted as performing a kind of fast micro-evolution.
However, the objective of AIS is very different from that of GA: AIS
generalize, whereas genetic algorithms specialize. More precisely, AIS
generate or "evolve" a large number of partial recognizers that, taken
as a whole population, cover all known cases. This means that the objective
of their learning process is generalization. Genetic algorithms, on the
other hand, attempt to breed a population in which each individual is
specialized towards the same narrowly defined objective.
AIS have previously been proposed and used for various kinds of intrusion
and fraud detection, such as network intrusion, credit card fraud etc. This
project will explore how the ideas of AIS can be used to support the
automatic detection of plagiarized program code. The basic idea is that the
system will be "immune" to previously encountered programs so that it will
have an "immune reaction" for every genuinely new solution. However, in the
case of plagiarized code, no immune reaction will take place, because the
AIS will already have been immunized with related forms of stimuli.
Declarative Languages for Reasoning with Diagrams
Bernd Meyer and Maria Garcia de la Banda
Diagrams are extensively used not only in scientific and technical
disciplines, but also in many areas of everyday communication. The long
term purpose of this project is to provide computational methods and infrastructure
for smart user-interfaces that communicate with the user via diagrams.
These systems need to be able to interpret or "understand" diagrams that the
user draws, translate them into other specifications and to
generate such diagrams as output from specifications.
The project aims to create a declarative logic language for computational
reasoning with diagrams. Constraint logic programming (CLP) languages are a
good basis for such an undertaking due to their high-level nature and their
support for geometric constraints.
Three honours sub-projects in this area are immediately available. These
are intended to supply the technical groundwork for diagrammatic reasoning
languages.
-
Unfortunately, most CLP languages only support solvers that are monotonic,
i.e., they do not allow arbitrary deletion of constraints, a step often
required to model change in diagrammatic reasoning. The first project is
therefore to integrate the non-monotonic constraint solver QOCA into the
CLP language HAL. Both QOCA and HAL are currently being developed by the
Optimization & Constraint Solving research group in collaboration with
groups outside Monash University. The main challenge will be the design of
a well-defined interface.
-
The second project is to implement a linear logic mode in HAL. Linear logic
is a specialized logic of resources and actions, in which truth values can
change over time. It has been shown to form an adequate basis for modelling
problems in diagrammatic reasoning. Linear logic programs can be executed
on top of standard logic programming by means of meta compilation. The
project will develop a linear logic meta compiler for HAL starting from
existing Prolog meta compilers for linear logic.
-
The third project is to implement a smart editor for a subclass of UML
diagrams, in particular for statecharts. The first step for this is to
formalize syntax and semantics of this class of diagrams in linear logic,
the second step will then be to implement this specification in a linear logic
programming language interfaced with a suitable graphical front-end.
Students wanting to tackle these projects should preferably have a working
knowledge of Prolog.
Layout of Complex Diagrams with Evolutionary/Genetic Algorithms
Bernd Meyer
Automatic layout of diagrams is of growing practical importance in many
application areas, but it proves to be conceptually difficult and
computationally expensive. The layout of simple node-edge graphs is
meanwhile, after more than a decade of dedicated research, relatively well
understood, but little attention has been paid to more complex types of
diagrams such as state charts or message sequence charts in UML.
The project will study the application of stochastic optimization
methods, such as genetic algorithms, to the layout of more complex
types of diagrams. We are particularly interested in the layout of
Hi-Graphs and state charts. Important questions are the definition of
appropriate layout esthetics, the design of evolutionary methods for these
esthetics, and possibly the hybridization of such methods with other layout
techniques.
David Dowe
David Dowe is interested in Minimum Message Length (MML) inductive inference.
The MML principle is particularly useful in machine learning, statistics,
"knowledge discovery" and "data mining". Both theoretical and applied projects
are available, some of which are listed below, and all of which you should
feel free to discuss with David Dowe. Areas of interest include clustering
and mixture modelling, the von Mises circular distribution, single and
multiple factor analysis, supervised learning, decision trees and decision
graphs with or without leaf regressions, sequentially and spatially correlated
data, protein folding, DNA string alignment, the human genome project and
market forecasting. All of these would be done by MML. I am also interested
in MML inference of neural nets. There is no need to have done any 3rd
year subject on AI for any of these projects.
Hypothesis testing and model selection by MML and other methods
David Dowe
We compare various forms of hypothesis testing in machine learning,
statistics and ``data mining''. Some and possibly all of the tests we
consider will be:
classical, (maximum) likelihood-based tests,
(Bayesian) Minimum Message Length (MML) tests and
other, non-MML, Bayesian tests.
We will compare MML and other Bayesian hypothesis testing with classical
hypothesis testing. Preliminary results indicate that classical hypothesis
testing does not work as well as Bayesian hypothesis testing. Nonetheless,
we endeavour to explain why classical hypothesis testing works as well as
it does.
A good mathematical background will be necessary, and a background in
statistics would be an unnecessary bonus.
References :
Wallace, C.S. and D.M. Boulton (1968), An information measure for
classification, Computer Journal, 11(2), pp185-194.
Wallace, C.S. and D.L. Dowe (1999). Minimum Message Length and
Kolmogorov Complexity, Computer Journal (special issue on Kolmogorov
complexity), Vol. 42, No. 4, pp270-283.
http://www3.oup.co.uk/computer_journal/hdb/Volume_42/Issue_04/
http://www3.oup.co.uk/computer_journal/hdb/Volume_42/Issue_04/pdf/420270.pdf
A part of a chapter in Em. Prof. Chris Wallace's book.
MML, Occam's razor, inference and prediction
David Dowe
The Minimum Message Length (MML) principle of machine learning, statistics,
inductive inference, "knowledge discovery" and "data mining" is an
operational form of Ockham's razor. MML (Wallace and Boulton, 1968;
Wallace and Dowe, 1999) gives a quantitative trade-off between the simplicity
of a theory and how well if fits the data, and seeks a simple theory that
fits the data well. An attempt to discredit Ockham's razor was given by
Murphy and Pazzani (1994), and this has been responded to by Needham and Dowe
(2001). A more recent attempt to discredit Ockham's razor has been given by
Webb (1996). This work seems to leave itself open to many responses, both in
terms of theory and empirical experiments. The Hons. project is to continue
the work of Needham and Dowe (2001) and, at least initially, to respond to
(and refute?) Webb (1996).
A good mathematical background will be necessary.
References :
Needham, S.L. and D.L. Dowe (2001). Message Length as an Effective
Ockham's Razor in Decision Tree Induction. Proc. 8th International
Workshop on Artificial Intelligence and Statistics (AI+STATS 2001),
pp253-260, Key West, Florida, U.S.A., Jan. 2001.
Wallace, C.S. and D.M. Boulton (1968), An information measure for
classification, Computer Journal, 11(2), pp185-194.
Wallace, C.S. and D.L. Dowe (1999). Minimum Message Length and
Kolmogorov Complexity, Computer Journal (special issue on Kolmogorov
complexity), Vol. 42, No. 4, pp270-283.
http://www3.oup.co.uk/computer_journal/hdb/Volume_42/Issue_04/
http://www3.oup.co.uk/computer_journal/hdb/Volume_42/Issue_04/pdf/420270.pdf
C.S. Wallace and J.D. Patrick, `Coding Decision Trees', Machine Learning, 11,
pp7-22, 1993.
G. I. Webb, Further Experimental Evidence against the Utility of Occam's
Razor, J. Artif. Intell. Research 4 (1996), pp397-417.
MML Inference of Support Vector Machines
Minimum Message Length (MML) is a universal principle in machine learning,
"data mining" and statistics. It dates back to Wallace and Boulton (1968)
and is closely connected to the notion of algorithmic information theory
(Wallace and Dowe, 1999).
An increasingly popular approach in machine learning is Support Vector
Machines (SVMs) (Vapnik, 1995). SVMs have been used in various
problems, including classification, regression analysis and the
building of decision trees. They possess interesting properties and
have been demonstrated to give excellent performance in a number of both
theoretical and practical applications.
Recently some work has been done in presenting SVMs in an MML framework.
The aim of this project is to continue this work and to break new ground
by applying the SVMs to various problems such as classification of data into
multiple classes. Possible approaches will include a combinatorial coding
scheme and a scheme for encoding the separating hyperplane.
A good mathematical background will be necessary. Programming might be in
Java, but this is negotiable.
References:
Vapnik, V.N. (1995), The Nature of Statistical Learning Theory, Springer.
Wallace, C.S. and D.M. Boulton (1968), An information measure for
classification, Computer Journal, 11(2), pp185-194.
Wallace, C.S. and D.L. Dowe (1999), Minimum Message Length and Kolmogorov
Complexity, Computer Journal, 42(4), pp270-283.
Well-behaved long-period pseudo-random number generators
David Dowe
Pseudo-random number generators (RNGs) are useful for computer games including
dice-rolling, card-dealing and bingo, as well as for computer simulation
and for search strategies such as simulated annealing and genetic algorithms.
For a pseudo-RNG to be of any practical use in these and other applications,
it needs to have a long period (length of cycle) and to exhibit very low
levels of correlation between the bits it generates and other signs of
apparently random behaviour. This project will initially involve tests
of randomness (as given by D. E. Knuth and others) on an RNG of period
(2^32) * (2^32 - 1) due to C.S. Wallace (1989), as well as other RNGs due
to G. Marsaglia and others (see WWW site below). A main aim of this project
will be to extend the Wallace generator to periods (2^32) * (2^32 - 1)
* (2^32 - 3) and higher while still maintaining its randomness.
Reading : 1) C. S. Wallace, A long-period pseudo-random generator, Tech
Rept #89/123, Dept of Computer Science, Monash University, February 1989.
2) Find Random Number Generator by doing the following:
1. Go to here
2. Click on " C Programming " on left of the screen
3. Click on " Code Snippets "
4. Under the heading " Portable functions and headers " click on
" Random number functions "
5. Click on " Rand1.C "
MML Inference of Musical Composition Style
This project aims to develop programs for inferring something about a musical
composer's composition style, just from records of pieces written.
Initially, a program capable of inferring very simple aspects of composition
- such as rhythm, melody and/or pitch - would be developed, and tested.
The testing would include inference of style of appropriately simple
computer composers and inference of style of human composers. The testing
would also include listening to some of the compositions made using the
inferred composition style(s).
A long-term and at least initially unrealistic goal is for the program
to infer a composition style which can then write pieces both more complicated
than nursery rhymes and pleasant to the ear.
Some basic methods have been developed for doing this inference (Macmillan,
1997) and related inference (Jansen, Dowe and Farr, 2000); and we hope to
extend these. Several inference techniques are possible - among these, we
intend to apply the principles of Minimum Message Length (MML) inference.
Programming will almost certainly be in Java, C++ or C.
A good mathematical background and an interest in Minimum Message Length (MML)
will be necessary, and a background in music would be a bonus. The student
would be expected to become familiar with most of the references below.
References :
------------
Jansen, A.R, D.L. Dowe and G.E. Farr (2000), Inductive Inference of Chess
Player Strategy. Proceedings of the Sixth Pacific Rim International
Conference on Artificial Intelligence (PRICAI'2000), Lecture Notes in
Artificial Intelligence (LNAI) 1886 (Copyright Springer-Verlag), pp61-71.
http://www.csse.monash.edu.au/~tonyj/pricai2000cr.ps
Macmillan, G. Brendan (1997), "MML Analysis of Melody", Master of Computing
thesis, Faculty of Information Technology, Monash University, 1997.
Wallace, C.S. and D.M. Boulton (1968), An information measure for
classification, Computer Journal, 11(2), pp185-194.
Wallace, C.S. and D.L. Dowe (1999). Minimum Message Length and
Kolmogorov Complexity, Computer Journal (special issue on Kolmogorov
complexity), Vol. 42, No. 4, pp270-283.
http://www3.oup.co.uk/computer_journal/hdb/Volume_42/Issue_04/
http://www3.oup.co.uk/computer_journal/hdb/Volume_42/Issue_04/pdf/420270.pdf
Inductive Inference of Game Player Strategy.
David Dowe
This project aims to develop programs for inferring something about a game
player's strategy, just from records of games played.
The sort of games
considered here are board games like Chess. A long term aim is to be able
to take as input a record of the chess games of (e.g.) Garry Kasparov (World
Champion), and infer something (but not everything!) about his chess playing
strategy. This may be an ambitious goal, but we propose to move toward
it in achievable steps. Initially, a program capable of inferring very
simple aspects of strategy would be developed, and tested using records
of games played by appropriately simple computer players. We have developed
some basic methods for doing the inference, and expect to improve on them.
Several types of inference are possible; among these, we intend to apply
the principles of Minimum Message Length (MML) inference. This project
builds on a 1997 project by Tony Jansen.
(See
http://www.csse.monash.edu.au/~dld/chess.html.)
Programming will almost certainly be in C. Please consult with the
supervisor on recommended subjects to take.
Evolving intelligence
David Dowe
Intelligence would appear to consist of three main components : rote learning
(memory), deductive learning (logical or mathematical reasoning) and inductive
inference (generalisation, abduction, induction or pattern recognition).
Human-devised intelligence test questions seem to be very much about inductive
inference. Inductive inference can, in turn, be thought of as being about
two-part compression, where the first part of a compressed data string encodes
an inferred hypothesis about the data string and the second part of the
compressed string encodes the data given the hypothesis. In this project,
using fitness functions related to compressive ability, we use genetic
algorithm (GA) techniques to evolve an artificial intelligence which is capable
of doing inductive learning and two-part compression in an artificial life
environment.
Reading :
D L Dowe and A R Hajek (1998). A non-behavioural, computational
extension to the Turing Test, pp101-106, Proceedings of the International
Conference on Computational Intelligence & Multimedia Applications
(ICCIMA'98), Gippsland, Australia, February 1998.
http://www.dartmouth.edu/~phil/events/LoebnerPrize2000.html
Topological Containment
(for BCompSc/BDigSys/BSE Hons, but scaled down for BSE)
Graham Farr
The aim of this project is to write programs to assist
in discovering new mathematical results and algorithms on graphs.
We say that a graph H is topologically contained in a graph
G if G has a subgraph that can be obtained from H by replacing edges
with paths. For example, let G be a graph whose vertices represent
countries in Europe with two vertices being adjacent if the countries
they represent share a land border. Let H be a graph with four
vertices u, v, w, x, with each pair of vertices being adjacent, so its
six edges are uv, uw, ux, vw, vx, wx. Then G topologically contains
H, by the following correspondence: u = Switzerland, v = France, w =
Germany, x = Italy; edges uv, uw, ux, vw and vx (in H) correspond to
paths (in G) consisting of single edges joining the
relevant countries (e.g., uv corresponds to the simple path
Switzerland-France), but edge wx corresponds to a slightly longer
path: Germany-Austria-Italy. Note that these paths in G cannot
intersect, except at their endpoints. In general, the paths will
often not be single edges, and may be quite long.
Topological containment has been an important concept in graph
theory for decades. It has been used to characterise many important
classes of graphs (trees, planar, series-parallel ...). It is
therefore of interest to find efficient algorithms for determining
when H is topologically contained in G. However, if G and H are both
allowed to vary, as part of the input, then the problem is NP-hard.
Suppose we fix H, so the input consists just of G.
When given such an input graph G, we ask whether it topologically
contains our fixed graph H. This gives a new problem for each fixed H.
The good news is that, for each H, the problem can be solved in
polynomial time (due to Robertson & Seymour). The bad news is that
the polynomials that measure the running time are so huge that the
algorithms are unlikely to ever be practical in our universe.
So the quest for practical algorithms continues.
Practical algorithms are known only for a small number of fixed graphs H.
Work in this area seemed to stall somewhat about ten years ago, due
partly to the prohibitive number of cases that had to be considered in
the mathematical analysis. It should now be worth trying to write a program
to do this case analysis. This may lead to new mathematical
results and new practical algorithms.
This project is part of joint research planned with Prof Mike Fellows
(U of Newcastle) whose book, Parameterized Complexity (co-authored
with Prof Rod Downey), considers this and many other problems.
You do not need a maths degree to do this project! You should, however,
have aptitute in theoretical subjects like Algorithms and Data Structures,
Formal Methods I and II.
It may be sensible to use LEDA (= Library of Efficient Data types and
Algorithms), which has many programs and data structures for graphs
and is written in C++.
Graph Clustering via MML
(for BCompSc/BDigSys Hons)
Graham Farr
Suppose you have a large graph, which may represent a software
engineering diagram, an electronic circuit, a human social network, or
whatever. In many applications it is useful to be able to identify
clusters within the graph: nonoverlapping sets of vertices that are
more highly connected, among each other, than the graph as a whole.
In the full form of the problem, we allow clusters within clusters and
so on, too.
One area where this may be useful is in Graph Drawing, where we
seek to represent graphs on a (usually) 2-D surface in a pleasing way,
minimising things like area, numbers of bends, crossings, etc. The
clusters can be drawn separately, then these drawings can be arranged
with respect to each other so that the whole graph can be well drawn.
Thus, clustering allows a large graph drawing problem to be broken
into several smaller ones.
In this project we use the Minimum Message Length (MML) principle
(based on information theory) to measure how good a clustering is.
The aim will be to produce programs that find MML clusterings of graphs,
compare these with clusterings obtained by other methods, and
to use the programs to produce good drawings of graphs.
Faculty Timetabling
Maria Garcia de la Banda
The constraint programming (CP) paradigm has had remarkable industrial
success even though it is quite recent. This is because CP languages
currently provide one of the best approaches for developing efficient
solutions to many combinatorial problems such as planning, scheduling,
routing and investment.
This summer scholarship deals with existing room timetabling application
developed for a faculty in a Germany University using the constraint logic
programming language CHIP. The aim of the project is to (a) port the
timetabling application to the logic programming language SICStus and (b)
to modify it to deal with the timetabling requirements of the Monash IT
Faculty.
The student should, preferably, have a
knowledge of Prolog and graphical user interfaces.
Compiling OPL to HAL
Maria Garcia de la Banda, Kim Marriott
The constraint programming (CP) paradigm has had remarkable industrial
success even though it is quite recent. This is because CP languages
currently provide one of the best approaches for developing efficient
solutions to many combinatorial problems such as planning, scheduling,
routing and investment.
OPL is a recent high-level mathematical modelling language for describing
combinatorial problems. HAL is a constraint programming language which is
currently being developed by the Optimization & Constraint Solving research
group in collaboration with groups outside Monash University. The aim of
the project is to develop a OPL-to-HAL compiler.
The student should be either 3rd or 4th year and, preferably, have a
knowledge of Prolog. We are ready to fund an additional two-four weeks in
order to facilitate the completion of the project.
Charles Greif
Charles Greif will
not be offering any projects.
Use Of Attribute Grammars To Construct
XML Translators
John Hurst
XML translators provide a mechanism for manipulating
documents written in the extensible markup language XML, and
performing arbitrary translations on them. XML documents
represent an evolving standard for information management that
allows the direct creation web documents, paper documents, and
database records. Derived from the SGML (Standard General
Markup Language) standard, XML provides for the separation of
document structure and document presentation, and is a very
flexible document protocol.
This project is to develop an attribute grammar (AG) based
description of possible XML translations, in much the same
style as is used in attribute grammar based compiler
generators, such as Eli. Just as compilers translate from
source to target representations, so this project will develop
a translator that can accept an AG of the translation to be
performed on arbitrary XML documents.
The Prefix-Content-Suffix Model Of XML
Translation
John Hurst
AXE
(http://www.csse.monash.edu.au/~ajh/research/doctech/index.html)
is an example of an XML translator. It is written in Perl,
and accepts a translation specification that converts XML
documents into other quite arbitrary document formats. For
example, it has been used to build all the web pages rooted at
http://www.csse.monash.edu.au/~ajh. It uses the paradigm of
specifying translations for each element as a prefix
component, a nested content component, and a suffix component.
It can be used as a declarative model, but becomes far more
powerful when each of these components may themselves be
procedural.
However, it is still a somewhat prototypical implementation,
and relies upon escaping into Perl code within the translation
file to perform some of the more arcane translations needed,
and it uses the restrictive SAX left-to-right translation
model. This project has two main goals: to rewrite AXE using
Python and the DOM tree-based evaluator, and to design
translation specifications that do not rely upon escaping into
embedded code.
This project and the previous one may be undertaken
conjointly or separately, and are not dependent upon each
other, although there are obvious areas of overlap.
Argumentation Game
Kevin Korb
A game in the style of the original computer adventure games is
envisioned in which the protaganist is required both to analyze a
sequence arguments, identifying conclusions, spotting fallacies,
etc. and to construct arguments in order to advance. The arguments
will be organized into one or more themes, such as arguments in a
court setting, or detective work (Sherlock Holmes style), or medical
detection.
Causal Learning
Kevin Korb
This project will continue work with CaMML in using MML (Bayesian)
inference for learning causal structure from statistical and
experimental data. This will involve work with a team of PhD students
and others. The exact role within the team is negotiable and may
involve one or more of: philosophy of causality; mathematics of MML;
statistics of causal inference; experimental work; creativity.
Programming in JAVA will be required.
ALife Simulation
Kevin Korb
The use of Artificial Life techniques to simulate social behavior.
I use ALife to investigate economic issues -- such as the nature of
money, transaction costs, market efficiency, etc. -- and ethical
issues -- such as the evolution of altruistic behavior. Projects
will combine theory with experimentation in one or the other
of these domains.
Bayesian Poker
Poker is an ideal vehicle for testing automated reasoning under
uncertainty. It introduces uncertainty both by physical randomization
and by incomplete information about opponents' hands. Another source
of uncertainty is the limited information available to construct
psychological models of opponents, their tendencies to bluff, play
conservatively, reveal weakness, etc. and the relation between their
hand strengths and betting behavior. All of these uncertainties must
be assessed accurately and combined effectively for any reasonable
level of skill in the game to be achieved, since good decision making
is highly sensitive to those tasks. We have developed a Bayesian Poker
Program (BPP), which uses a Bayesian network to model the program's
poker hand, the opponent's hand and the opponent's playing behavior
conditioned upon the hand, and betting curves which govern play given
a probability of winning. The history of play with opponents is used
to improve BPP's understanding of their behavior. BPP has been
compared experimentally with: a simple rule-based system; a program
which depends exclusively on hand probabilities (i.e., without
opponent modeling); and with human players. BPP has shown itself to be
an effective player against all these opponents, barring the better
humans.
This project involves extending and improving BPP. Possible tasks include:
Modifying the system to play "Texas Hold'em" and running
it in competition with alternative AI programs
which play Texas Hold'em
Modifying the system to use Dynamic Bayesian Networks, to
represent the play of the hand over time.
Addition of bluffing to opponent model.
Improved learning methods.
Cleaning Dirty Data
Kevin Korb
The booming field of "Knowledge Discovery in Databases" (KDD) or "Data
Mining" combines techniques developed in database theory with machine
learning. A number of techniques unique to this area have been
developed for coping with the inevitable noise in data, which results
in mismeasurement and missing values. KDD people typically clean
their data before providing it to their inferential or statistical
programs. There are excellent theoretical reasons for believing that
"cleaning" data throws out good information with the bad. This
project will look at the effect of cleaning data by throwing it out
and the prospects for the alternative of recycling dirty bath water.
Carlo Kopp
Please refer to SAN project co-supervised with Ron Pose.
Gordon Lowe
Gordon will be on OSP in 2002 and will not offer any projects.
Kim Marriott
Kim will be on OSP in semester 1 and will not offer any projects.
Ann will be on leave in semester 1 and will not offer any projects.
Development of High Resolution Ultrasonic Imaging System.
Andrew Paplinski,
Nandita Bhattacharjee, Charles Greif
An ultrasonic imaging system comprise three subsystems:
1. The analog front-end transmits signals between the transducer array
and the signal processing
subsystem.
2. The digital signal processing subsystem, which applies appropriately
delayed signals
to each transducer element to transmit a focused ultrasonic pulse wave
in the selected steering
direction. The reflected signals are first processed by the analog
front-end, and then
appropriately delayed, weighted and summed to form the resultant beam.
The beam is
dynamically focused at different ranges along the direction of the
transmitted ultrasonic pulse.
3. The imaging subsystem performs logarithmic compression of the pixels
from the signal processing
part, converts the scan lines into rectangular coordinates and displays
image frames of the
anatomy being scanned.
Specific projects are available to participate in the development of all
three subsystems.
Modelling Autism using Self-Organizing Maps
Andrew Paplinski and Lennart Gustafsson (Lulea Uni. Tech.)
Autism is a developmental disorder first described by Kanner
(1943) and Asperger (1944). Attention shift impairment and
the negative response to novelty are prevalent in individuals
with autism and are considered to be primary to other deficits
in autism, namely, impairments in social interaction, impairments
in verbal and non-verbal communication, and restricted repertoire
of activities and interests.
Theories on causes of autism, based on properties of artificial
neural networks, have been presented by Cohen (1994) and
Gustafsson (1997). Based on their work the purpose of this
project is to examine how the attention shift
impairment and novelty avoidance influence the self-organization
of an artificial neural network and to discuss the characteristics
of the resulting maps.
Existing prototype model has been written in MATLAB and will
be a starting point for the project.
Computer Controlled Car
Ron Pose and Lloyd Allison
See supervisors for details.
Suburban Area Networks
SANs are self organising ad hoc wireless networks in which household
computers in a suburb can be networked, bypassing the fixed wiring
infrastructure. Using current commodity hardware, such links can
operate at speeds of up to 11 Mbit/s. Future wireless hardware may
permit higher bit rates. In such a network, multiple households can
participate, each using a mast mounted 2.5 or 5.8 GHz microwave link
antenna. A number of possible projects are available, involving work
with protocols, traffic encryption or wireless propagation.
Information Theoretic Image Thresholding
Peter Tischer and Sid Ray
Image thresholding is a quick way to split the pixels of an image
into 2 classes. Often this is desirable if the image consists of a
foreground region and a background region. Most thresholding techiques
use the histogram of the intensity distribution and a large number of
thresholding techniques make use of principles from information theory
to approximate the histogram with two probability distributions - one
representing foreground pixels and one representing background.
The aim of the project is to investigate the mixture modeling
approach to image thresholding. Of particular interest is the use of
the Kullback-Leibler measure as an indicator of how well the histogram
is approximated. Since mixture modeling is a particular strength of the
MML approach to inductive inference, the project will also use the MML
mixture modeling program, 'snob', and apply it to the problem of finding
optimal mixtures with two or more probability distributions.
This project does not require any specific background knowledge of
image processing, information theory or MML.
Generating Support Vector Machines which minimize entropy of
prediction error
Support Vector Machines have been used in various machine learning
problems, including classification, regression analysis and the
building of decision trees. They process interesting properties and
have been demonstrated to give excellent performance in a number of
practical applications.
Also recently, in the field of image compression, iterative techniques
have been developed which generate linear predictors that minimize the
entropy of the prediction error distribution.
In this project we will investigate whether these recent techniques for
generating linear predictors can be adapted to generate Support Vector
Machines which minimize the entropy of the prediction distribution, and
compare the performance of these Support Vector Machines with ones created
by standard methods.
Bayesian Image Binarization
Peter Tischer
Image binarization involves taking the pixels of a 2D image and
placing them into two classes. Image thresholding is a particular
example of image binarization. The Iterated Conditional Mode (ICM)
algorithm is a Bayesian technique for putting pixels from images into
particular classes. In doing so it tries to exploit the fundamental
property of images that pixels tend to be similar in value to their
neighbours.
As a Bayesian technique, ICM has an inbuilt model of how the value of
a pixel is related to the values of its neighbours. The aim of the
project is to implement the ICM algorithm and investigate how it can be
modified to build up its model of interpixel dependency from the image
itself. The work can begin by trying to put pixels into two classes but
it should be possible to generalise the approach to include larger
numbers of classes.
This project does not require any specific background knowledge of
image processing or Bayesian statistics.
Processing of fMRI data
Peter Tischer
fMRI stands for functional Magentic Resonance Imaging. MRIs are
three dimensional images that are becoming widely used in medicine
and medical research. In fMRI a sequence of 3D MRIs of the brain are
taken while the subject is alternately carrying out some task or staying
inactive. A primary aim of analysing fMRIs is to determine which
regions of the brain are active when the task is carried out.
The project involves a number of image processing tasks that are
carried out in the analysis of fMRIs. Since the images are taken over a
number of seconds, the subject may move. Therefore there is a need to
estimate the amount of movement from one image to another and to try to
compensate for the effects of motion so different 3D images can be
compared. This involves a process called image registration.
Both the image capture process and the motion compensation may
introduce anomalous values into the data. Therefore one aim of the
project would be different ways of filtering the data to remove
anomalous values.
As a preliminary step, the processing of fMRIs tries to identify
which volume elements (voxels) of the 3D picture represent regions of
the brain that were active in the task. Another aim of the project
would be to investigate ways of producing binary 3D images whose voxels
indicate whether a particular voxel in the original MRIs was active or
inactive. Apart from identifying active voxels, the processing would
also try to identify connected regions of active brain regions.
This project does not require any specific background knowledge of
image processing.
Segmenting MRI brain images
Peter Tischer
MRI stands for functional Magentic Resonance Imaging. MRIs are
three dimensional images that are becoming widely used in medicine
and medical research. An important step in the processing of MRIs
of the human brain is segmenting the volume elements (voxels) into
small number of classes. Typically, this means 4 classes: White Matter
(WM), Grey Matter (GM), Cerebral-Spinal Fluid (CSF) and Other.
One of the most effective programs for segmenting brain images is a
program called FAST which uses the Iterated Conditional Mode (ICM)
algorithm to produce segmentations where a particular voxel is highly
likely to have similar properties to those of its neighbours.
As part of its normal execution the FAST algorithm also tries to
estimate the bias field associated with the particular MRI scanner that
was used to capture the image. The bias field is an indication of how
the magnetic field strength varied over the 3D volume in which the
subject was placed.
The project will investigate ways of improving the FAST program by
considering different models for the bias field, for the statistical
dependence of one voxel with its neighbours and for the way that the
intensity within a particular tissue class changes spatially.
This project does not require any specific background knowledge of
image processing or Bayesian statistics.
Optimisation of fuzzy control systems using genetic and
other algorithms
Bin Qiu
This project investigates genetic and other algorithms for the optimisation
of fuzzy controllers in terms of membership functions and linguistic rules.
A fuzzy logic based Connection Admission Control (CAC) function
Bin Qiu
This project is concerned with Connection Admission Control (CAC) in ATM
networks. The decision of source admission into a multimedia communication
network is a very active research area. Analytical solutions presented
so far are not adequate under a variety of network situations and source
characteristics. Fuzzy logic based CAC function may provide a breakthrough,
and therefore this research topic has the potential to be developed into
a major research project.
Using a fuzzy logic based scheduling algorithm to provide smart and
efficient load-balancing for clustered 'virtual' Internet servers.
Bin Qiu
This first part is taken from http://www.linuxvirtualserver.org/ :-
"With the explosive growth of the Internet and its increasingly important
role in our lives, the traffic on the Internet is increasing dramatically,
which has been growing at over 100% annual rate. The workload on the
servers is increasing rapidly so that servers will be easily overloaded
for a short time, especially for a popular web server. To overcome the
overloading problem of the servers, there are two solutions. One is the
single server solution, i.e. to upgrade the server to a higher performance
server, but it will soon be overloaded when requests increase so that we
have to upgrade it again, the upgrading process is complex and the cost is
high. The other is the multi-server solution, i.e. to build a scalable
server on a cluster of servers. When load increases, we can simply add a
new server or more into cluster to meet the increasing requests. However,
there are several methods to construct the cluster of servers.
Now the widely used one is Round-Robin DNS, which maps a single name to
the different IP address in a round-robin manner; thus different clients
will be mapped to different servers in the cluster for the ideal
situation. In this way, the load is distributed among the servers.
However, due to the caching nature of clients and hierarchical DNS system,
it easily leads to dynamic load imbalance among the servers, thus it is
not easy for a server to handle its peak load. ...
An even better way is to use a load balancer to distribute load among
servers in a cluster. The parallel services of servers can be made to
appear as a virtual service on a single IP address, so that the end users
see a virtual server, not a cluster of servers. The scheduling granularity
is per connection, which can make a sound load balance among the servers.
Fails can be masked when one server or more fail. Server management is
becoming easy, and administrator can take a server or more in and out of
service at any time, which won't interrupt services to users.
Load balancing can be done in two levels, application-level and IP-level.
..."
This project will pertain to IP-level load balancing.
Currently, the load balancing algorithms implemented in the
LinuxVirtualServer (LVS) consist of: Round-Robin, Weighted Round-Robin,
Least-Connection, and Weighted Least-Connection. These are fairly basic
algorithms that provide an adequate working load-balancing solution.
Following on from the idea that network communications aims for 1) Quality
Of Service (QoS) and 2) efficiency, it is proposed that by using a fuzzy
logic based algorithm for load-balancing virtual servers, it will
(possibly) improve both of these.
The aim would be to get the algorithm to efficiently load balance traffic
between nodes in the virtual server cluster in a 'smart' fashion... whilst
also making sure that the algorithm is lightweight enough to be practical.
The project tasks would be as follows:
1) Research the algorithms currently in use in LVS.
2) Research alternative fuzzy logic algorithms that could be used in load
balancing.
3) Assess the performance of currently implemented scheduling algorithms
in LVS (by simulation). Compare these and discover under what traffic
conditions each works best.
4) Investigate the performance of some new fuzzy logic based scheduling
algorithm. Compare the performance of the new algorithm with the existing
algorithms (by simulation). Determine the traffic conditions under which
the new algorithm operates best.
5) Implement the fuzzy logic based load balancing algorithm into LVS.
6) Assess LVS in operation?
The project could be extended to do smart prediction of overloading of the
load balancer and thus bring up secondary load balancers. Further,
bringing in more servers into the cluster intelligently as load / traffic
requires.
Sid Ray
The aim of the project will be to make a critical study of a number of
existing non-hierarchical cluster analysis methods, such as, c-means, fuzzy
c-means, and isodata, with a view to their application in image segmentation.
Study of cluster validity will form an essential part of the project. Applications
will include both monochrome and colour images.
Content-Based Image Retrieval
Sid Ray
Content-based image retrieval is a well-known method of image retrieval.
The objective of this project is to devise an algorithm for retrieval of
images from a large image data base based on colour quantisation and regional
quantisation.
Texture Analysis of Images
Sid Ray
Texture plays an important role in both human interpretation of visual
scenes and computer analysis of images. Textural cues are of particular
relevance in two different, but related, image analysis problems, namely,
the problems of (i) segmentation and (ii) classification of images. The
proposed project will deal with both of these problems. It will involve
two phases. In the first phase some existing texture analysis methods will
be investigated from the point of view of their theoretical soundness as
textural measures as well as their practical applicability. In the second
phase attempts will be made to derive a new methodology with particular
attention to its computational efficiency.
Document Image Analysis System
Sid Ray
The objective of document image analysis is to recognize text, graphics,
and pictures in printed documents and to extract the intended information
from them. There are two broad categories of document image analysis, namely,
textual processing and graphical processing. Textual processing includes
skew determination (any tilt at which the document may have been scanned),
finding columns, paragraphs, text lines and words, and performing optical
character recognition. Graphical processing deals with lines and symbols.
The scope of the proposed project will be the aspect of text processing
and it will concentrate on the development of a system for page layout
analysis.
Selection of Features for Pattern Recognition
Sid Ray
Experience shows that in recognition of patterns by humans, the recognition
is made based on only a few important features (or attributes) characterizing
the pattern classes. Conversely, in statistical pattern recognition, the
patterns are often represented by a large number of numerical features.
Although there is no conceptual justification in reducing the number of
features to a small number, in practical problem solving, this becomes
a necessary step due to the wellknown phenomenon of the 'curse of dimensionality'
of the feature vector on the complexity of the pattern classifier. The
aim of the proposed project is to develop an interactive feature selection
paradigm well-suited to multiclass (rather than 2-class) pattern recognition
problems.
The project will comprise four components, namely, (i) theoretical comparison
of existing feature selection criteria, (ii) development of software tools
for the existing criteria, followed by an experimental investigation of
them, (iii) analysis of the above results leading to, hopefully, a feature
selection paradigm, and (iv) development of an interactive software tool
for the above paradigm. The software developed will be 'interactive' in
the sense that depending on the classification accuracy arrived at a certain
stage, the user will have the option of changing the dimensionality value.
Options will also be available to supply interactively a range of values
of the dimensionality. The software tools will include procedures for displaying
the distribution of pattern samples in different feature spaces obtained
by different feature selection methods.
Heinz Schmidt and Ralf Reussner
In modern component-based software engineering (CBSE), and particular
in peer-to-peer (P2P) computing, software components often have to be
adapted. The reason is, that in CBSE components are often re-used in
contexts, where they have not been developed for. In P2P components
have to be adapted, because ften a systems have to interoperate
instantaneously with other systems, often unforeseen by the system's
developer.
In both areas, the adaptation needed to make a component interoperable
in a specific enivironment is unforeseen by the component
developer. Hence, it has to be performed automatically by the
component user. The TrustME project of the CRC for Enterprise
Distributed Systems Technology Pty Ltd (DSTC) deals with extensions of
industrial middleware platforms (like Microsoft's .NET or Sun
Microsystems EJB). In this project, we developed first approaches for
automatic adaptor generation.
In this particular student project, the student is expected to work in
the level of components adaptation. Therefore, an understanding of
finite state machines is required (or the willingness to aquire such
knowledge). In particular, the student is expected, to enhance an
existing API of a component configuration library protocol adaptation
mechanisms. This includes the implementation and enhancement of
existing algorithms for protocol adaptation.
Image compression via context coding of error bitplanes
Rod Worley
Image data may be compressed by predicting pixel values, forming an
image of the differences of the true and predicited values. The error
image may be coded efficiently by many techniques.
This project will look at the efficiency of compressing the error
image by converting it to bitplanes and using context coding to
compress the bitplane data.
Chintha Tellambura
Chinta will not offer any projects.
Henry Wu
Henry will be on OSP in 2002 and will not offer any projects.
A short-answer evaluation system
Ingrid Zukerman, Norm Eizenberg
(Dept. of Anatomy and Biology, University of Melbourne)
This project consists of developing a system which will give feedback to
students' short answers to questions. The project consists of the following
components:
- An interface which poses questions, accepts students' answers and provides
feedback. The feedback considered at present consists of an indication regarding
the number and type of missed elements in an incomplete or wrong answer.
- A mechanism which determines the quality of the students' answers. The
envisioned mechanism will apply statistical techniques to compare
students' answers with model answers supplied by a lecturer. In particular,
Latent Semantic Analysis will be investigated, and the feasibility of
an alternative approach based on edit-distance will be considered.
- A simple authoring tool, where lecturers will enter the different elements of
a model answer, e.g., essential components, or incidental components.
The main domain of implementation is Medicine, where a repository of
questions and answers is already available. A Computer Science application
is also possible.
The student undertaking this project should have some grounding in statistics.
Disclaimer