Modelling-Alignment for Non-Random Sequences: 17th Australian Joint Conf. on Artificial Intelligence (AI2004)

Modelling-Alignment for Non-Random Sequences (AI2004)

David R. POWELL, Lloyd ALLISON, & Trevor I. DIX

home₁ home₂
Bib
Algorithms
Bioinfo
FP
Logic
MML
Prog.Lang
and the
Book

Bioinformatics
Alignment
  AI2004
   software

Also see:
Comp.J.'99
Compression
MML

AI2004, Springer Verlag, LNCS Vol.3339, pp.203-214, 2004.

Abstract. Populations of biased, non-random sequences may cause standard alignment algorithms to yield false-positive matches and false-negative misses. A standard significance test based on the shuffling of sequences is a partial solutions, applicable to populations that can be described by simple models. Masking-out low information content intervals throws information away. We describe a new and general method, modelling alignment: Population models are incorporated into the alignment process, which can (and should) lead to changes in the rank-order of matches between a query sequence and a collection of sequences, compared to results from standard algorithms. The new method is general and places very few conditions on the nature of the models that can be used with it. We apply modelling-alignment to local alignment, global alignment, optimal alignment and the relatedness problem.

Results: As expected, modelling-alignment and the standard PRSS program from the FASTA package have similar accuracy on sequence populations that can be described by simple models, e.g. 0-order Markov models. However, modelling-alignment has higher accuracy on populations that are mixed or that are described by higher-order models: It gives fewer false positives and false negatives as show by ROC curves and other results from tests on real and artificial data.

Availability: An implementation of the software is available via the Web [see top left].

Partially funded by Australian Research Council (ARC) grant A49800558.

Paper:: [link]['04].; Preprint: [PP.ps]
Also see: [Comp.J.'99] and seminars [1], [2]. e.g. ROC curve

Coding Ockham's Razor, L. Allison, Springer

A Practical Introduction to Denotational Semantics, L. Allison, CUP

Linux
Ubuntu
free op. sys.
OpenOffice
free office suite
The GIMP
~ free photoshop
Firefox
web browser

© L. Allison http://www.allisons.org/ll/ (or as otherwise indicated),
Faculty of Information Technology (Clayton), Monash University, Australia 3800 (6/'05 was School of Computer Science and Software Engineering, Fac. Info. Tech., Monash University, was Department of Computer Science, Fac. Comp. & Info. Tech., '89 was Department of Computer Science, Fac. Sci., '68-'71 was Department of Information Science, Fac. Sci.)

Created with "vi (Linux + Solaris)", charset=iso-8859-1, fetched Wednesday, 24-Apr-2024 17:35:45 AEST.