Modelling-Alignment for Non-Random Sequences (AI2004)

David R. POWELL, Lloyd ALLISON, & Trevor I. DIX

LA home
Computing
 Algorithms
 Bioinformatics
 FP,  λ
 Logic,  π
 MML
 Prog.Langs

Bioinformatics
 Alignment
  AI2004
   software

Also see:
 Comp.J.'99
 Compression
MML

AI2004, Springer Verlag, LNCS Vol.3339, pp.203-214, 2004.

Abstract. Populations of biased, non-random sequences may cause standard alignment algorithms to yield false-positive matches and false-negative misses. A standard significance test based on the shuffling of sequences is a partial solutions, applicable to populations that can be described by simple models. Masking-out low information content intervals throws information away. We describe a new and general method, modelling alignment: Population models are incorporated into the alignment process, which can (and should) lead to changes in the rank-order of matches between a query sequence and a collection of sequences, compared to results from standard algorithms. The new method is general and places very few conditions on the nature of the models that can be used with it. We apply modelling-alignment to local alignment, global alignment, optimal alignment and the relatedness problem.

Results: As expected, modelling-alignment and the standard PRSS program from the FASTA package have similar accuracy on sequence populations that can be described by simple models, e.g. 0-order Markov models. However, modelling-alignment has higher accuracy on populations that are mixed or that are described by higher-order models: It gives fewer false positives and false negatives as show by ROC curves and other results from tests on real and artificial data.

Availability: An implementation of the software is available via the Web [see top left].

Partially funded by Australian Research Council (ARC) grant A49800558.

Paper:
[link]['04].
Preprint: [PP.ps]
Also see: [Comp.J.'99] and seminars [1], [2].

ROC curves
e.g. ROC curve
window on the wide world:

Computer Science Education Week

Linux
 Ubuntu
free op. sys.
OpenOffice
free office suite,
ver 3.4+

The GIMP
~ free photoshop
Firefox
web browser
FlashBlock
like it says!

© L. Allison   http://www.allisons.org/ll/   (or as otherwise indicated),
Faculty of Information Technology (Clayton), Monash University, Australia 3800 (6/'05 was School of Computer Science and Software Engineering, Fac. Info. Tech., Monash University,
was Department of Computer Science, Fac. Comp. & Info. Tech., '89 was Department of Computer Science, Fac. Sci., '68-'71 was Department of Information Science, Fac. Sci.)
Created with "vi (Linux + Solaris)",  charset=iso-8859-1,  fetched Saturday, 19-Apr-2014 17:04:07 EST.