Modelling-Alignment for Non-Random Sequences (AI2004)

David R. POWELL, Lloyd ALLISON, & Trevor I. DIX


Also see:

AI2004, Springer Verlag, LNCS Vol.3339, pp.203-214, 2004.

Abstract. Populations of biased, non-random sequences may cause standard alignment algorithms to yield false-positive matches and false-negative misses. A standard significance test based on the shuffling of sequences is a partial solutions, applicable to populations that can be described by simple models. Masking-out low information content intervals throws information away. We describe a new and general method, modelling alignment: Population models are incorporated into the alignment process, which can (and should) lead to changes in the rank-order of matches between a query sequence and a collection of sequences, compared to results from standard algorithms. The new method is general and places very few conditions on the nature of the models that can be used with it. We apply modelling-alignment to local alignment, global alignment, optimal alignment and the relatedness problem.

Results: As expected, modelling-alignment and the standard PRSS program from the FASTA package have similar accuracy on sequence populations that can be described by simple models, e.g. 0-order Markov models. However, modelling-alignment has higher accuracy on populations that are mixed or that are described by higher-order models: It gives fewer false positives and false negatives as show by ROC curves and other results from tests on real and artificial data.

Availability: An implementation of the software is available via the Web [see top left].

Partially funded by Australian Research Council (ARC) grant A49800558.

Preprint: []
Also see: [Comp.J.'99] and seminars [1], [2].

ROC curves
e.g. ROC curve
window on the wide world:

Computer Science Education Week

free op. sys.
free office suite,
ver 3.4+

~ free photoshop
web browser
like it says!

© L. Allison   (or as otherwise indicated),
Faculty of Information Technology (Clayton), Monash University, Australia 3800 (6/'05 was School of Computer Science and Software Engineering, Fac. Info. Tech., Monash University,
was Department of Computer Science, Fac. Comp. & Info. Tech., '89 was Department of Computer Science, Fac. Sci., '68-'71 was Department of Information Science, Fac. Sci.)
Created with "vi (Linux + Solaris)",  charset=iso-8859-1,  fetched Sunday, 29-Nov-2015 17:55:15 EST.