filler
[01] >>
^Bioinformatics^ ^MML^

Pattern Discovery in Plasmodium falciparum DNA

Work by: Linda Stern1, Lloyd Allison2, Ross Coppel3 & Trevor Dix2
June 2001

[1] Department of Computer Science and Software Engineering, The University of Melbourne, Australia 3010.
[2] School of Computer Science and Software Engineering, Monash University, Australia 3800.
[3] Department of Microbiology, Monash University, Australia 3800.

This document is online at:  www.csse.monash.edu.au/~lloyd/Seminars/2001-Malaria/index.shtml   and contains hyper-links to more resources.

See: L. Stern, L. Allison, R. L. Coppel, & T. I. Dix. Discovering patterns in Plasmodium falciparum genomic DNA. Molecular and Biochemical Parasitology, 118 pp175-186, 2001.

filler
<< [02] >>

We use


filler
<< [03] >>
Probabilistic Finite State Automaton PFSA for DNA = Hidden Markov Model HMM GHMM
Also see [ISMB 1998]

filler
<< [04] >>
 
Plasmodium falciparum chromosome 2 DNA HMM GHMM
Above: 1-D plot of information per base, chr2, Plasmodium falciparum

filler
<< [05] >>
Plasmodium falciparum chromosome 3 DNA HMM GHMM
Above: Chromosome 3

filler
<< [06] >>
Plasmodium falciparum given prior knowledge DNA HMM GHMM
Above: Chr3|Chr2.   e.g. Note (sub)telomeric regions are (much) more related than chance, and new spike ~130kb ... new information about Chr3 from Chr2.

filler
<< [07] >>
Plasmodium falciparum telomeric region DNA HMM GHMM
5' end of chromosome 2

filler
<< [08] >>
Plasmodium falciparum subtelomeric regions DNA HMM GHMM
Above: 5' ~ 3' subtelomeric regions of chromosome 2

filler
<< [09] >>
Plasmodium falciparum subtelomeric regions DNA HMM GHMM
Above: Chromosome 3, 5' ++ 3' subtelomeric regions. Simpler history than chr 2.

filler
<< [10] >>
Plasmodium falciparum SERA cluster DNA HMM GHMM
Above: Serine erythrocyte-binding antigen (SERA) cluster, chromosome 2.

filler
<< [11] >>
Plasmodium falciparum DNA HMM GHMM
Above: As before, but with exon 4 of PFB0345c appended (circled)

filler
<< [12] >>
Plasmodium falciparum by DOTTER DNA HMM GHMM
Above: As before, but by Dotter - "snow" from uninformative chance matches.

filler
<< [13] >>
Plasmodium falciparum by DOTTER DNA HMM GHMM
Above: Contig c10m304 is very repetitive. Plot by Dotter, snow from background repeats. Necessary to tinker with parameters.

filler
<< [14] >>
Plasmodium falciparum chromosome 10 chr10 c10m304 DNA HMM GHMM
Above: Contig c10m304 chromosome 10, Approximate Repeats Model, compresses to about 0.607 bits/bp. 600bp repeat is obvious under model.

filler
<< [15]

CONCLUSIONS


© L. Allison, School of Computer Science and Software Engineering, Monash University, Australia 3168.
Created with "vi (Linux + IRIX)",   charset=iso-8859-1

Research partly supported by Australian Research Council (ARC) grant A9800558.

Thanks to the scientists and funding agencies of the International Malaria Genome Project for making data for Plasmodium falciparum available prior to publication of the completed sequences. A consortium composed of the Institute for Genome Research, with the Naval Medical Research Centre (USA), sequenced chromosomes 2, 10, 11 and 14, with support from NIAID/NIH, the Burroughs Wellcome Fund, and the Department of Defense.

Sequences for Plasmodium falciparum chromosomes 2 and 3 were obtained from the National Centre for Biotechnology Information (www.ncbi.nlm.nih.gov), National Library of Medicine, National Institutes of Health (USA). Preliminary data for Plasmodium falciparum chromosome 10 were obtained from the Institute for Genomic Research (www.tigr.org). Sequencing of chromosomes 10 and 11 was part of the International Malaria Genome Project and was supported by award from the National Institute of Allergy and Infectious Disease, National Institutes of Health (USA).