[Compress]

DNA Sequence Compression (Random) Example

> Compress: L.Allison, Computer Science, Monash University 4/1998
     1 GGATATCACG TAGTCCCTAG CTCTTGGCGC TGGATGGGGC GGACGGAAGG
    51 GAAACGACCG TTGAATTCCA AATTCGGTCG TATGGAAATA TTGCAATGGA  100

> order-0 Markov Model
>                                                                                                    |   4.0 +
>                                                                                                    |   3.5 b
>                                                                                                    |   3.0 b
>      . .     ...   . .    . .         .   .          .  ..        ..     .   .              .      |   2.5 b
>--....-.--..-.---..--.-..-----.--..-------.---..---...--.---..-....--.....---.--...--.......--...--.|-  2.0 b
>..       .  .      .     .. .  ..  .... ..  ..  ...    .   .  .            ..  .   ..       .    .. |   1.5 b
>                                                                                                    |   1.0 b
>                                                                                                    |   0.5 b
>                                                                                                    |   0.0 b
> compress: Sequence length=100, |Alphabet|=4, log2(|Alphabet|) =2.0000
> hypothesis:  (H) =8.2 bits
> data:      (D|H) =197.4 bits, =1.9742 b/ch
> total: (H)+(D|H) =205.6 bits, =2.0564 b/ch
> ran 00/01/21  from 15:33:36  to 15:33:36  

> order-1 Markov Model
>                                                                                                    |   4.0 +
>                                                                                                    |   3.5 b
>                                                                                                    |   3.0 b
>.      .. . ..     ..      . .         .   .    .     .  .  .        .       .  .            ..     |   2.5 b
>--.-.-.----.--.....--....-----.--.--------.---.----.----.-.--.-.--...----..---.--.---.---.-.-------.|-  2.0 b
> . . .   .               .. .  .. ..... ..  .. . .. .. .   .  . ..    ...  ..  .  ... ... . .  .... |   1.5 b
>                                                                                                    |   1.0 b
>                                                                                                    |   0.5 b
>                                                                                                    |   0.0 b
> compress: Sequence length=100, |Alphabet|=4, log2(|Alphabet|) =2.0000
> hypothesis:  (H) =21.1 bits
> data:      (D|H) =191.0 bits, =1.9097 b/ch
> total: (H)+(D|H) =212.1 bits, =2.1208 b/ch
> ran 00/01/21  from 15:33:36  to 15:33:36  

> AED fwd approx repeats
> [Frequencies B:99.1 R:0.3 C:0.6 E:0.3 =:0.8 ~:0.1 i:0.0 d:0.0 tot:101.3]
> [Frequencies B:98.5 R:0.7 C:0.9 E:0.7 =:0.7 ~:0.4 i:0.3 d:0.3 tot:102.7]
> [Frequencies B:97.9 R:1.1 C:1.2 E:1.1 =:0.8 ~:0.6 i:0.6 d:0.6 tot:104.0]
>                                                                                                    |   4.0 +
>                                                                                                    |   3.5 b
>                                                                                                    |   3.0 b
>.      .. . ..     ..      . .         .   .    .     .  .  .        .       .  .            ..     |   2.5 b
>--.-.-.----.--.....--....-----.--.--------.---.----.----.-.--.-.--...----..---.--.---.---.-.-------.|-  2.0 b
> . . .   .               .. .  .. ..... ..  .. . .. .. .   .  . ..    ...  ..  .  ... ... . .  .... |   1.5 b
>                                                                                                    |   1.0 b
>                                                                                                    |   0.5 b
>                                                                                                    |   0.0 b
> compress: Sequence length=100, |Alphabet|=4, log2(|Alphabet|) =2.0000
> hypothesis:  (H) =31.6 bits
> data:      (D|H) =191.1 bits, =1.9110 b/ch
> total: (H)+(D|H) =222.7 bits, =2.2269 b/ch
> ran 00/01/21  from 15:33:36  to 15:33:38  
> --- end ---

------------------------------------------------------------------------------

L.Allison, Computer Science and SWE, Monash University, Australia 3168
http://www.csse.monash.edu.au/~lloyd/tildeStrings/
Fri Jan 21 15:33:38 EST 2000