Tries

LA home
Computing
 Algorithms
 Bioinformatics
 FP,  λ
 Logic,  π
 MML
 Prog.Langs

Algorithms
 glossary
 Binary Trees
  Search T'
  23trees
  234trees
  Btrees
  Tries
  PATRICIA
  Suffix Trees

 Tables

A trie (from retrieval), is a multi-way tree structure useful for storing strings over an alphabet. It has been used to store large dictionaries of English (say) words in spelling-checking programs and in natural-language "understanding" programs. Given the data:

an, ant, all, allot, alloy, aloe, are, ate, be
the corresponding trie would be:
trie

The idea is that all strings sharing a common stem or prefix hang off a common node. When the strings are words over {a..z}, a node has at most 27 children - one for each letter plus a terminator.

L
.
A
l
l
i
s
o
n

The elements in a string can be recovered in a scan from the root to the leaf that ends a string. All strings in the trie can be recovered by a depth-first scan of the tree.

One fast but rather wasteful way to implement a trie in Pascal:

type trie = ^ node;
     node = record
               subtrie :array['a'..'z'] of trie;
               IsEndOfWord :boolean
            end;

It is wasteful as a lot of nodes near the edge of the trie will have most subtries set to nil. On the other hand lookup is direct as there is random access on each level/character position in a string.

function find(word:string; length:integer; t:trie):boolean;

   function f(level:integer; t:trie):boolean;
   begin if level=length+1 then
            f:=t^.IsEndOfWord
         else if t^.subtrie[word[level]]=nil then
            f:=false
         else f:=f(level+1, t^.subtrie[word[level]])
   end{f};

begin find := f( 1, t )
end{find}

A second way to implement the trie is to represent each node in the trie as a linked-list of at most 27 elements, each element being a pair <char,trie>. In this case, a search (linear, binary, ...) must be used at each level but less space is wasted at leaves; remember most trees have more leaves than forks.

Notes

  • The height of a trie is the length of the longest key in the trie.
  • E. Fredkin. Trie Memory. Comm. ACM 3(9) pp490-499 Sept. 1960.
  • See [PATRICIA trees].
  • See [Suffix Trees] for string searching etc..


L. A., Department of Computer Science, UWA 1984, Department of Computer Science, Monash 1986, and (HTML) School of Computer Science and Software Engineering, Monash 1999
window on the wide world:

Computer Science Education Week

Linux
 Ubuntu
free op. sys.
OpenOffice
free office suite,
ver 3.4+

The GIMP
~ free photoshop
Firefox
web browser
FlashBlock
like it says!

© L. Allison   http://www.allisons.org/ll/   (or as otherwise indicated),
Faculty of Information Technology (Clayton), Monash University, Australia 3800 (6/'05 was School of Computer Science and Software Engineering, Fac. Info. Tech., Monash University,
was Department of Computer Science, Fac. Comp. & Info. Tech., '89 was Department of Computer Science, Fac. Sci., '68-'71 was Department of Information Science, Fac. Sci.)
Created with "vi (Linux + Solaris)",  charset=iso-8859-1,  fetched Saturday, 25-Oct-2014 16:33:19 EST.