In computer science, use of string and handling them for different purpose is very inportant thing. There are some data structure for work with string in programming language. TRIE is one of them. Today we will discuss about it.
Trie is an ordered tree data structure used to store a dynamic set or associative array where the keys are usually strings. It is also known as radix tree or prefix tree in the computer world. The word "trie" comes from the word "retrieval". Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the key with which it is associated. All the descendants of a node have a common prefix of the string associated with that node, and the root is associated with the empty string. Keys tend to be associated with leaves, though some inner nodes may correspond to keys of interest. Hence, keys are not necessarily associated with every node.
Suppose we have a dictionary with the following words.
Algo, Algea, Also, Tom, to
Now we need to find a way to save them in such a manner that we can retrieve them easily. One of the way is to save them in sorted order. So after that binary search will do rest. and another way is using trie.
Now we will build the Trie for the above example. At first there is no other in our structure except root. Now we will add algo. We will add a edge from root node and give the edge name 'a'. From the newly created node, we add an 'l' edge to new node. After adding 'g' and 'o' edge the trie will look like this -
Now We will add the second word, which is Algea. So first we need to add 'a' edge from root. But there is already an 'a' edge from root. if you go further you will also find edge from 'a' to 'l' and 'l' to 'g'. We only need to add edge 'e' and 'a'. The same procedure follows for Also. We only need to add 's' and 'o' edge. So now our trie looks like -
next we will add Tom in the try. As there is no 't' edge so we need to do it from scratch. Now To is a complete prefix of Tom so we dont need to add anything. we only need to color the last node of each word as gray to understand the last of each word. So our final trie get the following structure -
So now what is the importance of storing words like this. Because it is easy to find any word from this structure. Suppose if you want to find "anh" in this structure. First you need to check for 'a' edge from root. Yes, we have this in trie. Next we need to find 'n' edge from the current node. But we dont have this. So "anh is not in our dictionary. Generally we need to do number of letter operations to find a word in Trie.
The ruby implementation is given below -
class Node attr_reader :data, :edges attr_accessor :color def initialize(data) @data = data @edges =  @color = false end def insert(char) return get(char) if have?(char) child = Node.new(char) @edges << child child end def have?(char) @edges.each do |c| return true if c.data == char end false end def get(char) @edges.each do |child| return child if child.data == char end false end end class Trie attr_reader :root def initialize @root = Node.new(nil) end def insert(word) node = @root word.size.times do |i| child = node.insert(word[i]) node = child end node.term = true end def have?(word) node = @root word.size.times do |i| return false unless node.have?(word[i]) node = node.get(word[i]) end node.color end end
Searching complexity of trie is O(length of search word). Inserting complexity is also the same. Memory required for storing is dependent on number of word and how much prefix is common.
Hope this helps. Thank you
Example taken: http://www.shafaetsplanet.com/?p=1679