The Binary Search Tree (BST) data structure is a classic computer science example of a simple, yet effective construct for managing and access information.
A BST is characterized by the following physical facts:
- Each node holds a value, and can only have upto two children
- For every node in the tree, the left subtree consists of smaller values, while the right tree consists of greater values
Binary Node Class
Below is a basic Ruby definition of our node class. It contains the standard constructor method "initialize", and variable access methods. You will also notice that the "=" operator has been overridden for the member variables.
class BinaryNode
@contents = nil
@left = nil
@right = nil
public
def initialize(data, leftChild, rightChild)
@contents = data
@left = leftChild
@right = rightChild
end
def contents
@contents
end
def left
@left
end
def right
@right
end
def contents=(data)
@contents = data
end
def left=(leaf)
@left = leaf
end
def right=(leaf)
@right = leaf
end
end
Binary Node Unit Test
Considering we're moving onto more complex code examples now, I decided to leverage the Ruby unit test framework to create structured test cases. As you step through the code, please note the subtleties which may be easily missed -- it tests all aspects of the class definition above, including the "=" override, and the variable access methods (e.g. n.contents). A more thorough test may also include additional assertions for the "initialize" method, to verify the "left" and "right" children.
require 'BinaryNode'
require 'test/unit'
class TestBinaryNode < Test::Unit::TestCase
def the_test
n = BinaryNode.new("1", nil, nil)
assert_equal("1", n.contents)
n.right = BinaryNode.new("2", nil, nil)
assert_equal("2", n.right.contents)
n.left = BinaryNode.new("0", nil, nil)
assert_equal("0", n.left.contents)
end
end
Binary Search Tree Class
Ok, so we've covered the basic data structure involved in a BST, the binary node, which we will often refer to as "leaf" as well. The BST class contains all of the fun meaty stuff that defines all of the tree operations available to a developer. Although the BinaryNode is the basic building block of a BST, it should remain invisible to the end user.
The operations we will make available via our public API will be:
- initialize: general constructor
- clear: resets the tree to empty
- empty?: boolean operator that returns true/false respectively
- find: locates the requested value in the tree
- findMin: finds the minimum value stored in the tree
- findMax: finds the maximum value stored in the tree
- insert: puts the requested value into the tree
- remove: removes the requested value from the tree
- print: simply prints out the contents of the tree, in order
Additionally, private helper functions which operate recursively, have been created to assist the public operations. A more efficient implementation could consist of loops rather than recursion, which incurs processing and storage overhead relative to traversal depth.
For the most part, if you read over the code, it is evident which helper functions support their respective public counterparts. Please note that Ruby does not allow method definitions with the same name, regardless of their signatures (i found this out through trial-and-error, so if i'm wrong, please let me know). I consider this to be a very bad weakness, and hope things change down the road.
I expect if you're reading this post, then you have a basic understanding of a BST, and I will not cover its behavior in heavy detail. However, there are three functions ([] Operator, delete, and deleteMin) which are reviewed below the following section which defines our BST class.
require 'BinaryNode'
class BinarySearchTree
@root = nil
#begin: public members
public
def initialize()
end
def clear()
@root = nil
end
def empty?()
@root.nil?
end
def find(target)
self[locate(target, @root)]
end
def findMin()
self[locateMin(@root)]
end
def findMax()
self[locateMax(@root)]
end
def insert(element)
@root = put(element, @root)
end
def remove(element)
@root = delete(element, @root)
end
def print()
if empty? then
puts "Empty Tree"
else output(@root)
end
end
#end: public members
private
#begin: private members
def [](leaf)
return leaf.contents unless leaf.nil?
nil
end
def locate(target, leaf)
if leaf.nil? then return nil
elsif leaf.contents < target then return locate(target, leaf.right)
elsif leaf.contents > target then return locate(target, leaf.left)
elsif leaf.contents == target then return leaf
end
end
def locateMin(leaf)
return nil if leaf.nil?
return leaf if leaf.left.nil?
return locateMin(leaf.left)
end
def locateMax(leaf)
return nil if leaf.nil?
return leaf if leaf.right.nil?
return locateMax(leaf.right)
end
def put(element, leaf)
if leaf.nil? then leaf = BinaryNode.new(element, nil, nil)
elsif leaf.contents < element then leaf.right = put(element, leaf.right)
elsif leaf.contents > element then leaf.left = put(element, leaf.left)
end
return leaf
end
def delete(element, leaf)
if leaf.nil? then
return nil
elsif leaf.contents < element then
leaf.right = delete(element, leaf.right)
elsif leaf.contents > element then
leaf.left = delete(element, leaf.left)
elsif ( !leaf.left.nil? && !leaf.right.nil?) then
leaf.contents = deleteMin(leaf)
else
leaf = (!leaf.left.nil?) ? leaf.left : leaf.right
end
return leaf
end
def deleteMin(tree)
parent = tree.right
minChild = parent.left
if minChild.nil?
tree.right = parent.right
return parent.contents
end
while (! minChild.left.nil?)
parent = minChild
minChild = minChild.left
end
parent.left = minChild.right
return minChild.contents
end
def output(leaf)
if (!leaf.nil?) then
output(leaf.left)
puts leaf.contents
output(leaf.right)
end
end
#end: private members
end
Code Review
After reviewing the BST and BinaryNode class definitions, you will notice variables beginning with the "@" symbol. If you're not familiar with Ruby, I suggest you become knowledgeable with basic class rules. In Ruby, there are local, instance, and class variables; in the same respect there are instance and class methods.
To briefly describe these concepts:
- Local variable: a variable who's scope is limited to the function or code block in which they were introduced
- Instance variable: an instance variable is denoted with the "@" character, and by default has a private class scope. Its scope is global to the instance, and therefore accessible to any of its methods
- Class variable: a class variable is denoted with the "@@" symbol, and by default has a private class scope. Unlike an instance variable, a class variable is shared across all instances of a class (read: static variable)
- Instance method: a method which is associated to a particular instance of a class
- Class method: similar to a class variable, a class method is a shared definition across all instances of a class. Moreover, a class method is universally available and does not require an instance for it to be executed (read: static method)
To illustrate the power of Ruby, I decided to override the [] operator. The implementation simply returns the contents of the given leaf node, and it's used by the public methods (e.g. self[...]). It is important to note that in Ruby, the last evaluated line of code produces the return value of a method, hence the last line in this method as "nil".
def [](leaf)
return leaf.contents unless leaf.nil?
nil
end
For the most part, all BST operations are fairly straightforward, except for the "delete" method which possess some complexity. The deletion operation requires a strategy which addresses three cases.
The target node to be delete is a leaf node with:
- no children: delete the given node
- 1 child: move the child into the given node's position
- 2 children - complex case, whose strategy could call for the replacement of the target node with the smallest leaf node in it's right subtree.
def delete(element, leaf)
if leaf.nil? then
return nil
elsif leaf.contents < element then
leaf.right = delete(element, leaf.right)
elsif leaf.contents > element then
leaf.left = delete(element, leaf.left)
elsif ( !leaf.left.nil? && !leaf.right.nil?) then
leaf.contents = deleteMin(leaf)
else
leaf = (!leaf.left.nil?) ? leaf.left : leaf.right
end
return leaf
end
In order to make the "delete" operation more efficient, we can write a "deleteMin" method which searches the right subtree for the smallest value, and deletes it in one step. The inefficient implementation would require two steps within the "delete" method: one to find the minimum leaf by re-using the "findMin" method, followed by a recursive call to the "remove" method to physically delete the minimum node.
The code below illustrates the efficient approach, which utilizes a "while" loop to traverse down the "right" subtree.
def deleteMin(tree)
parent = tree.right
minChild = parent.left
if minChild.nil?
tree.right = nil
return parent.contents
end
while (! minChild.left.nil?)
parent = minChild
minChild = minChild.left
end
parent.left = nil
return minChild.contents
end
The BST implementation provided in this tutorial is simply an illustration of Ruby concepts. There are many alternate approaches which could yield faster performance, or require less memory (e.g. an array implementation where traversal is calculated based on indices, loops vs. recursion, etc..).
Additionally, it should be noted that over time a BST can easily become too heavy (biased) to a certain direction (left vs. right), because there is no balancing mechanism implemented. If this were to happen, the O(log n) operations offered by a BST would fade away as the tree becomes, in the worst-case, linear.
For a more efficient BST, a balancing algorithm can be created; Adelson-Velski and Landis (AVL) trees are a perfect example of such a data structure.
I hope this has provided you with some value!