A Binary Search Tree (BST) implementation in Ruby
The Binary Search Tree (BST) data structure is a classic computer science example of a simple, yet effective construct for managing and access information.
A BST is characterized by the following physical facts:
- Each node holds a value, and can only have upto two children
- For every node in the tree, the left subtree consists of smaller values, while the right tree consists of greater values
Binary Node Class
Below is a basic Ruby definition of our node class. It contains the standard constructor method "initialize", and variable access methods. You will also notice that the "=" operator has been overridden for the member variables.
class BinaryNode @contents = nil @left = nil @right = nil public def initialize(data, leftChild, rightChild) @contents = data @left = leftChild @right = rightChild end def contents @contents end def left @left end def right @right end def contents=(data) @contents = data end def left=(leaf) @left = leaf end def right=(leaf) @right = leaf end end
Binary Node Unit Test
Considering we're moving onto more complex code examples now, I decided to leverage the Ruby unit test framework to create structured test cases. As you step through the code, please note the subtleties which may be easily missed -- it tests all aspects of the class definition above, including the "=" override, and the variable access methods (e.g. n.contents). A more thorough test may also include additional assertions for the "initialize" method, to verify the "left" and "right" children.
require 'BinaryNode' require 'test/unit' class TestBinaryNode < Test::Unit::TestCase def the_test n = BinaryNode.new("1", nil, nil) assert_equal("1", n.contents) n.right = BinaryNode.new("2", nil, nil) assert_equal("2", n.right.contents) n.left = BinaryNode.new("0", nil, nil) assert_equal("0", n.left.contents) end end
Binary Search Tree Class
Ok, so we've covered the basic data structure involved in a BST, the binary node, which we will often refer to as "leaf" as well. The BST class contains all of the fun meaty stuff that defines all of the tree operations available to a developer. Although the BinaryNode is the basic building block of a BST, it should remain invisible to the end user.
The operations we will make available via our public API will be:
- initialize: general constructor
- clear: resets the tree to empty
- empty?: boolean operator that returns true/false respectively
- find: locates the requested value in the tree
- findMin: finds the minimum value stored in the tree
- findMax: finds the maximum value stored in the tree
- insert: puts the requested value into the tree
- remove: removes the requested value from the tree
- print: simply prints out the contents of the tree, in order
Additionally, private helper functions which operate recursively, have been created to assist the public operations. A more efficient implementation could consist of loops rather than recursion, which incurs processing and storage overhead relative to traversal depth.
For the most part, if you read over the code, it is evident which helper functions support their respective public counterparts. Please note that Ruby does not allow method definitions with the same name, regardless of their signatures (i found this out through trial-and-error, so if i'm wrong, please let me know). I consider this to be a very bad weakness, and hope things change down the road.
I expect if you're reading this post, then you have a basic understanding of a BST, and I will not cover its behavior in heavy detail. However, there are three functions ([] Operator, delete, and deleteMin) which are reviewed below the following section which defines our BST class.
require 'BinaryNode' class BinarySearchTree @root = nil #begin: public members public def initialize() end def clear() @root = nil end def empty?() @root.nil? end def find(target) self[locate(target, @root)] end def findMin() self[locateMin(@root)] end def findMax() self[locateMax(@root)] end def insert(element) @root = put(element, @root) end def remove(element) @root = delete(element, @root) end def print() if empty? then puts "Empty Tree" else output(@root) end end #end: public members private #begin: private members def [](leaf) return leaf.contents unless leaf.nil? nil end def locate(target, leaf) if leaf.nil? then return nil elsif leaf.contents < target then return locate(target, leaf.right) elsif leaf.contents > target then return locate(target, leaf.left) elsif leaf.contents == target then return leaf end end def locateMin(leaf) return nil if leaf.nil? return leaf if leaf.left.nil? return locateMin(leaf.left) end def locateMax(leaf) return nil if leaf.nil? return leaf if leaf.right.nil? return locateMax(leaf.right) end def put(element, leaf) if leaf.nil? then leaf = BinaryNode.new(element, nil, nil) elsif leaf.contents < element then leaf.right = put(element, leaf.right) elsif leaf.contents > element then leaf.left = put(element, leaf.left) end return leaf end def delete(element, leaf) if leaf.nil? then return nil elsif leaf.contents < element then leaf.right = delete(element, leaf.right) elsif leaf.contents > element then leaf.left = delete(element, leaf.left) elsif ( !leaf.left.nil? && !leaf.right.nil?) then leaf.contents = deleteMin(leaf) else leaf = (!leaf.left.nil?) ? leaf.left : leaf.right end return leaf end def deleteMin(tree) parent = tree.right minChild = parent.left if minChild.nil? tree.right = parent.right return parent.contents end while (! minChild.left.nil?) parent = minChild minChild = minChild.left end parent.left = minChild.right return minChild.contents end def output(leaf) if (!leaf.nil?) then output(leaf.left) puts leaf.contents output(leaf.right) end end #end: private members end
Code Review
After reviewing the BST and BinaryNode class definitions, you will notice variables beginning with the "@" symbol. If you're not familiar with Ruby, I suggest you become knowledgeable with basic class rules. In Ruby, there are local, instance, and class variables; in the same respect there are instance and class methods.
To briefly describe these concepts:
- Local variable: a variable who's scope is limited to the function or code block in which they were introduced
- Instance variable: an instance variable is denoted with the "@" character, and by default has a private class scope. Its scope is global to the instance, and therefore accessible to any of its methods
- Class variable: a class variable is denoted with the "@@" symbol, and by default has a private class scope. Unlike an instance variable, a class variable is shared across all instances of a class (read: static variable)
- Instance method: a method which is associated to a particular instance of a class
- Class method: similar to a class variable, a class method is a shared definition across all instances of a class. Moreover, a class method is universally available and does not require an instance for it to be executed (read: static method)
To illustrate the power of Ruby, I decided to override the [] operator. The implementation simply returns the contents of the given leaf node, and it's used by the public methods (e.g. self[...]). It is important to note that in Ruby, the last evaluated line of code produces the return value of a method, hence the last line in this method as "nil".
def [](leaf) return leaf.contents unless leaf.nil? nil end
For the most part, all BST operations are fairly straightforward, except for the "delete" method which possess some complexity. The deletion operation requires a strategy which addresses three cases.
The target node to be delete is a leaf node with:
- no children: delete the given node
- 1 child: move the child into the given node's position
- 2 children - complex case, whose strategy could call for the replacement of the target node with the smallest leaf node in it's right subtree.
def delete(element, leaf) if leaf.nil? then return nil elsif leaf.contents < element then leaf.right = delete(element, leaf.right) elsif leaf.contents > element then leaf.left = delete(element, leaf.left) elsif ( !leaf.left.nil? && !leaf.right.nil?) then leaf.contents = deleteMin(leaf) else leaf = (!leaf.left.nil?) ? leaf.left : leaf.right end return leaf end
In order to make the "delete" operation more efficient, we can write a "deleteMin" method which searches the right subtree for the smallest value, and deletes it in one step. The inefficient implementation would require two steps within the "delete" method: one to find the minimum leaf by re-using the "findMin" method, followed by a recursive call to the "remove" method to physically delete the minimum node.
The code below illustrates the efficient approach, which utilizes a "while" loop to traverse down the "right" subtree.
def deleteMin(tree) parent = tree.right minChild = parent.left if minChild.nil? tree.right = nil return parent.contents end while (! minChild.left.nil?) parent = minChild minChild = minChild.left end parent.left = nil return minChild.contents end
The BST implementation provided in this tutorial is simply an illustration of Ruby concepts. There are many alternate approaches which could yield faster performance, or require less memory (e.g. an array implementation where traversal is calculated based on indices, loops vs. recursion, etc..).
Additionally, it should be noted that over time a BST can easily become too heavy (biased) to a certain direction (left vs. right), because there is no balancing mechanism implemented. If this were to happen, the O(log n) operations offered by a BST would fade away as the tree becomes, in the worst-case, linear.
For a more efficient BST, a balancing algorithm can be created; Adelson-Velski and Landis (AVL) trees are a perfect example of such a data structure.
I hope this has provided you with some value!
I think perhaps your delete method does not work. Try this.
bst = BinarySearchTree.new
bst.insert(5)
bst.insert(2)
bst.insert(8)
bst.insert(1)
bst.insert(10)
bst.insert(-5)
bst.insert(6)
bst.insert(9)
bst.insert(7)
bst.to_s();
puts
bst.remove(5)
bst.to_s();
I’m showing the output as
-5 1 2 6 6 7 8 9 10
I tried your solution because I had this same problem in my own implementation. Something very strange is happening when you assign one node to another.
Please let me know if your output does not mimick mine or if you have the same problem.
Check that, it was all on a modified verison. Your version of the code outputs
-5 1 2 5 6 7 8 9 10 <–before removing root
-5 1 2 6 8 9 10 <– after removing root.
Tyler, Thanks for the feedback… I hope everything worked out for you. I tried the sample you provided and everything seems fine.
Not sure exactly what problem you were having with your implementation regarding node re-assignment, but keep in mind that Ruby is a pass-by-value language.
[...] Binary Node Class can be re-written as follows (15 lines vs. 38 [...]
Indeed your delete method does not work properly. Consider this test (inorder method is just modified print, that returns array):
class TestTree < Test::Unit::TestCase
def setup
@t = Tree.new
@contents = []
100.times do |x|
x = rand(100)
@t.insert(x)
@contents << x
end
@contents.uniq!
end
def test_delete
@contents.length.times do
@t.remove(@contents.shift)
assert_equal(@t.inorder,@contents.sort)
end
end
end
oh wow, so sorry… the issue was my deleteMin function, … i was removing the left/right links/pointers of the parent node w/o fixing the tree
lines: 106 and 115
code has now been fixed, thanks!