Instance variable accessors in Ruby

Encapsulation is a vital characteristic for proper object oriented implementation. In order to accomplish meaningful tasks with a given class, instances variables may need to be exposed given certain design constraints. The exposure of these instances variables are done through "getters" and "setters" which are responsible for assigning or retrieving a value to/from a particular variable (respectively). If we use the Binary Node Class from my previous post an example, the member variables "contents", "left", and "right" have been exposed by overriding the "=" operator (getter) and self-named functions.

For defining access to instance variables, Ruby provides a nice short cut through the use of the "attr_accessor" directive.

The Binary Node Class can be re-written as follows (15 lines vs. 38 lines):

  1. class BinaryNode
  2. attr_accessor :contents, :left, :right
  3.  
  4. @contents = nil
  5. @left = nil
  6. @right = nil
  7.  
  8. public
  9. def initialize(data, leftChild, rightChild)
  10. @contents = data
  11. @left = leftChild
  12. @right = rightChild
  13. end
  14.  
  15. end

In fact, there are a few variations to access instance variables, which can be configured according to "read" and "write" privileges.

  • attr_reader :variable - read only access to "variable"
  • attr_writer :variable - write only access to "variable"
  • attr_accessor :variable - read and write access to "variable"

A Binary Search Tree (BST) implementation in Ruby

The Binary Search Tree (BST) data structure is a classic computer science example of a simple, yet effective construct for managing and access information.

A BST is characterized by the following physical facts:

  • Each node holds a value, and can only have upto two children
  • For every node in the tree, the left subtree consists of smaller values, while the right tree consists of greater values

Binary Node Class
Below is a basic Ruby definition of our node class. It contains the standard constructor method "initialize", and variable access methods. You will also notice that the "=" operator has been overridden for the member variables.

  1. class BinaryNode
  2.  
  3. @contents = nil
  4. @left = nil
  5. @right = nil
  6.  
  7. public
  8. def initialize(data, leftChild, rightChild)
  9. @contents = data
  10. @left = leftChild
  11. @right = rightChild
  12. end
  13.  
  14. def contents
  15. @contents
  16. end
  17.  
  18. def left
  19. @left
  20. end
  21.  
  22. def right
  23. @right
  24. end
  25.  
  26. def contents=(data)
  27. @contents = data
  28. end
  29.  
  30. def left=(leaf)
  31. @left = leaf
  32. end
  33.  
  34. def right=(leaf)
  35. @right = leaf
  36. end
  37.  
  38. end

Binary Node Unit Test
Considering we're moving onto more complex code examples now, I decided to leverage the Ruby unit test framework to create structured test cases. As you step through the code, please note the subtleties which may be easily missed -- it tests all aspects of the class definition above, including the "=" override, and the variable access methods (e.g. n.contents). A more thorough test may also include additional assertions for the "initialize" method, to verify the "left" and "right" children.

  1. require 'BinaryNode'
  2. require 'test/unit'
  3.  
  4. class TestBinaryNode < Test::Unit::TestCase
  5.  
  6. def the_test
  7. n = BinaryNode.new("1", nil, nil)
  8. assert_equal("1", n.contents)
  9.  
  10. n.right = BinaryNode.new("2", nil, nil)
  11. assert_equal("2", n.right.contents)
  12.  
  13. n.left = BinaryNode.new("0", nil, nil)
  14. assert_equal("0", n.left.contents)
  15. end
  16.  
  17. end

Binary Search Tree Class
Ok, so we've covered the basic data structure involved in a BST, the binary node, which we will often refer to as "leaf" as well. The BST class contains all of the fun meaty stuff that defines all of the tree operations available to a developer. Although the BinaryNode is the basic building block of a BST, it should remain invisible to the end user.

The operations we will make available via our public API will be:

  • initialize: general constructor
  • clear: resets the tree to empty
  • empty?: boolean operator that returns true/false respectively
  • find: locates the requested value in the tree
  • findMin: finds the minimum value stored in the tree
  • findMax: finds the maximum value stored in the tree
  • insert: puts the requested value into the tree
  • remove: removes the requested value from the tree
  • print: simply prints out the contents of the tree, in order

Additionally, private helper functions which operate recursively, have been created to assist the public operations. A more efficient implementation could consist of loops rather than recursion, which incurs processing and storage overhead relative to traversal depth.

For the most part, if you read over the code, it is evident which helper functions support their respective public counterparts. Please note that Ruby does not allow method definitions with the same name, regardless of their signatures (i found this out through trial-and-error, so if i'm wrong, please let me know). I consider this to be a very bad weakness, and hope things change down the road.

I expect if you're reading this post, then you have a basic understanding of a BST, and I will not cover its behavior in heavy detail. However, there are three functions ([] Operator, delete, and deleteMin) which are reviewed below the following section which defines our BST class.

  1. require 'BinaryNode'
  2.  
  3. class BinarySearchTree
  4.  
  5. @root = nil
  6.  
  7. #begin: public members
  8. public
  9.  
  10. def initialize()
  11. end
  12.  
  13. def clear()
  14. @root = nil
  15. end
  16.  
  17. def empty?()
  18. @root.nil?
  19. end
  20.  
  21. def find(target)
  22. self[locate(target, @root)]
  23. end
  24.  
  25. def findMin()
  26. self[locateMin(@root)]
  27. end
  28.  
  29. def findMax()
  30. self[locateMax(@root)]
  31. end
  32.  
  33. def insert(element)
  34. @root = put(element, @root)
  35. end
  36.  
  37. def remove(element)
  38. @root = delete(element, @root)
  39. end
  40.  
  41. def print()
  42. if empty? then
  43. puts "Empty Tree"
  44. else output(@root)
  45. end
  46. end
  47.  
  48. #end: public members
  49.  
  50. private
  51. #begin: private members
  52.  
  53. def [](leaf)
  54. return leaf.contents unless leaf.nil?
  55. nil
  56. end
  57.  
  58. def locate(target, leaf)
  59. if leaf.nil? then return nil
  60. elsif leaf.contents < target then return locate(target, leaf.right)
  61. elsif leaf.contents > target then return locate(target, leaf.left)
  62. elsif leaf.contents == target then return leaf
  63. end
  64. end
  65.  
  66. def locateMin(leaf)
  67. return nil if leaf.nil?
  68. return leaf if leaf.left.nil?
  69. return locateMin(leaf.left)
  70. end
  71.  
  72. def locateMax(leaf)
  73. return nil if leaf.nil?
  74. return leaf if leaf.right.nil?
  75. return locateMax(leaf.right)
  76. end
  77.  
  78. def put(element, leaf)
  79. if leaf.nil? then leaf = BinaryNode.new(element, nil, nil)
  80. elsif leaf.contents < element then leaf.right = put(element, leaf.right)
  81. elsif leaf.contents > element then leaf.left = put(element, leaf.left)
  82. end
  83. return leaf
  84. end
  85.  
  86. def delete(element, leaf)
  87. if leaf.nil? then
  88. return nil
  89. elsif leaf.contents < element then
  90. leaf.right = delete(element, leaf.right)
  91. elsif leaf.contents > element then
  92. leaf.left = delete(element, leaf.left)
  93. elsif ( !leaf.left.nil? && !leaf.right.nil?) then
  94. leaf.contents = deleteMin(leaf)
  95. else
  96. leaf = (!leaf.left.nil?) ? leaf.left : leaf.right
  97. end
  98. return leaf
  99. end
  100.  
  101. def deleteMin(tree)
  102. parent = tree.right
  103. minChild = parent.left
  104.  
  105. if minChild.nil?
  106. tree.right = parent.right
  107. return parent.contents
  108. end
  109.  
  110. while (! minChild.left.nil?)
  111. parent = minChild
  112. minChild = minChild.left
  113. end
  114.  
  115. parent.left = minChild.right
  116. return minChild.contents
  117. end
  118.  
  119. def output(leaf)
  120. if (!leaf.nil?) then
  121. output(leaf.left)
  122. puts leaf.contents
  123. output(leaf.right)
  124. end
  125. end
  126.  
  127. #end: private members
  128.  
  129. end

Code Review

After reviewing the BST and BinaryNode class definitions, you will notice variables beginning with the "@" symbol. If you're not familiar with Ruby, I suggest you become knowledgeable with basic class rules. In Ruby, there are local, instance, and class variables; in the same respect there are instance and class methods.

To briefly describe these concepts:

  • Local variable: a variable who's scope is limited to the function or code block in which they were introduced
  • Instance variable: an instance variable is denoted with the "@" character, and by default has a private class scope. Its scope is global to the instance, and therefore accessible to any of its methods
  • Class variable: a class variable is denoted with the "@@" symbol, and by default has a private class scope. Unlike an instance variable, a class variable is shared across all instances of a class (read: static variable)
  • Instance method: a method which is associated to a particular instance of a class
  • Class method: similar to a class variable, a class method is a shared definition across all instances of a class. Moreover, a class method is universally available and does not require an instance for it to be executed (read: static method)

To illustrate the power of Ruby, I decided to override the [] operator. The implementation simply returns the contents of the given leaf node, and it's used by the public methods (e.g. self[...]). It is important to note that in Ruby, the last evaluated line of code produces the return value of a method, hence the last line in this method as "nil".

  1. def [](leaf)
  2. return leaf.contents unless leaf.nil?
  3. nil
  4. end

For the most part, all BST operations are fairly straightforward, except for the "delete" method which possess some complexity. The deletion operation requires a strategy which addresses three cases.

The target node to be delete is a leaf node with:

  • no children: delete the given node
  • 1 child: move the child into the given node's position
  • 2 children - complex case, whose strategy could call for the replacement of the target node with the smallest leaf node in it's right subtree.
  1. def delete(element, leaf)
  2. if leaf.nil? then
  3. return nil
  4. elsif leaf.contents < element then
  5. leaf.right = delete(element, leaf.right)
  6. elsif leaf.contents > element then
  7. leaf.left = delete(element, leaf.left)
  8. elsif ( !leaf.left.nil? && !leaf.right.nil?) then
  9. leaf.contents = deleteMin(leaf)
  10. else
  11. leaf = (!leaf.left.nil?) ? leaf.left : leaf.right
  12. end
  13. return leaf
  14. end

In order to make the "delete" operation more efficient, we can write a "deleteMin" method which searches the right subtree for the smallest value, and deletes it in one step. The inefficient implementation would require two steps within the "delete" method: one to find the minimum leaf by re-using the "findMin" method, followed by a recursive call to the "remove" method to physically delete the minimum node.

The code below illustrates the efficient approach, which utilizes a "while" loop to traverse down the "right" subtree.

  1. def deleteMin(tree)
  2. parent = tree.right
  3. minChild = parent.left
  4.  
  5. if minChild.nil?
  6. tree.right = nil
  7. return parent.contents
  8. end
  9.  
  10. while (! minChild.left.nil?)
  11. parent = minChild
  12. minChild = minChild.left
  13. end
  14.  
  15. parent.left = nil
  16. return minChild.contents
  17. end

The BST implementation provided in this tutorial is simply an illustration of Ruby concepts. There are many alternate approaches which could yield faster performance, or require less memory (e.g. an array implementation where traversal is calculated based on indices, loops vs. recursion, etc..).

Additionally, it should be noted that over time a BST can easily become too heavy (biased) to a certain direction (left vs. right), because there is no balancing mechanism implemented. If this were to happen, the O(log n) operations offered by a BST would fade away as the tree becomes, in the worst-case, linear.

For a more efficient BST, a balancing algorithm can be created; Adelson-Velski and Landis (AVL) trees are a perfect example of such a data structure.

I hope this has provided you with some value!

A Shell Sort Implementation in Ruby

This algorithm expands on concepts derived from Insertion Sort, and effectively uses those strategies by optimizing and consolidating swap operations per scan. Wikipedia offers some great background information for Shell Sort.

The example below is coded to the "poor" implementation of Shell Sort which operates in O(N2) time1, and room for improvements exist by optimizing the gap sequence; N/2 is the gap strategy used here. An effective gap strategy can lend a O(N3/2) down to O(N4/3) runtime.

  1. def shell_sort (list)
  2. gap = list.length/2
  3. while(gap > 0)
  4. list[gap..list.length-1].each_with_index do |element, index|
  5. j = (index += gap)
  6. index.step(gap-1, -gap) do |j|
  7. if ( j > 0 && j >= gap && element < list[j-gap] ) then
  8. list[j] = list[j-gap]
  9. else
  10. break
  11. end
  12. end
  13. list[j] = element
  14. end
  15. gap /= 2
  16. end
  17. list
  18. end



Elements that are a "gap" apart, are compared and sorted; as the gap diminishes over each repitition, elements which are closest together are eventually compared.

Shell Sort Example 1
Gap Array Elements
input 81 94 11 96 12 35 17 95 28 58 41 75 15
5 35 17 11 28 12 41 75 15 96 58 81 94 95
3 28 12 11 35 15 41 58 17 94 75 81 96 95
1 11 12 15 17 28 35 41 58 75 81 94 95 96



Code Review
Start with a gap of N/2, and iterate until adjacent elements have been processed

  1. def shell_sort (list)
  2. gap = list.length/2
  3. while(gap > 0)

Currently Ruby does not support an iterator which starts at a given index, so we will use the standard array API to obtain a sub-list which starts from "gap" and ends at the last element. We then iterate over that sub-list, and obtain each element and index.

  1. list[gap..list.length-1].each_with_index do |element, index|

The sublist is returned as a seperate object, and therefore it's index starts at "0"; we must compensate for this by adjusting "index" to "index + gap". Additionally, we introduce "j" here so that we can use it outside of the upcoming loop; note, it's value at this point in the sequence is arbitrary and has no meaning.

  1. j = (index += gap)

We then take advantage of Ruby's "step" construct for integers, to loop using a custom step sequence of "-gap", starting at "gap-1". As we scan down the list, we essentially compare and swap elements, in a similar fashion to Insertion Sort.

  1. index.step(gap-1, -gap) do |j|
  2. if ( j > 0 && j >= gap && element < list[j-gap] ) then
  3. list[j] = list[j-gap]
  4. else
  5. break
  6. end
  7. end
  8. list[j] = element

I offer the same caveat as with the Insertion Sort tutorial; I attempted rubify the code, and therefore this implementation may not be as efficient as possible (e.g. use of "step" versus other loop tactics, the use of Ruby's sub-array functionality, having to re-adjust the index, etc..). However, I hope this provides some valuable insight into both Shell Sort and Ruby.

References
1. Weiss, Mark Allen (1999). "Data Structures and Algorithm Analysis in Java". Addison Wesley. ISBN 0201357542.
2. "Shell Sort". Wikipedia. http://en.wikipedia.org/wiki/Shell_sort

An Insertion Sort Implementation in Ruby

Ruby, which is progressively gaining popularity in the industry, has struck an interest with me, and I've decided to take up the task of learning the language. Although I was first introduced to Ruby back in 2002, I never took the time to really get to know it.

Fast forward 6yrs later, and I've decided to get serious about it, and what better way to learn than to get in there and code up some common algorithms. As I venture on this journey, I will document my experience in efforts to help out others who are learning as well.

For this first example, I've coded up the simple "Insertion Sort" algorithm. Please note that in my attempts to "rubify" the algorithm, it is highly likely I may not have written it efficiently.

  1. def insertion_sort (list)
  2. list.each_with_index do |element, index|
  3. index.downto(0) do |j|
  4. if (j > 0 && element < list[j-1]) then
  5. list[j] = list[j-1]
  6. else
  7. list[j] = element
  8. break
  9. end
  10. end
  11. end
  12. end

Insertion sort operates on the basis that elements are sorted progressively as the code scans the list, and therefore after each scan S, elements 0 through S-1 are assumed to be in order, and element S is moved towards the left until it is in order.

Insertion Sort Example 1
Scan Array Elements
input 34 8 64 51 32 21
1 8 34 64 51 32 21
2 8 34 64 51 32 21
3 8 34 51 64 32 21
4 8 32 34 51 64 21
5 8 21 32 34 51 64



Code Walkthrough
The following lines uses the standard list iterator API to obtain both the element and it's index, starting at the beginning of the list.

  1. list.each_with_index do |element, index|

The inner loop uses the "downto" construct to start at the current index (from the outer loop), and counts backward to "0". Note that "downto" does not alter the value of "index". This loop does the work of comparing and swaping the current element with it's predecessors, and will appropriately shift "greater" ones up the list.

  1. index.downto(0) do |j|

The following code segment defines the comparison logic in order to shift the element down the list, by moving all "greater" elements up the list. Once an appropriate location has been determined, the current element is inserted back into the list.

  1. if (j > 0 && element < list[j-1]) then
  2. list[j] = list[j-1]
  3. else
  4. list[j] = element
  5. break
  6. end

One very important point to note is that array boundaries are somewhat loose in Ruby, and therefore an index of "-1" is considered legitimate and will not produce an array/list access violation -- it would point to the last element in the list. Therefore, in the core logic presented above, we account for that by detecting when "j" is no longer greater than "0" in our conditional statement. Regardless, in order to properly implement "insertion sort", strict boundaries are needed.


Additionally, due to Ruby's OO nature, the code presented above will work with any list of classes that support the "<" comparison operator. With that being said, Ruby also makes it possible to have lists that contain a heterogeneous set of classes since there is a lack of type-safety -- this can be both an advantage or disadvantage depending on how you intend to use this algorithm. If you're using a mixed set of classes, you would have to ensure that there is some logical ordering defined within your classes to satisfy the comparison methods.


I hope this tutorial has been helpful!

References
1. Weiss, Mark Allen (1999). "Data Structures and Algorithm Analysis in Java". Addison Wesley. ISBN 0201357542


(c) 2009 sheel kapur / powered by WordPress