Creating A Binary Search Tree In Racket

Firstly, what is a binary tree?

A binary tree is a simple data structure where every node points to two more nodes, culminating in some type of final data type (usually null or nil).

1
2 3
4 5 6 7

A badly unbalanced binary tree might look more like this:

1
2 null
3 null
4 5 null
null null null 6 7

Both of these examples are not really sorted, and thus are not very useful as binary search trees. They look sorted in this format, and there is an order to them, but it’s not what I’ll be talking about when I’m trying to get to grips with the data structure in this post. A properly-sorted balanced binary tree would look like this:

4
2 6
1 3 5 7

This is much more useful, as it means that we can find our way around the tree by checking the value held by the current node against the target value. For instance, in the case of 5, we check to see if it’s larger than (4). It is, so we go right to the node with the value (6). 5 is smaller than 6 so we go left, and we’re there!

In a sorted list, we would need to make 5 checks to find our number, while for the BST we only need to make 3 (2 to traverse the tree, one to make sure we’ve ended on the correct value).

In an unsorted or unbalanced tree this would take a lot longer, and if the value didn’t exist in an unsorted tree the operation would take O(n) time to return #f. In a balanced, sorted binary search tree it takes O(log n) time to search for the value, even if the value doesn’t exist.

So, trees are most useful when they’re already sorted and balanced – or ideally automatically self-sorting and balancing.

Binary trees are closely related to a clever data structure called a skip list. The concept behind a skip list (crudely) is that a list is created:

(list 1 2 3 4 5 6 7 8 9)

Followed by a second list with only every other number, to speed up traversal times:

(list 1 3 5 7 9)

If we reach a number in the skip list which is larger than the target number, we know that we’ve moved one place past the target number. If the number matches, we’ve found the number. At a stroke, we’ve halved the search time for this list!

If you build your skip list to its intended height…

(list 1)
(list 1 5)
(list 1 5 9)
(list 1 3 5 7 9)
(list 1 2 3 4 5 6 7 8 9)

Looks a lot like a balanced binary tree!

So how do we make a binary tree in Racket?
(struct node (x left right)
#:transparent)

Ta-da.

Obviously there’s more to a binary tree than this, but that’s the basics of it. Once you have a node with a place for values, a left node and a right node, you’ve basically got a binary tree.

You can also extend the trees by hand:

(define tree (node 1 (node 2 null null) (node 3 null null)))

This is pretty rubbish though. If we treated linked lists like this in Racket, we’d have to make everything using (cons x (cons y null)), when what people actually do in the real world is write (list x y).

For this next part, I first created a leaf struct. Leaves should typically be very populous, so I wanted to identify them to save a large number of function calls.

(define-struct/contract leaf ([x (not/c null?)])
#:transparent)

Just the value, and it’s not allowed to be null.

Next, we create a function which takes a list and spits out a sorted, unbalanced tree.

;Modified quicksort to create ordered (but not balanced) binary search tree
(define (unsorted-list->binary-tree xs)
    (if (null? xs)
        null
        (if (null? (cdr xs))
            (leaf (car xs))
            (let* ([hd (car xs)]
                   [tail (cdr xs)]
                   [left (filter (lambda (x) (< x hd)) tail)]
                   [right (filter (lambda (x) (>= x hd)) tail)])
              (node hd (unsorted-list->binary-tree left) (unsorted-list->binary-tree right))))))

A simple modified racket quicksort function, similar to the quicksort I’ve discussed here before.

Sample input/output:

> (unsorted-list->binary-tree (list 1 2 3 4 5 6))
(node 1 '() (node 2 '() (node 3 '() (node 4 '() (node 5 '() (leaf 6))))))
>(unsorted-list->binary-tree (list 4 5 7 12 1 2))
(node 4 (node 1 '() (leaf 2)) (node 5 '() (node 7 '() (leaf 12))))

Meanwhile, turning a sorted list into a balanced tree is just as easy:

(define (sorted-list->balanced-tree xs)
    (if (null? xs)
        null
        (if (null? (cdr xs))
            (leaf (car xs))
            (let* ([n (floor (/ (length xs) 2))]
                   [mid (list-ref xs n)]
                   [left (take xs n)]
                   [right (drop xs (+ n 1))])
             (node mid (sorted-list->balanced-tree left) (sorted-list->balanced-tree right))))))

We don’t know what the median of a list is until it’s sorted, meaning I can’t balance a list that isn’t sorted without seeking out a different algorithm or data structure. What we could do is use a slightly different data structure, the Red-Black tree, but I’m still getting to grips with how and why exactly that structure works, so I’d rather not get into that now.

In other words, we might as well take an already-sorted list, then balance it.

If we don’t want to just feed lists into our tree, we make our tree-making function variadic. A variadic function is a function able to take a variable number of arguments.

In Racket, it’s simple:

;To create a binary tree from a variadic argument
(define binary-tree
  (lambda xs
    (letrec ([f (lambda (xs)
            (if (null? xs)
            null
            (let* ([hd (car xs)]
                   [tail (cdr xs)]
                   [left (filter (lambda (x) (< x hd)) tail)]
                   [right (filter (lambda (x) (>= x hd)) tail)])
             (if (null? tail)
                 (leaf hd)
                 (node hd (f left) (f right))))))])
    (f xs))))

If you’re wondering why I’ve bothered with the local recursive function (f xs) here, it’s because the variadic argument is a list as soon as it is passed to the function. Thus, if we tried recursing on it our function would get confused and think there was only one argument and stop recursing – unless we put a cond statement in to differentiate between lists and other types of input and…yes. This could get counter-productive.

This function works in the exact same way as the one we saw earlier, only it takes different numbers of arguments rather than different lengths of lists.

Finally, here is how I create a balanced tree with variadic arguments and of different types, and also a kth-select algorithm for any ordered tree in Racket:

;Requires a sign (e.g. <, >, <=, string<?) as the first argument to sort the rest of the arguments
(define balanced-tree
  (lambda xs
    (letrec ([f (lambda (xs)
                    (if (null? xs)
                        null
                        (let* ([n (floor (/ (length xs) 2))]
                               [mid (list-ref xs n)]
                               [left (take xs n)]
                               [right (drop xs (+ n 1))])
                          (if (null? (cdr xs))
                              (leaf mid)
                              (node mid (f left) (f right))))))])
        (f (sort (cdr xs) (car xs))))))
;count-tree-members - useful for kth selection algorithm
(define (count-tree-members ts)
  (cond
    [(null? ts) 0]
    [(leaf? ts) 1]
    [(node? ts)
     (+ 1 (count-tree-members (node-left ts)) (count-tree-members (node-right ts)))]))
;selection algorithm on tree
(define  (exn:fail "Error! Value k is too large or small for the search tree." (current-continuation-marks)))
(define (kth-select ts k)
 (letrec ([f (lambda (ts k)
              (cond
               [(null? ts) (raise )]
               [(leaf? ts) (if (= k 1) (leaf-x ts) (raise ))]
               [(node? ts) (let ([left (add1 (count-tree-members (node-left ts)))])
                             (cond
                               [(= k left) (node-x ts)]
                               [(> k left) (f (node-right ts) (- k left))]
                               [(< k left) (f (node-left ts) k)]))]))])
   (f ts k)))

In the mean-time, I’ve been fiddling around with small improvements to the Nearest Neighbour algorithm for the Travelling Salesman Problem (for fun, I’m not a crank I promise!), and continuing to work on my game in Java.

Here is the worst video this is it it is the worst one there is nothing worse than this.

Tagged , , , ,

4 thoughts on “Creating A Binary Search Tree In Racket

  1. I don’t get it 😦

    • It’s like a list of information, and at each step it splits into two points (instead of just listing one piece of information). So it looks a bit like (1 (2 3)).

      If we added more bits to that it might look like this: (1 (2 (3 4) 5 (6 7)).

      Proper trees like this are almost always sorted so that a middle value is the first in the list, higher values always go on the right, and lower values always go on the left. This means that you can quickly find what you’re looking for in a ton of information.

      If you meant the Neverending Story video, though, yeah, I don’t know wtf either.

  2. isa says:

    Nice blog post (it’s old I know). I learnt a lot from your posts about racket. Thanks!
    If you still write, I’d like to read more about rackets. 🙂

    Meanwhile, I tried making a function to make it easier to visualize your version of tree using pict library:

    (require pict)
    (require pict/tree-layout)

    (define (bt->tree-layout bt)
    (cond [(node? bt) (tree-layout #:pict (text (~a (node-val bt))) (bt->tree-layout (node-left bt)) (bt->tree-layout (node-right bt)))]
    [(leaf? bt) (tree-layout #:pict (text (~a (leaf-x bt))))]
    [(null? bt) (tree-layout)]))

    • Thanks! I don’t write much Racket these days (what I do write is boring XML-parsing stuff that relies heavily on inbuilt functionality). I will say that I think it’s really good at XML parsing because 1. Racket is relatively straightforwardly homoiconic, 2. S-Expressions (which all Lisp-like languages are made of) are very similar to X-Expressions (which XML is made of). But I don’t know if I could make a whole post out of that!

      EDIT: I like the results of the code you posted, thanks, that’s really cool.

Leave a comment