Merge Sort In Racket

I talked briefly about merge sort only yesterday, in my post about quicksort.

I also mentioned Lisp, which in the form of “Common Lisp” is a really nice practical language I really intend to learn properly one day. The more I learn about it, the more startled I am at how far ahead of its time it was – or more accurately, at how primitive popular ‘modern’ programming languages such as C, Java and C++ can be. Note that primitive doesn’t mean easy to grasp – if anything, the lower levels of abstraction possible with OOP and procedural languages mean that it’s harder to build the same kinds of things and more difficult to understand what’s going on once you have assembled a simulacrum of the abstraction you intended.

The things people build in these languages constantly astound me, and they have the added bonus that they’re often backed by powerful enough compilers that they’ll run much faster than their FP brethren. I’ve kind of got away from the topic though.

Merge Sort

So yes. Merge sort. Off we go.

I had a naive view that merge sort was quicksort’s less glamorous older brother, until watching the videos on the preview of my Coursera course in algorithm design, I realised that algorithm design and speed were incredibly complex issues subject to a whole host of factors.

I also realised, on trying to implement a quick and ugly merge sort in Racket, that it was harder to do than I had anticipated.

Merge sort pretty much forces you to create at least two different functions:

  • A zipping-up function, which merges ordered arrays, lists or vectors into bigger ordered listy things
  • An unzipping function, which splits the initial lists until they are small enough that ordering them is a trivial operation

Generally, either the zipping-up function or the unzipping function can also take care of ordering the tiny little lists.

For the tiny little lists, we either order them at length 2 by a simple conditional check, reversing the list if they’re “out of order”, at length 7 by doing an insertion sort (I’ve chosen this length because it’s the same length Java apparently use in the default Java sort function (source: Wikipedia) or at length 1 by simply having the zipping-up function handle them.

Length 1 is the easiest for me, so I go with that. I’d probably go with the length 7 option if I was actually building a sort function of my own for god-knows-what reason.

Let’s build our zippy function then – it’s going to be really simple. As before, comments are marked out with a semi-colon.

(define (merge-lists xs ys)
  (cond
    [(null? xs) ys]
;if xs is empty, return ys [(null? ys) xs]
;if ys is empty, return xs [(< (car xs) (car ys))
;if the head of list "xs" is bigger than head of list "ys" (cons (car xs) (merge-lists (cdr xs) ys))]
;cons head xs to (recurse) [#t ;I use true as an else cond here - possibly bad style? (cons (car ys) (merge-lists xs (cdr ys)))]))
;cons head ys to (recurse)

Sure, it only sorts ordered lists – but we’re gonna make sure these are ordered in the sort function.

(define (merge-sort xs)
  (cond
    [(or (null? xs) (null? (cdr xs))) xs]
    [(null? (cddr xs))
     (merge-lists (list (car xs)) (cdr xs))]
    [#t
     (let ([x (ceiling (/ (length xs) 2))])
       (merge-lists (merge-sort (take xs x))
                    (merge-sort (drop xs x))))]))

Wow, we’re done. That was SO easy. I love Racket.

All we do is check if the list is empty (null?) or has a length of 1 (if its tail is null? it must have a length of 1). If either of these is the case, we return the list.

The next bit uses (cddr), which checks if the tail of the tail is null. Similarly, (cdddr) would return the tail of the tail of the tail of the list, or (cadr) would return the head of the tail of the list – it’s simply a matter of combining (car) – head – and (cdr) – tail – in the proportions you desire. If this returns true, we merge-lists on a list of the head of list and tial of list.

But why do we specifically turn (car xs) into a list, and keep (cdr xs) as-is?

Well, that’s due to how linked lists work.

A pair in Racket, and other “tuples” in other languages, are constructed by (cons)ing values together. So (cons 1 2) results in a pair ‘(1 2). A linked list can be constructed by (cons)ing another value to this pair (e.g. (cons 1 (cons 2 3))).

However, a list is only officially a list in Racket if it ends in a null value, for many reasons – we’ve seen some of them here! When we find where the null value is, we know to end the recursive function for sure. So (cons 1 (cons 2 3)) – not a list. (cons 1 (cons 2 null)) – a list.

This affects us here because the head of our list cannot end in null – it is by definition a single value. Although the tail looks like a single value, ‘(x), it’s actually a list ‘(cons x null). So the head goes in a brand new list, while the tail remains as it is.

Finally, we define a local variable equal to half the list’s length (rounded up with the (ceiling) function) and we call (merge-list) on the first half and second half of our list, divided up using the (take) and (drop) functions from the Racket library. They do more-or-less what you’d expect – take the list up to position x, or drop the list up to position x. We keep merging and merging our merge-sort function until we finally end up with a 2-length list, at which point we order and merge them – and merge this with the other list that has been merged – and so on, all the way back until the function returns.

It’s nice and fast, we separate our concerns prettily, and it’s not even as complex as quicksort – merge sort is awesome.

As always, if you’ve seen something egregious in this post, please correct me. I make these posts primarily to learn, so anything you can let me know is gratefully accepted!

Advertisements
Tagged , , , ,

2 thoughts on “Merge Sort In Racket

  1. Rosario Pablo says:

    Hello
    I have a problem in my code I can not solve it.
    when entering a list that divides and remains in odd number throws error
    (define l2 (cons 4 (cons 2 (cons 8 (cons 1 ( cons 17 empty))))))

    #lang racket
    (define l ( cons 1 (cons 7 (cons 10 (cons 5 (cons 6 ( cons 2 ( cons 9 ( cons 2 empty)))))))))
    (define l2 (cons 4 (cons 2 (cons 8 (cons 1 ( cons 17 empty))))))
    (define l3 (cons 5 (cons 2 (cons 4 ( cons 26 empty)))))

    (define l6 (cons 6 (cons 16(cons 8 ( cons 10 empty)))))
    (define l5 ‘())

    (define (subListaR l n)
    (if( or (empty? l) (= n 0))
    ‘()
    (cons (car l) (subListaR (cdr l) (- n 1)))
    )
    )

    (define (subLista l n)
    (if ( and (list? l) ( >= n 0))
    (subListaR l n)
    ‘error
    ))
    (define (invierte l s )
    (if( empty? (cdr l))
    (cons (car l) s)
    (invierte (cdr l) (cons (car l) s ))))

    (define (mergesort l)
    (if (> (tam l 0) 1)
    (combina
    (mergesort (subListaR l (/ (tam l 0) 2)))
    (mergesort (subListaF l ‘() (ceiling ( / (tam l 0) 2.0)))))
    l
    )
    )

    (define (subListaF l s n)
    ( invierte ( subLista (invierte l s) n) s )

    )

    (define (combina l1 l2)
    (if(empty? l1)
    l2
    (if (empty? l2)
    l1
    (if ( = (car l1) (car l2) )
    (cons (car l1) (combina(cdr l1) l2))
    (if( < (car l1) (car l2))
    (cons (car l1) (combina(cdr l1) l2 ))
    (cons (car l2) (combina l1 (cdr l2)))
    )
    )
    )))

    (define (tam l n)

    (if (empty? l)
    n
    (tam (cdr l)(+ n 1))
    ) )

    could you help me solve the error

    • Good question, it’s been ages since I’ve touched Racket so this is a bit of an exercise for me. I think it is that in your mergesort function, you do not handle the case when there are exactly two elements left. There may be better ways of doing this, but I always start debugging by wrapping the function I’m calling in something like a (begin (print l) (mergesort l) ), which enables me to see exactly where and how things are going wrong. In this particular case I noticed that subListaF was never being printed or invoked, so something is wrong there. Also, you’ve written a lot of functions that duplicate what Racket already does – I’d recommend checking what subListaR and subListaF do against what I’ve written here. Change your mergesort function to:

      (define (mergesort l)
      (if (> (tam l 0) 1)
      (combina
      (mergesort (take l (ceiling (/ (tam l 0) 2))) )
      (mergesort (drop l (ceiling ( / (tam l 0) 2)))))
      l
      )
      )

      And then it works for me. I would honestly use (take) and (drop) over your sublist functions if I were you unless you absolutely have to write your own take or drop functions as homework. If that is the case, you can have a look at the docs to see how take and drop work and implement something that works exactly the same: https://docs.racket-lang.org/reference/pairs.html?q=take#%28def._%28%28lib._racket%2Flist..rkt%29._take%29%29.

      It is also important that you use (ceiling) in both left and right subListas, otherwise you will end up repeatedly running on a sublist of the same length, which I think was the original error you faced.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Advertisements
%d bloggers like this: