NOTE:

These are a little sketchy...I'll fill in details 'soon'.

Putting It All In Order

There comes a time in every programmer's career when s/he must place items of data into order. Sometimes this is ascending (smallest to largest). Sometimes it is descending (largest to smallest). But whichever way it goes, it must.

One of the earliest sorts was known as bubble sort. Followed by selection sort and insertion sort. After these quick and dirty algorithms came a period of reflection and study. At which time arrived on the scene much more powerful sorts of sorting algorithms: quick, heap, and merge. Let's look at each in turn.

Bubble Sort

This sort is named after the way soda bubbles rise to the top of a glass -- just sort of meandering their way from the base to the rim:

                   |___________|
                   |       ,   |
                   |     .     |
                   |       *   |
                   |         o |
                   |        0  |
                   |          O|
                   +-----------+

(Okay, so I'm not an artist. *sigh*)

The idea of the algorithm, however, is that data items will migrate their way toward their proper (ordered) position within the array -- slowly but surely. This is accomplished by comparing adjacent pairs of data within the array and swapping them (if necessary) to be in the correct (desired) order. If we were trying to sort the following array into descending order, for instance, watch the steps unfold (and watch how the 'a' element bubbles its way toward the opposite end of the array):

    +-----+-----+-----+-----+-----+-----+
    |  a  |  d  |  c  |  q  |  b  |  e  |      original
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  d  |  a  |  c  |  q  |  b  |  e  |      compare/swap 0 and 1
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  d  |  c  |  a  |  q  |  b  |  e  |      compare/swap 1 and 2
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  d  |  c  |  q  |  a  |  b  |  e  |      compare/swap 2 and 3
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  d  |  c  |  q  |  b  |  a  |  e  |      compare/swap 3 and 4
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  d  |  c  |  q  |  b  |  e  |  a  |      compare/swap 4 and 5
    +-----+-----+-----+-----+-----+-----+

This completes phase one. Next, we begin again at position 0 and move anything else that needs adjusting:

    +-----+-----+-----+-----+-----+-----+
    |  d  |  c  |  q  |  b  |  e  |  a  |      'original'
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  d  |  c  |  q  |  b  |  e  |  a  |      compare      0 and 1
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  d  |  q  |  c  |  b  |  e  |  a  |      compare/swap 1 and 2
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  d  |  q  |  c  |  b  |  e  |  a  |      compare      2 and 3
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  d  |  q  |  c  |  e  |  b  |  a  |      compare/swap 3 and 4
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  d  |  q  |  c  |  e  |  b  |  a  |      compare      4 and 5
    +-----+-----+-----+-----+-----+-----+

Note how this time we only had to swap two pairs of elements! Things seem to be looking up! (Some may ask, did we really have to compare those elements that didn't need swapping? We didn't, but the computer does. It doesn't know until it has checked whether two items are in order. The computer doesn't have a more global vision of the array. It sees just a single element at a time -- or two when comparing.)

But, our job isn't done yet! We have to keep going until all the elements are in order. Pass three:

    +-----+-----+-----+-----+-----+-----+
    |  d  |  q  |  c  |  e  |  b  |  a  |      'original'
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  q  |  d  |  c  |  e  |  b  |  a  |      compare/swap 0 and 1
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  q  |  d  |  c  |  e  |  b  |  a  |      compare      1 and 2
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  q  |  d  |  e  |  c  |  b  |  a  |      compare/swap 2 and 3
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  q  |  d  |  e  |  c  |  b  |  a  |      compare      3 and 4
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  q  |  d  |  e  |  c  |  b  |  a  |      compare      4 and 5
    +-----+-----+-----+-----+-----+-----+

Again, just two swaps, but 5 comparisons needed to determine if swaps were necessary. Pass four:

    +-----+-----+-----+-----+-----+-----+
    |  q  |  d  |  e  |  c  |  b  |  a  |      'original'
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  q  |  d  |  e  |  c  |  b  |  a  |      compare      0 and 1
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  q  |  e  |  d  |  c  |  b  |  a  |      compare/swap 1 and 2
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  q  |  e  |  d  |  c  |  b  |  a  |      compare      2 and 3
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  q  |  e  |  d  |  c  |  b  |  a  |      compare      3 and 4
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  q  |  e  |  d  |  c  |  b  |  a  |      compare      4 and 5
    +-----+-----+-----+-----+-----+-----+

There, done, right? Nope. The computer doesn't see the whole picture. It only knows that on this last pass, it swapped two things. Therefore, things were still out of order. It must take another stab at it. Pass five:

    +-----+-----+-----+-----+-----+-----+
    |  q  |  e  |  d  |  c  |  b  |  a  |      'original'
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  q  |  e  |  d  |  c  |  b  |  a  |      compare      0 and 1
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  q  |  e  |  d  |  c  |  b  |  a  |      compare      1 and 2
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  q  |  e  |  d  |  c  |  b  |  a  |      compare      2 and 3
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  q  |  e  |  d  |  c  |  b  |  a  |      compare      3 and 4
    +-----+-----+-----+-----+-----+-----+

    +-----+-----+-----+-----+-----+-----+
    |  q  |  e  |  d  |  c  |  b  |  a  |      compare      4 and 5
    +-----+-----+-----+-----+-----+-----+

Finally! The computer, having made it through an entire pass without a single swap, feels that it is safe to stop. *whew*

This, fairly obviously, takes quite a bit of time and effort. Let's look at the performance table:
Number of ElementsAverage Comparisons to Ordered
864
644096
10241048576
409616777216
10485761099511627776

Yeow! That's a lot of comparisons! In fact, if you have n items of data, it will take about n2 (that's n-squared) comparisons to place the items in order. After watching that work for a while, we decided there had to be a better way! Although the number of comparisons seemed as small as possible (okay, we were naive), we thought we could at least get rid of some of those swaps. Thus was born:

Selection Sort

Since our goal with selection sort was to reduce the number of swaps performed during the sort, the target seemed clear. To place 6 things in order, we should, at worst, have to move 6 things. So, we decided that we should first determine what piece to move into position and then move it. So, if we are looking to do an ascending sort, we'd be looking for the smallest piece to place first. Let's try with our previous array:

    +-----+-----+-----+-----+-----+-----+
    |  a  |  d  |  c  |  q  |  b  |  e  |      original
    +-----+-----+-----+-----+-----+-----+

First we try to find the smallest. (Again, it seems obvious to us because we have a global view. The computer can see at most two elements at a time and doesn't realize that 'a' really is the smallest until it has compared all of them.)

    position |  at this position  |  smallest so far
   ----------+--------------------+------------------
       0     |        'a'         |         'a'
       1     |        'd'         |         'a'
       2     |        'c'         |         'a'
       3     |        'q'         |         'a'
       4     |        'b'         |         'a'
       5     |        'e'         |         'a'

Now, we swap the 'a' with the first position. Um...better remember where it was instead of what it was, eh? *smile*

    position |  at this position  |  position of smallest so far
   ----------+--------------------+------------------------------
       0     |        'a'         |         0
       1     |        'd'         |         0
       2     |        'c'         |         0
       3     |        'q'         |         0
       4     |        'b'         |         0
       5     |        'e'         |         0

    +-----+-----+-----+-----+-----+-----+
    |  a  |  d  |  c  |  q  |  b  |  e  |      swap 0 and 0
    +-----+-----+-----+-----+-----+-----+

Okay, so this first one was a little redundant. Again, the computer doesn't realize that. Let's look at finding the next smallest, though. A novice might start again at position 0. However, this won't work out since we just placed the smallest piece of data in that position! Because of that, we can now eliminate the first position from our search. Let's start at 1 and work our way down:

    +-----+-----+-----+-----+-----+-----+
    |  a  |  d  |  c  |  q  |  b  |  e  |      'original'
    +-----+-----+-----+-----+-----+-----+

    position |  at this position  |  position of smallest so far
   ----------+--------------------+------------------------------
       1     |        'd'         |         1
       2     |        'c'         |         2
       3     |        'q'         |         2
       4     |        'b'         |         4
       5     |        'e'         |         4

    +-----+-----+-----+-----+-----+-----+
    |  a  |  b  |  c  |  q  |  d  |  e  |      swap 1 and 4
    +-----+-----+-----+-----+-----+-----+

Nifty! Now, we can eliminate positions 0 and 1 from our search for the third smallest item:

    +-----+-----+-----+-----+-----+-----+
    |  a  |  b  |  c  |  q  |  d  |  e  |      'original'
    +-----+-----+-----+-----+-----+-----+

    position |  at this position  |  position of smallest so far
   ----------+--------------------+------------------------------
       2     |        'c'         |         2
       3     |        'q'         |         2
       4     |        'd'         |         2
       5     |        'e'         |         2

    +-----+-----+-----+-----+-----+-----+
    |  a  |  b  |  c  |  q  |  d  |  e  |      swap 2 and 2
    +-----+-----+-----+-----+-----+-----+

*sigh* Once again our computer's narrow vision makes a silly move. *shrug* Oh, well. Let's find the fourth smallest (now not having to look through 0, 1, or 2):

    +-----+-----+-----+-----+-----+-----+
    |  a  |  b  |  c  |  q  |  d  |  e  |      'original'
    +-----+-----+-----+-----+-----+-----+

    position |  at this position  |  position of smallest so far
   ----------+--------------------+------------------------------
       3     |        'q'         |         3
       4     |        'd'         |         4
       5     |        'e'         |         4

    +-----+-----+-----+-----+-----+-----+
    |  a  |  b  |  c  |  d  |  q  |  e  |      swap 3 and 4
    +-----+-----+-----+-----+-----+-----+

And the fifth smallest (searching only positions 4 and 5 now):

    +-----+-----+-----+-----+-----+-----+
    |  a  |  b  |  c  |  d  |  q  |  e  |      'original'
    +-----+-----+-----+-----+-----+-----+

    position |  at this position  |  position of smallest so far
   ----------+--------------------+------------------------------
       4     |        'q'         |         4
       5     |        'e'         |         5

    +-----+-----+-----+-----+-----+-----+
    |  a  |  b  |  c  |  d  |  e  |  q  |      swap 4 and 5
    +-----+-----+-----+-----+-----+-----+

And...we're done! Why don't we go through it again? Since we were swapping pairs, we actually only needed 5 swaps to place 6 items in order. (To do n, we'd only need n-1 swaps.) So, has this really helped? Actually, yes and no. We have reduced the number of swaps (a slow thing to do, depending on the size of the data). We even reduced the absolute number of comparisons (a good side-effect!). However, analysis shows that the number of comparisons is still in the range of n-squared. (Take out your graphing calculator and graph x^2 and x^2-4x+3. Make the window [0,4000,0,20000]. Not much difference between the two lines, eh? That's the difference we just made in the number of comparisons. Well, not exactly, but close. To see the reduction we made in the number of swaps, leave x^2 and replace the second function with x-1. Leave the window the same. Now that's a significant decrease!)

Insertion Sort

Still trying to reduce the number of comparisons, someone came up with insertion sort. Bubble sort tried to work with the whole array (so to speak). Selection sort eliminated one element at a time from consideration. Insertion sort makes another approach. It starts by viewing the array as just the first element and says, "There, that's sorted!" Then it brings another element into consideration, finds where it should go, places it, and says, "There, those are sorted!". And so on. An example:

    Pass 1:
       |-view-|
        +-----+-----+-----+-----+-----+-----+
        |  a  |  d  |  c  |  q  |  b  |  e  |      original
        +-----+-----+-----+-----+-----+-----+
        done
       |-view-|
        +-----+-----+-----+-----+-----+-----+
        |  a  |  d  |  c  |  q  |  b  |  e  |      sorted (within view)
        +-----+-----+-----+-----+-----+-----+

    Pass 2:
       |----view----|
        +-----+-----+-----+-----+-----+-----+
        |  a  |  d  |  c  |  q  |  b  |  e  |      'original'
        +-----+-----+-----+-----+-----+-----+
        new item:  'd'
        before 'a'?  no
        done
       |----view----|
        +-----+-----+-----+-----+-----+-----+
        |  a  |  d  |  c  |  q  |  b  |  e  |      sorted (within view)
        +-----+-----+-----+-----+-----+-----+

    Pass 3:
       |-------view-------|
        +-----+-----+-----+-----+-----+-----+
        |  a  |  d  |  c  |  q  |  b  |  e  |      'original'
        +-----+-----+-----+-----+-----+-----+
        new item:  'c'
        before 'a'?  no
        before 'd'?  yes
        hold 'c'
        move 'd' down
        place 'c'
        done
       |-------view-------|
        +-----+-----+-----+-----+-----+-----+
        |  a  |  c  |  d  |  q  |  b  |  e  |      sorted (within view)
        +-----+-----+-----+-----+-----+-----+

    Pass 4:
       |----------view----------|
        +-----+-----+-----+-----+-----+-----+
        |  a  |  c  |  d  |  q  |  b  |  e  |      'original'
        +-----+-----+-----+-----+-----+-----+
        new item:  'q'
        before 'a'?  no
        before 'c'?  no
        before 'd'?  no
        done
       |----------view----------|
        +-----+-----+-----+-----+-----+-----+
        |  a  |  c  |  d  |  q  |  b  |  e  |      sorted (within view)
        +-----+-----+-----+-----+-----+-----+

    Pass 5:
       |-------------view-------------|
        +-----+-----+-----+-----+-----+-----+
        |  a  |  c  |  d  |  q  |  b  |  e  |      'original'
        +-----+-----+-----+-----+-----+-----+
        new item:  'b'
        before 'a'?  no
        before 'c'?  yes
        hold 'b'
        move 'q' down
        move 'd' down
        move 'c' down
        place 'b'
        done
       |-------------view-------------|
        +-----+-----+-----+-----+-----+-----+
        |  a  |  b  |  c  |  d  |  q  |  e  |      sorted (within view)
        +-----+-----+-----+-----+-----+-----+

    Pass 6:
       |----------------view----------------|
        +-----+-----+-----+-----+-----+-----+
        |  a  |  b  |  c  |  d  |  q  |  e  |      'original'
        +-----+-----+-----+-----+-----+-----+
        new item:  'e'
        before 'a'?  no
        before 'b'?  no
        before 'c'?  no
        before 'd'?  no
        before 'q'?  yes
        hold 'e'
        move 'q' down
        place 'e'
        done
       |----------------view----------------|
        +-----+-----+-----+-----+-----+-----+
        |  a  |  b  |  c  |  d  |  e  |  q  |      sorted
        +-----+-----+-----+-----+-----+-----+

When sorting an existing array, this doesn't perform very well. It still doesn't in the neighborhood (on the order of) n-squared comparisons. And, the number of movements has grown because sometimes you have to shift many items down to insert the new item in its place! (It is back around (x^2)/2. Check out the graph in the same window as before to see the difference.)

However, when you are trying to build a sorted array as the user enters their data, this algorithm is reasonably handy. After all, that is it's view of the world: one more element, where does it go?

Can't We Go Any Faster?

Why yes, Virginia, we can! So far, we have seen three sorts which have performance characteristics like this:
AlgorithmAverage ComparisonsAverage Swaps
bubble
 2
n
  2
 n
---
 2
selection
 2
n
n-1
insertion
 2
n
  2
 n
---
 2

As the size of the array (n) grows, these algorithms take much more time. There are (at least) three other algorithms, however. These take considerably less time (graphing is left to the interested reader):
AlgorithmAverage Comparisons
quick
n*(log n)
      2
heap
n*(log n)
      2
merge
n*(log n)
      2

Wow! After looking at the graphs for n-log-n versus n-squared, you easily see the improvement. However, given that these three sorts are so much alike, how can we tell which algorithm we should use? When the average performance is equal, we have other properties to consider:
AlgorithmWorst ComparisonsExtra MemoryStable
quick
 2
n
0
yes
heap
n*(log n)
      2
0
no
merge
(array)
n*(log n)
      2
n
yes

Worst performance of heap and merge is still n-log-n. Quick sort can degrade to n-squared performance if care isn't taken.

Quick and heap sort never take any extra memory. When used on arrays, merge sort requires a secondary array for processing -- another n memory locations.

Stability of a sort indicates that, when sorting on multiple attributes, the previous order is preserved. For instance, if you had parallel arrays containing the sales person id, the month, and the sales figures (for that person in that month), sorting first on sales figure and then on id would preserve the order of figures within the data about the same person:

    Original Data:
           id       month       sales
        +-----+    +-----+     +-----+
        | 432 |    |  8  |     |  89 |
        +-----+    +-----+     +-----+
        | 123 |    |  8  |     | 116 |
        +-----+    +-----+     +-----+
        | 555 |    |  8  |     | 203 |
        +-----+    +-----+     +-----+
        | 432 |    |  9  |     | 105 |
        +-----+    +-----+     +-----+
        | 123 |    |  9  |     | 143 |
        +-----+    +-----+     +-----+
        | 555 |    |  9  |     | 187 |
        +-----+    +-----+     +-----+

    Sorted by sales:
           id       month       sales
        +-----+    +-----+     +-----+
        | 432 |    |  8  |     |  89 |
        +-----+    +-----+     +-----+
        | 432 |    |  9  |     | 105 |
        +-----+    +-----+     +-----+
        | 123 |    |  8  |     | 116 |
        +-----+    +-----+     +-----+
        | 123 |    |  9  |     | 143 |
        +-----+    +-----+     +-----+
        | 555 |    |  9  |     | 187 |
        +-----+    +-----+     +-----+
        | 555 |    |  8  |     | 203 |
        +-----+    +-----+     +-----+

    Re-Sorted by id:
           id       month       sales
        +-----+    +-----+     +-----+
        | 123 |    |  8  |     | 116 |
        +-----+    +-----+     +-----+
        | 123 |    |  9  |     | 143 |
        +-----+    +-----+     +-----+
        | 432 |    |  8  |     |  89 |
        +-----+    +-----+     +-----+
        | 432 |    |  9  |     | 105 |
        +-----+    +-----+     +-----+
        | 555 |    |  9  |     | 187 |
        +-----+    +-----+     +-----+
        | 555 |    |  8  |     | 203 |
        +-----+    +-----+     +-----+

Merge sort exhibits this property, but both heap and quick sort completely disregard prior order information.

So, by looking at these properties (worst case, stability, extra memory required), we can decide which of these three sorts to use. Unfortunately for you, you'll have to wait a semester or two before you get to actually use them.

However, you should be able to know their characteristics (average behavior and properties that distinguish them).

Stray Bits