ICOM 4035 – Data Structures
Laboratory 9: Sorting Algorithms
Until now the fastest way of finding some specific data has been either by using HashTables or BinarySearchTrees. However these are complex data structures and sometimes aren’t required in a certain application. For instance recall the PriorityQueue or the SortedLinkList seen in previous labs. These data structures are linear and up until now we inserted new elements in the corresponding places inside of the collection, thus we kept the order inside of it. This kind of sorting is known as an insertion sort. In this case insertion is O(n) while removing is O(1).
For the case of the PriorityQueue in particular where we could just remove the element with the highest priority we could do the operations in another manner. Insert elements in any position (the first to make the operation O(1)) and when we were to remove an element search for the element with the highest priority and remove it (O(n)). This kind of sorting is known as selection sort.
Another sorting algorithm, which is easy to implement but has a very high computational cost is the bubble sort. This algorithm is based on running the collection starting from the first element and swapping those that are not in place until we reach the end. Then we start to run the collection starting from the second element and perform any necessary swapping. We continue doing this until we reach the end of the collection (O(n2)).
You will now learn two advanced algorithms: the merge sort and the quick sort. These two follow the principle of divide and conquer. In case of the merge sort we divide the collection into two collections; each of size n/2 where n is the size of the main collection. For each of these two collections we continue to apply the partition until we have just one element. Then we start merging the sub-collections, by checking which of the two elements is smaller and adding it to the main collection.
The other algorithm is the quick sort. This one works by partitioning the collection just like the merge sort by it performs the work before creating the partitions. The basic step is to select a pivot, typically the last but if we used a random pivot is faster (random quick sorting). After this pivot has been selected we break the collection into three sub collections. One of these collections (L) contains the values of the main collection which are smaller than the pivot. The other (E); contains the values which are equal to the pivot. Finally G is a collection that contains the values which are greater than the pivot. After this we apply the same partition to the collections of L and G. Once this is finish the merging process begins, first we add the elements in L, followed by E and then G.
There is another sorting algorithm that uses a heap to sort the information. Although this won’t be covered in the lab you are encouraged to see how it works in chapter 8 (PriorityQueues). The main idea is also to partition a collection only that the resulting sorted collection is a min/max heap.
For this practice download the files here. You are asked to implement the methods of randomQuickSort and mergeSort. You can implement these either with recursion or iteration, your choice. Remember that the int that is returned is to know how many times the algorithm did a change in the collection; refer to the other methods to see an example. In case of the mergeSort and the quickSort, the value of that int is the number of partitions that where created to sort the collection.