CS 228 meeting -*- Outline -*- * quicksort (HR 12.8) ** divide and conqueor idea Idea is divide and conqueor, quite a bit like binary search explain in pictures, show video ------------------------------------------ QUICKSORT IDEA void Quicksort(int vec[], int loBound, int hiBound) { IF 0 or 1 items to sort, return; IF 2 items to sort, swap if needed ELSE { partition vec around someSub so that vec[loBound..someSub-1] are less than all items in vec[someSub..hiBound]; Quicksort(vec, loBound, someSub-1); Quicksort(vec, someSub, hiBound); } } ------------------------------------------ the value that winds up being vec[someSub] is called the pivot; want this to be an average value, so splits the array into partitions of roughly equal size, then get O(N log N) performance on average. Q: What if there is only one element less than the pivot? then it turns out to be like selection sort. Q: when does the recursion stop? when it runs into vectors of length 2 or less. Faster not to make the recursive call in that case. So with an additional assumption that the vector to be sorted is at least length 2, here's a version. ------------------------------------------ IMPROVED QUICKSORT // qsort2.C #include "qsort2.h" #include "partition.h" #include "swap.h" // Algorithm: improved quicksort void Quicksort(int vec[], int loBound, int hiBound) { if (hiBound-loBound == 1) { // Two items if (vec[loBound] > vec[hiBound]) { swap(vec[loBound], vec[hiBound]); } return; } int someSub = partition(vec, loBound, hiBound); if (loBound < someSub-1) { // 2 or more items in 1st subvec Quicksort(vec, loBound, someSub-1); } if (someSub+1 < hiBound) { // 2 or more items in 2nd subvec Quicksort(vec, someSub+1, hiBound); } } ------------------------------------------ Q: Assuming someSub is halfway between loBound and hiBound, how many times is Quicksort called on input of size N? draw picture to estimate it (fig 12.21) it's about O(N) ** partitioning (HR p.581-584) The partitioning has to work so it averages a size of log N... to partition, divide slice into elements <= selected element and those > than it. Describe partitioning as on ------------------------------------------ PARTITIONING IDEA // partition.h extern int partition(int vec[], int loBound, int hiBound); // PRE: loBound+1 < hiBound // (so at least 3 elements present) // && loBound..hiBound are // legal indexes of vec // MODIFIES: vec[loBound..hiBound] // POST: vec[loBound..hiBound] contains // the same values as // vec[loBound..hiBound] // but each vec[loBound..FCTVAL] < // each vec[FCTVAL..hiBound] ------------------------------------------ draw pictures. ------------------------------------------ // partition.C #include "partition.h" #include "swap.h" int partition(int vec[], int loBound, int hiBound) { // ASSERT: there are 3 or more items int pivot = vec[(loBound+hiBound)/2]; vec[(loBound+hiBound)/2] = vec[loBound]; vec[loBound] = pivot; int loSwap = loBound + 1; int hiSwap = hiBound; do { while (loSwap <= hiSwap && vec[loSwap] <= pivot) { loSwap++; } // ASSERT: loSwap <= hiSwap+1 // && all vec[loBound+1..loSwap-1] // are <= pivot && (loSwap < hiSwap) // --> vec[lowSwap] > pivot while (vec[hiSwap] > pivot) { hiSwap--; } // ASSERT: // if (loSwap < hiSwap) { swap(vec[loSwap], vec[hiSwap]); } // INV: vec[loBound..loSwap-1] <=pivot // && vec[hiSwap+1..hiBound] > pivot // && (loSwap < hiSwap) --> // vec[loSwap] <= pivot < vec[hiSwap] // && (loSwap >= hiSwap) --> // vec[hiSwap] <= pivot // && loBound <= loSwap <= hiSwap+1 // && hiSwap+1 <= hiBound+1 } while (loSwap < hiSwap); vec[loBound] = vec[hiSwap]; vec[hiSwap] = pivot; return hiSwap; } ------------------------------------------ Fill in the critical assertion. // ASSERT: hiSwap >= loSwap-1 // && All vec[hiSwap+1..hiBound] // are > pivot && vec[hiSwap] <= pivot Talk about why the last swap is used. Might mention that can eliminate the overhead for function call to partition by writing it as an inline procedure, or by writing it in place... (see book) ** analysis of quicksort (HR p. 587ff) The best case and average times are O(N log N) Worst case is O(N^2), but only occurs rarely (already sorted in qsort1) so the important thing to remember is the average case times