Skip to content

Instantly share code, notes, and snippets.

@xxsang
Created June 7, 2019 02:04
Show Gist options
  • Save xxsang/cfb51b5fd2e1830e503eb72661a7a082 to your computer and use it in GitHub Desktop.
Save xxsang/cfb51b5fd2e1830e503eb72661a7a082 to your computer and use it in GitHub Desktop.
Sorting
# Sorting -- 3 Jun 2019
1. Callbacks
1. To sort any type of data
2. client passes array of objects to sort() function
3. The sort() function calls back object’s compareTo() method as needed.
2. Selection Sort
1. Find the smallest remaining entry and swap a[i] and a[min]
2. N^2/2 compares and N exchanges
3. Quadratic time even if input is sorted
3. Insertion Sort
1. In iteration i, swap a[i] with each larger entry to its left
2. ~1/4N^2 compares, ~1/4N^2 exchanges on average
3. Best case, N-1 compares and 0 exchanges (quicker than selection sort) (already sorted)
4. Worst case, ~1/2N^2 compares and ~1/2N^2 exchanges (slower than selection sort) (reverse order)
5. Partially-sorted arrays (an inversion is a pair of keys that are out of order) if the number of inversions is linear
6. Insertion sort runs in linear time for partially-sorted arrays, the number of exchanges equals number of inversions
4. Shellsort
1. move entries more than one position at a time by h-sorting the array
2. Insertion sort with stride length h
3. h should be 3x+1, or 1,5,19,41,109,209,505,929,2161,3905,…
4. worstcase O(N^(3/2))
5. average better than that but do not have model
6. advantage: fast, tiny, fixed footprint of code, hardware sort prototype
5. Shuffling
1. Shuffle sort
1. generate a random real number for each array entry
2. sort the array
3. Get the array shuffled.
4. Shuffle sort produces a uniformly random permutation of the input array provided no duplicate values
2. Knuth shuffle
1. in iteration i, pick integer r between 0 and i uniformly at random, swap a[i] and a[r]
2. O(N)
3. Use unbiased shuffle
6. Convex hull
1. The convex hull of a set of N points is the smallest perimeter fence enclosing the points
2. Graham scan
7. Merge Sort
1. divide array into two halves, recursively sort each half and merge the halves (by comparing pairs from two halves)
2. Copy the sorted halves and get back the sorted whole array
3. assert expression[,arguments]
4. sort process: divide-and-conquer (recursion)
5. NlgN compares, 6NlgN array access
6. memory, 2N, not in-place sorting algorithm
7. Improvements
1. cutoff to insertion sort for <7 items
2. stop if already sorted: is biggest item in first half <=smallest item in second half
3. switch the role of the input and auxiliary array in each recursive call (save time not memory, save moving items)
8. Bottom-up mergesort
1. no recursion
2. merge subarrays of size 1, 2, 4, 8, 16, …
9. Complexity
1. Model of computation: decision tree
2. cost model:#compares
3. upper bound: NlgN
4. lower bound: NlgN
5. mergesort is optimal with respect to # compares
6. mergesort is not optimal with respect to space usage
10. Comparators
1. Sort keys
2. Comparator interface, support multiple orderings of a given data type
11. Stability
1. Preserve the relative items with equal keys (sort by the first key and then the second, the relative order still holds)
2. Insertion sort and mergesort are stable
3. mergesort is stable depending on merge function implementation
4. Not selection sort or shellsort, have long exchange of keys
8. Quick Sort
1. Idea
1. Shuffle the array
2. Partition so that for same j, entry a[j] is in place, no larger entry to the left of j, no smaller entry to the right of j
3. Sort each piece recursively
2. Partition
1. random select a partition element
2. move i from left to right and j from right to left, exchange if it does not meet the partition criterion, until the two pointer cross
3. recursively partition the array into half
3. Partition-in-place
4. terminating the loop need to test the pointer cross, stay in bound (i==hi)
5. shuffling is needed for performance guarantee
6. Quicksort (NlgN) but on average quicker than mergesort
7. Best case NlgN, worst case 1/2N^2
8. The average # compares is ~2NlnN, exchanges~1/2NlnN
9. Quadratic time
1. If sorted or reverse sorted
2. has many duplicates
10. Not stable
11. Improvements
1. insertion sort <10 small arrays
2. take the median of the sample to partition the array
9. Selection
1. find the kth largest
2. Quick-select
1. Like quick-sort, random partition and recursively solve it
2. Linear time on average
10. Duplicate keys
1. Mergesort doesn’t matter, always between 1/2NlgN and NlgN compares
2. Quicksort goes quadratic unless partitioning stops on equal keys
1. Problem: put all items eqaul to partitioning item on one side
2. Consequence: ~1/2N^2 compares when all keys are equal
3. 3-way partitioning
1. partition array into 3 parts so that entries between lt and gt equal to partition item v, no larger entries to left of lt, no smaller entries to right of gt
2. Dijkstra 3-way partitioning
1. scan i from left to right, if small exchange a[lt] with a[i], increment both lt and i; if larger exchange a[gt] with a[i], decrement gt; if equal, increment i
4. Lower bound depend on distinct keys
11. System sorts
1. Selection: inplace, N exchanges
2. Insertion, inplace, stable, use for small N or partially ordered
3. Shell, inplace, tight code, subquadratic
4. merge, stable, NlogN guarantee, stable
5. quick, inplace, NlogN probabilistic guarantee, fatest in practice
6. 3-way quick, inplace, improves quick sort in presence of duplicate keys
7. Is there a inplace, stable, guaranteed NlgN sorting algorithm?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment