Skip to content

Instantly share code, notes, and snippets.

@alyssaq
Last active December 23, 2015 14:19
Show Gist options
  • Save alyssaq/6648100 to your computer and use it in GitHub Desktop.
Save alyssaq/6648100 to your computer and use it in GitHub Desktop.
Sorting and Searching

#Sorting and Searching Sorting is the basic building block of many algorithms. Interesting design of algorithms appear in the context of sorting: divide-and-conquer, data structures, and randomized algorithms.

Common Sorts

Merge sort:

  • Advantages: suitable for linked list, suitable for external sort.
  • Disadvantages: need extra buffer holding the merged data.

Insertion/Selection sort:

  • Advantages: easy to implement.
  • Disadvantages: too slow and become impractical when data is huge.

Heap sort:

  • Advantages: don't need recursion. Suitable for large data.
  • Disadvantages: usually slower than merge sort and quick sort.

Quick sort:

  • Advantages: practical fastest.
  • Disadvantages: recursive, worst case too slow.

Applications of Sorting

  • Searching: If keys are all sorted, binary search in a dictionary is O(log n)
  • Closest pair: How do you find the pair of numbers with the smallest difference between them? Once sorted, closest pair must lie next to each other with smallest gap. Thus, a linear-time scan through a sorted array is a total O(n log n)
  • Element uniqueness: Are there duplicates in a set of n items? Similar to closest pair with zero gap. Sort the numbers then a linear scan to check all adjacent pairs O(n log n). Alternatively, we could just build a hashtable O(n) and a duplicate occurs when a key already exists.
  • Frequency distribution: Given n items, which element k occurs the largest number of times in the set? Sort, look-up k then find the first element that is not k. O(log n + c)
  • Selection: What is the kth largest item? Once sorted, the kth largest can be found in constant time.
  • Convex Hulls: What is the polygon of smallest area that contains a given set of n points in (x,y) plane. Sort by x-coordinates O(n log n). We know the boundary is the first and last element. Traverse the remaining elements and identify whether they belong in the null O(n). So total of O(n log n)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment