Skip to content

Instantly share code, notes, and snippets.

@JogoShugh
Last active June 26, 2024 15:51
Show Gist options
  • Save JogoShugh/d2e8ef29cbc0e2a23242987ec1c55c04 to your computer and use it in GitHub Desktop.
Save JogoShugh/d2e8ef29cbc0e2a23242987ec1c55c04 to your computer and use it in GitHub Desktop.
Grokking Algorithms 2nd Ed Notes

The book provides this first implementation, which as written traverses each element past i two times per recursion:

def qsort(arr):
  if len(arr) < 2:
    return arr
  pivot = arr[0]
  less = [i for i in arr[1:] if i <= pivot]
  # NOTE: this traverses all N elements twice in the list.    
  greater = [i for i in arr[1:] if i > pivot]  
  return qsort(less) + [pivot] + qsort(greater)

I wanted to optimize this, to reduce the total number of reads, so I did it the following way and added in some tracking to help visualize the output.

The key difference is here, where it traverses the list items only one time per recursion:

  less = []
  greater = []
  for idx, value in enumerate(arr):
      state["touches"] += 1
      if not idx == random_index: 
        if value <= pivot:
          less.append(value)
        else:
          greater.append(value)

Full listing:

# Use a random array index, as per the book's ultimate advice for reducing the number of reads

import random

def printp(pad, message):
  padding = " " * pad
  print(f"{padding}{message}")

def qsort(arr, parentRecursion=0, state=None):
  if state is None:
    state = {'recursion':0, 'touches':0, 'initialPivotIndex': None}
  print("\n")
  recursion = state['recursion'] + 1
  state['recursion'] = recursion

  if (recursion > 1):
    printp(parentRecursion, f"Recursion: parent # {parentRecursion}; total #: {recursion}")
  else:
    printp(parentRecursion, f"Entering function for first time! (parentRecursion = {parentRecursion})")

  arr_len = len(arr)
  # This is obvious. If the length is 1 or 0, it's already sorted. Nothing to do
  if arr_len < 2:
    printp(parentRecursion, f"Found arr of 1 or fewer elements: {arr}")
    return (arr, state)
  else:
    printp(parentRecursion, f"Found arr with > 1 elements: {arr}")

  random_index = random.randint(0, arr_len - 1) 

  # Just take the 0th item in the list in this implementation, since we know 
  # there are at least indices 0 and 1 available at this point
  pivot = arr[random_index]
  if state['initialPivotIndex'] == None:
    state['initialPivotIndex'] = random_index
  printp(parentRecursion, f"the pivot arr[{random_index}]: {pivot}")
  # Count the touch here
  state['touches'] += 1

  # Pot the the i values of the entire array in appropriate division
  less = []
  greater = []
  for idx, value in enumerate(arr):
      state["touches"] += 1
      if not idx == random_index: 
        if value <= pivot:
          less.append(value)
        else:
          greater.append(value)
  
  printp(parentRecursion, f"touches after array division: {state['touches']}")

  # recursively call qsort, and use array concatenation to combine to final sorted list
  printp(parentRecursion, f"{less} + [{pivot}] + {greater} (less + pivot + greater)")
  return (qsort(less, recursion, state)[0] + [pivot] + qsort(greater, recursion, state)[0], state)
  
result = qsort([10, 20, 30, 40, 50, 60, 70, 80]) 

print(f"Final result: {result}")  

A few sample runs, redacted after the first one:

pivot = 6 => 9 recursions

Entering function for first time! (parentRecursion = 0)
Found arr with > 1 elements: [10, 20, 30, 40, 50, 60, 70, 80]
the pivot arr[6]: 70
touches after array division: 8
[10, 20, 30, 40, 50, 60] + [70] + [80] (less + pivot + greater)


 Recursion: parent # 1; total #: 2
 Found arr with > 1 elements: [10, 20, 30, 40, 50, 60]
 the pivot arr[4]: 50
 touches after array division: 14
 [10, 20, 30, 40] + [50] + [60] (less + pivot + greater)


  Recursion: parent # 2; total #: 3
  Found arr with > 1 elements: [10, 20, 30, 40]
  the pivot arr[2]: 30
  touches after array division: 18
  [10, 20] + [30] + [40] (less + pivot + greater)


   Recursion: parent # 3; total #: 4
   Found arr with > 1 elements: [10, 20]
   the pivot arr[1]: 20
   touches after array division: 20
   [10] + [20] + [] (less + pivot + greater)


    Recursion: parent # 4; total #: 5
    Found arr of 1 or fewer elements: [10]


    Recursion: parent # 4; total #: 6
    Found arr of 1 or fewer elements: []


   Recursion: parent # 3; total #: 7
   Found arr of 1 or fewer elements: [40]


  Recursion: parent # 2; total #: 8
  Found arr of 1 or fewer elements: [60]


 Recursion: parent # 1; total #: 9
 Found arr of 1 or fewer elements: [80]
Final result: [10, 20, 30, 40, 50, 60, 70, 80]

More results, just showing the initial pivot and total number of recurions and touches based on that initial pivot. Keep in mind that since we are only tracking the initial pivot, there will still be variation in the total touches count. However, the number of recursions should always stay the same, I think. (TODO: verify)

{'initialPivotIndex': 0, 'recursion': 11, 'touches': 30})
{'initialPivotIndex': 1, 'recursion': 11, 'touches': 26})
{'initialPivotIndex': 2, 'recursion': 11, 'touches': 25})
{'initialPivotIndex': 3, 'recursion': 11, 'touches': 24})
{'initialPivotIndex': 4, 'recursion': 11, 'touches': 24})
{'initialPivotIndex': 5, 'recursion': 13, 'touches': 30})
{'initialPivotIndex': 6, 'recursion': 11, 'touches': 28})
{'initialPivotIndex': 7, 'recursion': 11, 'touches': 29})

Notes

  • In the output notice the parent recursion is listed followed by the total recursion count.
    • The sample uses None as the default value for state because default parameters in Python functions are essentially "static". We want to avoid shared state between top-level invocations.
  • I idented by the parent count to get a clear sense of how the control flow occurs in relation to the total number.
    • This allows you to see how the less array in the final, recursive functions statement, return qsort(less, recursion, state) + [pivot] + qsort(greater, recursion, state) gets broken down all the way before we see any logging of what happens to the greater array.
    • For example in the case of pivot = 6 we can just 9 recursions, examine the relationship of each recursion to its parent recursion number. The easiest relationship to see is the "deepest" one, which shows a maximum depth of 4, where both 5 and 6 were children of the 4th recursion:
    0
    |
    1 => 0
      |
    * 2 => 1
        |
    **  3 => 2
          |
    ***   4 => 3
            |
    ****    5 => 4
            |
    ****    6 => 4
          |
    ***   7 => 3
        |
    **  8 => 2
      |
    * 9 => 1
    
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment