Last active
January 27, 2024 02:58
-
-
Save Per48edjes/a10eb10c98b63a6ecfcf9a7c4849c6d7 to your computer and use it in GitHub Desktop.
Walking myself through Cycle Sort
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"id": "08164e1b-959a-4ba4-900b-b634cb059023", | |
"metadata": {}, | |
"source": [ | |
"# Cycle Sort\n", | |
"\n", | |
"Ravi Dayabhai" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "3fdba7d2-8969-4f0a-874f-d860c1c0fb13", | |
"metadata": {}, | |
"source": [ | |
"## Introduction\n", | |
"\n", | |
"**Cycle sort** is a nifty sorting algorithm that exploits the fact that any permutation (of some sorted collection of elements) can be decomposed into a **unique set of disjoint _cycles_**. (There are broader, more profound mathematical machinations at play here, but we won't touch on symmetric groups or the like here.)\n", | |
"\n", | |
"### Properties\n", | |
"\n", | |
"For certain types of data (i.e., where we know elements are \"supposed to be\" in the array), the procedure yields a **stable**, **in-place** sort in **linear** time!\n", | |
"\n", | |
"Let's first consider what a _cycle_ actually is, using an example." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "340c6926-0fd3-4021-95a3-bd3bf1533d28", | |
"metadata": {}, | |
"source": [ | |
"## Example\n", | |
"\n", | |
"### Cycle & Un-Cycle\n", | |
"\n", | |
"The example below fleshes out the aforementioned concepts with the following toy inputs:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"id": "1597b12d-2a74-40f5-86f7-8585babff514", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"xs = [0, 1, 2, 3, 4, 5]\n", | |
"c = [1, 3, 4]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "90c9b13e-f3d2-40c6-868b-2f7d2cb486a6", | |
"metadata": {}, | |
"source": [ | |
"Notice that we know _ex ante_ where each element is \"supposed to go,\" since the `xs` consist of the integers in the interval $[0, 5]$, i.e., an _item_ (or value) in `xs` is precisely its _index_ position.\n", | |
"\n", | |
"The `cycle` list contains information about *which elements* (identified by their index) should be rotated.\n", | |
"\n", | |
"In the above example: \n", | |
"\n", | |
"- the item at index `1` should take the place of the item at index `3`\n", | |
"- the item at index `3` should take the place of the item at index `4`\n", | |
"- the item at index `4` should take the place of the item at index `1`" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"id": "8eb3a412-f553-4483-bc35-6368946abee3", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"def apply_cycle(lst: list[int], cycle: list[int]) -> list[int]:\n", | |
" items = [lst[i] for i in cycle]\n", | |
" # This rotation is for indexing convenience\n", | |
" items = [items[-1]] + items[:-1]\n", | |
" for i, target in enumerate(cycle):\n", | |
" lst[target] = items[i]\n", | |
" return lst" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "769264bf-9ea0-4108-a101-43774f0ca310", | |
"metadata": {}, | |
"source": [ | |
"The above function, in the context of the toy inputs, would perform the following:\n", | |
"\n", | |
"1. Collect the actual `items` (elements) that are involved in the `cycle`: `items = [1, 3, 4]`\n", | |
"2. Does a convenience shift of the `items`: `items = [4,1,3]`\n", | |
"3. Iterates over `cycle` (which, importantly, contains _indices_ of `lst`) to replace `lst`'s item (at the index given by `cycle`) with the item that should take its place (given by `item`)\n", | |
"\n", | |
"For clarity, the following table outlines the values at play to perform the swaps:\n", | |
"\n", | |
"Object | Containing | Representation \n", | |
"--------|------------|--------\n", | |
"`cycle` | indices | `[1, 3, 4]`\n", | |
"`items` | elements | `[4, 1, 3]`\n", | |
"`lst` | elements | `[0, 1, 2, 3, 4]`\n", | |
"\n", | |
"To spell out the first iteration of the `for`-loop in `apply_cycle` using our example:\n", | |
"> `i = 0` and `target = 1`, so at `target` in `lst` (`lst[target]`) put the value that should take its place: `items[i] = 4`." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"id": "d4d201c6-a8db-4ee8-a5dc-1cf5329e0646", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[0, 4, 2, 1, 3, 5]" | |
] | |
}, | |
"execution_count": 3, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"perm1 = apply_cycle(xs, c)\n", | |
"perm1" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "135a8543-d675-4efc-a5a5-ae70f9075355", | |
"metadata": {}, | |
"source": [ | |
"Notice that reversing the cycle and re-applying the `apply_function` function on the permutated list **recovers the original list**!" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"id": "a1ba5511-4cac-4504-86e9-8962dfbbf9bb", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[0, 1, 2, 3, 4, 5]" | |
] | |
}, | |
"execution_count": 4, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"c.reverse()\n", | |
"apply_cycle(perm1, c)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "efbb394c-4264-4ee1-8b3a-6f66ddd6adff", | |
"metadata": {}, | |
"source": [ | |
"## Sorting with Cycles" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "ae86fb6d-7111-4b8d-bf78-4bec7a933855", | |
"metadata": {}, | |
"source": [ | |
"### 'Factoring' a Permutation\n", | |
"\n", | |
"The sensation of `apply_cycle` to a `lst` given a set of `cycle`s is analgous to multiplying a set of prime factors to produce a composite integer.\n", | |
"\n", | |
"Thus, we can ask the question, \"What _cycles_ produced this permutation?\"" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"id": "f63fadf2-2fe2-4630-b53b-1f7c45e27914", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"from collections import deque\n", | |
"\n", | |
"# Note: This function is restricted to, taking as input, lists [0 .. n]\n", | |
"def factor_perm(lst: list[int]) -> list[deque[int]]:\n", | |
" visited, cycles = set(), []\n", | |
" for i in range(len(lst)):\n", | |
" if i != lst[i] and not (i in visited):\n", | |
" cycle, cur = deque([]), i\n", | |
" while True:\n", | |
" cycle.appendleft(cur)\n", | |
" visited.add(cur)\n", | |
" cur = lst[cur]\n", | |
" if cur == i:\n", | |
" break\n", | |
" cycles.append(cycle)\n", | |
" return cycles" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"id": "742bb66d-2c90-4df8-986f-b5e69cc526f6", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[deque([3, 4, 1])]" | |
] | |
}, | |
"execution_count": 6, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"factor_perm([0, 4, 2, 1, 3, 5])" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "c7b6f459-63eb-47c7-8372-034d2b6ea174", | |
"metadata": { | |
"editable": true, | |
"slideshow": { | |
"slide_type": "" | |
}, | |
"tags": [] | |
}, | |
"source": [ | |
"### Sorting Using Cycles\n", | |
"\n", | |
"What's amazing is that **cycle sort** results from following a similar procedure as 'factoring' a permutation into its component cycles -- rather than keeping track of each cycle, we can just put out-of-position elements (mutating the input `lst`) where they should be according to the cycle. Even keeping a `visited` set is obviated because we can just skip over elements in their correct position!" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"id": "f87cbb32-2593-4c84-8c3a-e12a3c5c74cb", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# TODO: REVIEW THIS\n", | |
"# Note: This function is restricted to, taking as input, lists [0 .. n]\n", | |
"def cycle_sort(lst: list[int]) -> None:\n", | |
" for i in range(len(lst)):\n", | |
" if i != lst[i]:\n", | |
" cur = i\n", | |
" while True:\n", | |
" next_pos = lst[cur]\n", | |
" if cur != i:\n", | |
" lst[cur] = last_item\n", | |
" last_item = next_pos\n", | |
" cur = next_pos\n", | |
" if cur == i:\n", | |
" lst[cur] = last_item\n", | |
" break" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"id": "618905ed-6f15-4354-9bcc-a2fc03d0d340", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]" | |
] | |
}, | |
"execution_count": 8, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"l = [0, 5, 6, 8, 7, 4, 9, 1, 3, 2]\n", | |
"cycle_sort(l)\n", | |
"l" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "d3052c52-a56e-4738-80a3-e6c52f2256e4", | |
"metadata": {}, | |
"source": [ | |
"We can simplify this implementation above by using the `i`th location in `lst` itself to hold the swapped *out* element, knowing at some point that element will be swapped to its correct position:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 9, | |
"id": "e18bf5c3-cf52-4fa9-aac1-bd5a0d25de89", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# This function is restricted to, taking as input, lists [0 .. n]\n", | |
"def cyclic_sort(lst: list[int]) -> None:\n", | |
" n = len(lst)\n", | |
" for i in range(n):\n", | |
" while lst[i] != i and lst[lst[i]] != lst[i]:\n", | |
" # Ordering of this swap is critical!\n", | |
" lst[lst[i]], lst[i] = lst[i], lst[lst[i]]" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 10, | |
"id": "1e5cf72e-d9c3-4ce8-8816-577904f0ab97", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]" | |
] | |
}, | |
"execution_count": 10, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"l = [0, 5, 6, 8, 7, 4, 9, 1, 3, 2]\n", | |
"cyclic_sort(l)\n", | |
"l" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3 (ipykernel)", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.10.10" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 5 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
NB: This borrows heavily from this blog post, as this notebook is an artifact of me digesting it!