matthewpalmer/gist:b25244eb5a59cb1c0315

## gistfile1.txt
Graph
http://i.imgur.com/uxBzv0a.png

# Sort A
We ran our standard tests of input data from 1000 to 100,000 in size, with intervals of 1000. This resulted in the above graph.

The average (random) case is roughly polynomial, and it performs nearly as badly on descending data. Interestingly, the algorithm performs very well on ascending data. With these observations, we can narrow the algorithm down to one of the following:
- oblivious bubble sort
- Vanilla Insertion sort
- Insertion sort with binary search
- Selection sort (**??**)
- Shell sort times 4
- .... some more ....

All of the above algorithms have the characteristic that their average case is O(N^2), and that their best case occurs for already sorted data.

# Oblivious Bubble sort
- average case: quadratic
- unstable
- best case performance on sorted data O(N)
- worst case performance on reverse sorted data O(N^2)

# Vanilla Insertion sort
- average case: quadratic
- O(N) best case for sorted data
- worst case reverse order

# Insertion sort with binary search
- depending on the cost of comparisons compared to the cost of swaps, this will likely have better performance than the regular insertion sort. This is because the number of comparisons for re-insertion is reduced to N*logN
- overall still quadratic

# Quadratic Selection sort
- generally slightly faster than regular selection sort.
- the first sort (on the chunks of size sqrt(n)) will take sqrt(n)*O(n) time

# Vanilla Quick sort
- average case is O(N*logN), which is the same as best case. this occurs when the pivot evenly divides the set
- worst case is N^2 on data that is either ascending or descending (i.e. already sorted) because the partition step will break it into two chunks that are size N-1 and 1, which is pretty useless

# Randomised Quick sort
- almost all cases should be NlogN because the worst case for vanilla quick sort (N^2 on sorted data) will be eliminated by randomly arranging the input

# Shell sort times 4
- in general, the efficiency of a shell sort depends on sequence of h values chosen (in this case 1024, 256, 16, 4, 1)
- very similar to insertion sort in terms of worst case time complexity (N^2), but should be slightly faster
best case is NlogN when the array is already sorted

# Radix sort msd first
- radix sort works on significant digits (so an msd first radix sort will start at the first letter of each word)
- time grows as length of the input strings grows -- i.e. no. of digits in each number affects the time
- overall time also affected by number of inputs
- time complexity is O(kN) if k is the lenghth of the strings

# Bucket sort w/ large but fixed no. of buckets
- each element put into a bucket based on its value (0..9, 10..19, 20..29, etc.)
- each bucket sorted (possibly recursively, or with a different sorting algorithm), then visit the sorted buckets and re-insert into the array
- time complexity affected by number of buckets and efficiency of sorting each bucket
- the worst case is O(n^2) when all of the elements fall into the same bucket
- if k is the number of buckets then the average case will be N+K
	Graph
	http://i.imgur.com/uxBzv0a.png

	# Sort A
	We ran our standard tests of input data from 1000 to 100,000 in size, with intervals of 1000. This resulted in the above graph.

	The average (random) case is roughly polynomial, and it performs nearly as badly on descending data. Interestingly, the algorithm performs very well on ascending data. With these observations, we can narrow the algorithm down to one of the following:
	- oblivious bubble sort
	- Vanilla Insertion sort
	- Insertion sort with binary search
	- Selection sort (??)
	- Shell sort times 4
	- .... some more ....

	All of the above algorithms have the characteristic that their average case is O(N^2), and that their best case occurs for already sorted data.

	# Oblivious Bubble sort
	- average case: quadratic
	- unstable
	- best case performance on sorted data O(N)
	- worst case performance on reverse sorted data O(N^2)

	# Vanilla Insertion sort
	- average case: quadratic
	- O(N) best case for sorted data
	- worst case reverse order

	# Insertion sort with binary search
	- depending on the cost of comparisons compared to the cost of swaps, this will likely have better performance than the regular insertion sort. This is because the number of comparisons for re-insertion is reduced to N*logN
	- overall still quadratic

	# Quadratic Selection sort
	- generally slightly faster than regular selection sort.
	- the first sort (on the chunks of size sqrt(n)) will take sqrt(n)*O(n) time

	# Vanilla Quick sort
	- average case is O(N*logN), which is the same as best case. this occurs when the pivot evenly divides the set
	- worst case is N^2 on data that is either ascending or descending (i.e. already sorted) because the partition step will break it into two chunks that are size N-1 and 1, which is pretty useless

	# Randomised Quick sort
	- almost all cases should be NlogN because the worst case for vanilla quick sort (N^2 on sorted data) will be eliminated by randomly arranging the input

	# Shell sort times 4
	- in general, the efficiency of a shell sort depends on sequence of h values chosen (in this case 1024, 256, 16, 4, 1)
	- very similar to insertion sort in terms of worst case time complexity (N^2), but should be slightly faster
	best case is NlogN when the array is already sorted

	# Radix sort msd first
	- radix sort works on significant digits (so an msd first radix sort will start at the first letter of each word)
	- time grows as length of the input strings grows -- i.e. no. of digits in each number affects the time
	- overall time also affected by number of inputs
	- time complexity is O(kN) if k is the lenghth of the strings

	# Bucket sort w/ large but fixed no. of buckets
	- each element put into a bucket based on its value (0..9, 10..19, 20..29, etc.)
	- each bucket sorted (possibly recursively, or with a different sorting algorithm), then visit the sorted buckets and re-insert into the array
	- time complexity affected by number of buckets and efficiency of sorting each bucket
	- the worst case is O(n^2) when all of the elements fall into the same bucket
	- if k is the number of buckets then the average case will be N+K