kgaughan/gist:a6a31918ccf1baaa6d3f14abdf803bef

## gistfile1.rst

      
    Raw
  

              gistfile1.rst
            
          
    I've an idea for a merge sort variant that's been rattling around my head for a while.
The idea is that sorting is broken up into two phase: run identification and merging. This has some similarities to Timsort, but it's not quite the same thing.
In the run identification phase, runs are identified and recorded. There is some deperturbation to try an identify the longest runs possible near the potential end of runs where if the next element, x[i], is less than x[i-1] and x[i-2], it is swapped with x[i-1].
If no run longer than N can be identified since the last identified run by M elements, those M elements are sorted using insertion sort to create a run.
When we have a list of runs, we start merging adjacent pairs of runs, starting with the smallest adjacent pairs.
Open questions:

How should N (the minimum run length) and M (the maximum span of any non-run), where M > N, be calculated? Should N have an upper or lower bound if calculated based on the sequence length?
There should be a way to identify runs in the opposite direction, and a heuristic for identifying those, and then reversing them. (Yes, naturally, but the key is when run reversal ought to be done).
What is the best way of doing the merge? There are a few things I think might be worth while:

End identification: if you have a pair of runs, compare the last element of the first run with the first element of the second run, and if the former is less than the latter, treat the two runs as merged.
Might it be worthwhile trying to minimise the size of the temporary storage needed to perform the run by checking if some chunk at the beginning/end of the two runs can be excluded from the merge?
Should reversal of opposite runs be done as part of the merge or after run identification? Is it better to do longer runs in the former and shorter runs in the latter?


Material:

The in-place merge chapter on the WikSort algorithm looks interesting.
Ratio based stable in-place merging by Pok-Son Kim and Arne Kutzner - this is mentioned in the previous link, but the link is broken. Also see: http://itbe.hanyang.ac.kr/publications/arne-kutzner-selected-publications/