Test setup:
- 1.1G, 100 million lines of intergers.
- 24 CPU, 48G mem
Use time ./parasort.sh all-ids 500000 20
:
real 2m32.483s
user 10m52.996s
sys 0m9.066s
Compare with single core sort:
$time sort -u -n all-ids -o all-ids.sortu
real 8m44.057s
user 8m39.567s
sys 0m4.038s
Use -S 10000000000
(allow 10G buffer).
real 2m39.086s
user 10m54.576s
sys 0m14.016s
Use large part size: time ./parasort.sh all-ids 5000000 20
.
real 2m16.916s
user 11m33.363s
sys 0m28.230s
Further optimisations:
-S