Skip to content

Instantly share code, notes, and snippets.

@amosr
Created November 13, 2015 22:13
Show Gist options
  • Save amosr/8989c5c65e684bf76c2d to your computer and use it in GitHub Desktop.
Save amosr/8989c5c65e684bf76c2d to your computer and use it in GitHub Desktop.
beating grep
$ ls -lah asx.psv
-rw-r--r-- 1 amos staff 59G 13 Nov 18:07 asx.psv
$ time grep -v EntryError asx.psv > /dev/null
real 14m33.885s
user 14m10.081s
sys 0m22.390s
$ time icicle-bench data/example/AsxDictionary.toml asx.psv output.psv
icicle-bench: starting compilation
icicle-bench: compilation time = 34.08s
icicle-bench: starting snapshot
icicle-bench: snapshot time = 278.51s (217.80MB/s)
real 5m12.648s
user 4m41.321s
sys 0m24.160s
@amosr
Copy link
Author

amosr commented Nov 13, 2015

this is for computing a bunch of means, and a bunch of "Pearson's product-moment correlation coefficient" on the different fields. the data is scraped from ASX stock prices.
the point is that, even spending 34 seconds compiling and optimising the query program, we are three times faster than grep. I guess for significantly smaller files the compilation overhead would dominate though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment