Skip to content

Instantly share code, notes, and snippets.

@MatthewRDodds
Created December 23, 2014 15:25
Show Gist options
  • Save MatthewRDodds/4a5121498f01ad5d55ea to your computer and use it in GitHub Desktop.
Save MatthewRDodds/4a5121498f01ad5d55ea to your computer and use it in GitHub Desktop.
Elasticsearch indexing speed tests

Elasticsearch indexing speed tests

Trying to find the fastest way to index all the things

Iterating over the entire collection and serializing and indexing for each item:

    user     system      total        real
10.630000   0.400000  11.030000 ( 24.820179)
11.150000   0.340000  11.490000 ( 25.183031)
11.180000   0.340000  11.520000 ( 25.743142)
11.330000   0.340000  11.670000 ( 25.486202)

Using model.find_in_batches, iterating over each item to serialize, then bulk indexing:

batches of 25:

       user     system      total        real
25 10.730000   0.370000  11.100000 ( 20.602062)
25 10.480000   0.340000  10.820000 ( 20.030325)
25 10.550000   0.250000  10.800000 ( 19.910053)
25 10.430000   0.310000  10.740000 ( 20.088754)

batches of 200:

       user     system      total        real
25 10.680000   0.350000  11.030000 ( 20.002928)
25 10.520000   0.370000  10.890000 ( 20.046314)
25 10.630000   0.310000  10.940000 ( 20.327363)
25 10.910000   0.340000  11.250000 ( 20.393303)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment