The score number was not correctly parsed by the bash script.
Indeed, a double
number can be returned result as its scientific notation such as 4.06E-05
in JSON. This happens when normal notation is not precise enough to display all significant numbers (eg.0.0000406
).
The following script correctly parses elasticsearch results. In addition, I added the raw output in the results.ok.tsv
file.
ES_HOST=localhost
curl -XDELETE "http://$ES_HOST:9200/test"
curl -XPUT "http://$ES_HOST:9200/test" -d '{"settings":{"number_of_shards":1,"number_of_replicas":0},"mappings":{"test":{"properties":{"date":{"type":"date"}}}}}'
for scale in {3,6,9}; do
for offset in {0,3,6}; do
echo "scale=$scale, offset=$offset"
for day in {0..60}; do
curl -s -XPOST "http://$ES_HOST:9200/test/test/1?refresh=true" -d '{"date":"'"$(date -v-${day}d '+%Y-%m-%dT%H:%M:%S.000Z')"'"}' > /dev/null
RESULT=$(curl -s -XGET "http://$ES_HOST:9200/test/_search" -d '{"_source":false,"query":{"function_score":{"query":{"match_all":{}},"functions":[{"gauss":{"date":{"origin":"now","scale":"'"$scale"'d","offset":"'"$offset"'d","decay":0.5}}}],"boost_mode":"replace"}}}')
echo "$RESULT" | sed 's|.*"_score":\([^,\}]\+\).*|\1|' # <-- problem was here!
done
done
done
- We (re)create a
test
index and put a simple mapping to declare thedate
field
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"test": {
"properties": {
"date": {"type": "date"}
}
}
}
}
- For various
scale
andoffset
values, we insert n docs withdate
equals to now minus n days. Each doc has the same id so that it replace the previous one (there is always one doc in the index)
{
"date": $date
}
- For each doc, we run a function score query with a gauss decay function relative to
now
{
"_source": false,
"query": {
"function_score": {
"query": {"match_all": {}},
"functions": [{
"gauss": {
"date": {
"origin": "now",
"scale": $scale"d",
"offset": $offset"d",
"decay": 0.5
}
}
}],
"boost_mode": "replace"
}
}
}
Beware of the ,
instead of .
as decimal separator
ES_HOST=localhost
curl -XDELETE "http://$ES_HOST:9200/test"
curl -XPUT "http://$ES_HOST:9200/test" -d '{"settings":{"number_of_shards":1,"number_of_replicas":0},"mappings":{"test":{"properties":{"date":{"type":"date"}}}}}'
for scale in {3,6,9}; do
for offset in {0,3,6}; do
echo "scale=$scale, offset=$offset"
for day in {0..60}; do
curl -s -XPOST "http://$ES_HOST:9200/test/test/1?refresh=true" -d '{"date":"'"$(date -v-${day}d '+%Y-%m-%dT%H:%M:%S.000Z')"'"}' > /dev/null
RESULT=$(curl -s -XGET "http://$ES_HOST:9200/test/_search" -d '{"_source":false,"query":{"function_score":{"query":{"match_all":{}},"functions":[{"gauss":{"date":{"origin":"now","scale":"'"$scale"'d","offset":"'"$offset"'d","decay":0.5}}}],"boost_mode":"replace"}}}')
echo "$RESULT" | sed 's|.*"_score":\([0-9]*\).\([0-9]*\).*|\1,\2|' # Change sep here
done
done
done
You can find sample graphed results in this Gist files results-20-days.png
and results-60-days.png
and raw results in the results.tsv
file (you can copypaste/import into Excel)