Skip to content

Instantly share code, notes, and snippets.

@markharwood
Created July 24, 2014 09:55
Show Gist options
  • Save markharwood/26144f299056f7dae3ba to your computer and use it in GitHub Desktop.
Save markharwood/26144f299056f7dae3ba to your computer and use it in GitHub Desktop.
Example Groovy update script
//import org.apache.lucene.codecs.bloom.FuzzySet;
// Extract the doc source to a field
doc = ctx._source;
// Convert basic array into map for ease of manipulation
tagMap = doc.tags.collectEntries{[it.tag, it]};
// Patch the new tags into the data structure, adding one to a usercount
for (usertag in usertags){
def tag = tagMap.get(usertag);
if(tag == null){
// create new tag
tag = ["tag":usertag, "usercount":1];
tagMap.put(usertag, tag);
} else{
// Existing tag
if(tag.usercount == null){
tag.usercount = 1;
} else {
tag.usercount++;
}
}
}
//This won't work because of sandboxing
//FuzzySet users=FuzzySet.createSetBasedOnMaxMemory(500);
//TODO use HyperLogLog structure rather than count to prevent double-counting?
//def bytes=userid.getBytes();
//hash=MurmurHash2.hash32(bytesRef.bytes, 0, bytes.length);
//println(hash);
// Store the modified map's values back as array.
doc.tags = tagMap.values();
{
"_index": "bookmarks",
"_type": "bookmark",
"_id": "http://www.bbc.co.uk/news/world-asia-27807157",
"_source": {
"description": "Singapore's Marina Bay Sands casino has banned shark fin, the latest in a series of boycotts around the region.",
"image": "http://news.bbcimg.co.uk/media/images/75469000/jpg/_75469430_hi022417700.jpg",
"title": "BBC News - Singapore casino bans shark fin",
"tags": [
{
"_index": "bookmarks",
"_id": "sharks",
"highlight": {
"description": [
" <em>shark</em> fin, the latest in a series of boycotts around the region."
]
},
"tag": "sharks",
"usercount": 5
}
]
}
}
{
"userid": "hf74hs238adghs023j",
"usertags": [
"sharks",
"casino"
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment