Skip to content

Instantly share code, notes, and snippets.

Zachary Tong polyfractal

Block or report user

Report or block polyfractal

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
View boosted_syn.mvel
termInfo = _index[field].get(term,_PAYLOADS);
score = 0;
for (pos : termInfo) {
score = score + pos.payloadAsFloat(0);
}
return score;
View Results.md

Ran on my macbook air, half a million docs. Single node, 5 primary 0 replica. Node restarted between runs to make sure all caches cleared, etc.

Existing benchmark

$ python loadtester.py --es "http://localhost:9200/speedtest/_search" -i ../data/stoicism.txt -o test1.txt --ns 10000 --nt 3 --nf 10
0 26004 1.36110687256
1000 5561 0.0182199478149
2000 10516 0.0134048461914
3000 42137 0.0833399295807
4000 34922 0.0168430805206
View gist:4063964
{
"mappings":{
"post":{
"properties":{
"body":{
"fields":{
"body":{
"type":"string",
"analyzer":"analyzer_term"
},
View gist:4947188
###Changes:
-Added "include_in_root" for each nested object.
-Removed the "nested" params from the last facet
This basically copies the nested doc into the root doc. You then reference the root "inner object" rather than the "nested object" to get the data. Be careful though, this breaks down if you have multiple nested docs that share the same name (e.g. array of nested), since the facet will operate on the entire array instead of individual ones.
See this thread for more info: https://groups.google.com/d/topic/elasticsearch/pjoNmosdCPs/discussion
@polyfractal
polyfractal / gist:4959909
Last active Dec 13, 2015
Mapping for indexing throughput benchmark
View gist:4959909
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 0,
"index": {
"analysis": {
"analyzer": {
"analyzer_shingle": {
"tokenizer": "standard",
"filter": [
@polyfractal
polyfractal / gist:4968387
Created Feb 16, 2013
Logging.yml to enable a SocketAppender, which will be used to talk to Logstash
View gist:4968387
rootLogger: INFO, console, file, socketappender
logger:
# log action execution errors for easier debugging
action: DEBUG
# reduce the logging for aws, too much is logged under the default INFO
com.amazonaws: WARN
# gateway
#gateway: DEBUG
#index.gateway: DEBUG
@polyfractal
polyfractal / gist:4997040
Created Feb 20, 2013
Reroute API breaks Cluster/Node/Stats API. More details
View gist:4997040
##Listed in order of operations and the resulting output:
## reroute the shards - executed on C1 client node
$curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
"commands":[
{
"move":{
"index":"test",
"shard":2,
"from_node":"J47gdOIwQMq2GTmzzmzJBA",
View import.php
<?php
require 'vendor/autoload.php';
use Sherlock\Sherlock;
function pprint($value) {
print_r($value);
echo "\r\n";
View gist:5482219
{
"test":{
"state":"open",
"settings":{
"index.analysis.filter.filter_shingle.type":"shingle",
"index.number_of_replicas":"0",
"index.analysis.filter.filter_shingle.output_unigrams":"true",
"index.analysis.analyzer.analyzer_shingle.tokenizer":"standard",
"index.analysis.filter.filter_shingle.min_shingle_size":"2",
"index.analysis.analyzer.analyzer_shingle.filter.0":"standard",
View gist:5506185
public function assertThrowsException($exception, $code)
{
$raisedException = null;
try {
$code();
} catch (\Exception $raisedException) {
// No more code, we only want to catch the exception in $raisedException.
}
$this->assertInstanceOf($exception, $raisedException);
You can’t perform that action at this time.