Skip to content

Instantly share code, notes, and snippets.

@BeWilled
Last active December 14, 2015 19:09
Show Gist options
  • Save BeWilled/5134964 to your computer and use it in GitHub Desktop.
Save BeWilled/5134964 to your computer and use it in GitHub Desktop.
Mapper-attachment plugin not parsing binary files.
Java Version: jdk1.6.0_26
ElasticSearch Engine Version: elasticsearch-0.20.5
Mapper Attachments version: 1.7.0
ElasticSearch.log:
[2013-03-11 09:07:58,381][INFO ][node ] [Modred the Mystic] {0.20.5}[6566]: initializing ...
[2013-03-11 09:07:58,407][INFO ][plugins ] [Modred the Mystic] loaded [mapper-attachments], sites []
[2013-03-11 09:08:00,163][INFO ][node ] [Modred the Mystic] {0.20.5}[6566]: initialized
[2013-03-11 09:08:00,163][INFO ][node ] [Modred the Mystic] {0.20.5}[6566]: starting ...
[2013-03-11 09:08:00,233][INFO ][transport ] [Modred the Mystic] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/172.22.42.135:9300]}
[2013-03-11 09:08:03,251][INFO ][cluster.service ] [Modred the Mystic] new_master [Modred the Mystic][EqjdWf0yQOiRGgxXky9t6g][inet[/172.22.42.135:9300]], reason: zen-disco-join (elected_as_master)
[2013-03-11 09:08:03,270][INFO ][discovery ] [Modred the Mystic] elasticsearch/EqjdWf0yQOiRGgxXky9t6g
[2013-03-11 09:08:03,283][INFO ][http ] [Modred the Mystic] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.22.42.135:9200]}
[2013-03-11 09:08:03,283][INFO ][node ] [Modred the Mystic] {0.20.5}[6566]: started
[2013-03-11 09:08:03,632][INFO ][gateway ] [Modred the Mystic] recovered [1] indices into cluster_state
Script:
#!/bin/sh
host=localhost:9200
curl -X DELETE "${host}/test"
curl -X PUT "${host}/test" -d '{
"settings" : { "index" : { "number_of_shards" : 1, "number_of_replicas" : 0 }}
}'
curl -X GET "${host}/_cluster/health?wait_for_status=green&pretty=1&timeout=5s"
curl -X PUT "${host}/test/attachment/_mapping" -d '{
"attachment" : {
"properties" : {
"file" : {
"type" : "attachment",
"fields" : {
"title" : { "store" : "yes" },
"file" : { "term_vector":"with_positions_offsets", "store":"yes" }
}
}
}
}
}'
curl -C - -O http://www.intersil.com/data/fn/fn6742.pdf
coded=`cat fn6742.pdf | perl -MMIME::Base64 -ne 'print encode_base64($_)'`
json="{\"file\":\"${coded}\"}"
echo "$json" > json.file
curl -X POST "${host}/test/attachment/" -d @json.file
echo
curl -XPOST "${host}/_refresh"
curl "${host}/_search?pretty=true" -d '{
"fields" : ["title"],
"query" : {
"query_string" : {
"query" : "amplifier"
}
},
"highlight" : {
"fields" : {
"file" : {}
}
}
}'
OUTPUT:
{"ok":true,"acknowledged":true}{"ok":true,"acknowledged":true}{
"cluster_name" : "elasticsearch",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 1,
"active_shards" : 1,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
}{"ok":true,"acknowledged":true}** Resuming transfer from byte position 104
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
curl: (33) HTTP server doesn't seem to support byte ranges. Cannot resume.
{"ok":true,"_index":"test","_type":"attachment","_id":"NlNzoDOOSYCy7BszwmQdUg","_version":1}
{"ok":true,"_shards":{"total":1,"successful":1,"failed":0}}{
"took" : 35,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
When visiting http://localhost:9200/test/attachment/NlNzoDOOSYCy7BszwmQdUg this shows up:
{"_index":"test","_type":"attachment","_id":"NlNzoDOOSYCy7BszwmQdUg","_version":1,"exists":true, "_source" : {"file":"PGh0bWw+VGhpcyBwYWdlIG1vdmVkIHRvIDxhIGhyZWY9Ig0KL2NvbnRlbnQvZGFtL0ludGVyc2lsL2RvY3VtZW50cy9mbjY3L2ZuNjc0Mi5wZGYNCg==Ij5oZXJlPC9hPjwvaHRtbD4NCg=="}}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment