Skip to content

Instantly share code, notes, and snippets.

View loren's full-sized avatar

Loren Siebert loren

View GitHub Profile
@loren
loren / rails_routing_invalid_chars_fix.rb
Created August 17, 2012 17:37
Fix for ArgumentError: invalid byte sequence in UTF-8
require 'action_dispatch/routing/route_set'
# Based on https://gist.github.com/2830082
module ActionDispatch
module Routing
class RouteSet
class Dispatcher
def call_with_invalid_char_handling(env)
uri = CGI::unescape(env["REQUEST_URI"].force_encoding("UTF-8"))
# If anything in the REQUEST_URI has an invalid encoding, then raise since it's likely to trigger errors further on.
return [400, {'X-Cascade' => 'pass'}, []] if uri.is_a?(String) and !uri.valid_encoding?
@loren
loren / flickr_initial_mapping.json
Created October 23, 2014 02:25
Initial Flickr Photo mapping for Elasticsearch
{
"development-asis-flickr_photos": {
"mappings": {
"flickr_photo": {
"properties": {
"description": {
"type": "string",
"analyzer": "en_analyzer"
},
"owner": {
@loren
loren / instagram_initial_mapping.json
Created October 28, 2014 17:29
Initial Instagram Photo mapping for Elasticsearch
{
"development-asis-instagram_photos": {
"mappings": {
"instagram_photo": {
"properties": {
"caption": {
"type": "string",
"analyzer": "en_analyzer"
},
"popularity": {
@loren
loren / initial_asis_settings.json
Created October 28, 2014 17:30
Initial Elasticsearch settings
{
"settings": {
"index": {
"analysis": {
"char_filter": {
"ignore_chars": {
"type": "mapping",
"mappings": [
"'=>",
"\u2019=>",
@loren
loren / initial_search_query.json
Created October 28, 2014 17:32
Initial Elasticsearch query across Instagram and Flickr
GET http://localhost:9200/development-asis-flickr_photos,development-asis-instagram_photos/_search
{
"query": {
"function_score": {
"functions": [
{
"field_value_factor": {
"field": "popularity",
"modifier": "log2p"
}
@loren
loren / filter_range.json
Created October 29, 2014 15:08
Combining Gaussian filter and constant boost factor based on date range filter
{
"functions": [
{
"field_value_factor": {
"field": "popularity",
"modifier": "log2p"
}
},
{
"filter": {
@loren
loren / match_phrase.json
Created October 29, 2014 15:09
Recognize proximity of words
{
"bool": {
"should": [
{
"match": {
"tags": {
"query": "jefferson memorial",
"analyzer": "tag_analyzer"
}
}
@loren
loren / second_flickr_mapping.json
Created October 29, 2014 15:12
Second iteration on Flickr mapping for Elasticsearch
{
"properties": {
"bigram": {
"type": "string",
"analyzer": "bigram_analyzer"
},
"description": {
"type": "string",
"analyzer": "en_analyzer",
"copy_to": [
@loren
loren / second_instagram_mapping.json
Created October 29, 2014 15:13
Second iteration on Instagram mapping for Elasticsearch
{
"properties": {
"bigram": {
"type": "string",
"analyzer": "bigram_analyzer"
},
"caption": {
"type": "string",
"analyzer": "en_analyzer",
"copy_to": [
@loren
loren / spelling.json
Created October 29, 2014 15:14
Spelling suggestion based on bigram field
{
"suggest": {
"text": "jeferson memorial",
"suggestion": {
"phrase": {
"analyzer": "bigram_analyzer",
"field": "bigram",
"size": 1,
"direct_generator": [
{