bradfordcp /
Created September 2, 2010 19:12
Converts a WordNet prolog file into a flat file useful for Solr synonym matching.
* Based off of the Lucene prolog parser in the wordnet contrib package within the
* main Lucene project. It has been modified to remove the Lucene bits and generate
* a synonyms.txt file suitable for consumption by Solr. The idea was mentioned in
* a sidebar of the book Solr 1.4 Enterprise Search Server by Eric Pugh.
* @see <a href="">Lucene Sandbox WordNet page</a>
* @see <a href="">SVN Repository of the WordNet contrib</a>
* @see <a href="">Solr 1.4 Enterprise Search Server Book</a>
curl -XPUT localhost:9200/test
curl -XPUT http://localhost:9200/test/data/_mapping -d '{
"data" : {
"dynamic_templates" : [
"string_template" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
karmi / movie-titles.rb
Created January 13, 2013 20:42
Multiple analyzers and query fields in Elasticsearch for auto-completion
require 'tire'
# Tire.configure { logger STDERR, level: 'debug' }
Tire.index('movie-titles') do
create \
settings: {
index: {
analysis: {
mrflip /
Last active January 21, 2024 21:06
Elasticsearch Tuning Plan

Next Steps

  • Measure time spend on index, flush, refresh, merge, query, etc. (TD - done)
  • Take hot threads snapshots under read+write, read-only, write-only (TD - done)
  • Adjust refresh time to 10s (from 1s) and see how load changes (TD)
  • Measure time of a rolling restart doing disable_flush and disable_recovery (TD)
  • Specify routing on query -- make it choose same node for each shard each time (MD)
  • GC new generation size (TD)
  • Warmers
  • measure before/after of client query time with and without warmers (MD)
linjunpop /
Last active May 30, 2023 08:20
Deploy Rails 4 app with Dokku on DigitalOcean

Deploy Rails 4 app with Dokku on DigitalOcean

Install dokku

First create a Ubuntu 13.04 x64 droplet on DigitalOcean Control Panel

Then ssh with root account, run this in termianl:

$ wget -qO- | sudo bash
psorianom /
Created August 23, 2014 13:40
Text feature extractor with okapi bm25 and delta idf
# -*- coding: utf-8 -*-
# Authors: Olivier Grisel <>
# Mathieu Blondel <>
# Lars Buitinck <>
# Robert Layton <>
# Jochen Wersdörfer <>
# Roman Sinayev <>
# License: BSD 3 clause
joech4n /
Last active November 28, 2018 16:40
Get bucket size and object count by first level prefix (i.e. bucket/prefix1, bucket/prefix2)
BUCKETNAME=mybucketname; REGION=us-east-1; for prefix in $(aws s3api list-objects --bucket $BUCKETNAME --delimiter '/' --output text --region $REGION |grep COMMONPREFIX |tail -n+2| awk '{print $2}'); do echo "Totals for $prefix"; aws s3 ls --summarize --human-readable --recursive s3://$BUCKETNAME/$prefix --region $REGION ; done |grep Total
synapticarbors /
Created September 2, 2016 18:46
Numba implementation of np.roll
import numpy as np
import numba as nb
from numba import types
from numba.extending import overload_method
@overload_method(types.Array, 'take')
def array_take(arr, indices):
if isinstance(indices, types.Array):
def sample_gumbel(shape, eps=1e-20):
"""Sample from Gumbel(0, 1)"""
U = tf.random_uniform(shape,minval=0,maxval=1)
return -tf.log(-tf.log(U + eps) + eps)
def gumbel_softmax_sample(logits, temperature):
""" Draw a sample from the Gumbel-Softmax distribution"""
y = logits + sample_gumbel(tf.shape(logits))
return tf.nn.softmax( y / temperature)
yzh119 /
Created January 12, 2018 12:25
from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
def sample_gumbel(shape, eps=1e-20):
U = torch.rand(shape).cuda()
return -Variable(torch.log(-torch.log(U + eps) + eps))