Skip to content

Instantly share code, notes, and snippets.

View butlermh's full-sized avatar

Mark H. Butler butlermh

  • Santa Clara, United States
View GitHub Profile
@butlermh
butlermh / lucene4cosine.java
Created January 30, 2013 12:32
Using Lucene 4 to calculate cosine similarity
import java.io.IOException;
import java.util.*;
import java.util.Map;
import java.util.Set;
import org.apache.commons.math3.linear.*;
import org.apache.lucene.analysis.*;
import org.apache.lucene.document.*;
import org.apache.lucene.index.*;
import org.apache.lucene.store.*;
@butlermh
butlermh / hadoop-build.sh
Created December 23, 2011 14:15
Snippet from Hadoop build script
# note you need to set the following environment variables
# NATIVE - the native LZO library location
# ANT17 - your Ant 1.7 installation
# FORREST_HOME - your Apache Forrest 0.8 installation
# HADOOP_VERSION - the version you are giving Hadoop
# JAVA_HOME - your Java isntallation
env CFLAGS=-m64 CXXFLAGS=-m64 C_INCLUDE_PATH=$NATIVE/include
LIBRARY_PATH=$NATIVE/lib LD_LIBRARY_PATH=$NATIVE/lib
@butlermh
butlermh / forrest.properties
Created December 23, 2011 14:07
Changes to Forrest.properties so Hadoop will build with JDK 1.6
forrest.validate.sitemap=false
forrest.validate.skins.stylesheets=false
@butlermh
butlermh / hadoop-lzo-build.sh
Created December 23, 2011 13:31
Snippet from Hadoop-LZO build script
env CFLAGS=-m64 CXXFLAGS=-m64 C_INCLUDE_PATH=$NATIVE/include
LIBRARY_PATH=$NATIVE/lib LD_LIBRARY_PATH=$NATIVE/lib
JAVA_LIBRARY_PATH=$NATIVE/lib ant -Dtest.junit.output.format=xml
-Dtest.output=yes -Dversion=$HADOOP_LZO_VERSION test package published
@butlermh
butlermh / lzo-build.sh
Created December 23, 2011 13:15
Snippet from LZO build script
env CFLAGS="-m64" CXXFLAGS="-m64" ./configure --enable-shared
make
make check
make test
make DESTDIR=$PWD/build install
@butlermh
butlermh / applicationContext.xml
Created December 23, 2011 11:28
Snippet from Spring config file
<bean id="resourceStore" scope="singleton"
class="my.hbase.using.Webapp">
<constructor-arg>
<bean scope="singleton"
class="org.apache.hadoop.hbase.HBaseConfigurationSpringWrapper">
<constructor-arg>
<bean scope="singleton" class="org.apache.hadoop.conf.Configuration" />
</constructor-arg>
<constructor-arg>
<map>
@butlermh
butlermh / HBaseConfigurationSpringWrapper.java
Created December 23, 2011 11:24
A Spring friendly wrapper for HBaseConfiguration
package org.apache.hadoop.hbase;
import java.util.Map;
import org.apache.hadoop.conf.Configuration;
public class HBaseConfigurationSpringWrapper extends HBaseConfiguration {
public HBaseConfigurationSpringWrapper(Configuration config,
Map<String, String> configParams) {
@butlermh
butlermh / QueryParserTest.java
Created June 29, 2011 11:09
Test that checks ElasticSearch JSON queries are valid
package org.elasticsearchtest;
import static org.elasticsearch.common.io.Streams.copyToStringFromClasspath;
import java.io.IOException;
import org.apache.lucene.search.Query;
import org.elasticsearch.common.inject.Injector;
import org.elasticsearch.common.inject.ModulesBuilder;
import org.elasticsearch.common.settings.ImmutableSettings;
@butlermh
butlermh / boolquery.json
Created June 29, 2011 11:05
Nested boolean queries for ElasticSearch
{
"bool":{
"must":[
{
"query_string":{
"default_field":"content",
"query":"test1"
}
},
{
@butlermh
butlermh / wishlist.rb
Created June 16, 2011 19:29
Find out what books are available on your Amazon wishlist on Kindle
#!/usr/bin/ruby
require 'rubygems'
require 'open-uri'
require 'nokogiri'
require 'net/http'
require 'uri'
require 'amazon/aws/search'
include Amazon::AWS