Skip to content

Instantly share code, notes, and snippets.

View stankiewicz's full-sized avatar

Radosław Stankiewicz stankiewicz

View GitHub Profile
POST /megacorp/transactions2/_search
{
"query": {
"filtered" : {
"filter" : {
"geo_distance" : {
"distance" : "520km",
"Location" : [-111.89028,40.76083]
}
},
@stankiewicz
stankiewicz / gist:57e80cf84385b85e304d
Last active August 29, 2015 14:13
Pig to Elastic
REGISTER /Users/radoslawstankiewicz/Downloads/elasticsearch-hadoop-2.0.2.jar
wejscie = LOAD '/Users/radoslawstankiewicz/Downloads/pig-0.14.0/bin/SalesJan2009_final.csv' using PigStorage(';')
AS (transdate:chararray,
product:chararray,
price:long,
paymenttype:chararray,
name:chararray,
city:chararray,
state:chararray,
ccode:chararray,
@stankiewicz
stankiewicz / gist:183cdc6997136b3e5f61
Created January 20, 2015 14:10
Pig-elastic search
wget http://central.maven.org/maven2/org/elasticsearch/elasticsearch-hadoop/2.0.2/elasticsearch-hadoop-2.0.2.jar
@stankiewicz
stankiewicz / gist:2e5a2c700b20e13e7cea
Created January 20, 2015 13:59
Utworzenie Indeksu z typem transactions
POST /megacorp/_mapping/transactions2
{
"transactions2": {
"_timestamp": {
"enabled": true,
"type": "date",
"format": "yyyy-MM-dd'T'HH:mm:ss.SSSZ",
"store": true,
"path": "_timestamp"
},
GET /megacorp/transactions/_mapping
{
"megacorp": {
"mappings": {
"transactions": {
"properties": {
"City": {
"type": "string"
},
POST /megacorp/transactions/
{
"_timestamp" : "2015-01-12T23:37:00.000+01:00",
"Product" : "Product2",
"Price" : 5200,
"Payment_type": "Mastercard",
"Name" : "carolina",
"City" : "Basildon",
"State" : "England",
"Country" : "GB",
https://chrome.google.com/webstore/detail/sense-beta/lhjgkmllcaadmopgmanpapmpjgmfcfig
@stankiewicz
stankiewicz / gist:73ece9b82f0870f4038a
Created January 19, 2015 21:44
Pig Download Path
wget http://ftp.ps.pl/pub/apache/pig/pig-0.14.0/pig-0.14.0.tar.gz
yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-streaming.jar -file mapper.py -mapper mapper.py -file reducer.py -reducer reducer.py -input /tmp/SalesJan2009_final.csv -output /tmp/my_output/
#!/usr/bin/env python
import sys
current_product = None
current_count = 0
product = None
for line in sys.stdin:
line = line.strip()
product, count = line.split(',', 1)
count = int(count)
if current_product == product: