Skip to content

Instantly share code, notes, and snippets.

View stankiewicz's full-sized avatar

Radosław Stankiewicz stankiewicz

View GitHub Profile
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
git clone https://gitlab.com/gcp-training-bdtsw/dataflow-python.git
pip install --user -r requirements.txt
export DATAFLOW_BUCKET=gs://$GOOGLE_CLOUD_PROJECT-dataflow-temp
gsutil mb $DATAFLOW_BUCKET
export DATASET=$GOOGLE_CLOUD_PROJECT:dataflow_test
bq mk --dataset $DATASET
python dataflow/data_ingestion.py \
--project=$GOOGLE_CLOUD_PROJECT \
export HADOOP_USER_NAME=hdfs
hdfs dfs -mkdir /user/cloudbreak
hdfs dfs -chown -R cloudbreak:hdfs /user/cloudbreak
hdfs dfs -chmod -R 777 /user/cloudbreak
/usr/local/bin/pip upgrade
/usr/local/bin/pip install numpy
/usr/local/bin/pip install pandas
/usr/local/bin/pip install jupyter
@stankiewicz
stankiewicz / sh
Created March 2, 2017 09:54
Installing pip hunspell on osx
# based on https://coderwall.com/p/nhmyeg/installing-hunspell-via-pip-on-osx-mavericks
brew install hunspell
export C_INCLUDE_PATH=/usr/local/include/hunspell
ln -sf /usr/local/lib/libhunspell-1.6.a /usr/local/lib/libhunspell.a
pip install hunspell
redash:
image: redash/redash:latest
ports:
- "5000:5000"
links:
- redis
- postgres
environment:
- REDASH_LOG_LEVEL=INFO
- REDASH_REDIS_URL=redis://redis:6379/0
@stankiewicz
stankiewicz / where_is_my_collection.sh
Last active January 3, 2017 08:49
My problem with SOLR Cloud is very slow tree rendering - it's difficult to find servers which holds shards. Change zkhost to one or more of your zookeeper hosts, change MyCollection export to collection you want to find.
export COLLECTION=MyCollection && zookeeper-client -server zkhost:2181 ls /solr/collections/$COLLECTION/leader_elect | \
tail -n 1 | egrep -o '[a-zA-Z0-9]+' | while read line; do
export SHARD=$line ;
export to_grep=`zookeeper-client -server zkhost:2181 ls /solr/collections/$COLLECTION/leader_elect/$SHARD/election | tail -n 1 | \
egrep -o '^\[[0-9]+' | cut -c 2-` ;
zookeeper-client -server zkhost:2181 ls /solr/overseer_elect/election | \
tail -n 1 | tail -c +2 | head -c -1 | egrep -o '([^,]+)+' | grep $to_grep | \
gawk -v SHARD="$SHARD" 'match($0, "^[ 0-9]+-([^_]+)", ary) {print SHARD " - " ary[1]}' ;
done
@stankiewicz
stankiewicz / where_is_my_collection.sh
Created January 3, 2017 08:44
change zkhost to one or more of your zookeeper hosts, change MyCollection export to something mo
export COLLECTION=MyCollection && zookeeper-client -server zkhost:2181 ls /solr/collections/$COLLECTION/leader_elect | tail -n 1 | egrep -o '[a-zA-Z0-9]+' | while read line; do export SHARD=$line ; export to_grep=`zookeeper-client -server zkhost:2181 ls /solr/collections/$COLLECTION/leader_elect/$SHARD/election | tail -n 1 | egrep -o '^\[[0-9]+' | cut -c 2-` ;zookeeper-client -server zkhost:2181 ls /solr/overseer_elect/election | tail -n 1 | tail -c +2 | head -c -1 | egrep -o '([^,]+)+' | grep $to_grep | gawk -v SHARD="$SHARD" 'match($0, "^[ 0-9]+-([^_]+)", ary) {print SHARD " - " ary[1]}' ; done
POST /megacorp/transactions2/_search
{
"query": {
"filtered": {
"filter": {
"geo_distance": {
"distance": "520km",
"Location": [
-111.89028,
40.76083
POST /megacorp/transactions2/_search
{
"query": {
"filtered": {
"filter": {
"geo_bounding_box": {
"Location": {
"top_left": [
-111.89028,
40.76083