Skip to content

Instantly share code, notes, and snippets.

Joris Bontje jorisbontje

Block or report user

Report or block jorisbontje

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@jorisbontje
jorisbontje / positweets.hive
Created May 15, 2012
Twitter sentiment analysis using Apache Hive
View positweets.hive
drop table if exists raw_tweets;
drop table if exists tweets;
drop table if exists positive_hashtags_per_day;
drop table if exists count_positive_hashtags_per_day;
drop table if exists top5_positive_hashtags_per_day;
create table raw_tweets (json string);
load data local inpath 'sample.json' into table raw_tweets;
create table tweets as
@jorisbontje
jorisbontje / gist:5803625
Created Jun 18, 2013
hadoop default xml
View gist:5803625
http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-project-dist/hadoop-common/core-default.xml
http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml
View ansible_R.yml
---
- name: Install R
yum: name=$item state=installed
environment: $proxy_env
with_items:
- R
tags:
- packages
- name: Copy R package installer script
View gist:5056544
0) Download avro-tools jar file from avro.apache.org
1) Extract Avro schema using avro-tools.jar
java -jar avro-tools*.jar getschema file.avro > file.avsc
2) Upload Avro schema to hdfs
hadoop fs -cp file.avsc /use/training/file.avsc
View WebHDFS.txt
curl -O http://python-distribute.org/distribute_setup.py
sudo python distribute_setup.py
curl -O https://raw.github.com/pypa/pip/master/contrib/get-pip.py
sudo python get-pip.py
sudo pip install webhdfs
cp /usr/lib/python2.6/site-packages/webhdfs/example.py .
@jorisbontje
jorisbontje / heroku_dynos.sh
Created Nov 14, 2012
Heroku total number of dynos
View heroku_dynos.sh
#!/bin/sh
# Return total number of Heroku dynos for an account
#
# Uses:
# jutil <https://github.com/misterfifths/jutil.git>
# underscore.js <http://underscorejs.org/>
API_KEY="<your Heroku API key>"
curl -s -H "Accept: application/json" -u :$API_KEY https://api.heroku.com/apps | jselect 'dynos' | jutil 'return _.reduce($, function(memo, num){ return memo + num; }, 0);'
@jorisbontje
jorisbontje / export-scm-config.sh
Created May 27, 2012
Export the Cloudera Manager configuration
View export-scm-config.sh
#!/bin/bash
USERNAME=admin
PASSWORD=admin
SCM_URL=http://localhost:7180
COOKIES_FILE=cookies.txt
EXPORT_FILE=export.txt
wget -q --post-data="j_username=${USERNAME}&j_password=${PASSWORD}" --save-cookies ${COOKIES_FILE} --keep-session-cookies -O /dev/null ${SCM_URL}/j_spring_security_check
wget -q -O ${EXPORT_FILE} --load-cookies ${COOKIES_FILE} ${SCM_URL}/cmf/exportCLI
View SortTest.java
@SuppressWarnings("serial")
@PlatformRunner.Platform({ LocalPlatform.class, HadoopPlatform.class})
public class SortTest extends PlatformTestCase {
private static final inputFileSort = "src/test/data/sort.txt";
public SortTest() {
super(false);
}
View gist:1074930
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.NullPointerException
LazySeq.java:47 clojure.lang.LazySeq.sval
LazySeq.java:56 clojure.lang.LazySeq.seq
Cons.java:39 clojure.lang.Cons.next
RT.java:1178 clojure.lang.RT.length
RT.java:1157 clojure.lang.RT.seqToArray
LazySeq.java:126 clojure.lang.LazySeq.toArray
RT.java:1135 clojure.lang.RT.toArray
core.clj:300 clojure.core/to-array
View gist:1054917
$ lein test
Testing cascalog-weather.test.weather
11/06/29 22:46:49 INFO hadoop.Hadoop18TapUtil: setting up task: 'attempt_002147483647_0000_m_000000_0' - file:/var/folders/YZ/YZO0QDWpEp0jBsowT4Bo4U+++TI/-Tmp-/tap57/4abafe2e-a136-4415-b7bc-09fd51acc301/_temporary/_attempt_002147483647_0000_m_000000_0
11/06/29 22:46:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
11/06/29 22:46:49 INFO hadoop.TapCollector: closing tap collector for: /var/folders/YZ/YZO0QDWpEp0jBsowT4Bo4U+++TI/-Tmp-/tap57/4abafe2e-a136-4415-b7bc-09fd51acc301/part-00000
11/06/29 22:46:49 INFO hadoop.Hadoop18TapUtil: committing task: 'attempt_002147483647_0000_m_000000_0' - file:/var/folders/YZ/YZO0QDWpEp0jBsowT4Bo4U+++TI/-Tmp-/tap57/4abafe2e-a136-4415-b7bc-09fd51acc301/_temporary/_attempt_002147483647_0000_m_000000_0
11/06/29 22:46:49 INFO hadoop.Hadoop18TapUtil: saved output of task 'attempt_002147483647_0000_m_000000_0' to file:/var/folders/YZ/YZO0QDWpEp0jBsow
You can’t perform that action at this time.