Skip to content

Instantly share code, notes, and snippets.

Joris Bontje jorisbontje

Block or report user

Report or block jorisbontje

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
jorisbontje / positweets.hive
Created May 15, 2012
Twitter sentiment analysis using Apache Hive
View positweets.hive
drop table if exists raw_tweets;
drop table if exists tweets;
drop table if exists positive_hashtags_per_day;
drop table if exists count_positive_hashtags_per_day;
drop table if exists top5_positive_hashtags_per_day;
create table raw_tweets (json string);
load data local inpath 'sample.json' into table raw_tweets;
create table tweets as
jorisbontje / gist:5803625
Created Jun 18, 2013
hadoop default xml
View gist:5803625
View ansible_R.yml
- name: Install R
yum: name=$item state=installed
environment: $proxy_env
- R
- packages
- name: Copy R package installer script
View gist:5056544
0) Download avro-tools jar file from
1) Extract Avro schema using avro-tools.jar
java -jar avro-tools*.jar getschema file.avro > file.avsc
2) Upload Avro schema to hdfs
hadoop fs -cp file.avsc /use/training/file.avsc
View WebHDFS.txt
curl -O
sudo python
curl -O
sudo python
sudo pip install webhdfs
cp /usr/lib/python2.6/site-packages/webhdfs/ .
jorisbontje /
Created Nov 14, 2012
Heroku total number of dynos
# Return total number of Heroku dynos for an account
# Uses:
# jutil <>
# underscore.js <>
API_KEY="<your Heroku API key>"
curl -s -H "Accept: application/json" -u :$API_KEY | jselect 'dynos' | jutil 'return _.reduce($, function(memo, num){ return memo + num; }, 0);'
jorisbontje /
Created May 27, 2012
Export the Cloudera Manager configuration
wget -q --post-data="j_username=${USERNAME}&j_password=${PASSWORD}" --save-cookies ${COOKIES_FILE} --keep-session-cookies -O /dev/null ${SCM_URL}/j_spring_security_check
wget -q -O ${EXPORT_FILE} --load-cookies ${COOKIES_FILE} ${SCM_URL}/cmf/exportCLI
@PlatformRunner.Platform({ LocalPlatform.class, HadoopPlatform.class})
public class SortTest extends PlatformTestCase {
private static final inputFileSort = "src/test/data/sort.txt";
public SortTest() {
View gist:1074930
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.NullPointerException clojure.lang.LazySeq.sval clojure.lang.LazySeq.seq clojure.lang.RT.length clojure.lang.RT.seqToArray clojure.lang.LazySeq.toArray clojure.lang.RT.toArray
core.clj:300 clojure.core/to-array
View gist:1054917
$ lein test
11/06/29 22:46:49 INFO hadoop.Hadoop18TapUtil: setting up task: 'attempt_002147483647_0000_m_000000_0' - file:/var/folders/YZ/YZO0QDWpEp0jBsowT4Bo4U+++TI/-Tmp-/tap57/4abafe2e-a136-4415-b7bc-09fd51acc301/_temporary/_attempt_002147483647_0000_m_000000_0
11/06/29 22:46:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
11/06/29 22:46:49 INFO hadoop.TapCollector: closing tap collector for: /var/folders/YZ/YZO0QDWpEp0jBsowT4Bo4U+++TI/-Tmp-/tap57/4abafe2e-a136-4415-b7bc-09fd51acc301/part-00000
11/06/29 22:46:49 INFO hadoop.Hadoop18TapUtil: committing task: 'attempt_002147483647_0000_m_000000_0' - file:/var/folders/YZ/YZO0QDWpEp0jBsowT4Bo4U+++TI/-Tmp-/tap57/4abafe2e-a136-4415-b7bc-09fd51acc301/_temporary/_attempt_002147483647_0000_m_000000_0
11/06/29 22:46:49 INFO hadoop.Hadoop18TapUtil: saved output of task 'attempt_002147483647_0000_m_000000_0' to file:/var/folders/YZ/YZO0QDWpEp0jBsow
You can’t perform that action at this time.