Skip to content

Instantly share code, notes, and snippets.

View fs111's full-sized avatar

André Kelpe fs111

View GitHub Profile
14/02/28 14:54:31 INFO planner.HadoopPlanner: using application jar: /home/fs111/code/concurrent/es-test/build/libs/es-cascading-tryout.jar
14/02/28 14:54:31 INFO property.AppProps: using app.id: FD14E98AD3D446AF9912A2DBAAE9F14E
14/02/28 14:54:32 DEBUG cascading.EsTap: Detected Hadoop environment; initializing [EsHadoopTap]
14/02/28 14:54:32 TRACE cascading.EsTap: Registered Cascading serialization token 140 for org.elasticsearch.hadoop.mr.LinkedMapWritable
14/02/28 14:54:32 TRACE cascading.EsHadoopScheme: Initialized configuration {fs.s3n.impl=org.apache.hadoop.fs.s3native.NativeS3FileSystem, mapreduce.job.counters.max=120, mapred.task.cache.levels=2, mapreduce.job.restart.recover=true, map.sort.class=org.apache.hadoop.util.QuickSort, hadoop.tmp.dir=/tmp/hadoop-${user.name}, hadoop.native.lib=true, ipc.client.idlethreshold=4000, mapred.system.dir=${hadoop.tmp.dir}/mapred/system, mapred.job.tracker.persist.jobstatus.hours=0, io.skip.checksum.errors=false, fs.default.name=file:///, mapred.cluster.reduce.memory
$ curl 'http://elastic.local:9200/foo/bar/_search?pretty=true&q=*:*'
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
public class Main
{
public static void main( String[] args )
{
String outPath = args[ 0 ];
Map<Object, Object> properties = new HashMap<Object, Object>();
properties.put("es.http.timeout", "10m");
properties.put("es.scroll.size", "10");
AppProps.setApplicationJarClass( properties, Main.class );
$ hadoop jar build/libs/es-cascading-tryout.jar foo
14/02/28 13:32:41 INFO util.HadoopUtil: resolving application jar from found main method on: impatient.Main
14/02/28 13:32:41 INFO planner.HadoopPlanner: using application jar: /home/fs111/code/concurrent/es-test/build/libs/es-cascading-tryout.jar
14/02/28 13:32:41 INFO property.AppProps: using app.id: BB3B7C8730DC470D8E6ED817EFFC2E5B
14/02/28 13:32:41 INFO util.Version: Concurrent, Inc - Cascading 2.5.2
14/02/28 13:32:41 INFO flow.Flow: [] starting
14/02/28 13:32:41 INFO flow.Flow: [] source: EsHadoopTap["EsHadoopScheme[[UNKNOWN]->[ALL]]"]["employees/titles"]
14/02/28 13:32:41 INFO flow.Flow: [] sink: Hfs["TextDelimited[[UNKNOWN]->[ALL]]"]["foo"]
14/02/28 13:32:41 INFO flow.Flow: [] parallel execution is enabled: false
14/02/28 13:32:41 INFO flow.Flow: [] starting jobs: 1
public class Main
{
public static void main( String[] args )
{
String outPath = args[ 0 ];
Properties properties = new Properties();
AppProps.setApplicationJarClass( properties, Main.class );
HadoopFlowConnector flowConnector = new HadoopFlowConnector( properties );
$ gradle test
The TaskContainer.add() method has been deprecated and is scheduled to be removed in Gradle 2.0. Please use the create() method instead.
Using Apache Hadoop Stable [1.2.1]
:compileJava UP-TO-DATE
:processResources UP-TO-DATE
:classes UP-TO-DATE
:compileTestJava UP-TO-DATE
:processTestResources UP-TO-DATE
:testClasses UP-TO-DATE
:test
@fs111
fs111 / gist:7562431
Created November 20, 2013 12:32
hadoop-env for lingual to make it work with CDH4.4
if [ -z "${HADOOP_HOME}" ]; then
echo "warning: HADOOP_HOME is not set. see your hadoop documents for details."
BIN_PATH=`which hadoop`
if [ -n "${BIN_PATH}" ] ; then
HADOOP_HOME=${BIN_PATH%/bin/hadoop}
echo "lingual will use HADOOP_HOME of ${HADOOP_HOME}"
else
echo "exiting."
exit 1
fi
@fs111
fs111 / gist:7013230
Created October 16, 2013 19:18
These are the notes from my talk about lingual during the big data beer meetup in Berlin: http://www.meetup.com/Big-Data-Beers/events/143392512/

Lingual - ANSI SQL for apache hadoop

Big Data Beers Berlin, October 16th 2013

Speaker

  • André Kelpe | @fs111 | andre[at]concurrentinc[dot]com
  • works for concurrent inc (http://concurrentinc.com)
    • company behind Cacading and Lingual

Cascading

$ du -sch .sbt .lein .gradle .m2 .ivy2
108M .sbt
14M .lein
167M .gradle
659M .m2
87M .ivy2
1.1G total
>>> d = {"a":42}
>>> t = d,
>>> l = [t]
>>> l[0][0]['a']
42