-
-
Save amimimor/3cc8399ebdfb0c7ca2bb to your computer and use it in GitHub Desktop.
Setup a Scalding job (for the first time) to be run on CDH3u2, using Maven
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. go to your scalding source directory | |
2. edit build.sbt (https://gist.github.com/238d74b081d9f2c6e5f1) | |
3. sbt -29 update | |
4. sbt -29 assembly | |
5. mvn install:install-file ..... (http://maven.apache.org/plugins/maven-install-plugin/usage.html) to install the created scalding-assembly.0.x.y.jar locally | |
6. download Cloudera's hadoop-0.20.2-cdh3u2.tar.gz (or just download hadoop-core-cdh3u2.jar) | |
6a. same as 5, install locally your cdh3u2 hadoop-core jar (of course, get it first, or embed Cloudera's parent pom) | |
7. in your IDE, create a new project using this pom: https://gist.github.com/40f1838bbdd15cc25b21 | |
8. create the file src/assembly/job.xml and edit: https://gist.github.com/9c5e6f04da287667983a | |
9. create your Scala class implementing Scalding's Job, i.e. "class SomethingCool(args: Args) extends Job(args)" | |
10. mvn package | |
11. the created jar would be placed under your project's target folder, named like: YOURPROJECT-0.0.1-SNAPSHOT-job.jar | |
12. setup your hadoop conf files (most importantly, your core-site.xml file) and edit | |
<property> | |
<name>fs.default.name</name> | |
<value>hdfs://namenode.somethingcool.com:8020/</value> | |
</property> | |
13. cd to your hadoop-0.20-cdh3u2 folder | |
14. bin/hadoop jar YOURPROJECT-0.0.1-SNAPSHOT-job.jar com.twitter.scalding.Tool your.package.your.class --hdfs --input hdfs://namenode.somethingcool.com/user/hdfs/tmp/hello.txt --output hdfs://namenode.somethingcool.com/user/hdfs/tmp/hello_out.txt -libjars YOURPROJECT-0.0.1-SNAPSHOT-job.jar | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment