Skip to content

Instantly share code, notes, and snippets.

@cotdp cotdp/gist:3062892
Created Jul 6, 2012

Embed
What would you like to do?
Example ZipFile Job
// Standard stuff
Job job = new Job(conf);
job.setJobName(this.getClass().getSimpleName());
job.setJarByClass(this.getClass());
job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);
// Hello there ZipFileInputFormat!
job.setInputFormatClass(ZipFileInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
// The output files will contain "Word [TAB] Count"
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
// We want to be fault-tolerant
ZipFileInputFormat.setLenient( true );
ZipFileInputFormat.setInputPaths(job, new Path("/data/archives/*.zip"));
TextOutputFormat.setOutputPath(job, new Path("/tmp/zip_wordcount"));
//
job.waitForCompletion(true);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.