Skip to content

Instantly share code, notes, and snippets.

@msukmanowsky
Created October 7, 2011 21:18
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save msukmanowsky/4a380409cd1497602906 to your computer and use it in GitHub Desktop.
Save msukmanowsky/4a380409cd1497602906 to your computer and use it in GitHub Desktop.
A Apache Hadoop DataFileInputFormat for Omniture hit file data.
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
class OmnitureDataFileInputFormat extends TextInputFormat {
@Override
public RecordReader<LongWritable, Text> createRecordReader(
InputSplit split, TaskAttemptContext taskAttemptContext) {
return new OmnitureDataFileRecordReader();
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment