Create a gist now

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Avro append-to-existing file example with the DataFileWriter.appendTo(…) API.
package com.cloudera.example;
import java.io.IOException;
import java.io.OutputStream;
import org.apache.avro.Schema;
import org.apache.avro.file.DataFileWriter;
import org.apache.avro.file.SeekableInput;
import org.apache.avro.mapred.FsInput;
import org.apache.avro.reflect.ReflectData;
import org.apache.avro.reflect.ReflectDatumWriter;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class DFWAppendTest {
public static class Sample {
CharSequence foo;
public Sample(CharSequence bar) {
this.foo = bar;
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://localhost");
conf.setInt("dfs.replication", 1);
FileSystem fs = FileSystem.get(conf);
Schema sample = ReflectData.get().getSchema(Sample.class);
ReflectDatumWriter<Sample> rdw = new ReflectDatumWriter<DFWAppendTest.Sample>(
Sample.class);
DataFileWriter<Sample> dfwo = new DataFileWriter<DFWAppendTest.Sample>(rdw);
Path filePath = new Path("/sample.avro");
OutputStream out = fs.create(filePath);
DataFileWriter<Sample> dfw = dfwo.create(sample, out);
dfw.append(new Sample("Eggs"));
dfw.append(new Sample("Spam"));
dfw.close();
out.close();
OutputStream aout = fs.append(filePath);
dfw = dfwo.appendTo(new FsInput(filePath, conf), aout);
dfw.append(new Sample("Monty"));
dfw.append(new Sample("Python"));
dfwo.close();
aout.close();
}
}
@ChetanBhasin

This comment has been minimized.

Show comment
Hide comment
@ChetanBhasin

ChetanBhasin Feb 2, 2016

Do you know if something like dfs.support.append has to be set to true in the site file on Hadoop configuration for append operation to work?

Do you know if something like dfs.support.append has to be set to true in the site file on Hadoop configuration for append operation to work?

@JC16268

This comment has been minimized.

Show comment
Hide comment

JC16268 commented Apr 28, 2016

yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment