Created
November 12, 2016 19:30
-
-
Save ibalashov/e337e8d7e43c5780665e66fa4fd35a88 to your computer and use it in GitHub Desktop.
Reading broken avro file with DataFileReader does not produce any exception. fastavro duly complains on the same file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
% cat twitter.snappy.incomplete.avro | |
Objavro.codec | |
snappyavro.schema�{"type":"record","name":"twitter_schema","namespace":"com.miguno.avro","fields":[{"name":"username","type":"string","doc":"Name of the user account on Twitter.com"},{"name":"tweet","type":"string","doc":"The content of the user's Twitter message"},{"name":"timestamp","type":"long","doc":"Unix epoch time in milliseconds"}],"doc:":"A basic schema for storing Twitter messages"}5\�1��~����H����d�c | |
migunoFRock: Nerf paper, scissors is fine.��� | |
BlizzardCSFWor% | |
% fastavro twitter.snappy.incomplete.avro | |
Traceback (most recent call last): | |
File "/usr/local/bin/fastavro", line 9, in <module> | |
load_entry_point('fastavro==0.9.9', 'console_scripts', 'fastavro')() | |
File "/usr/local/lib/python2.7/site-packages/fastavro/__main__.py", line 54, in main | |
for record in reader: | |
File "fastavro/_reader.py", line 469, in _iter_avro (fastavro/_reader.c:9127) | |
File "fastavro/_reader.py", line 422, in fastavro._reader.snappy_read_block (fastavro/_reader.c:8077) | |
File "fastavro/_reader.py", line 426, in fastavro._reader.snappy_read_block (fastavro/_reader.c:7985) | |
snappy.UncompressError: Error while decompressing: invalid input | |
==== AvroReadTest.java == | |
@Test | |
public void testRead() throws Exception { | |
final File file = new File(getClass().getResource("/twitter.snappy.incomplete.avro").getFile()); | |
final GenericDatumReader<GenericRecord> genericDatumReader = new GenericDatumReader<GenericRecord>(); | |
final DataFileReader<GenericRecord> dataFileReader = new DataFileReader<GenericRecord>(file, genericDatumReader); | |
while (dataFileReader.hasNext()) { | |
System.out.println(dataFileReader.next()); | |
} | |
} | |
=============================================== | |
Default Suite | |
Total tests run: 1, Failures: 0, Skips: 0 | |
=============================================== | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I'm seeing this error. Could you explain what is breaking the avro file here ?