Skip to content

Instantly share code, notes, and snippets.

Created July 8, 2016 21:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save anonymous/13a8ba2c5f6794dc6207f1ae09b12825 to your computer and use it in GitHub Desktop.
Save anonymous/13a8ba2c5f6794dc6207f1ae09b12825 to your computer and use it in GitHub Desktop.
RFile Corruption

Bad

Written via AccumuloOutputFormat to encrypted HDFS directory

00000000  d1 11 d3 68 91 b5 d7 b6  39 df 41 40 92 ba e1 50  |...h....9.A@...P|
00000010  78 9c bd 96 41 6e 83 30  10 45 c1 a9 a2 26 4d 7b  |x...An.0.E...&M{|
00000020  01 ce 80 64 63 c0 66 49  09 25 2c b0 22 82 54 75  |...dc.fI.%,.".Tu|
00000030  c5 aa 52 d8 b6 39 4d 4f  5b 9a a6 1b 67 2a 4d ad  |..R..9MO[...g*M.|
00000040  8e 59 1a 90 9f 3c 7f 9e  27 78 60 87 80 ed 03 b6  |.Y...<..'x`.....|
00000050  6b 38 4f 72 11 76 8b ae  32 c1 d7 13 b2 68 bd 2f  |k8Or.v..2....h./|
00000060  87 b6 36 c3 d8 6e e7 85  db 9f 8f a2 9b be ac ea  |..6..n..........|
00000070  79 65 c9 9e 8f d3 e9 f5  23 bc 0f 93 55 d9 d4 63  |ye......#...U..c|
00000080  5f 9a e6 fb 45 22 62 c9  a3 55 3d ec 4c 5b b5 c3  |_...E"b..U=.L[..|
00000090  cb f9 ff c7 b7 e9 34 bd  1f a3 65 53 9b 6d dd cf  |......4...eS.m..|
000000a0  6b 8c 3d 45 eb f6 30 5e  36 3a af 98 e8 c2 f0 2b  |k.=E..0^6:.....+|
000000b0  41 02 11 48 8b 20 13 71  8e 22 e8 1c 08 24 44 90  |A..H. .q."...$D.|
000000c0  5a 04 b9 88 15 d9 19 a4  10 41 e6 7a 06 2e 04 19  |Z........A.z....|
000000d0  44 90 5b 04 52 c4 29 59  15 72 88 40 f9 4c a2 82  |D.[.R.)Y.r.@.L..|
000000e0  08 b4 cf 2a 68 88 a0 70  4d a2 4b 15 8a 6b 82 0d  |...*h..pM.K..k..|
000000f0  53 c2 63 3b 2a 94 92 28  a3 a8 50 4a a2 8c a2 42  |S.c;*..(..PJ...B|
00000100  29 89 b4 0a 28 25 51 46  51 a1 94 44 d9 8e 0a a5  |)...(%QFQ..D....|
00000110  24 d2 24 a2 94 44 9a 44  40 49 1b a6 b9 c7 28 6a  |$.$..D.D@I....(j|
00000120  0e 1d 82 ad 24 ca 28 6a  94 92 28 a3 a8 41 25 d9  |....$.(j..(..A%.|
00000130  42 a0 8c a2 46 09 01 1d  45 17 02 50 08 3e ef 67  |B...F...E..P.>.g|
00000140  0d 36 43 61 37 03 65 14  0b a0 19 ee 16 92 7b bc  |.6Ca7.e.......{.|
00000150  1b e6 cd fe 75 58 fc 7b  21 24 07 cd ec ec 45 97  |....uX.{!$....E.|
00000160  33 40 8d 6a 84 61 94 1c  1c d5 84 c7 30 4a 81 32  |3@.j.a......0J.2|
00000170  33 65 14 05 68 66 7b 54  a3 8c a2 00 47 35 8f 5e  |3e..hf{T....G5.^|
00000180  94 e2 ca 8b 9f f4 26 0f  51 78 9c 53 48 2e 29 61  |......&.Qx.SH.)a|
00000190  60 60 60 07 62 46 46 06  08 00 d1 8c be 50 ce 1d  |```.bFF......P..|
000001a0  46 3e 7e 21 21 a6 60 06  a6 00 06 26 0f 77 03 03  |F>~!!.`....&.w..|
000001b0  23 33 43 5f 5f 67 3f b8  4a 38 80 71 94 f8 f8 85  |#3C__g?.J8.q....|
000001c0  85 91 34 18 1b 9a fa 06  39 3a bb 42 8c 13 38 5b  |..4.....9:.B..8[|
000001d0  79 87 13 00 68 f0 0d 55  78 9c 63 4a af 62 00 00  |y...h..Ux.cJ.b..|
000001e0  02 35 00 e4 02 11 64 61  74 61 3a 42 43 46 69 6c  |.5....data:BCFil|
000001f0  65 2e 69 6e 64 65 78 02  67 7a ce 92 0c 04 10 64  |e.index.gz.....d|
00000200  61 74 61 3a 52 46 69 6c  65 2e 69 6e 64 65 78 02  |ata:RFile.index.|
00000210  67 7a cd 89 4f 76 00 00  00 00 00 00 01 e4 00 01  |gz..Ov..........|
00000220  00 00 d1 11 d3 68 91 b5  d7 b6 39 df 41 40 92 ba  |.....h....9.A@..|
00000230  e1 50                                             |.P|
00000232

Output of rfile-info

2016-07-08 14:20:13,291 [bcfile.BCFile] ERROR: Got IOException when trying to create DataIndex block
2016-07-08 14:20:13,291 [start.Main] ERROR: Thread 'rfile-info' died.
java.io.EOFException: Cannot seek after EOF
	at org.apache.hadoop.fs.ChecksumFileSystem$FSDataBoundedInputStream.seek(ChecksumFileSystem.java:323)
	at org.apache.accumulo.core.file.rfile.bcfile.BoundedRangeFileInputStream.read(BoundedRangeFileInputStream.java:98)
	at org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:159)
	at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:143)
	at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
	at java.io.FilterInputStream.read(FilterInputStream.java:83)
	at java.io.DataInputStream.readByte(DataInputStream.java:265)
	at org.apache.accumulo.core.file.rfile.bcfile.Utils.readVLong(Utils.java:175)
	at org.apache.accumulo.core.file.rfile.bcfile.Utils.readVInt(Utils.java:152)
	at org.apache.accumulo.core.file.rfile.bcfile.Utils.readString(Utils.java:239)
	at org.apache.accumulo.core.file.rfile.bcfile.BCFile$DataIndex.<init>(BCFile.java:1139)
	at org.apache.accumulo.core.file.rfile.bcfile.BCFile$Reader.<init>(BCFile.java:917)
	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.init(CachableBlockFile.java:246)
	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBCFile(CachableBlockFile.java:257)
	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.access$100(CachableBlockFile.java:137)
	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader$MetaBlockLoader.get(CachableBlockFile.java:209)
	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBlock(CachableBlockFile.java:313)
	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:368)
	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:137)
	at org.apache.accumulo.core.file.rfile.RFile$Reader.<init>(RFile.java:843)
	at org.apache.accumulo.core.file.rfile.PrintInfo.execute(PrintInfo.java:107)
	at org.apache.accumulo.start.Main$1.run(Main.jav

Good

Written via AccumuloOutputFormat to unencrypted HDFS directory

00000000  d1 11 d3 68 91 b5 d7 b6  39 df 41 40 92 ba e1 50  |...h....9.A@...P|
00000010  78 9c bd 96 41 6e 83 30  10 45 c1 a9 a2 26 4d 7b  |x...An.0.E...&M{|
00000020  01 ce 80 64 63 c0 66 49  09 25 2c b0 22 82 54 75  |...dc.fI.%,.".Tu|
00000030  c5 aa 52 d8 b6 39 4d 4f  5b 9a a6 1b 67 2a 4d ad  |..R..9MO[...g*M.|
00000040  8e 59 1a 90 9f 3c 7f 9e  27 78 60 87 80 ed 03 b6  |.Y...<..'x`.....|
00000050  6b 38 4f 72 11 76 8b ae  32 c1 d7 13 b2 68 bd 2f  |k8Or.v..2....h./|
00000060  87 b6 36 c3 d8 6e e7 85  db 9f 8f a2 9b be ac ea  |..6..n..........|
00000070  79 65 c9 9e 8f d3 e9 f5  23 bc 0f 93 55 d9 d4 63  |ye......#...U..c|
00000080  5f 9a e6 fb 45 22 62 c9  a3 55 3d ec 4c 5b b5 c3  |_...E"b..U=.L[..|
00000090  cb f9 ff c7 b7 e9 34 bd  1f a3 65 53 9b 6d dd cf  |......4...eS.m..|
000000a0  6b 8c 3d 45 eb f6 30 5e  36 3a af 98 e8 c2 f0 2b  |k.=E..0^6:.....+|
000000b0  41 02 11 48 8b 20 13 71  8e 22 e8 1c 08 24 44 90  |A..H. .q."...$D.|
000000c0  5a 04 b9 88 15 d9 19 a4  10 41 e6 7a 06 2e 04 19  |Z........A.z....|
000000d0  44 90 5b 04 52 c4 29 59  15 72 88 40 f9 4c a2 82  |D.[.R.)Y.r.@.L..|
000000e0  08 b4 cf 2a 68 88 a0 70  4d a2 4b 15 8a 6b 82 0d  |...*h..pM.K..k..|
000000f0  53 c2 63 3b 2a 94 92 28  a3 a8 50 4a a2 8c a2 42  |S.c;*..(..PJ...B|
00000100  29 89 b4 0a 28 25 51 46  51 a1 94 44 d9 8e 0a a5  |)...(%QFQ..D....|
00000110  24 d2 24 a2 94 44 9a 44  40 49 1b a6 b9 c7 28 6a  |$.$..D.D@I....(j|
00000120  0e 1d 82 ad 24 ca 28 6a  94 92 28 a3 a8 41 25 d9  |....$.(j..(..A%.|
00000130  42 a0 8c a2 46 09 01 1d  45 17 02 50 08 3e ef 67  |B...F...E..P.>.g|
00000140  0d 36 43 61 37 03 65 14  0b a0 19 ee 16 92 7b bc  |.6Ca7.e.......{.|
00000150  1b e6 cd fe 75 58 fc 7b  21 24 07 cd ec ec 45 97  |....uX.{!$....E.|
00000160  33 40 8d 6a 84 61 94 1c  1c d5 84 c7 30 4a 81 32  |3@.j.a......0J.2|
00000170  33 65 14 05 68 66 7b 54  a3 8c a2 00 47 35 8f 5e  |3e..hf{T....G5.^|
00000180  94 e2 ca 8b 9f f4 26 0f  51 78 9c 53 48 2e 29 61  |......&.Qx.SH.)a|
00000190  60 60 60 07 62 46 46 06  08 00 d1 8c be 50 ce 1d  |```.bFF......P..|
000001a0  46 3e 7e 21 21 a6 60 06  a6 00 06 26 0f 77 03 03  |F>~!!.`....&.w..|
000001b0  23 33 43 5f 5f 67 3f b8  4a 38 80 71 94 f8 f8 85  |#3C__g?.J8.q....|
000001c0  85 91 34 18 1b 9a fa 06  39 3a bb 42 8c 13 38 5b  |..4.....9:.B..8[|
000001d0  79 87 13 00 68 f0 0d 55  78 9c 63 4a af 62 00 00  |y...h..Ux.cJ.b..|
000001e0  02 35 00 e4 02 11 64 61  74 61 3a 42 43 46 69 6c  |.5....data:BCFil|
000001f0  65 2e 69 6e 64 65 78 02  67 7a cd d8 0c 04 10 64  |e.index.gz.....d|
00000200  61 74 61 3a 52 46 69 6c  65 2e 69 6e 64 65 78 02  |ata:RFile.index.|
00000210  67 7a cd 89 4f 76 00 00  00 00 00 00 01 e4 00 01  |gz..Ov..........|
00000220  00 00 d1 11 d3 68 91 b5  d7 b6 39 df 41 40 92 ba  |.....h....9.A@..|
00000230  e1 50                                             |.P|
00000232

Output of rfile-info

Reading file: file:/tmp/rfile_investigation/good/part-r-00001.rf
Locality group         : <DEFAULT>
	Start block          : 0
	Num   blocks         : 1
	Index level 0        : 38 bytes  1 blocks
	First key            : %02;S%00;%02;P%00;%02;HG00261 M:MCN [] 0 false
	Last key             : %02;S%00;%02;P%00;%02;HG00315 M:RACE [] 0 false
	Num entries          : 220
	Column families      : [M]

Meta block     : BCFile.index
      Raw size             : 4 bytes
      Compressed size      : 12 bytes
      Compression type     : gz

Meta block     : RFile.index
      Raw size             : 118 bytes
      Compressed size      : 79 bytes
      Compression type     : gz

Summary

The files differ by 2 bytes in position 0x1FA: the good file reads CD D8; the bad file reads CE 92

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment