Skip to content

Instantly share code, notes, and snippets.

@elliotmoore
Last active September 16, 2015 21:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save elliotmoore/83dc7827526b6b0b1536 to your computer and use it in GitHub Desktop.
Save elliotmoore/83dc7827526b6b0b1536 to your computer and use it in GitHub Desktop.
s3 gotcha

If you upload certain filenames to s3, it can produce invalid XML in the response, thus breaking most aws s3 tools.

Create a new test bucket, please don't use an existing one

# test all is well, list new bucket
$ aws s3 ls s3://ells-test-filename

# upload a test file and confirm you can list it
$ touch a-working-file
$ aws s3 cp a-working-file s3://ells-test-filename 
upload: ./a-working-file to s3://ells-test-filename/a-working-file
$ aws s3 ls s3://ells-test-filename
2015-09-15 22:35:14 0 a-working-file

# Here we go!
# make a dodgy file - where ^H is control-vH
$ touch a-broken-^H^H-file
# Upload our dodgy file
$ aws s3 cp a-broken??-file s3://ells-test-filename 
upload: ./a-brok-file to s3://ells-test-filename/a-brok-file

# now try to list the folder!
$ aws s3 ls s3://ells-test-filename

Unable to parse response (reference to invalid character number: line 2, column 233), invalid XML received:
<?xml version="1.0" encoding="UTF-8"?>
<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Name>ells-test-filename</Name>
<Prefix></Prefix>
<Marker></Marker>
<MaxKeys>1000</MaxKeys>
<Delimiter>/</Delimiter>
<IsTruncated>false</IsTruncated>
<Contents>
<Key>a-broken&#x8;&#x8;-file</Key>
<LastModified>2015-09-15T21:37:56.000Z</LastModified><ETag>&quot;2222222&quot;</ETag>
<Size>0</Size>
<Owner>
<ID>12345678911</ID>
<DisplayName>bobby</DisplayName>
</Owner>
<StorageClass>STANDARD</StorageClass>
</Contents>
<Contents>
<Key>a-working-file</Key>
<LastModified>2015-09-15T21:35:14.000Z</LastModified><ETag>&quot;111111111&quot;</ETag>
<Size>0</Size>
<Owner><ID>12345678910</ID><DisplayName>bobby</DisplayName>
</Owner>
<StorageClass>STANDARD</StorageClass>
</Contents>
</ListBucketResult>
# run through a linter...
$ xmllint s3-payload.xml 
s3-payload.xml:2: parser error : xmlParseCharRef: invalid xmlChar value 8
imiter>/</Delimiter><IsTruncated>false</IsTruncated><Contents><Key>a-broken&#x8;

Breaks with s3cmd and s3s3mirror, could render a backup/sync job useless.

Tools will list the folder (in segments) before displaying or copying or syncing, so you could get halfway through, then it bails when the next marker contains invalid XML, so if you don't clean your input, you could bust your backups, then u need to play hunt the dodgy filename, then try to delete it ;-D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment