Skip to content

Instantly share code, notes, and snippets.

@yannk
Last active December 21, 2015 09:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save yannk/6289094 to your computer and use it in GitHub Desktop.
Save yannk/6289094 to your computer and use it in GitHub Desktop.
Sendgrid bug in the parse api. mangled text data if it's containing high-bit characters.

Sendgrid seems to have a bug in the Parse api. The fields are inconsistently encoded. According to the docs, they all should be UTF-8, but they are not.

I sent an email to the parse endpoint with the subject and the body containing: 日本国.

The text one below is incorrect (not utf-8)

Here is a capture of the data on the wire:

  5a 59 0d 0a 43 6f 6e 74    65 6e 74 2d 44 69 73 70    ZY..Content-Disp
  6f 73 69 74 69 6f 6e 3a    20 66 6f 72 6d 2d 64 61    osition: form-da
  74 61 3b 20 6e 61 6d 65    3d 22 74 65 78 74 22 0d    ta; name="text".
  0a 0d 0a c8 d5 b1 be b9    fa 0d 0a 0d 0a 2d 2d 78    .............--x

so the interesting data is: c8 d5 b1 be b9 fa

  2d 78 59 7a 5a 59 0d 0a    43 6f 6e 74 65 6e 74 2d    -xYzZY..Content-
  44 69 73 70 6f 73 69 74    69 6f 6e 3a 20 66 6f 72    Disposition: for
  6d 2d 64 61 74 61 3b 20    6e 61 6d 65 3d 22 73 75    m-data; name="su
  62 6a 65 63 74 22 0d 0a    0d 0a e6 97 a5 e6 9c ac    bject"..........
  e5 9b bd 0d 0a 2d 2d 78    59 7a 5a 59 0d 0a 43 6f    .....--xYzZY..Co

And the subject one is correct: data: e6 97 a5 e6 9c ac e5 9b

So the first bytes translate to:

Byte number 1 is decimal 200, hex 0xC8, octal \310, binary 11001000 This is the first byte of a 2 byte sequence.

Byte number 2 is decimal 213, hex 0xD5, octal \325, binary 11010101 Previous UTF-8 multibyte sequence incomplete, earlier bytes dropped. This is the first byte of a 2 byte sequence.

Byte number 3 is decimal 177, hex 0xB1, octal \261, binary 10110001 This is continuation byte 1, expecting 0 more.

U+0571 ARMENIAN SMALL LETTER JA

Byte number 4 is decimal 190, hex 0xBE, octal \276, binary 10111110 Unexpected continuation byte.

Byte number 5 is decimal 185, hex 0xB9, octal \271, binary 10111001 Unexpected continuation byte.

Byte number 6 is decimal 250, hex 0xFA, octal \372, binary 11111010 This is the first byte of a 5 byte sequence. End of file during multibyte sequence, some bytes dropped

and the second ones (correct):

Byte number 1 is decimal 230, hex 0xE6, octal \346, binary 11100110 This is the first byte of a 3 byte sequence.

Byte number 2 is decimal 151, hex 0x97, octal \227, binary 10010111 This is continuation byte 1, expecting 1 more.

Byte number 3 is decimal 165, hex 0xA5, octal \245, binary 10100101 This is continuation byte 2, expecting 0 more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment