Skip to content

Instantly share code, notes, and snippets.

@justinabrahms
Created November 17, 2015 12:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save justinabrahms/c35bfc16338108c100a6 to your computer and use it in GitHub Desktop.
Save justinabrahms/c35bfc16338108c100a6 to your computer and use it in GitHub Desktop.
"""Goal here is to obfuscate GPG blocks using markov chains.
Steps:
1. Train markov chain with some news articles on a similar topic.
2. Take in a GPG armored ascii block.
3. Use that input to determine which branch of the markov chain to walk.
4. The result would be a new article similar to the inputs.
5. The user could then take the same program, run it against the same
inputs, and output the GPG block.
For #4, we need to send the recieving user the inputs. Since this is an
article though, we could just include them as links in the text.
Inspiration:
I was reading an article on theintercept.com about blaming encrypting
for terror attacks. I think that's an absurd premise. The author, Glenn
Greenwald, quoted a former FBI director about needing "viable key
management infrastructure" which sounds fucking terrifying.
I thought this would be an interesting mechanism to "hide" that you're
transmitting an encrypted message, in case it ever became such a thing
that you weren't legally allowed to send.
Original Link: https://theintercept.com/2015/11/15/exploiting-emotions-about-paris-to-blame-snowden-distract-from-actual-culprits-who-empowered-isis/
"""
@apg
Copy link

apg commented Nov 17, 2015

First thought is that a Markov chain isn't the right approach, because by definition it lacks determinism, which is crucial to recovering the original ASCII armored message. But, maybe there's some steganography research that would help figure this out.

@justinabrahms
Copy link
Author

Steganography makes me think of images. Given that you have one of 64 characters in an ascii armored message.. it seems reasonable that you could simply have an 7-bit color space. Or perhaps an animated gif with a full panel of color times the numbers of characters as panes.

I thought markov chains might be deterministic given the same input, but "research markov chains" were next on my list. :)

@apg
Copy link

apg commented Nov 17, 2015

This is likely to be indistinguishable from random noise, though, no? The problem here, and the reason steganography applies, is that you're trying to hide seemingly random noise within something that simply appears to be unmodified. Take a look at Bacon's Cipher as it directly applies to "stegotext", according to the wiki page linked above.

@justinabrahms
Copy link
Author

That's neat. I could see making that w/ font tags, then maybe capturing it as an svg image or something.

I was less thinking of including it as including it in random noise, as much as obfuscating that you're sending crypto things.

The AB thing Bacon uses is neat. If you had a markov chain with 2+ possible stems from the original word, you could just deterministically pick them based on the expression of A or B. If you had < 2 possible stems, you skip that one and continue. Not sure if you'd hit a cyclical loop or not. shrug

@apg
Copy link

apg commented Nov 17, 2015

I'm not entirely sure Markov chains fit in here, but it might be possible.

To your original point of storing the encoded message in the original article, what if you use spaces between words, or between sentences to encode the {A,B} from Bacon's Cipher? Now, we have 7 words per armor block character, and we'd have to encode some mechanism for "EOM", but, now we have perfectly readable text with some inconsistent use of the space bar. If it's encoded further in HTML, it displays perfectly fine to the viewer, and most would be none the wiser. Don't treat '\n' as space.

Algorithm becomes:

  1. Encrypt message and get ASCII armor block
  2. Ensure that the article you're sending is at least Len(ASCII armor block) * 7 words.
  3. Bacon encipher the ASCII block
  4. Explode the article text on spaces (spaces could even be contained in the HTML tags)
  5. Reassemble based on the Bacon Encipher output, inserting 1 space for A, 2 for B
  6. ???
  7. Profit

Idea: Instead of encoding an EOM, frame the message with a 4 "byte" byte count, Bacon ciphered, of course.

Someone should blog about this, with a basic implementation, though it's probably not novel.

@justinabrahms
Copy link
Author

Recording discussion on twitter.
I pointed out we could use &nbsp; instead of 2 spaces. Andrew suggested we could use U+00A0 instead so it was less obvious.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment