Skip to content

Instantly share code, notes, and snippets.

@quinthar
Last active February 7, 2021 16:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save quinthar/44e1c4f63f84556a9822ebf274dc510a to your computer and use it in GitHub Desktop.
Save quinthar/44e1c4f63f84556a9822ebf274dc510a to your computer and use it in GitHub Desktop.
Expensify.cash end-to-end encryption proposal

Expensify.cash end-to-end encryption proposal

Hi! I'm one of the developers behind Expensify.cash (and the CEO/founder of the company -- chat with me on Twitter @dbarrett) and these are some quick notes on how we might add end-to-end encryption. I'm generally familiar with the basics of encryption, and have read quite a bit on the Signal protocol, but am no expert so am eager to get advice from those who are. In particular, I would like to build a system optimized for simplicty and security against very plausible real world attacks people care about, without overengineering against exotic attacks that are unlikely to happen in the real world.

tl;dr

In short, I think this design can make Expensify.cash provide very strong protection against Your Friends, Your Boss, and Lawyers. And I also think it will protect you from The Cops, Hackers, and The Feds for all but the most severe concerns. But no amount of encryption will protect you from The Feds if sufficiently motivated, and anyone claiming otherwise is lying.

Attack Surface

To start, here are the major layers that an adversary might attack:

  1. The Network. This could mean eavesdropping on your wifi, intercepting your home internet connection to your provider, mass capture on some internet backbone, or targeted capture inside our datacenters right before the packets go to our cabinets.

  2. The Code. This could mean inserting an intentional vulnerability into our own code, or the code of some library we depend upon, or even into the operating system of our servers or your device.

  3. The Servers. This could mean someone establishing a persistent foothold in our servers, or even replacing our servers entirely with alternatives.

  4. Your Device. This means the phone in your hand or laptop in your posession (or the device of any friend you are talking with).

  5. The Math. This means the mathemtical complexity of the encryption algorithms themselves.

Adverseries, Capabilities, and Countermeasures

Clearly there are no shortage of potential poeple who might want to intercept your communications. But here are probably the major categories you might think of, and a quick summary of their likely capabilities:

The Feds

This might be the CIA, NSA, FBI, or some other shadowy three-letter organization we don't even know about. Additionally, in light of Five Eyes, the GWOT, and a bunch of multilateral information sharing agreements, it's probably safest to assume nearly all democratic first-world nations have combined their efforts, with all of them being collectively known as "The Feds".

Though there's no way to know their full capabilities, I think it's safest to assume that they have at least full control over The Network (ie, they can realistically record broad swathes of encrypted internet traffic for large periods time). In practice this is likely paranoid -- it's unlikely they are truly recording every single byte forever. But it seems safest to design on the assumption they are.

Furthermore, despite being a rather disturbing prospect, we should likely acknowledge that if sufficiently motivated, The Feds also have control over The Code: they can get a FISA warrant to secretly force any US citizen ship modified code (ie, to insert a back door), and then use a gag order to prevent them from admitting it. Any US hosted code (including all code in Github), as well as US hosted binaries (including every app in the iPhone App Store and Google Play) also fall under thier potential purview. Again, it's very unlikely that they are doing this in all but the most extreme situations. But we shouldn't ignore the possibility that they are, and that we'd have no way of knowing.

Next, they we should also assume they have control over The Servers: they can get a warrant to access the physical hardware of any US hosted server. This means if necessary they can force any US citizen to reveal any information stored on the servers, either on a point in time or on an ongoing basis. They might also force anyone to install special hardware or really do anything you can imagine can be done with physical access to the actual hardware.

Finally, there's Your Device. And this might be hard to hear, but I think a good security model should assume that The Feds can access your device pretty much whenever they want, so long as they are sufficienty incentivized. It doesn't need to be some Hollywood heist plot: they would just stop you, physically take your phone from you, and put you in jail until you unlock it (assuming they don't already have some trick to get it themselves). Hopefully that's not true, but let's build a system that anticipates this possibility, and then any surprises are good surprises.

In fact, the only thing I think is reasonable to assume they don't have control over is the the basic math of encryption itself. It's pretty widely inspected stuff, and even if The Feds have a monopoly on violence, they don't have a monoploy on mathemeticians. Granted, every algorithm is always one breakthrough from being obsolete -- some new technique (or the ever-lurking threat of quantum computing) is always right around the corner. And it's likely that The Feds will have a leg up on any new techniques merely because they are more motivated than most to find them. But to the degree encryption can be trusted at all, I think it's a reasonable position to assume that it in fact works as advertised. If nothing else, the alternative is pretty demoralizing.

Regardless, I can assure you that as of right now, they aren't doing anything of the sort with Expensify's code, data, or servers, and we would fight every request to the fullest extent of the law. And I'm sure the CEOs of Signal, Google, and Apple would say the same. They might even claim that they would personally go to jail or wipe all their servers to avoid data falling into the wrong hands -- and they might even mean it!

Unfortunately, you can't actually trust any of us: it's entirely possible we are being forced to lie to the public, conceivably at risk of physical harm to us and our families. There are a wide variety of opinions on whether that's a good or a bad thing, but a full analysis of the moral foundation of a nation's monopoly on violence in a particular geographic region is out of the scope of this document.

The main thing is: if you are planning on becoming a "person of interest" of The Feds, I'd urge you to reconsider your life choices, and be damn sure whatever cause you are fighting for is worth it. Because "the grid" goes far and wide, and no amount of encryption will protect you forever.

Hackers

If The Feds have the most power due to just being able to get a warrant for this information, the next most threatening entity would be a nebulous group of "Hackers". These would be people who have technical sophistication and presumably some motive to attack, as well as a willingness to work outside the law. Hackers fall into a few major buckets:

  • Nation States - These would be major state-sponsored hacking groups who have enormous skill and patience, and no real profit motive. They are willing to spend virtually unlimited time and money, so long as doing so aligns with national security objectives. Russia, China, and Iran are the most commonly mentioned, but it's best to assume that every nation with a military has some form of offensive "cyber" (eyeroll) capability. Nation States will take the time to create bespoke attacks to specific services, or specific targets. Just like it's a pretty dangerous move to piss off The Feds, it's maybe not the best idea to piss off other Nation States with active hacking groups.

  • Failed Nation States - The only difference between a Nation State and a Failed Nation State is a profit motive. In general, there's only one Failed Nation State hacker, and that's North Korea's Lazarus Group. Unlike other nation states who are hacking for national security, North Korea is hacking for cold hard cash: much of their work is to generate hard currency to fund state operations and bypass sanctions. This is also the group that hacked Sony as punishment for releasing a movie about assassinating the North Korean dictator Kim Jong Un. Once again, if you can avoid it, probably best to avoid pissing off the North Koreans.

  • Industrial Espionage - The most common accusation for this is against China, which (supposedly) works in close coordination with major Chinese companies to steal military and civilian technology to give their industry a leg up. I don't think anyone really has a good idea what's going on here, but it seems safest to assume that it is in fact happening, and that if there is some reason for Chinese industry to steal your IP, you should assume they will try.

  • Profiteers - Hackers with an overt profit motive definitely exist. However, they generally target banks -- especially those who run off-the-shelf banking software that runs on Windows servers. These are typically very well organized groups that have a whole network of specialized parties that work in coordination to not just attack the servers, but also steal the money, transfer it abroad, and then eventually withdraw it from thousands of ATMs around the world. That last step especially is very dangerous, but in general every group distrusts all the others, so it's much safer and more cost effective to repeat attacks that are already known to work rather than investing in creating a new one.

  • Bitcoin Blackmailers - The next most dangerous group are those who are trying to steal and threaten to release embarassing information, extorting you for Bitcoin. Another variation would be encrypting all your data to hold it hostage. Eihter way, these are likely opportunistic groups that will use general tools to hack a wide variety of systems -- probably starting with phishing attacks to install malware. This is a lot safer because there is no physical money to deal with, though Bitcoin is (despite its reputation for anonymity) quite literally the most carefully tracked currency in the history of the world, so it's not without risk to actually convert thier ill-gotten gains into real money.

  • Script Kiddies - These are amateur hackers who are just learning the ropes, typically by running well-known off-the-shelf tools designed to test servers for known vulnerabilities. Script Kiddies are just in it for the lolz.

Regardless, there are as many kinds of hackers as there are Hollywood plots. But just like you don't make a unique defense to protect against every possible infection, the general defense against all these is good hygiene: keep all your systems patched and don't click, don't click on sketchy links.

The Cops

Granted, the line between The Cops and The Feds is pretty murky -- and the closer you model your behavior on Snowden or Bin Laden, the closer you get to The Fed territory (which as discussed, is a real dangerous place to be). But 99% of people who care about encryption have no interest in genuinely overthrowing the government, and instead are looking for protection against government overreach.

One sceanrio of concern could be excessive force or other abuse of authority by an individual officer, such as at a border crossing, traffic stop, or protest. Thankfully, the powers of any individual officer to compel you to reveal your information on the spot are very limited, and a good lock screen is your best defense.

That said, if you are arrested and charged with a crime (even a trumped up crime that will never be prosecuted just to intimidate you), it's possible that you'll be asked to unlock your phone and present your chat history to officials. In this situation, end-to-end encrytion and self-destructing chats are important tools to ensure you remain in control of the information you reveal, by preventing The Cops from subpoening your chat history out of our servers.

Accordingly, it's best to assume The Cops can gain control of The Servers via a valid subpoena, and maybe Your Device if they are particularly aggressive. But it's unlikely they could get access to The Network or The Code.

Lawyers

The next threat -- and we're finally getting to something a bit more realistic that might actually concern you -- is some Lawyer coming after you via civil lawsuit: them claiming you did something, and you claiming you didn't. This would be resolved in a court, and would generally result in a court ordering you to turn over any written documents (including chats) related to the case. To be clear, you are legally required to do this; encryption doesn't absolve you this legal requirement. But it does ensure that the court can only make this request to you directly, and can't bypass you by sending the request straight to someone else (eg, us). Because if they did, we would honestly explain that we cannot help them due to you using legal encryption technology to protect your own privacy, and we don't have the key.

This means Lawyers have largely the same capabilities of The Cops: access to The Servers via subpoena, but probably not Your Device.

Your Boss

Expensify already has very strong privacy protections in place to ensure employers do not arbitrarily inspect the private data of employees. And similar to above, if your boss is served a legal subpoena to produce documents related to some civil lawsuit, any chats you have related to your employer are legally "discoverable" -- even if you use an end-to-end encrypted chat tool, or SMS, or one-time-pad encrypted paper notes, you are required to honor the court's request. But this proposed design creates very strong protections to ensure that no private communications with non-employees get accidentally revealed as well, by ensuring all chats with non-employees are fully encrypted and that neither Your Boss nor Expensify has the key.

Similar to The Cops and Lawyers, Your Boss can realistically have access to The Servers via subpoena in relation to a legitimate lawsuit, but probably not Your Device.

Your Friends

The final group you might want to protect your chats from being revealed to, are the people immediately around you -- your firends, family, and so on. Again, a good lockscreen PIN along with biometric unlock is probably your best defense. But unless your grip is really good, it's not infeasible that your unlocked phone could be snatched from your hand. Toward that end, self-destructing chats are likely the strongest protection.

Strangely, Your Friends probably have an easier time getting access to Your Device than literally anyone else -- even though they have no access to anything else.

Common Attacks & Protections

Ok, with that quick review of attack surfaces and common adversaries, let's talk about the most likely attacks to happen in the real world that you might want protection against:

  1. Dragnet surveillance. Probably the most common fear is just of some undirected, omnipresent observer eavesdropping on all communications and taking notes for some unknown future purpose. I wager 90% of the desire for (and value of) end-to-end encryption is just to eliminate the anxiety that you might accidentally say something that could be misinterpreted and used against you at some future point by some some unknown party. Even the most social person needs the ability to retreat to a private space every once in a while, just to recharge out of public eye. Accordingly, I think by far the most important value of end-to-end encryption is just so online conversations can maintain the same natural privacy expectations we have enjoyed for millennia offline, by eliminating any central unencrypted chokepoint through which dragnet surveillance could be realistically installed.

  2. Targeted surveillance. Next, even though very few of us are interesting enough to be the subject of any kind of targeted surveillance, nobody likes the lurking suspicion that someone is secretly listening in. The next value of end-to-end encryption is to ensure that if someone does have a legal reason to review your historical communcations, they need to come directly to you to get it -- they can't do it secretly, without your explicit consent. This can be done by putting each user in full control of access to their communications history, such that any legal requests to access it must go through them. Similar to how protection against dragnet surveillance creates an instant anxiety relief even if no such surveillance is in place, the knowledge that nobody can access your history without going through you is a contiuous relief even if nobody ever asks to see it.

  3. Device confiscation. The next most worriesome attack is simply having your device confiscated. Whether or not you are pressured to unlock it, the possibilty that they have found some method to unlock your phone without you can be extremely alarming. To be clear, this has nothing to do with the presumption of guilt: as more and more conversations go online, the range of personal information contained on your device -- including not just chats but also personal photos -- creates an enormous risk of an invasion of privacy. Toward that end, the best protection against privacy invasion upon device confiscation is to use expiring messages, thereby limiting the potential for exposure.

  4. Device compromise. Though it's clearly alarming for someone to grab an unlocked phone out of your hand, it at least has the benefit of you knowing it happened. An even worse attack involves your device somehow being compromised without you realizing it. In Hollywood that seems to involve swapping out the SIM card when you aren't looking; in practice it's probably a more mundane issue of someone clicking on a malware link in some kind of phishing email. Either way, this is probably the most damaging attack because it conceivably allows your adversary to extract the decrypted communications directly from the device on an ongoing fashion. Unfortunately, there's really nothing encryption can do to prevent this because the attack happens after the decryption is finished.

  5. Key extraction. Though it's very unlikely to be dealing with an attacker this sophisticated, and this is nearly impossible to do without first confiscating or otherwise compromising the device itself, an attacker might somehow obtain a copy of the encryption key, enabling them to eavesdrop on the encrypted network communication and decrypt on an ongoing manner without needing ongoing access to the device. The best protection against this is to use a protocol that has two properties:

    • Forward Secrecy - This means that the information on the device can't be used to reveal all future conversations. The most common way to solve this is to frequently reset the encryption key, rendering any compromised key useless for future messags.
    • Backward Secrecy - As the name implies, if Forward Secrecy ensures the keys on the device can't be used to decrypt future messags, Backward Secrecy ensures that past messages (other those currently stored on the device, which are revealed) cannot be decrypted using any key on the device. So even if you were recording all the network traffic to a device for weeks or years ahead of time, if you finally get ahold of the device and extract its key, the key will be useless for all of those previous years of messages. This is most commonly done by combining the frequent key resetting of Forward Secrecy, with simply deleting the decryption key after it's used, rendering undecipherable any outstanding copies of the historical encrypted message.
  6. Password extraction. Given the difficulty of extracting and using your encryption keys (not to mention the limited value of doing so as a result of Forward/Backward Secrecy), another attack is to simply extract the username/password you use to sign into the service (along with perhaps any underlying key used to power a two-factor authentication app) to simply sign in as you at a future time. The best protection against this is to make it really obvious when a new device signs into your account (by notifying the other devices), as well as using a "paring" feature (where an existing device needs to actively authorize the new device) before revealing any sensitive data to the new device.

Summary of Encryption Primitives

Though there are a zillion important details being left out, here is a quick summary of the major tools used to build encrypted systems:

Symmetric (aka "Secret Key") Encryption

The simplest (though still insanely complex) form of encryption is called "symmetric" encryption, because the same key is used to both encrypt and decrypt. In general, all encryption works on "blocks" of data (eg, AES works on 16-byte blocks), meaning long messages need to be split up into many blocks, and small messages need to be "padded" to fill out the block. Symmetric encryption is extremely fast, in part because most modern chips have instructions specifically designed for it. It's also extremly secure: a block encrypted with a symmetric key is nearly impossible for an attacker to decrypt, within any reasonable timeframe (ie, longer than your lifetime).

That said, encrypting each block with the same key isn't enough to fully protect the message, because if there are two blocks with the same "plaintext" content, the encrytped "ciphertext" will be the same for those blocks. Even if the attacker won't know precisely what that block is, they'll know that there were multiple instances of the same block. Said another way, even if it's unclear what the message is, it might reveal that the same messages was sent several times in a row. Accordingly, symmetric encryption is paired with an encryption "mode" (seeded with an "initialization vector"), which generally combines the output of previously encrypted blocks with each new block, such that even the same block encrypted many times in a row will look random to an attacker.

Though not strictly a feature of symmetric encryption per se, it's becoming increasingly popular to "ratchet" your symmetric key over the course of communications using a "hash function", such that every message you send is encrypted with a different key. To break this down further, the purpose of a "one way hash function" (like sha) is to generate a "fingerprint" of some input that perfectly identifies it, without being able to "reverse" it and reproduce the input. So two different people who have the same content, can each independently hash it, and produce the same exact fingerprint. Then if you hash that fingerprint, you get another one -- and so on. This is a neat trick, because so long as two parties agree on the initial content (in this case, the content is a symmetric key), then they can agree on an infinite sequence of keys -- each generated from the one before. So if both parties agree to "ratchet" the key (ie, replace the symmetric key with its own hash) after each message, then they will both change the key every single message -- in a way that appears completely random to an outside party. Even better, if someone intercepts one of the keys somehow, because the hash function is "non-reversible", the attacker wouldn't be able to figure out any of the previous keys that came before it (though they would be able to "ratchet along" for future messages).

Symmetric encryption is also called "secret key" encryption because anyone who has the key can decrypt the content, so it's important to keep that key "secret". That is hard to do when two parties are communicating over a public channel that others can see, which is what brings us to...

Asymmetric (aka "Public Key") Encryption

To solve the "bootstrapping" problem of enabling two parties to agree upon a common symmetric key, a second kind of "asymmetric" encryption exists that uses a key split into two parts: the "public" and "private" half. As the name implies, the "public" half of the key can be safely shared in the open, and in fact is the most common way of cryptographically identifying yourself to the rest of the world. The "private" half, again as the name implies, must be kept secret, and the fact that you are the only one who knows it is what enables you to prove you are in fact the person who is identifying yourself with the corresponding public key. Though like anything there are a million different algorithms, the two main asymmetric encryption algorithms are:

  • RSA - A workhorse from the 70's, RSA is neat in that a message encrypted with one half of the key can only be decrypted with the other. So, this means anybody can securely send a message to someone over a public network by encrypted the message with the recipient's public key, confident that only the recipient has the private key that can decrypt it. On the other hand, this works in such a fashion where someone can demonstrate that they do in fact have the private key by "signing" a message using the private key. This is done by sharing a message like "My name is David", and then encrypting that message with the private key, enabling anybody to decrypt it with the public key and confirm that the message was in fact sent by the holder of the private key. RSA encrypts in blocks of 256 bytes, but in practice the maximum message size is 190 bytes due to the need to "pad" the message. RSA is much, much slower than symmetric encryption, and much more computationally expensive to create a key. Accordingly, in nearly all cases, RSA is not used to encrypt actual content (ie, by splitting it up into a bunch of 190-byte blocks and encrypting each separately). Rather, the sender would typically pick some other symmetric key (aka a "session key" as it's generally unique for each communication's session), encrypt that session key with RSA, then send the encrypted session key to the recipient. After the session has been "initiated" in this way, subsequent communications would just use the much, much faster symmetric encryption key.

  • Diffie Hellman (DH) - DH isn't just an alternative to RSA, it's actually a totally different thing. Though both RSA and DH has the same concept of public/private keys, DH doesn't let you encrypt with one key and decrypt with the other, like RSA. Rather, DH is used exclusively to enable two parties to independently derive the same session key. Basically, given two parties Alice and Bob, each of whom has publicly revealed their public key halves (Alice.pub and Bob.pub), and secretly stored their private keyhalves (Alice.priv and Bob.priv), DH allows Alice and Bob to each combine thier private key with the others' public key, to independently calculate the same shared session key. So it accomplishes the same real-world effect as RSA in that the end result is both parties have securely agreed upon a symmetric key to encrypt/decrypt the session content, but just goes about it a differnet way. So while DH can't encrypt arbitrary messages directly with the public key like RSA -- nor can it sign messages with the private key -- DH has a very significant advantage over RSA in that generating a new key is vastly less expensive, so it's much easier to "rotate" (ie, regenerate) a DH key than an RSA key. Indeed, DH with Curve25519 can use any 32-byte block of data as a DH key, meaning generating a new key is as simple as picking a random 32 bytes. This is particularly useful when doing a "DH ratchet", which is just like a symmetric key ratchet (explained above), but changing your DH key with every message. Ratcheting with RSA would likely be prohibitively expensive.

Problem

Alright, with all that background out of the way, what is the actual problem we are trying to solve? Security is never absolute, it is always a tradeoff, so as we think about Expensify's end-to-end encryption design, what specifically are we trying to protect again? I would suggest that the bulk of our users would prioritize being protected from the following, in order:

  1. Mass surveillance. Most people have nothing to hide and no particular "need" for secrecy. But when continuously observed we get anxious and behave differnetly. People just want to know that there is nobody casually and continuously looking over their shoulder, just for a general peace of mind.

  2. Your Boss. Expensify is a tool used both for personal and professional use -- and because we're not robots, we often have personal relationships in the workplace that need protecting. Expensify needs to guarantee that nobody can see your private chats except those who participated in them.

  3. Lawyers. As law-abiding citizens we have an obligation to participate in "legal holds" and "discovery", and to enable our employers to do the same. But we should put our users in control of their information by ensuring that the requests come directly to them, rather than people they don't know revealing thier private information to other people they don't know.

  4. The Cops. We expect that many of our users will be engaged in righting society's many wrongs, which might include protest -- and even civil disobediance. Thia could involve tangling with law enforcement -- possibly in the streets, possibly at border crossings -- who have an imperfect track record in honoring the full rights of citizens. Ideally the information they could obtain would be limited.

  5. Hackers. Expensify users would expect that every reasonable precaution will be taken to ensure that hackers of all kinds will not be able to access private information, whether business or personal.

That said, at risk of being controversial, I'd like to call out one threat that is "out of scope" for a v1 of our end-to-end encryption:

  • The Feds. A full protection against the united global efforts of the world's richest and most technologically advanced nations is... difficult. Indeed, I'd say it's basically impossible for any private company to realistically claim a serious defense against the multi-trillion dollar military/intelligence complex, and anyone claiming otherwise is simply delusional. More to the point, given that these nations have all the legal and physical authority to force anyone involved in this process to bypass their own security and lie to everyone about it, as a customer, nobody should trust anyone who claims they are providing this kind of impossible protection.

Solution Summary

The above sections outline the various kinds of attackers and attacks that we should pro_tect against, and hinted at how those protections might work in practice. Let's dig into each of those protections now:

  • Device keys. The first thing the client device does is generate an asymmetric key, which will be used to communciate with this specific device (of which a user might have multiple). Whenever the device's mailbox on the server is empty (which is most of the time), this key can be safely regenerated, with new public keys broadcast out to the other devices.

  • User authentication. Expensify uses a standard email/sms + password + optional (but highly recommended) two factor authentication scheme. The device's public key is uploaded with this authentication request, and on succesful authentication, is used to open a store-and-forward "mailbox" for other devices owned by the same user to send it messages.

  • New device pairing. In particular, the new device broadcasts to all the other devices "I have successfully authenticated, and my random pairing PIN is XXXX!" by encrypting this message to each other device's public key, and sending it to the other devices' mailboxes. Every device gets this message (either immediately, if online, or later when it comes online), and shows a "Did you sign in to the device showing PIN XXXX?" modal. Whether yes or no, that device broadcasts a message to all other devices saying "This device's pairing request was approved/denied", and they all hide the pairing modal.

  • Prekey initialization. The client takes a page from Signal's book and generates 100 "prekeys" and uploads their public keys to the server. The server will give each of these keys out as necessary, one at a time, whenever initiating a new session. The server notifies all devices whenever a prekey is used, and each device uploads a new one to replace it, such that there are always at least 100 prekeys for each user.

  • Prekey sharing. Becuase Expensify is built for multi-device operation, the private half of each prekey needs to be synchronized between all devices. Accordingly, when uploading the public prekey to the server, each device also encrypts the private prekey half with each other device's public key, and inserts it into that device's mailbox. This mailbox will deliver instantly if the client is online, otherwise it'll be held until teh client comes online again. In this way, all of a user's devices will have every prekey's private half, and thus be equally ready to process the new incoming session.

  • Session initialization. Every chat in Expensify is a room with 2 or more users -- there is no technical differentiation between "1:1" and "group" DMs. Furthemore, unlike Signal, all messages are recorded on the server forever, and delivered to clients "in order". In essence, each chat is a sequential list of messages, each of which can be optionally encrypted. (Some rooms have encryption disabled, to enable serverside searching and legal hold discovery for business purposes.) Accordingly, a "new chat" just means a new room has been created with two or more people in it -- and once created, it never goes away. So if Alice is creating a room with Bob and Carol:

    1. Alice's device 0 (Alice0) notifies the server it is creating a chat with Bob and Carol
    2. The server picks the next prekey for Alice, Bob, and Carol, and adds a message to the chat saying Alice, Bob, and Carol are using prekeys A, B, and C respectively -- all devices currently online receive this message (and all that are not online will receive it when they sign in)
    3. Alice0 sees it has no symmetric "room key" R for this room
    4. Alice0 picks a random symmetric "room key" R and creates a message M like Alice will be encrypting future messages to this room with room key R, until further notice.
    5. Alice0 encrypts message M once each for Bob's and Carol's public prekeys, as well as once again to her own prekey (such that her other devices will get it), combines it all into a single post signed by her private key A, and sends to the room
    6. Bob and Carol, along with all of Alice's other devices receive this message on all their devices, each:
      1. verify it's signed by A's public key
      2. decrypt using their own room's prekey private half
      3. record Alice's symmetric room key for future use
  • Session communications. Now Alice has an active room key known to Bob and Carol; all of Alice's future messages are just encrypted once using the symmetric room key, and all devices for all users can decrypt it.

  • Disappearing messages. If Alice would like disappearing messages (eg, after 1 week) then at some cadence (eg, daily):

    1. Alice0 picks a new room key and broadcasts it out just like before, but this time including along with the key a request to delete all messages encrypted with this key after 7 days -- along with the key itself
    2. All devices keep track of all historical room keys, and delete them at their configured expiration dates

Conclusion

This proposal to add end-to-end encryption is heavily influenced by Signal's design, but adapted for a different set of requirements, and hopefully simplified where possible as a result.

@olabiniV2
Copy link

I'm not sure if you are actually noticing it, but from my perspective you seem to be engaging in the fallacy of "moving the goalposts" - you started with one question, but with every argument, you have changed the parameters for the question, until you have now reduced it to "why RSA cannot be used to encrypt each block of a stream (so long as each block uses OAEP, and performance isn't a consideration)".

It seems to me as if you have made a decision already, and are now trying to find confirmation that this decision is correct, instead of being open to changing your mind.

For that reason, I'm not going to even try to respond to your final question, especially since this thread already contains abundant reasons for why you shouldn't do what you are proposing. Instead, I'll tell you how you can go about answering this yourself. This will also be my final message to this thread.

Let's look at this from two perspectives. Are you asking the question from a practical perspective or from a theoretical perspective? If it's practical, meaning that you are planning on putting this in production unless I give you arguments to not do it, then the answer would go something like this:

When it comes to implementing cryptography and choosing how to put together protocols, the burden of proof HAS to be on the side of deviation from conservative choices. Meaning, you can't say something random and then expect others to prove why what you are proposing is dangerous. In cryptography we know from experience that deviating from the conservative choices is risky. Extremely risky. Which means that you have the responsibility of proving that what you are proposing is safe. It's like if you were an astrophysisist and said that "stars are made of cheese", and then expected other astrophysisists to give you evidence that you are incorrect - and if they don't, that is proof that "stars are made of cheese". It has to be other way around, where your hypothesis is something you should back up with proof that it's safe. Or to put it another way, in cryptography, our base assumption is always that something is unsafe. If you think that something should be safe, it's your responsibility to show that.

Now, using RSA in the way you are describing might be safe, for some definition of safe. But the history of cryptography shows us that there are several warnings signs in what you are proposing, including things like non-contribution from all parties, input with structure, and input with mathematically related structures between blocks. Even one of these things by itself should be enough to disqualify the design, unless you can show that it is safe.

Implementing cryptography, and creating cryptographic protocols, is a very hard discipline. Even if you make all the conservative choices, and create an extremely simple protocol, it's still hideously complicated and hard to get right, such that end users are actually protected. And adding more unknowns by making non-standard choices for no good reason, is simply a way of making the situation even worse.

OK, so, if you are not asking from a practical perspective, but from a theoretical standpoint - meaning, you want to know the answer, but you have no real plan to design a protocol in this way, the steps you should take are these:

  • Define the question in a well defined way, with all details necessary. This includes specifying exactly what security properties you expect your solution to have, and so on. This needs to be done in a semi-mathematical way, not in human language, saying things like "it should be safe against network observers". No, it should be something like, "it should be IND-CPA against any attacker with less than 120bits storage, 160bits computing and 70bits time advantage".
  • Read all papers about asymmetric cryptographic, starting with all the Diffie-Hellman papers, Merkle, RSA and forward. Look at all the cryptologic papers (the attacks) against them. Finally, read all the papers about possible countermeasures against attacks, things like OEAP, and so on
  • Create an actual "instantiation" of your protocol proposal, once again including all relevant security properties
  • Create a proof that shows that this protocol delivers these security properties

The thing is, no one else is likely to do this work for you, because there's no advantage for anyone to do it. As a community, we have better ways of achieving what you want to do already, at lower cost. RSA can be notoriously brittle, and it's used less and less these days. So, why would any cryptographer spend valuable time analyzing this question? We don't necessarily know that non-contribution is a problem for RSA or specific RSA implementations. But we know that it has been a problem in other circumstances, so why take the chance? And why even analyze it in a setting which no-one would use anyway?

One final point. You are simplifying or mis-representing my opinion by saying "...you haven't shown any reason why it's secure to encrypt a 190-byte payload with RSA so long as that payload contains a symmetric key, but not secure if that 190-byte payload contains something else.". My actual opinion is that it is less likely to be secure to encrypt input with structure, using RSA, than it is to encrypt data without structure. Further, my actual opinion going back to my first answers, is that neither of those two alternatives are likely to be secure enough for any kind of production deployment, especially not if you are comparing to the possibility of using a DH style scheme instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment