quinthar/end-to-end-encryption.md

## end-to-end-encryption.md

      
    Raw
  

              end-to-end-encryption.md
            
          
    Expensify.cash end-to-end encryption proposal

Hi!  I'm one of the developers behind Expensify.cash (and the CEO/founder of the company -- chat with me on Twitter @dbarrett) and these are some quick notes on how we might add end-to-end encryption.  I'm generally familiar with the basics of encryption, and have read quite a bit on the Signal protocol, but am no expert so am eager to get advice from those who are.  In particular, I would like to build a system optimized for simplicty and security against very plausible real world attacks people care about, without overengineering against exotic attacks that are unlikely to happen in the real world.
tl;dr

In short, I think this design can make Expensify.cash provide very strong protection against Your Friends, Your Boss, and Lawyers.  And I also think it will protect you from The Cops, Hackers, and The Feds for all but the most severe concerns.  But no amount of encryption will protect you from The Feds if sufficiently motivated, and anyone claiming otherwise is lying.
Attack Surface

To start, here are the major layers that an adversary might attack:


The Network.  This could mean eavesdropping on your wifi, intercepting your home internet connection to your provider, mass capture on some internet backbone, or targeted capture inside our datacenters right before the packets go to our cabinets.


The Code.  This could mean inserting an intentional vulnerability into our own code, or the code of some library we depend upon, or even into the operating system of our servers or your device.


The Servers.  This could mean someone establishing a persistent foothold in our servers, or even replacing our servers entirely with alternatives.


Your Device.  This means the phone in your hand or laptop in your posession (or the device of any friend you are talking with).


The Math.  This means the mathemtical complexity of the encryption algorithms themselves.


Adverseries, Capabilities, and Countermeasures

Clearly there are no shortage of potential poeple who might want to intercept your communications.  But here are probably the major categories you might think of, and a quick summary of their likely capabilities:
The Feds

This might be the CIA, NSA, FBI, or some other shadowy three-letter organization we don't even know about.  Additionally, in light of Five Eyes, the GWOT, and a bunch of multilateral information sharing agreements, it's probably safest to assume nearly all democratic first-world nations have combined their efforts, with all of them being collectively known as "The Feds".
Though there's no way to know their full capabilities, I think it's safest to assume that they have at least full control over The Network (ie, they can realistically record broad swathes of encrypted internet traffic for large periods time).  In practice this is likely paranoid -- it's unlikely they are truly recording every single byte forever.  But it seems safest to design on the assumption they are.
Furthermore, despite being a rather disturbing prospect, we should likely acknowledge that if sufficiently motivated, The Feds also have control over The Code: they can get a FISA warrant to secretly force any US citizen ship modified code (ie, to insert a back door), and then use a gag order to prevent them from admitting it.  Any US hosted code (including all code in Github), as well as US hosted binaries (including every app in the iPhone App Store and Google Play) also fall under thier potential purview.  Again, it's very unlikely that they are doing this in all but the most extreme situations.  But we shouldn't ignore the possibility that they are, and that we'd have no way of knowing.
Next, they we should also assume they have control over The Servers: they can get a warrant to access the physical hardware of any US hosted server.  This means if necessary they can force any US citizen to reveal any information stored on the servers, either on a point in time or on an ongoing basis.  They might also force anyone to install special hardware or really do anything you can imagine can be done with physical access to the actual hardware.
Finally, there's Your Device.  And this might be hard to hear, but I think a good security model should assume that The Feds can access your device pretty much whenever they want, so long as they are sufficienty incentivized.  It doesn't need to be some Hollywood heist plot: they would just stop you, physically take your phone from you, and put you in jail until you unlock it (assuming they don't already have some trick to get it themselves).  Hopefully that's not true, but let's build a system that anticipates this possibility, and then any surprises are good surprises.
In fact, the only thing I think is reasonable to assume they don't have control over is the the basic math of encryption itself.  It's pretty widely inspected stuff, and even if The Feds have a monopoly on violence, they don't have a monoploy on mathemeticians.  Granted, every algorithm is always one breakthrough from being obsolete -- some new technique (or the ever-lurking threat of quantum computing) is always right around the corner.  And it's likely that The Feds will have a leg up on any new techniques merely because they are more motivated than most to find them.  But to the degree encryption can be trusted at all, I think it's a reasonable position to assume that it in fact works as advertised.  If nothing else, the alternative is pretty demoralizing.
Regardless, I can assure you that as of right now, they aren't doing anything of the sort with Expensify's code, data, or servers, and we would fight every request to the fullest extent of the law.  And I'm sure the CEOs of Signal, Google, and Apple would say the same.  They might even claim that they would personally go to jail or wipe all their servers to avoid data falling into the wrong hands -- and they might even mean it!
Unfortunately, you can't actually trust any of us: it's entirely possible we are being forced to lie to the public, conceivably at risk of physical harm to us and our families.  There are a wide variety of opinions on whether that's a good or a bad thing, but a full analysis of the moral foundation of a nation's monopoly on violence in a particular geographic region is out of the scope of this document.
The main thing is: if you are planning on becoming a "person of interest" of The Feds, I'd urge you to reconsider your life choices, and be damn sure whatever cause you are fighting for is worth it.  Because "the grid" goes far and wide, and no amount of encryption will protect you forever.
Hackers

If The Feds have the most power due to just being able to get a warrant for this information, the next most threatening entity would be a nebulous group of "Hackers".  These would be people who have technical sophistication and presumably some motive to attack, as well as a willingness to work outside the law.  Hackers fall into a few major buckets:


Nation States - These would be major state-sponsored hacking groups who have enormous skill and patience, and no real profit motive.  They are willing to spend virtually unlimited time and money, so long as doing so aligns with national security objectives.  Russia, China, and Iran are the most commonly mentioned, but it's best to assume that every nation with a military has some form of offensive "cyber" (eyeroll) capability.  Nation States will take the time to create bespoke attacks to specific services, or specific targets.  Just like it's a pretty dangerous move to piss off The Feds, it's maybe not the best idea to piss off other Nation States with active hacking groups.


Failed Nation States - The only difference between a Nation State and a Failed Nation State is a profit motive.  In general, there's only one Failed Nation State hacker, and that's North Korea's Lazarus Group.  Unlike other nation states who are hacking for national security, North Korea is hacking for cold hard cash: much of their work is to generate hard currency to fund state operations and bypass sanctions.  This is also the group that hacked Sony as punishment for releasing a movie about assassinating the North Korean dictator Kim Jong Un.  Once again, if you can avoid it, probably best to avoid pissing off the North Koreans.


Industrial Espionage - The most common accusation for this is against China, which (supposedly) works in close coordination with major Chinese companies to steal military and civilian technology to give their industry a leg up.  I don't think anyone really has a good idea what's going on here, but it seems safest to assume that it is in fact happening, and that if there is some reason for Chinese industry to steal your IP, you should assume they will try.


Profiteers - Hackers with an overt profit motive definitely exist.  However, they generally target banks -- especially those who run off-the-shelf banking software that runs on Windows servers.  These are typically very well organized groups that have a whole network of specialized parties that work in coordination to not just attack the servers, but also steal the money, transfer it abroad, and then eventually withdraw it from thousands of ATMs around the world.  That last step especially is very dangerous, but in general every group distrusts all the others, so it's much safer and more cost effective to repeat attacks that are already known to work rather than investing in creating a new one.


Bitcoin Blackmailers - The next most dangerous group are those who are trying to steal and threaten to release embarassing information, extorting you for Bitcoin.  Another variation would be encrypting all your data to hold it hostage.  Eihter way, these are likely opportunistic groups that will use general tools to hack a wide variety of systems -- probably starting with phishing attacks to install malware.  This is a lot safer because there is no physical money to deal with, though Bitcoin is (despite its reputation for anonymity) quite literally the most carefully tracked currency in the history of the world, so it's not without risk to actually convert thier ill-gotten gains into real money.


Script Kiddies - These are amateur hackers who are just learning the ropes, typically by running well-known off-the-shelf tools designed to test servers for known vulnerabilities.  Script Kiddies are just in it for the lolz.


Regardless, there are as many kinds of hackers as there are Hollywood plots.  But just like you don't make a unique defense to protect against every possible infection, the general defense against all these is good hygiene: keep all your systems patched and don't click, don't click on sketchy links.
The Cops

Granted, the line between The Cops and The Feds is pretty murky -- and the closer you model your behavior on Snowden or Bin Laden, the closer you get to The Fed territory (which as discussed, is a real dangerous place to be).  But 99% of people who care about encryption have no interest in genuinely overthrowing the government, and instead are looking for protection against government overreach.
One sceanrio of concern could be excessive force or other abuse of authority by an individual officer, such as at a border crossing, traffic stop, or protest.  Thankfully, the powers of any individual officer to compel you to reveal your information on the spot are very limited, and a good lock screen is your best defense.
That said, if you are arrested and charged with a crime (even a trumped up crime that will never be prosecuted just to intimidate you), it's possible that you'll be asked to unlock your phone and present your chat history to officials.  In this situation, end-to-end encrytion and self-destructing chats are important tools to ensure you remain in control of the information you reveal, by preventing The Cops from subpoening your chat history out of our servers.
Accordingly, it's best to assume The Cops can gain control of The Servers via a valid subpoena, and maybe Your Device if they are particularly aggressive.  But it's unlikely they could get access to The Network or The Code.
Lawyers

The next threat -- and we're finally getting to something a bit more realistic that might actually concern you -- is some Lawyer coming after you via civil lawsuit: them claiming you did something, and you claiming you didn't.  This would be resolved in a court, and would generally result in a court ordering you to turn over any written documents (including chats) related to the case.  To be clear, you are legally required to do this; encryption doesn't absolve you this legal requirement.  But it does ensure that the court can only make this request to you directly, and can't bypass you by sending the request straight to someone else (eg, us).  Because if they did, we would honestly explain that we cannot help them due to you using legal encryption technology to protect your own privacy, and we don't have the key.
This means Lawyers have largely the same capabilities of The Cops: access to The Servers via subpoena, but probably not Your Device.
Your Boss

Expensify already has very strong privacy protections in place to ensure employers do not arbitrarily inspect the private data of employees.  And similar to above, if your boss is served a legal subpoena to produce documents related to some civil lawsuit, any chats you have related to your employer are legally "discoverable" -- even if you use an end-to-end encrypted chat tool, or SMS, or one-time-pad encrypted paper notes, you are required to honor the court's request.  But this proposed design creates very strong protections to ensure that no private communications with non-employees get accidentally revealed as well, by ensuring all chats with non-employees are fully encrypted and that neither Your Boss nor Expensify has the key.
Similar to The Cops and Lawyers, Your Boss can realistically have access to The Servers via subpoena in relation to a legitimate lawsuit, but probably not Your Device.
Your Friends

The final group you might want to protect your chats from being revealed to, are the people immediately around you -- your firends, family, and so on.  Again, a good lockscreen PIN along with biometric unlock is probably your best defense.  But unless your grip is really good, it's not infeasible that your unlocked phone could be snatched from your hand.  Toward that end, self-destructing chats are likely the strongest protection.
Strangely, Your Friends probably have an easier time getting access to Your Device than literally anyone else -- even though they have no access to anything else.
Common Attacks & Protections

Ok, with that quick review of attack surfaces and common adversaries, let's talk about the most likely attacks to happen in the real world that you might want protection against:


Dragnet surveillance.  Probably the most common fear is just of some undirected, omnipresent observer eavesdropping on all communications and taking notes for some unknown future purpose.  I wager 90% of the desire for (and value of) end-to-end encryption is just to eliminate the anxiety that you might accidentally say something that could be misinterpreted and used against you at some future point by some some unknown party.  Even the most social person needs the ability to retreat to a private space every once in a while, just to recharge out of public eye.  Accordingly, I think by far the most important value of end-to-end encryption is just so online conversations can maintain the same natural privacy expectations we have enjoyed for millennia offline, by eliminating any central unencrypted chokepoint through which dragnet surveillance could be realistically installed.


Targeted surveillance.  Next, even though very few of us are interesting enough to be the subject of any kind of targeted surveillance, nobody likes the lurking suspicion that someone is secretly listening in.  The next value of end-to-end encryption is to ensure that if someone does have a legal reason to review your historical communcations, they need to come directly to you to get it -- they can't do it secretly, without your explicit consent.  This can be done by putting each user in full control of access to their communications history, such that any legal requests to access it must go through them.  Similar to how protection against dragnet surveillance creates an instant anxiety relief even if no such surveillance is in place, the knowledge that nobody can access your history without going through you is a contiuous relief even if nobody ever asks to see it.


Device confiscation.  The next most worriesome attack is simply having your device confiscated.  Whether or not you are pressured to unlock it, the possibilty that they have found some method to unlock your phone without you can be extremely alarming.  To be clear, this has nothing to do with the presumption of guilt: as more and more conversations go online, the range of personal information contained on your device -- including not just chats but also personal photos -- creates an enormous risk of an invasion of privacy.  Toward that end, the best protection against privacy invasion upon device confiscation is to use expiring messages, thereby limiting the potential for exposure.


Device compromise.  Though it's clearly alarming for someone to grab an unlocked phone out of your hand, it at least has the benefit of you knowing it happened.  An even worse attack involves your device somehow being compromised without you realizing it.  In Hollywood that seems to involve swapping out the SIM card when you aren't looking; in practice it's probably a more mundane issue of someone clicking on a malware link in some kind of phishing email.  Either way, this is probably the most damaging attack because it conceivably allows your adversary to extract the decrypted communications directly from the device on an ongoing fashion.  Unfortunately, there's really nothing encryption can do to prevent this because the attack happens after the decryption is finished.


Key extraction.  Though it's very unlikely to be dealing with an attacker this sophisticated, and this is nearly impossible to do without first confiscating or otherwise compromising the device itself, an attacker might somehow obtain a copy of the encryption key, enabling them to eavesdrop on the encrypted network communication and decrypt on an ongoing manner without needing ongoing access to the device.  The best protection against this is to use a protocol that has two properties:

Forward Secrecy - This means that the information on the device can't be used to reveal all future conversations.  The most common way to solve this is to frequently reset the encryption key, rendering any compromised key useless for future messags.
Backward Secrecy - As the name implies, if Forward Secrecy ensures the keys on the device can't be used to decrypt future messags, Backward Secrecy ensures that past messages (other those currently stored on the device, which are revealed) cannot be decrypted using any key on the device.  So even if you were recording all the network traffic to a device for weeks or years ahead of time, if you finally get ahold of the device and extract its key, the key will be useless for all of those previous years of messages.  This is most commonly done by combining the frequent key resetting of Forward Secrecy, with simply deleting the decryption key after it's used, rendering undecipherable any outstanding copies of the historical encrypted message.


Password extraction.  Given the difficulty of extracting and using your encryption keys (not to mention the limited value of doing so as a result of Forward/Backward Secrecy), another attack is to simply extract the username/password you use to sign into the service (along with perhaps any underlying key used to power a two-factor authentication app) to simply sign in as you at a future time.  The best protection against this is to make it really obvious when a new device signs into your account (by notifying the other devices), as well as using a "paring" feature (where an existing device needs to actively authorize the new device) before revealing any sensitive data to the new device.


Summary of Encryption Primitives

Though there are a zillion important details being left out, here is a quick summary of the major tools used to build encrypted systems:
Symmetric (aka "Secret Key") Encryption

The simplest (though still insanely complex) form of encryption is called "symmetric" encryption, because the same key is used to both encrypt and decrypt.  In general, all encryption works on "blocks" of data (eg, AES works on 16-byte blocks), meaning long messages need to be split up into many blocks, and small messages need to be "padded" to fill out the block.  Symmetric encryption is extremely fast, in part because most modern chips have instructions specifically designed for it.  It's also extremly secure: a block encrypted with a symmetric key is nearly impossible for an attacker to decrypt, within any reasonable timeframe (ie, longer than your lifetime).
That said, encrypting each block with the same key isn't enough to fully protect the message, because if there are two blocks with the same "plaintext" content, the encrytped "ciphertext" will be the same for those blocks.  Even if the attacker won't know precisely what that block is, they'll know that there were multiple instances of the same block.  Said another way, even if it's unclear what the message is, it might reveal that the same messages was sent several times in a row.  Accordingly, symmetric encryption is paired with an encryption "mode" (seeded with an "initialization vector"), which generally combines the output of previously encrypted blocks with each new block, such that even the same block encrypted many times in a row will look random to an attacker.
Though not strictly a feature of symmetric encryption per se, it's becoming increasingly popular to "ratchet" your symmetric key over the course of communications using a "hash function", such that every message you send is encrypted with a different key.  To break this down further, the purpose of a "one way hash function" (like sha) is to generate a "fingerprint" of some input that perfectly identifies it, without being able to "reverse" it and reproduce the input.  So two different people who have the same content, can each independently hash it, and produce the same exact fingerprint.  Then if you hash that fingerprint, you get another one -- and so on.  This is a neat trick, because so long as two parties agree on the initial content (in this case, the content is a symmetric key), then they can agree on an infinite sequence of keys -- each generated from the one before.  So if both parties agree to "ratchet" the key (ie, replace the symmetric key with its own hash) after each message, then they will both change the key every single message -- in a way that appears completely random to an outside party.  Even better, if someone intercepts one of the keys somehow, because the hash function is "non-reversible", the attacker wouldn't be able to figure out any of the previous keys that came before it (though they would be able to "ratchet along" for future messages).
Symmetric encryption is also called "secret key" encryption because anyone who has the key can decrypt the content, so it's important to keep that key "secret".  That is hard to do when two parties are communicating over a public channel that others can see, which is what brings us to...
Asymmetric (aka "Public Key") Encryption

To solve the "bootstrapping" problem of enabling two parties to agree upon a common symmetric key, a second kind of "asymmetric" encryption exists that uses a key split into two parts: the "public" and "private" half.  As the name implies, the "public" half of the key can be safely shared in the open, and in fact is the most common way of cryptographically identifying yourself to the rest of the world.  The "private" half, again as the name implies, must be kept secret, and the fact that you are the only one who knows it is what enables you to prove you are in fact the person who is identifying yourself with the corresponding public key.  Though like anything there are a million different algorithms, the two main asymmetric encryption algorithms are:


RSA - A workhorse from the 70's, RSA is neat in that a message encrypted with one half of the key can only be decrypted with the other.  So, this means anybody can securely send a message to someone over a public network by encrypted the message with the recipient's public key, confident that only the recipient has the private key that can decrypt it.  On the other hand, this works in such a fashion where someone can demonstrate that they do in fact have the private key by "signing" a message using the private key.  This is done by sharing a message like "My name is David", and then encrypting that message with the private key, enabling anybody to decrypt it with the public key and confirm that the message was in fact sent by the holder of the private key.  RSA encrypts in blocks of 256 bytes, but in practice the maximum message size is 190 bytes due to the need to "pad" the message.  RSA is much, much slower than symmetric encryption, and much more computationally expensive to create a key.  Accordingly, in nearly all cases, RSA is not used to encrypt actual content (ie, by splitting it up into a bunch of 190-byte blocks and encrypting each separately).  Rather, the sender would typically pick some other symmetric key (aka a "session key" as it's generally unique for each communication's session), encrypt that session key with RSA, then send the encrypted session key to the recipient.  After the session has been "initiated" in this way, subsequent communications would just use the much, much faster symmetric encryption key.


Diffie Hellman (DH) - DH isn't just an alternative to RSA, it's actually a totally different thing.  Though both RSA and DH has the same concept of public/private keys, DH doesn't let you encrypt with one key and decrypt with the other, like RSA.  Rather, DH is used exclusively to enable two parties to independently derive the same session key.  Basically, given two parties Alice and Bob, each of whom has publicly revealed their public key halves (Alice.pub and Bob.pub), and secretly stored their private keyhalves (Alice.priv and Bob.priv), DH allows Alice and Bob to each combine thier private key with the others' public key, to independently calculate the same shared session key.  So it accomplishes the same real-world effect as RSA in that the end result is both parties have securely agreed upon a symmetric key to encrypt/decrypt the session content, but just goes about it a differnet way.  So while DH can't encrypt arbitrary messages directly with the public key like RSA -- nor can it sign messages with the private key -- DH has a very significant advantage over RSA in that generating a new key is vastly less expensive, so it's much easier to "rotate" (ie, regenerate) a DH key than an RSA key.  Indeed, DH with Curve25519 can use any 32-byte block of data as a DH key, meaning generating a new key is as simple as picking a random 32 bytes.  This is particularly useful when doing a "DH ratchet", which is just like a symmetric key ratchet (explained above), but changing your DH key with every message.  Ratcheting with RSA would likely be prohibitively expensive.


Problem

Alright, with all that background out of the way, what is the actual problem we are trying to solve?  Security is never absolute, it is always a tradeoff, so as we think about Expensify's end-to-end encryption design, what specifically are we trying to protect again?  I would suggest that the bulk of our users would prioritize being protected from the following, in order:


Mass surveillance.  Most people have nothing to hide and no particular "need" for secrecy.  But when continuously observed we get anxious and behave differnetly.  People just want to know that there is nobody casually and continuously looking over their shoulder, just for a general peace of mind.


Your Boss.  Expensify is a tool used both for personal and professional use -- and because we're not robots, we often have personal relationships in the workplace that need protecting.  Expensify needs to guarantee that nobody can see your private chats except those who participated in them.


Lawyers.  As law-abiding citizens we have an obligation to participate in "legal holds" and "discovery", and to enable our employers to do the same.  But we should put our users in control of their information by ensuring that the requests come directly to them, rather than people they don't know revealing thier private information to other people they don't know.


The Cops.  We expect that many of our users will be engaged in righting society's many wrongs, which might include protest -- and even civil disobediance.  Thia could involve tangling with law enforcement -- possibly in the streets, possibly at border crossings -- who have an imperfect track record in honoring the full rights of citizens.  Ideally the information they could obtain would be limited.


Hackers.  Expensify users would expect that every reasonable precaution will be taken to ensure that hackers of all kinds will not be able to access private information, whether business or personal.


That said, at risk of being controversial, I'd like to call out one threat that is "out of scope" for a v1 of our end-to-end encryption:

The Feds.  A full protection against the united global efforts of the world's richest and most technologically advanced nations is... difficult.  Indeed, I'd say it's basically impossible for any private company to realistically claim a serious defense against the multi-trillion dollar military/intelligence complex, and anyone claiming otherwise is simply delusional.  More to the point, given that these nations have all the legal and physical authority to force anyone involved in this process to bypass their own security and lie to everyone about it, as a customer, nobody should trust anyone who claims they are providing this kind of impossible protection.

Solution Summary

The above sections outline the various kinds of attackers and attacks that we should pro_tect against, and hinted at how those protections might work in practice.  Let's dig into each of those protections now:


Device keys.  The first thing the client device does is generate an asymmetric key, which will be used to communciate with this specific device (of which a user might have multiple).  Whenever the device's mailbox on the server is empty (which is most of the time), this key can be safely regenerated, with new public keys broadcast out to the other devices.


User authentication.  Expensify uses a standard email/sms + password + optional (but highly recommended) two factor authentication scheme.  The device's public key is uploaded with this authentication request, and on succesful authentication, is used to open a store-and-forward "mailbox" for other devices owned by the same user to send it messages.


New device pairing.  In particular, the new device broadcasts to all the other devices "I have successfully authenticated, and my random pairing PIN is XXXX!" by encrypting this message to each other device's public key, and sending it to the other devices' mailboxes.  Every device gets this message (either immediately, if online, or later when it comes online), and shows a "Did you sign in to the device showing PIN XXXX?" modal.  Whether yes or no, that device broadcasts a message to all other devices saying "This device's pairing request was approved/denied", and they all hide the pairing modal.


Prekey initialization.  The client takes a page from Signal's book and generates 100 "prekeys" and uploads their public keys to the server.  The server will give each of these keys out as necessary, one at a time, whenever initiating a new session.  The server notifies all devices whenever a prekey is used, and each device uploads a new one to replace it, such that there are always at least 100 prekeys for each user.


Prekey sharing.  Becuase Expensify is built for multi-device operation, the private half of each prekey needs to be synchronized between all devices.  Accordingly, when uploading the public prekey to the server, each device also encrypts the private prekey half with each other device's public key, and inserts it into that device's mailbox.  This mailbox will deliver instantly if the client is online, otherwise it'll be held until teh client comes online again.  In this way, all of a user's devices will have every prekey's private half, and thus be equally ready to process the new incoming session.


Session initialization.  Every chat in Expensify is a room with 2 or more users -- there is no technical differentiation between "1:1" and "group" DMs.  Furthemore, unlike Signal, all messages are recorded on the server forever, and delivered to clients "in order".  In essence, each chat is a sequential list of messages, each of which can be optionally encrypted.  (Some rooms have encryption disabled, to enable serverside searching and legal hold discovery for business purposes.)  Accordingly, a "new chat" just means a new room has been created with two or more people in it -- and once created, it never goes away.  So if Alice is creating a room with Bob and Carol:

Alice's device 0 (Alice0) notifies the server it is creating a chat with Bob and Carol
The server picks the next prekey for Alice, Bob, and Carol, and adds a message to the chat saying Alice, Bob, and Carol are using prekeys A, B, and C respectively -- all devices currently online receive this message (and all that are not online will receive it when they sign in)
Alice0 sees it has no symmetric "room key" R for this room
Alice0 picks a random symmetric "room key" R and creates a message M like Alice will be encrypting future messages to this room with room key R, until further notice.
Alice0 encrypts message M once each for Bob's and Carol's public prekeys, as well as once again to her own prekey (such that her other devices will get it), combines it all into a single post signed by her private key A, and sends to the room
Bob and Carol, along with all of Alice's other devices receive this message on all their devices, each:

verify it's signed by A's public key
decrypt using their own room's prekey private half
record Alice's symmetric room key for future use


Session communications.  Now Alice has an active room key known to Bob and Carol; all of Alice's future messages are just encrypted once using the symmetric room key, and all devices for all users can decrypt it.


Disappearing messages. If Alice would like disappearing messages (eg, after 1 week) then at some cadence (eg, daily):

Alice0 picks a new room key and broadcasts it out just like before, but this time including along with the key a request to delete all messages encrypted with this key after 7 days -- along with the key itself
All devices keep track of all historical room keys, and delete them at their configured expiration dates


Conclusion

This proposal to add end-to-end encryption is heavily influenced by Signal's design, but adapted for a different set of requirements, and hopefully simplified where possible as a result.