Connicpu/RSA Class Protocol.md

## RSA Class Protocol.md

      
    Raw
  

              RSA Class Protocol.md
            
          
    An RSA Class Project Protocol

By Connie Hilarides

Overview

I have created a fairly simple protocol for encrypting and decrypting files so that
we can all have a common system for encrypting and decrypting our files. This guide
will not directly contain code because we are all using differeng languages and
frameworks, but I will describe on an abstract level the process for reading and saving
these files.
Reading the files

When reading a file, we must decide how we are going to break up the data so that it can
be reasonably and efficiently transformed and transported through RSA. It is relatively
simple to support arbitrarily-decided chunk sizes, and different chunk sizes will be
appropriate depending on the size of P and Q, so this is the approach I have chosen.
Using a 1-byte chunk size is tempting for its simplicity, but it comes with a glaring
security issue: It becomes trivial to reverse engineer the message values by encrypting
all possible chunk values because there are only 256 of them. With that out of the way,
let's get down to the actual procedure.
When reading a file, we can simply pick a number of bytes to read at a time, say 4. We
read this values into a chunksize buffer, and then we convert the buffer into a BigInt,
or whatever the equivalent in your multiple-precision math library is, and read them in
big-endian order. The byte order is very important for compatability across clients. If
you just need a small refresher, big-endian means the first byte in the buffer is the most
significant digit. If your library has a way to convert a buffer of big-endian bytes into
a BigInt, great! Just do that 😄! Otherwise, you can do something along the lines of
the following psuedocode:
let buffer = file.read(chunksize);
while buffer.len() < chunksize {
    buffer.push(padding_byte);
}

let value = BigInt::ZERO;
for byte in buffer {
    value = value << 8;
    value = value | byte;
}

chunks.push(value);
And now we repeat this process until we reach the end of the file. If the size of the file
is not a multiple of the chunksize, that's okay. We simply treat the least significant digits
in the last chunk's bigint as 0x00 (or any other padding byte, it doesn't particularly
matter). If you're following my psuedocode above for your file
reading, then all you have to do is only read the first remaining_bytes values from the
buffer. Put all of these values into an array and we're good to go! From here you can treat
these numbers exactly like you did the encoded strings from earlier in the class.
Addendum

Since there's been a little bit of confusion about this, I'll clarify. On the last chunk when
filesize % chunksize != 0, you should pad the buffer up to chunksize with whatever value
you would like, although it seems like 0x00 is a good choice for consistency. On the decryption
size of this, it means after converting the integer back into a byte buffer, you should make sure
you ignore the end bytes which go beyond the data in the last chunk.
Transmitting encrypted data

Now that we can read our file into an array of numbers and encrypt it, we need a way to send
the data to eachother. For this I opted for an extremely simplified protocol to avoid the
messiness of working with binary files across disparate codebases as much as possible. The
encrypted files are simply a long list of decimal numbers separated by newlines. For reference,
these are the first few lines of the result of encrypting the sample PDF file on moodle:
4
156629
280142762805293730221154046
833819895614268281427979597
348435674451341947894218752
1251450166634126079199772553
1036140818739004042990078616

The very first line of the data is the chunksize which was used when we read the data
from the file. We need this because otherwise it would be impossible to tell how many
significant bytes the decrypted data contained, due to 0x00 bytes in the original data.
The second line is the total number of bytes in the original file. We need to know
this in order to know how many significant bytes are in the final chunk if the total
size is not divisible by the chunksize. The rest of the data is just the list of
integers making up the encrypted data. An extremely simple format to send! 😊
Writing the files back to disk

Writing the data back to disk is a fairly straightforward process. After you've decrypted
the RSA numbers, you simply need to reverse the process outlined in the first section.
There are just a couple edges you need to watch out for. Your BigInt library may or may
not come with a method for giving you back your integer's bytes into a buffer. If it does,
it may be giving you the minimum number of bytes required to represent that number, which
could be less than the chunksize if the most significant bytes were 0x00. This is just
something to think about. And then you'll need to deal with the last chunk potentially
containing less than chunksize bytes worth of data. With that foreward over with, I'll
just give you some psuedocode so that you can implement this.
let buffer = new_bytearray(chunksize);
let total = totalsize;
for chunk in chunks {
    // Put the raw bytes of the number into our buffer. `count` is
    // the number of bytes which got written to the buffer.
    let count = chunk.to_bytes_bigendian(&buffer);
    
    // This will happen on the last chunk if the totalsize is
    // not a multiple of chunksize.
    if total < chunksize {
        chunksize = total; // pretend the remaining is our chunksize now
    }
    
    // This happens when the most significant bytes are zero.
    // We'll need to write out some zeros to compensate.
    if count < chunksize {
        file.write_zeros(chunksize - count);
    }
    
    file.write(buffer, count);
    
    total -= chunksize;
}
I hope this is enough for you to go on! (I know this last bit is only kind-of psuedocode,
hope nobody minds 😊)
If you have any questions or comments, feel free to leave them on this gist, or send
me an email at c.hilarides@digipen.edu