NikhilPeri/Secure and Verifiable Database Schema.md

## Secure and Verifiable Database Schema.md

      
    Raw
  

              Secure and Verifiable Database Schema.md
            
          
    Anonymous Verifiable Database

This document outlines solutions to building the database for a voting system with the following features:

Anonymous means that it is not possible to associate a vote back to a single citizen, but a citizen can
log back in and review their ballot was cast correctly.
Personally Verifiable means that we can automatically detect their vote was counted as they cast it
this is something that is not possible in the current voting system
Publicly Verifiable__ means that we can automatically go through the database and independently count the votes
and ensure they have not been altered

Each of the solutions will be evaluated based on resiliency in the case of database leaked database read_access, and write_access
What is Hashing

d4811bee88f1480710302b89425266ee =  Hash('my_secret'+'a_random_salt')

Hashing is a one way function that transforms data into a sequence of psuedo random bits.  The process uses modulo
arithmetic is irreversible, without brute force.  If a strong hash is used such as SHA3-512 which generates a
64 byte word there are 1.34e154 possible
combinations which would take more computing time than there is in the universe to break.
Furthermore there are libraries to use slow hashing which will require a certain amount of time to compute which further
decreases the ability of a brute fore attack. For this Rails Project we will use bcrypt.
BCrypt also supports salting which is good approach to reduce the likely hood of hash collisions when hashing no random data
BCrypt::Engine.cost = 8           # value between 4 and 10

salt = BCrypt::Engine.generate_salt                   # $2a$08$982gj3KvnmnjgGCR/O3vR...
hash = BCrypt::Engine.hash_secret('my_secret', salt)  # bwhVn9QRYbaetibsjCbnymulhLjD...

We can now store the hash and the salt in out database and as long as we provide the same my_secret input and salt we will get the same
string output. This is what makes hashing a oneway process
Anonymous


Foreign key associations are bidirectional, and not useful when trying to build an anonymous lookup
by using the following hash function to generate a secure uni-directional foreign key lookup
user_secure_id = hash(user_password, user_id, election_id, voter_salt)

This means that a user will have to enter their password in order to lookup their vote.  However if read_access is compromised
it is not possible to identify the vote associated to a particular user.  In fact by adding the election_id to the hash generation
a user will have a unique secure_id for each election they participate in.  This means for a single vote we are secure if read access
is compromised.
However if write access is compromised it is possible to alter a individual users vote and it will only be apparent if the user logs
in and physically notices their vote was changed.  This means that the system is not publicly or personally verifiable
read access manually personally verifiable
write access unverifiable
Publicly Verifiable


Currently the secure foreign key only allows anonymous database lookups. However the election is still not publicly verifiable.
However if we can generate a hash using publicly available information private citizens can iterate through all the votes and detect
any anomalies.  If we chain these operations by using part of the previous hash in the next hash, therefore
changing a single vote will break all later votes as well.
When it comes to sampling the previous signature we want to sample as many bits as practical.  Sampling too many will increase the
likelyhood of a hash collision, not sampling enough bits will result in missing a tampered vote.
P_false_positive = 16 ^ - (# of sample bytes)

sampling 10 bytes means that in an election with 1 billion votes you can get away with tampering with 0.014 votes and not get caught
However this approach is great for the read_access case.  But if the database is write_access compromised and signatures are generated
using all publicly available information it is very easy for an attacker to re-generate the signatures and automatic verification by an independant
citizen would not detect any anomalies
private_vote_signature = hash(election_id, vote, previous_public_vote_signature)

read access automatically publicly & personally verifiable
write access unverifiable
Personally Verifiable


To solve the final problem of automatic verifiability we generate a final hash of the public_vote_signature and the user_password.  This
signature can not be regenerated by an attacker. If write_access is compromised and an attacker changes a single vote they are able to modify the
public signatures, however they cannot modify the private_vote_signature.  This means any user who logs in can verify the correctness of their
vote along with all pervious votes in the system
private_vote_signature = hash(user_password, public_vote_signature)

read access automatically publicly & personally verifiable
write access automatically personally verifiable, by all users
Open Issues

This approach secures the lowest level of vote storage, but there are still other unresolved attack vectors

When Foreign Keys are integers it is very easy to index them in and preform optimized queries.  However when the keys you are
searching for are pseudo random queries will become very slow.  A solution could be parallel distributed databases clusters where nodes are added
when the master table exceeds a threshold, the chain can be forked and two tangent chains can be created
Still maintain a single point of failure, if datacenter is destroyed. A better solution would be high replication of data centers
Still susceptible to malware and MITM attacks!! please help me solve this
How do we ensure a voters physical identity, again help me solve this!!