fisx/fig1.jpg Secret

## fig1.jpg

      
    Raw
  

              fig1.jpg
            
          
## lio.markdown

      
    Raw
  

              lio.markdown
            
          
    IO with labels

In Haskell, functions are pure and effects are explicit.
append :: [a] -> [a]
filter :: (a -> Bool) -> [a] -> [a]

If you want to cause an effect, you need to write a function that
returns an action:
filterIO :: (a -> IO Bool) -> [a] -> IO [a]
putStr :: String -> IO ()

We will now look at an API that consists of two functions: one for
getting the system time, and one for getting the contents of
/etc/shadow.  Clearly, granting access to the first function is far
less problematic.  We start by providing the unprotected functionality
in a module called TCB (for "trusted computing base").
module Api.TCB
unsafeGetTime :: IO UTCTime
unsafeGetShadowFile :: IO [String]

What we want instead is two functions that internally may call the
above, but on the surface keep track of the legitimacy of all
operations:
-- (still in Api.TCB)
data L = Public | Member | Admin
  deriving (Eq, Ord, Show)

instance Label L where ...

getTime :: LIO L UTCTime
getShadowFile :: LIO L [String]

L is the information on the clearance level required for the action
currently performed.  We will talk about a more sophisticated instance
type of 'Label' in the next section.  For now just think of them as
totally ordered.
If we run an action, we need to provide a clearance level:
runTCB :: Label l => l -> LIO l a -> IO a
runTCB = ...

main = do
  role <- getUserRole
  runTCB role getShadowFile >>= print

In a web application, getUserRole may extract user name and password
from the HTTP request, verify the password, and lookup user roles in
the database if the password is correct.  (In our current
implementation, it needs to crash if credentials are incorrect.)
LIO L is a monad that makes sure that a given clearance level is
never violated.  How is this achieved?  Let's look at the
implementation.
getTime is easy, because it does not require any particular
clearance level.  We simply need to introduce a function that turns a
labeled action into an unlabelled one:
ioTCB :: LIO l a -> IO a

Note that calling ioTCB requires you to manually verify that the
clearance is sufficient for performing the action.  This is why
modules that imports it must be trusted, and why we called the current
module TCB.
Now we can provide a trivial implementation of getTime:
getTime :: Label l => LIO l UTCTime
getTime = do
  ioTCB unsafeGetTime

(FIXME: drop this example for something more entertaining?)
access to /etc/shadow, however, needs to be marked as privileged
access.  lio provides the function taint for this:
taint :: Label l => l -> LIO l ()

getShadowFile :: LIO L UTCTime
getShadowFile = taint Admin >> unsafeGetShadowFile

Note that the label must be fixed to 'L' here, because we actually
update the current label in the labeled IO monad.
This is the gist of module Api.TCB.  We can now define a module:
module Api (getTime, getShadowFile) where
import Api.TCB (getTime, getShadowFile)

And we can mark this module as Safe!  [link to SafeHaskell wiki
page].
From here on, we can outsource development of an api extension in
module Api.Ext to a company that employs positively malicious
developers.  The only thing you need to do is check that it marked as
Safe in the header.  Then Api.Ext will:


be able to use getTime and getShadowFile to implement more complex
functions on top.


provably not construct any action that causes an effects that is
not legitimized by the clearance level with which the action is
called.


About labels

A DCLabel has two components: the secrecy and the integrity.
Each individual component is given by a formula in conjunctive
normal form over principals. Note that there is no negation of
principals available (discussed further below).
DCLabels are attached to data. They have meaning if looked at in
connection with data.
Let's look at secrecy first:


False
means ultimate secrecy, nobody is allowed to see the value (useless)


True
means completely public, everybody is allowed to see the value


P
means that exactly P is allowed to see the value


We can build


P \/ Q
both P and Q individually are allowed to see the value


P /\ Q
only P and Q together are allowed to see the value


We can not build


not P
everybody but P is allowed to see the value


Let's also look at integrity:


False
means ultimate integrity, nobody can have touched the value (see below)


True
means completely untrusted, everybody could have touched the value


P
means that exactly P must have produced the value


We can build


P \/ Q
both P and Q might have individually touched the value


P /\ Q
only P and Q together could have produced the value


We can not build


not P
everybody but P could have touched the value


An observation is that naturally, secrecy can only increase, and integrity can
only decrease: the secrecy of compound data is always the maximum of the secrecy
of its components; the integrity of compound data is always the minimum of the
integrity of its components.
So if we look at the secrecy lattice, we have
True <= P <= False

And if we look at the integrity lattice, we have
False <= P <= True

The minimal element of the combined lattice is thus
True %% False

which corresponds to "completely public" and "ultimately trusted". This is also
the label that is naturally attached to an empty piece of data (e.g. Haskell's
() type): for a value of type (), there is no choice. Everbody can know it,
so it is naturally public. Also, nobody can or could have messed with it, so it
can be ultimately trusted.
The maximal element of the combined lattice is
False %% True

which corresponds to "ultimately secret" and "completely untrusted".

Lifetime of a program

Now let us look at the lifetime of a program, and at reading and writing
labelled data. Let us at first disregard our actual permissions. Let us look
at the program as a policy-free and trusted actor that just wants to maintain
labelling information properly, but is allowed to do everything in principle.
(This corresponds in LIO to assuming we have maximum clearance.)
Reading data

Under this assumption, we can always read data of any label. However, reading
this data effectively combines it with everything we already know. That is why
it makes sense to maintain a current label that starts out low (e.g. at the
minimal value described below) and increases over time depending on all the
things we have seen.
Example for secrecy: We start out with the current label True, indicating
all information we currently have is public. We then read a secret belonging to P.
Then the current label's secrecy is raised to P. We then read another secret
belonging to Q. Then the current label's secrecy is raised to P /\ Q. We now
have secret data available for P and Q together.
(NOTE: This is not the same as the data being available for both P and Q,
which would mean either P or Q can read it.  P and Q have to cooperate
to ccess data labelled with P /\ Q.)
Example for integrity: We start out with the current label False, which means
all information we currently have is ultimately trusted. We then read some
information signed by P (i.e., with integrity P). Then our current label's
integrity component is raised to P. If we combine this with information signed
by Q (i.e., with integrity Q), then our current label's integrity component is
raised to P \/ Q. We now have data available that could have been interfered
with by either P or Q.
Writing data

We can write data to a variable that already has a fixed label attached to it,
we can not always do it. It is easy to see that we can always write out data
at our current label, or at any label higher than that. But if the variable
has a label that is lower than our current label, we could potentially leak
secret information or have untrusted information taint supposedly trusted data.
Example for secrecy: We start out with the current label True, indicating
all information we currently have is public. At this point, we can write
to a variable that is marked with secrecy True, secrecy P, secrecy Q, or
even secrecy P /\ Q. Once we read a secret belonging to P, we can no longer
write to a variable marked with secrecy True or secrecy Q, but writing to
secrecy P or secrecy P /\ Q is still fine. If we then also read a secret
belonging to Q, our current secrecy label becomes P /\ Q, and the only
variable we can still write to is that of secrecy P /\ Q (or yet higher).
Corner case (not exception): we can always write to a variable that is
ultimately secret.
Example for integrity: We start out with the current label True, indicating
all information we currently have is ultimately trusted. At this point, we
can write to a variable that is marked with integrity False, integrity P,
integrity Q, integrity P \/ Q or integrity True. Once we read data that
has integrity P, our current label raises to integrity P, meaning that all
data we have available could have been interfered with by P. We now can
only write to variables with integrity P, integrity P \/ Q or integrity True.
If we also read data with integrity Q, our current integrity label is raised
to P \/ Q, and we can only write to variables with integrity P \/ Q or
integrity True. In fact, we can always write to a variable marked with
integrity True, indicating that it is completely untrusted.
Doesn't this feel asymmetric? Reading always succeeds, but writing can fail?
Yes, but this is triggered by the role of the current label. If we read data,
we can simply raise the current label as needed. If we write data, we'd need
to raise the label of the variable we write to. But LIO's assumption is that
variables have fixed labels attached to them. So if we can't raise it, the
only other option is to fail.
Clearance

We can block reading of data as well. This is the role of the current clearance
of a computation. The current clearance provides an upper bound to the current
label. So at all times the current clearance has to be at least as high as the
current label. However, the current clearance cannot be raised, it can only be
lowered (voluntarily).
Example for secrecy: If we set the secrecy component of the clearance of a
computation to P. We still start out with current label True. We can read
a variable that is secret to P, which will raise the current secrecy label
to P. But we can never even read a variable that is secret to Q, because
this is beyond our clearance. In general, we can read nothing that is too
secret for our current clearance.
Example for integrity: If we set the integrity component of the clearance of
a computation to P, we still start out with current label False. We can read
a variable that is marked with integrity P, which will raise the current
integrity label to P. But we can never read a variable that is marked with
integrity Q, because that is beyond our clearance. In general, we can read
nothing that would be less trustworthy than indicated by our current clearance.
Clearance also affects writing of data. TODO: explain why ...
TODO: What is a useful / sane value for clearance? I think that
typically the secrecy clearance will be set globally, i.e., to P for
user P. This ensures that a computation running on behalf of P cannot
work in any way with data that is too secret for P. However, the
integrity clearance is a different issue. Any user might very well
work with untrusted data.  We might temporarily want to ensure that we
work with trusted data. In this case, we might want to artificially
lower the integrity clearance temporarily.
Intermediate summary

You can read everything that is below your current label without changing
your label. You can read everything that is below your current clearance
by raising the label to the label of the data you have seen.
You can write to everything that has a label between your current label
and your current clearance. Writing never changes the current label.
Special case where the current label is equal to the current clearance:
you can read everything below your current clearance / label. You can
write only at you current clearance / label.
LIO functions corresponding to these concepts

If you use LIO references that are directly marked with a label, you get
everything for free:
newLIORef :: Label l => l -> a -> LIO (LIORef l a)

This is essentially a write (of the initial data), but as you are
creating new data, you can choose the label yourself. So the label
has to be between your current label and your current clearance.
The current label does not change.
readLIORef :: Label l => LIORef l a -> LIO l a

Reads, so the label has to be below the current clearance, and the current
label is raised if necessary.
writeLIORef :: Label l => LIORef l a -> a -> LIO l ()

Writes, so it has exactly the same constraints attached to it as newLIORef.
The current label does not change.
modifyLIORef :: Label l => LIORef l a -> (a -> a) -> LIO l ()

Is seen as only a write, because the original value of the reference is not
exposed. So it has the same restrictions as writeLIORef.
Declassification and signing

It is natural that everyone has a certain amount of control over the
secrecy and integrity of data that is against the natural order of everything
automatically becoming only more secret and less trusted.
For example, if I receive data marked secret to me, I can very well
decide that I don't in fact consider it secret, and mark it
public. However, if I receive data that is marked secret to me (P) and
someone else (Q), i.e., has secrecy P /\ Q, then I can at most remove
my secrecy component and declassify it to secrecy Q.  I cannot make it
completely public, however.
Does that mean that I can't myself read it any more after declassification?
Dually, if I receive untrusted data, I can check it and declare that it is,
in fact, trusted by me. I can sign it. It then has integrity P. I can also
sign data that already has higher integrity than True. I.e., integrity Q.
If I then sign it, it gets integrity P /\ Q. However, I cannot usually
sign data in any other way. So I cannot take untrusted data and suddenly
claim it has integrity Q.
This is covered in LIO by the concept of privileges.
TODO: As I understood, privs are values that I can give untrusted code
to do things that they are not supposed to do according to their lio
state.  I though declassification and endorsement would in principle
work without privs, but I need to think more about this.

`False`	means ultimate secrecy, nobody is allowed to see the value (useless)
`True`	means completely public, everybody is allowed to see the value
`P`	means that exactly `P` is allowed to see the value

`P \/ Q`	both `P` and `Q` individually are allowed to see the value
`P /\ Q`	only `P` and `Q` together are allowed to see the value

`False`	means ultimate integrity, nobody can have touched the value (see below)
`True`	means completely untrusted, everybody could have touched the value
`P`	means that exactly `P` must have produced the value

`P \/ Q`	both `P` and `Q` might have individually touched the value
`P /\ Q`	only `P` and `Q` together could have produced the value