frasertweedale/crit.rst Secret

## crit.rst

      
    Raw
  

              crit.rst
            
          
    Extensible Objects part 2: parsing critical extensions

In the previous post, I investigated and compared a couple of
approaches to implementing an extensible object format in Haskell,
motivated by requirements of the JSON Object Signing and Encryption
(JOSE) formats.  In this post we push forward with the record
approach and implement the notion of critical extensions.

Criticality

Several extensible object formats, particularly in the security
sphere, include the concept of "critical" extensions.  The short
story is that when an implementation encounters a critical
extension, if it does not recognise the extension it must refuse to
process the object.  There are a multitude of scenarios where this
is important.
For example, the X.509 certificate format (RFC 5280) has a
Subject Alternative Name (SAN) extension.  Normally this extension
is non-critical; if an implementation doesn't recognise it, no big
deal, just ignore it and carry on.  But if the certificate has an
empty Subject Distinguished Name, then the SAN extension alone
provides information about the certificate subject.  In that case
the extension must be marked critical, and implementations that do
not recognise the SAN extension must reject the certificate.
The JOSE formats, specifically JSON Web Signature (JWS) (RFC
7515) and JSON Web Encryption (JWE) (RFC 7516) also support
critical extensions.  Whereas in X.509 each extension instance on a
certificate carries a boolean flag to indicate criticality...
Extension  ::=  SEQUENCE  {
   extnID      OBJECT IDENTIFIER,
   critical    BOOLEAN DEFAULT FALSE,
   extnValue   OCTET STRING
   }

JWS and JWE support a header field containing a list of keys of the
critical parameters in the object.  An excerpt from RFC 7515
explains:

The "crit" (critical) Header Parameter indicates that extensions to
this specification and/or [JWA] are being used that MUST be
understood and processed.  Its value is an array listing the Header
Parameter names present in the JOSE Header that use those extensions.
If any of the listed extension Header Parameters are not understood
and supported by the recipient, then the JWS is invalid.
It goes on to state that the list (if it appears) must not be empty,
must not contains keys that are not present in the object, must not
contain duplicate keys, and so on.

Data types for (a subset of) JWS

Let's define a small subset of JWS to play with, which we'll call
J.  Like its big brother, J objects get passed around as JSON.
A J object contains a "signature" which for the purposes of
this post we'll say is a freeform string, and a "header" (the J
Header).  The J Header is another object that contains an "alg"
(algorithm) which again we'll pretend is a freeform string, and an
optional "crit" array as defined above.
When defining extensions to J, it is the J Header that gets
extended.  No new fields ever get added to the other J object, so
we define a concrete :hs:`JHeader` record type, and a :hs:`J` record
type that has a type parameter for the header field.
data JHeader = JHeader
  { _alg :: String
  , _crit :: Maybe (NonEmpty Text)
  }
  deriving (Show)

data J a = J
  { _header :: a
  , _sig :: String
  }
  deriving (Show)
Note that although :hs:`Maybe (NonEmpty Text)` is isomorphic
to plain old :hs:`[String]`, the former encodes the presence or
absense of the "crit" field as well as enforces the requirement
that the "crit" array be non-empty.  For the purposes of this
post we ignore the requirement that the array not contain duplicate
values.
We're going to use classy optics so let's define a class containing
lenses for the fields of a J Header:
class HasJHeader a where
  jHeader :: Lens' a JHeader
  alg :: Lens' a String
  alg = ... -- default implementation
  crit :: Lens' a (Maybe (NonEmpty Text))
  crit = ... -- default implementation

instance HasJHeader JHeader where
  jHeader = id
Instead of writing the above by hand, lens provides a Template
Haskell function to automate it.  Some people avoid Template
Haskell but if the shoe exactly fits I don't mind using it.
{-# LANGUAGE TemplateHaskell #-}

import Control.Lens.TH (makeClassy)

data JHeader = ...  -- as above
makeClassy ''JHeader
Now we can write :hs:`FromJSON` instances for :hs:`JHeader` and
:hs:`J`.  (We'll ignore :hs:`ToJSON` in this post but it is required
in the real jose library).
instance FromJSON JHeader where
  parseJSON = withObject "J Header" $ \o -> JHeader
    <$> o .: "alg"
    <*> o .:? "crit"

instance (FromJSON a) => FromJSON (J a) where
  parseJSON = withObject "J" $ \o -> J
    <$> o .: "header"
    <*> o .: "signature"
At this point, we are parsing the "crit" parameter but not yet
validating it.

Validating critical headers

In J there are two requirements for validating the "crit"
header:

Keys must be present in the JSON object
Unrecognised keys must cause the object to be rejected

Let's address 1. by enhancing the :hs:`JHeader` parser to check that
each element in the "crit" header is present in the object.
instance FromJSON JHeader where
  parseJSON = withObject "J Header" $ \o ->
    (JHeader <$> o .: "alg" <*> o .:? "crit")
      >>= checkCrits o

checkCrit :: Monad m => Object -> String -> m String
checkCrit o k
  | not (k `member` o)  = fail "crit key not present"
  | otherwise           = pure k

checkCrits :: Monad m => Object -> JHeader -> m JHeader
checkCrits o = crit (mapM (mapM checkCrit o))
In :hs:`checkCrits` we use the :hs:`crit` lens and a couple of
applications of :hs:`mapM` to apply :hs:`checkCrit` to each listed
extension, within the monadic parser context.
The following ghci transcript demonstrates:
λ> eitherDecode "{\"alg\":\"\"}" :: Either String JHeader
Right (JHeader {_alg = "", _crit = Nothing})

λ> eitherDecode "{\"alg\":\"\",\"crit\":[\"x\"]}" :: Either String JHeader
Left "Error in $: crit key not present"

λ> eitherDecode "{\"alg\":\"\",\"crit\":[\"x\"],\"x\":\"\"}" :: Either String JHeader
Right (JHeader {_alg = "", _crit = Just ("x" :| [])})

So we have addressed requirement 1.  As for requirement 2, we can
immediately see a problem in the final example above: J Header
does not define the "x" parameter, so this object should be
rejected!

Extending J

J does not have any extensions, so let's define K.  A K object
is a J object with one additional header field: "ext", a
freeform string.  In Haskell:
data KHeader = KHeader
  { _kJHeader :: JHeader
  , _ext :: String
  }
  deriving (Show)
makeClassy ''KHeader

instance HasJHeader KHeader where
  jHeader = kJHeader

type K = J KHeader
Now the :hs:`FromJSON` instance:
instance FromJSON KHeader where
  parseJSON = withObject "K Header" $ \o ->
    KHeader <$> parseJSON (Object o) <*> o .: "ext"
But now we need a way to tell the :hs:`JHeader` parser about the
K's extensions!  Well, first we need to be able to associate a
list of "known extensions" with arbitrary datatypes.  For that, we
will make a new type class (recall that JWE as well as JWS uses this
extension mechanism, so there is no point making it part of the
:hs:`HasJHeader` class).
class HasParams a where
  extensions :: Proxy a -> [Text]
We use a :hs:`Data.Proxy.Proxy` as the argument because we don't
need to convey any information at the value level; we only use it to
select the correct instance.  It also allows any person reading the
type signature to know that it is a constant function.
Now, as well as being about to find out what extensions are
supported by a particular type, we need a way to for information
about the outer type (:hs:`KHeader`) to be "pushed down" to where it
will actually be used in the :hs:`JHeader` parser.  We cannot change
the :hs:`FromJSON` API, but we're never going to be parsing header
objects on their own anyway - they'll always be included in the
larger J object, so we can dispense with the :hs:`FromJSON`
instances for the header objects and add the parsing functions to
:hs:`HasParams` instead.
class HasParams a where
  extensions :: Proxy a -> [Text]
  parseParamsFor :: HasParams b => Proxy b -> Value -> Parser a

instance HasParams JHeader where
  extensions = const []
  parseParamsFor proxy = withObject "J Header" $ \o ->
    (JHeader <$> o .: "alg" <*> o .:? "crit") >>= checkCrits o

instance HasParams KHeader where
  extensions = const ["ext"]
  parseParamsFor proxy = withObject "K Header" $ \o ->
    KHeader <$> parseParamsFor proxy (Object o) <*> o .: "ext"
In :hs:`parseParamsFor` we can see the proxy being "passed down"
through :hs:`KHeader` to :hs:`JHeader` where it is as yet unused
(we will fix that soon).
:hs:`parseParamsFor` at the type of interest requires a proxy for
that same "outermost" type: b and a will be the same type.
We can write a convenience function to relieve the user of having to
construct the :hs:`Proxy` value themselves.  The
ScopedTypeVariables language extension is needed to do this.
parseParams :: forall a. HasParams a => Value -> Parser a
parseParams = parseParamsFor (Proxy :: Proxy a)
Now we must update :hs:`checkCrit` and related functions,
parameterising them over the list of known extensions and failing if
an unrecognised key is encountered.  The JHeader parser must also
be updated; it will use the proxy to retrieve the list of known
extensions.
checkCrit :: Monad m => [Text] -> Object -> Text -> m Text
checkCrit exts o k
  | not (k `member` o)  = fail "crit key not present"
  | k `notElem` exts    = fail "crit key not understood"
  | otherwise           = pure k

checkCrits :: Monad m => [Text] -> Object -> JHeader -> m JHeader
checkCrits exts o = crit (mapM (mapM (checkCrit exts o)))

instance HasParams JHeader where
  extensions = const []
  parseParamsFor proxy = withObject "J Header" $ \o ->
    (JHeader <$> o .: "alg" <*> o .:? "crit")
      >>= checkCrits (extensions proxy) o
Finally we update the :haskell:`J a` parser to use
:hs:`parseParams` for the header parameter:
instance (HasParams a) => FromJSON (J a) where
  parseJSON = withObject "J" $ \o -> J
    <$> (o .: "header" >>= parseParams)
    <*> o .: "signature"
Let's see the results in ghci as we try to parse a J object
and a K object with a critical "ext" parameter:
λ> let s = "{\"signature\":\"\",\"header\":{\"alg\":\"\",\"crit\":[\"ext\"],\"ext\":\"\"}}"

λ> eitherDecode s :: Either String (J JHeader)
Left "Error in $: crit key not understood"

λ> eitherDecode s :: Either String (J KHeader)
Right (J {_header = KHeader {_kJHeader = JHeader {_alg = "", _crit = Just ("ext" :| [])}, _ext = ""}, _sig = ""})

Perfect!

Conclusion

As has just been demonstrated, we are able to parse arbitrary header
types - ostensibly, extensions of the base J Header object -
whilst ensuring that the innermost object's "crit" parameter
checking logic is sufficient even for critical extensions defined by
other types.  The overhead for library authors is small: they only
need to declare what extensions are understood by their type, and
ensure that their parser passes a proxy value on to the parser for
the encapsulated type.
What has been sacrificed to accomplish this?  We no longer have
FromJSON instances for the header types - but we didn't really
need them anyway.  This has an impact on testing: you can't test
just the parsing of the header type.  But a :hs:`newtype` and a
simple :hs:`FromJSON` instance will get you over the line.
In this whole post we ignored the :hs:`ToJSON` instances, which are
likely to be necessary if you're using J (or the real JOSE
formats) in anger.  The instances are straightforward but reveal a
problem with modelling the "crit" parameter as :hs:`NonEmpty
Text`.  If the field accessor (or corresponding lens) is exported
users could add any old value and construct an invalid object.
Various workarounds are possible: functions that only allow you to
add values that appear in the :hs:`extensions` list is one possible
approach.
Our J specification ignored a couple of other requirements that
exist in JOSE.  The first is that parameter keys defined by JOSE
itself are prohibited for some reason (presumably because they would
be redundant).  In my jose library, this requirement is met by
parameterising :hs:`checkCrit` over the list of reserved keys as
well as the list of known extensions.  The remaining requirement is
that duplicate keys are prohibited.  This can be addressed by using
a non-empty set data structure.  I am not aware of any Haskell
library providing such a structure but I have written one as part of
a different library, so I will probably extract it and publish it as
a standalone package so that I can use it in jose (and others can
use it, too).
Finally, there is craziness in JOSE headers that I have not even
touched on in this post, notably the notion of "protected" and
"unprotected" headers: a concept which introduces a lot of
complexity.  I might write about it in a future post, but if you're
interested and/or experiencing a lack of being mad you could read
about it in RFC 7515.