In the previous post, I investigated and compared a couple of approaches to implementing an extensible object format in Haskell, motivated by requirements of the JSON Object Signing and Encryption (JOSE) formats. In this post we push forward with the record approach and implement the notion of critical extensions.
Several extensible object formats, particularly in the security sphere, include the concept of "critical" extensions. The short story is that when an implementation encounters a critical extension, if it does not recognise the extension it must refuse to process the object. There are a multitude of scenarios where this is important.
For example, the X.509 certificate format (RFC 5280) has a Subject Alternative Name (SAN) extension. Normally this extension is non-critical; if an implementation doesn't recognise it, no big deal, just ignore it and carry on. But if the certificate has an empty Subject Distinguished Name, then the SAN extension alone provides information about the certificate subject. In that case the extension must be marked critical, and implementations that do not recognise the SAN extension must reject the certificate.
The JOSE formats, specifically JSON Web Signature (JWS) (RFC 7515) and JSON Web Encryption (JWE) (RFC 7516) also support critical extensions. Whereas in X.509 each extension instance on a certificate carries a boolean flag to indicate criticality...
Extension ::= SEQUENCE { extnID OBJECT IDENTIFIER, critical BOOLEAN DEFAULT FALSE, extnValue OCTET STRING }
JWS and JWE support a header field containing a list of keys of the critical parameters in the object. An excerpt from RFC 7515 explains:
The "crit" (critical) Header Parameter indicates that extensions to this specification and/or [JWA] are being used that MUST be understood and processed. Its value is an array listing the Header Parameter names present in the JOSE Header that use those extensions. If any of the listed extension Header Parameters are not understood and supported by the recipient, then the JWS is invalid.
It goes on to state that the list (if it appears) must not be empty, must not contains keys that are not present in the object, must not contain duplicate keys, and so on.
Let's define a small subset of JWS to play with, which we'll call
J. Like its big brother, J objects get passed around as JSON.
A J object contains a "signature"
which for the purposes of
this post we'll say is a freeform string, and a "header"
(the J
Header). The J Header is another object that contains an "alg"
(algorithm) which again we'll pretend is a freeform string, and an
optional "crit"
array as defined above.
When defining extensions to J, it is the J Header that gets extended. No new fields ever get added to the other J object, so we define a concrete :hs:`JHeader` record type, and a :hs:`J` record type that has a type parameter for the header field.
data JHeader = JHeader
{ _alg :: String
, _crit :: Maybe (NonEmpty Text)
}
deriving (Show)
data J a = J
{ _header :: a
, _sig :: String
}
deriving (Show)
Note that although :hs:`Maybe (NonEmpty Text)` is isomorphic
to plain old :hs:`[String]`, the former encodes the presence or
absense of the "crit"
field as well as enforces the requirement
that the "crit"
array be non-empty. For the purposes of this
post we ignore the requirement that the array not contain duplicate
values.
We're going to use classy optics so let's define a class containing lenses for the fields of a J Header:
class HasJHeader a where
jHeader :: Lens' a JHeader
alg :: Lens' a String
alg = ... -- default implementation
crit :: Lens' a (Maybe (NonEmpty Text))
crit = ... -- default implementation
instance HasJHeader JHeader where
jHeader = id
Instead of writing the above by hand, lens provides a Template Haskell function to automate it. Some people avoid Template Haskell but if the shoe exactly fits I don't mind using it.
{-# LANGUAGE TemplateHaskell #-}
import Control.Lens.TH (makeClassy)
data JHeader = ... -- as above
makeClassy ''JHeader
Now we can write :hs:`FromJSON` instances for :hs:`JHeader` and :hs:`J`. (We'll ignore :hs:`ToJSON` in this post but it is required in the real jose library).
instance FromJSON JHeader where
parseJSON = withObject "J Header" $ \o -> JHeader
<$> o .: "alg"
<*> o .:? "crit"
instance (FromJSON a) => FromJSON (J a) where
parseJSON = withObject "J" $ \o -> J
<$> o .: "header"
<*> o .: "signature"
At this point, we are parsing the "crit"
parameter but not yet
validating it.
In J there are two requirements for validating the "crit"
header:
- Keys must be present in the JSON object
- Unrecognised keys must cause the object to be rejected
Let's address 1. by enhancing the :hs:`JHeader` parser to check that
each element in the "crit"
header is present in the object.
instance FromJSON JHeader where
parseJSON = withObject "J Header" $ \o ->
(JHeader <$> o .: "alg" <*> o .:? "crit")
>>= checkCrits o
checkCrit :: Monad m => Object -> String -> m String
checkCrit o k
| not (k `member` o) = fail "crit key not present"
| otherwise = pure k
checkCrits :: Monad m => Object -> JHeader -> m JHeader
checkCrits o = crit (mapM (mapM checkCrit o))
In :hs:`checkCrits` we use the :hs:`crit` lens and a couple of applications of :hs:`mapM` to apply :hs:`checkCrit` to each listed extension, within the monadic parser context.
The following ghci
transcript demonstrates:
λ> eitherDecode "{\"alg\":\"\"}" :: Either String JHeader Right (JHeader {_alg = "", _crit = Nothing}) λ> eitherDecode "{\"alg\":\"\",\"crit\":[\"x\"]}" :: Either String JHeader Left "Error in $: crit key not present" λ> eitherDecode "{\"alg\":\"\",\"crit\":[\"x\"],\"x\":\"\"}" :: Either String JHeader Right (JHeader {_alg = "", _crit = Just ("x" :| [])})
So we have addressed requirement 1. As for requirement 2, we can
immediately see a problem in the final example above: J Header
does not define the "x"
parameter, so this object should be
rejected!
J does not have any extensions, so let's define K. A K object
is a J object with one additional header field: "ext"
, a
freeform string. In Haskell:
data KHeader = KHeader
{ _kJHeader :: JHeader
, _ext :: String
}
deriving (Show)
makeClassy ''KHeader
instance HasJHeader KHeader where
jHeader = kJHeader
type K = J KHeader
Now the :hs:`FromJSON` instance:
instance FromJSON KHeader where
parseJSON = withObject "K Header" $ \o ->
KHeader <$> parseJSON (Object o) <*> o .: "ext"
But now we need a way to tell the :hs:`JHeader` parser about the K's extensions! Well, first we need to be able to associate a list of "known extensions" with arbitrary datatypes. For that, we will make a new type class (recall that JWE as well as JWS uses this extension mechanism, so there is no point making it part of the :hs:`HasJHeader` class).
class HasParams a where
extensions :: Proxy a -> [Text]
We use a :hs:`Data.Proxy.Proxy` as the argument because we don't need to convey any information at the value level; we only use it to select the correct instance. It also allows any person reading the type signature to know that it is a constant function.
Now, as well as being about to find out what extensions are supported by a particular type, we need a way to for information about the outer type (:hs:`KHeader`) to be "pushed down" to where it will actually be used in the :hs:`JHeader` parser. We cannot change the :hs:`FromJSON` API, but we're never going to be parsing header objects on their own anyway - they'll always be included in the larger J object, so we can dispense with the :hs:`FromJSON` instances for the header objects and add the parsing functions to :hs:`HasParams` instead.
class HasParams a where
extensions :: Proxy a -> [Text]
parseParamsFor :: HasParams b => Proxy b -> Value -> Parser a
instance HasParams JHeader where
extensions = const []
parseParamsFor proxy = withObject "J Header" $ \o ->
(JHeader <$> o .: "alg" <*> o .:? "crit") >>= checkCrits o
instance HasParams KHeader where
extensions = const ["ext"]
parseParamsFor proxy = withObject "K Header" $ \o ->
KHeader <$> parseParamsFor proxy (Object o) <*> o .: "ext"
In :hs:`parseParamsFor` we can see the proxy being "passed down" through :hs:`KHeader` to :hs:`JHeader` where it is as yet unused (we will fix that soon).
:hs:`parseParamsFor` at the type of interest requires a proxy for
that same "outermost" type: b
and a
will be the same type.
We can write a convenience function to relieve the user of having to
construct the :hs:`Proxy` value themselves. The
ScopedTypeVariables
language extension is needed to do this.
parseParams :: forall a. HasParams a => Value -> Parser a
parseParams = parseParamsFor (Proxy :: Proxy a)
Now we must update :hs:`checkCrit` and related functions, parameterising them over the list of known extensions and failing if an unrecognised key is encountered. The JHeader parser must also be updated; it will use the proxy to retrieve the list of known extensions.
checkCrit :: Monad m => [Text] -> Object -> Text -> m Text
checkCrit exts o k
| not (k `member` o) = fail "crit key not present"
| k `notElem` exts = fail "crit key not understood"
| otherwise = pure k
checkCrits :: Monad m => [Text] -> Object -> JHeader -> m JHeader
checkCrits exts o = crit (mapM (mapM (checkCrit exts o)))
instance HasParams JHeader where
extensions = const []
parseParamsFor proxy = withObject "J Header" $ \o ->
(JHeader <$> o .: "alg" <*> o .:? "crit")
>>= checkCrits (extensions proxy) o
Finally we update the :haskell:`J a` parser to use :hs:`parseParams` for the header parameter:
instance (HasParams a) => FromJSON (J a) where
parseJSON = withObject "J" $ \o -> J
<$> (o .: "header" >>= parseParams)
<*> o .: "signature"
Let's see the results in ghci
as we try to parse a J object
and a K object with a critical "ext"
parameter:
λ> let s = "{\"signature\":\"\",\"header\":{\"alg\":\"\",\"crit\":[\"ext\"],\"ext\":\"\"}}" λ> eitherDecode s :: Either String (J JHeader) Left "Error in $: crit key not understood" λ> eitherDecode s :: Either String (J KHeader) Right (J {_header = KHeader {_kJHeader = JHeader {_alg = "", _crit = Just ("ext" :| [])}, _ext = ""}, _sig = ""})
Perfect!
As has just been demonstrated, we are able to parse arbitrary header
types - ostensibly, extensions of the base J Header object -
whilst ensuring that the innermost object's "crit"
parameter
checking logic is sufficient even for critical extensions defined by
other types. The overhead for library authors is small: they only
need to declare what extensions are understood by their type, and
ensure that their parser passes a proxy value on to the parser for
the encapsulated type.
What has been sacrificed to accomplish this? We no longer have
FromJSON
instances for the header types - but we didn't really
need them anyway. This has an impact on testing: you can't test
just the parsing of the header type. But a :hs:`newtype` and a
simple :hs:`FromJSON` instance will get you over the line.
In this whole post we ignored the :hs:`ToJSON` instances, which are
likely to be necessary if you're using J (or the real JOSE
formats) in anger. The instances are straightforward but reveal a
problem with modelling the "crit"
parameter as :hs:`NonEmpty
Text`. If the field accessor (or corresponding lens) is exported
users could add any old value and construct an invalid object.
Various workarounds are possible: functions that only allow you to
add values that appear in the :hs:`extensions` list is one possible
approach.
Our J specification ignored a couple of other requirements that exist in JOSE. The first is that parameter keys defined by JOSE itself are prohibited for some reason (presumably because they would be redundant). In my jose library, this requirement is met by parameterising :hs:`checkCrit` over the list of reserved keys as well as the list of known extensions. The remaining requirement is that duplicate keys are prohibited. This can be addressed by using a non-empty set data structure. I am not aware of any Haskell library providing such a structure but I have written one as part of a different library, so I will probably extract it and publish it as a standalone package so that I can use it in jose (and others can use it, too).
Finally, there is craziness in JOSE headers that I have not even touched on in this post, notably the notion of "protected" and "unprotected" headers: a concept which introduces a lot of complexity. I might write about it in a future post, but if you're interested and/or experiencing a lack of being mad you could read about it in RFC 7515.