Skip to content

Instantly share code, notes, and snippets.

@THS-on
Last active March 6, 2022 10:22
Show Gist options
  • Save THS-on/4229ba2c8c83dc5d9726a6c62c932868 to your computer and use it in GitHub Desktop.
Save THS-on/4229ba2c8c83dc5d9726a6c62c932868 to your computer and use it in GitHub Desktop.
Non atomic qoutes for Keylime

Non atomic Quotes for attestation

Issue

A TPM contains multiple PCRs and can generate a signed quote over the concatenated hash of a selection of PCRs. The quote itself does not contain the values of the PCRs. If you want to have matching quote and PCR values most implementations (also Keylime) do the following trick:

  1. Read PCR values (8 at the time)
  2. Generate quote
  3. Read PCR values (8 at the time)
  4. Check if the PCR values from step 1. and 3. match, if not start with 1.

This works fine if the PCR values are essentially static which is the case for all the PCRs used during UEFI Secure Boot, but is not the case when IMA is enabled and extends PCR 10 quite frequently.

In cases where IMA is enabled this might cause an unintentional attestation failures because there is no atomic quote ever be generated. Also the quote signing is a computationally expensive operation that might block the TPM from performing other action.

Current Implementation in Keylime

For the verifier to attest the agent sends the following information:

  1. quote
  2. signature for the quote
  3. PCR values checked to be the same as the hash in the quote with the method described above.
    1. We currently use a binary data structure from tpm2-tools for that
  4. IMA log (optional)
  5. UEFI log (optional)
  6. NK transport key measured into PCR 16 (might not be always sent)

Then the verifier does the following steps:

  1. Check if hash in quote matches the concatenated hash of the PCR values
  2. Check the signature, quote and AK of the agent match
  3. (Validate data quote for NK against PCR 16 sent by the agent)
  4. (optional) Validate UEFI log
    1. Walk the UEFI log and get the computed PCR values
    2. Validate the UEFI log against a measured boot policy
    3. Check the computed PCR values against the PCR values sent by the agent
  5. (optional) IMA validation
    1. Validate an entry of the IMA log
    2. compute running hash for PCR 10
    3. check if running hash matches PCR 10 sent by the agent
      1. if yes stop
      2. if no goto i. or fail if there are no more entries
  6. Validate static PCRs (all PCRs that were not covered by UEFI or IMA log validation)
    1. Check PCR value against static allow list

    2. Check if all PCRs that should be validated are now actually validated

This model has several disadvantages:

  • It requires that the quote and sent PCR values match exactly
  • More complex validators (e.g. ImaBuf validator or measured boot policies) run before the integrity of the data fully validated against the quote. Which is not directly a security issue, but increases the attack surface of Keylime.

Moving to non atomic quotes

We do not actually require that the PCR values and quote is atomic implement any of the functionality above (if we assume that no other PCR than PCR 10 changes frequently).

The agent sends still the same data as above with the difference that the PCR values might not match the quote.

Now the following steps for verification are:

  1. Check if the signature, quote and AK match
  2. (skip if no UEFI log validation enabled) walk UEFI log and only save computed PCRs
  3. Build list with PCR 16 sent by the agent and the computed PCRs from the UEFI log with they are not present use the selected PCR values also sent by the agent
  4. (skip if no IMA log validation enabled) IMA entry structure validation. In this step we now try to iterate the log until we find a matching running hash for the quote. If there are no external failures this should always work because entries are first added to the IMA log then measured into the TPM. In this step we only validate the structure (hash of the entire struct) of the entry not its content. For the first iteration start with 4.ii, because the quote might already match before we even validated one entry (happens often by incremental attestation).
    1. Compute running hash with running hash for PCR 10 and the list from above and stop if it matches the quote or fail if there are no more entries and no match was found. We also save what the last entry of the IMA log was and only validate the content up to that point later.
    2. Parse and compute hash of IMA entry and update running hash for PCR 10
    3. goto i.
  5. If now the running hash matches the quote, we can assume that all the PCR values and data are valid.
  6. (optional) Validate UEFI log using a measured boot policy
    1. Parse UEFI log into JSON format
    2. run measured boot policy and produce an failure if the policy fails
  7. (optional) Validate content of IMA entries
    1. run complex validator on IMA entry and keep track of any failures
    2. goto i if not last entry.
    3. Output all the failures
  8. Validate static PCRs (all PCRs that were not covered by UEFI or IMA log validation)
    1. Check PCR value against static allow list
  9. Check if all PCRs that should be validated are now actually validated

Up to step 5. we return early if a failure occurs. After that we collect them and handle them according to their severity level. More information on that can be found here: https://github.com/keylime/enhancements/blob/master/46_revocation_severity_and_context.md

Design considerations

Verifier

The currently Keylime puts most of the PCR validation the validation into an abstract TPM which does the necessary calls to tpm2-tools for validation. This made sense for supporting TPM 1.2 and TPM 2.0 and sharing the code with the agent. We no longer support TPM 1.2 and longterm the Python agent will be deprecated and removed. Therefore the new validation code should have the following properties:

  1. Content validation of logs (UEFI, IMA) should be fully separate from testing that quote and data is valid
  2. This should allow us new data for validation easily
  3. If there is a new TPM or a similar (Pluton??) protocol we should be easily add support for that without changing our data validation
  4. quote validation is abstracted in a way that the current dependency on tpm2-tools can be swapped out with for example tpm2-pytss
  5. Easily unit testable. The current code is only covered through end-to-end testing.

With that in mind the proposed steps from above can be implemented without changing how users currently use Keylime.

Agent

The agent only one mayor change that should simplify the code in most cases. Instead of checking that the PCR values and the quote are atomic, the agent first reads the PCR values and then generates the quote and sends the data to the verifier.

API changes

With this change we want to reduce the dependency on tpm2-tools. We currently use for sending the PCR values a tpm2-tools specific data structure (tpm2_pcrs) and have a custom format encoding this with the quote and signature, this will get replaced by a JSON structure with the following structure:

{"pcrs" : 
	{
        "0": "HEX_ENCODED_VALUE_OF_PCR_0",
        "1": "HEX_ENCODED_VALUE_OF_PCR_1",
        ...
    },
 "quote": "BASE64_ENCODED_VALUE_OF_TPM_QUOTE",
 "sigature": "BASE64_ENCODED_VALUE_OF_TPM_QUOTE_SIGNATURE"
}

With only the PCRs present that were requested by the verifier.

Note that the old 2.0 API still provides all the necessary information only the data structures are changed to make implementations simpler, so the verifier can easily support both APIs.

Other ideas related to this change

Clock and firmware validation

The TPM quote also contains clock and firmware information besides the quote hash. Keylime currently does not use this data. The firmware string can be just another data point that can be validated like the logs. With the clock to checks can be implemented:

  1. Checking that there was no changes to the clock (the safe flag is set to true)
  2. If the system was rebooted between two quotes by checking if the clock advances at the right pace and checking reset and restart counters. Note that the two counters are obfuscated to make fingerprinting harder, so they can only be checked on equality.

The second point will allow Keylime easily detect scenarios where a device left the trusted state for a short period of time and then rebooted to get again into a trusted state.

Moving from tpm2-tools to tpm2-pytss

There are now Python bindings for the TPM with tpm2-pytss which implements parsing of TPM specific data structures and makes it possible to implement the quote signature fully in Python. Moving in the verifier to pytss would allow us to remove external calls to tpm2-tools. It might make sense to put more generic code for validation into pytss fist before using it in Keylime.

@THS-on
Copy link
Author

THS-on commented Mar 6, 2022

  1. I've already seen proposals for using PCR 11 along with PCR 10. I suggest that the basic design should accommodate more than one.

Good to know! The issue then is walking the log until we match with two frequently updated PCRs is a lot harder than just one. (It might be O(n*m) for the n entries in the one PCR and m in the other complexity wise).

  1. HEX may add more network traffic, but it's minimal after the first quote. The main performance hits are the quote at the agent side and the signature verification at the verifier. The network traffic has no effect on performance in my benchmarks.

Yeah you are right, compared to the IMA log the data is quite small. I'm fine with moving to HEX to make debugging easier.

  1. The tools do many context save and loads. I posted the analysis previously, and it will affect performance. For makecredential, sure, it's not on a critical path. Checkquote is. Does it use the TPM to check the signature?

We use a pure software implementation of make credential that does not use a TPM, the same goes with the checkquote. On the server side tpm2-tools for those helper tools and not to interact with a TPM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment