Skip to content

Instantly share code, notes, and snippets.

@lrvick
Last active April 11, 2023 06:53
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save lrvick/d4b87c600cc074dfcd00a01ee6275420 to your computer and use it in GitHub Desktop.
Save lrvick/d4b87c600cc074dfcd00a01ee6275420 to your computer and use it in GitHub Desktop.
Sig v2 Design

Sig v2 Design (Draft!)

The goal of this document is to describe the desired user experience for the next generation of "sig" and it's predecessor "git-signatures"

These were useful prototypes but significant improvement is needed before widespread use.

Challenges

Optional Quality Indication

Sig already uses pgp signatures, but with no metadata to signify the what the review means.

Was it a cursory look? A deep dive? How familiar with the subject matter is the reviewer?

What if they didn't even look at the code but at least compiled it and verified it is reproducible. This should still be signed too, but it is different type of review than reading the code.

The introduction of metadata tags allowing for a normalized way to comunicate review quality have emerged in similar tool and spec attempts.

Notably:

Of note, Lance has also had exposure to a number of proprietary tools developed in house for various companies.

A wide range of metadata tags may be useful for specific environments but the ones most useful in practice seem to be:

  • thoroughness
  • understanding
  • rating
  • reproducibility

Once one chooses to include metadata, then a simple detached signature no longer suffices. The tools and specs above attempt to solve this by having an opinionated format hard-coding various metadata tags and then allowing the inclusion of a chosen signature with that report. Crev, git-wotr, and basically all signatures on popular linux packages like debian, fedora, arch, etc, all take this approach.

The trouble with this approach of many one-off specs is it will forever require complex custom tooling, when the ability to add custom metadata tags directly into OpenGPG signatures has existed all along which can solve for these problems.

The only limiting factor is that doing any custom development to take adavantage of OpenPGP features like this historically required dealing with GnuPG which is notoriously unfriendly to modern development sensabilities.

Thankfully this landscape has changed with the sequoia-pgp team making a modern well tested and library-first implementation of OpenPGP in rust, allowing for us to finally have a clear path to use the OpenPGP standard to the fullest.

There is now a clear path to implement a simple scheme for detached OpenPGP signatures that contain any review metadata desired.

First User Experience

Additionally while the value of the OpenPGP standard and it's portability, and compatibility with hardware signing devices can not be under-stated, the UX of going from "I have a new Yubikey in my hand and want to review code" to "there is a published signature" is so wide that most give up instantly and seemingly blame the OpenPGP standard for ux failings of GnuPG.

Both of these problems are not an issue with OpenPGP, but a problem with the UX of the tools.

Historically this was not easy to solve because GnuPG didn't have a well developed library to easily embed workflows for an easy and portable user experience in cases like these.

Once again the existance of a modern library like sequoia-pgp create a path for a tool like "sig" to fully streamline the first user experience of key creation and publishing so users can use PGP correctly without actually having to care about it when they are just trying to get some code review done.

Sig should be a one-stop to set users up for success with PGP so they can get their review done, and also be setup to take advantage of all the other wins of PGP with other tools when they are ready.

Desired UX

Here is the outcome we want to target for an streamlined hardware-signed code review process whose resulting signing keys and artifacts can be easily published and discovered so every review can help secure the entire open source supply chain.

Sig Review

  1. cd to directory containing any open source project you wish to review
  2. Run "sig review"
  3. All code canges since the last signed review you made and HEAD will be displayed as a diff in chosen git difftool
  4. (template launches to submit review)
  5. User fills in thoroughness, understanding, and rating boxes
  6. User saves.
    1. If needed: "Please insert a PGP compatible smartcard such as a yubikey 5"
    2. If needed: "This smartcard has no keys. Would you like to generate a new set? [Y/n]"
    3. If needed: "Keys generated for UID foo@bar.com. Publish to keyservers? [Y/n]"
    4. If needed: "Keys published to x, y, x servers. For maximum visibility consider publishing to popular locations like Github, Gitlab, Keyoxide, or to your own domain via WKD."
  7. "Signing! Please confirm on your "Yubico 5A" USB device."
  8. "Do you want to save this signature as a Git Note on this Git repository?"
  9. "Would you like to publish this signature to a public signing database so others can benefit from your review?"

Sig verify

Assuming simple detached signatures exist, and are optionally published to git notes or a public database, they should now be readily discoverable by "sig verify".

Every organization will have different preferences on verification.

For example a strict org worried about supply chain attacks may want the following before production use:

  • 1+ review by a release engineer with keys from set a with minimum "reproduciblity" of 1
  • 2+ reviews by a internal software engineer from set b with minimum "thoroughness" of 5 and "understanding" of 5
  • 1+ external review done by trusted entities from set c with a minimum "thoroughness" of 2 and "understanding" of 2

"sig verify" could output any signatures found, but it would be more useful if you could set a standardized policy file in a root repo at an organization to ensure your root repo (and all dependencies) meet the desired policy.

This could prevent untrusted code from ever touching production putting a full stop to many classes of supply chain attack.

I proposed a simple format that could meet these needs as json or yaml:

[
    {
        "name": "release-engineers",
        "min": 1,
        "members": [
            "05F597E88FAD0763449F8D1F573FD879821C2735",
            "097722C2A2EF2E7AFB2D0C345902E7D5FB4E1ECD"
        ],
        "metadata": {
            "thoroughness": {
                "min": 2
            },
            "understanding": {
                "min": 4
            },
            "rating": {
                "present": true
            }
        }
    },
    {
        "name": "engineers",
        "min": 3,
        "members": [
            "1351878A47640D0812452E5057546E564D259DBB",
            "FE9BE6F2F92C4A3B536D326FCC3160C3C54E50BB",
            "4E27CF523A3880CD4FD3B4532D147A3EE202DBBE"
        ],
        "metadata": {
            "thoroughness": {
                "min": 1
            },
            "understanding": {
                "min": 2
            },
            "rating": {
                "present": true
            }
        }
    }
]

Wiktor of the sequioa-pgp team graciously implemented a proof of concept of this here:

https://gitlab.com/wiktor/lance-verifier#lance-verifier

Note: Lance did not name it.

Questions

  1. Do we integrate the work from wiktor directly into the sequoia-pgp library and/or the sq command line app?
    • Arbitrary policy based m-of-n verification seems generally useful far beyond just code review use cases
  2. Do we keep sig as a simple bash wrapper for tools like gnupg or sq, or rewrite in rust importing sequoia-pgp
    • The answer to #1 will impact #2 here.
    • Lance is not yet very good at rust so this may slow progress, but not opposed to getting more experience here.
  3. We need to spec and develop a public database to publish and discover open source signatures sig makes
    • Should also be able to harvest and index existing signing formats best effort from linux distros, Crev, etc.
    • "sig verify" could be able to pull down and use a crev or debian signature to satisfy one external signer, etc.
    • Lance would default to deveoping this in PostgREST for a highly testable SQL only implementation
    • Anyone could mirrored to a blockchain or ipfs for long term durability as desired

Funding

Currently we have none, so this will likely not go very quickly.

We know of several independent security engineers who would love to work on this at least part time if we had some funding to offset other opportunities.

If you or your organization sees supply chain attacks as a real threat and wants to accelerate efforts for an open toolchain to make signed and distributed code review of open source dependencies easier for everyone, contact lance@lrvick.net.

Contributors

  • Lance Vick - Distrust, LLC
  • Wiktor - P==P, Sequoia PGP
@jnaulty
Copy link

jnaulty commented Feb 15, 2022

This is very cool!

I like the idea of using in-toto spec for the code review artifact (the reviewer feels like a type of functionary would be great for increased adoption in more 'industrial' + 'institutional' software shops that are steadily improving their software supply chain security posture.

So, you'd get some kind of review-code.[keyid].link type artifact after a code review (e.g. using the example of write-code instead, you'd get review-code signed attestations: https://github.com/in-toto/docs/blob/18a8f2a053e089dfea39d4b567c35601840e64c5/in-toto-spec.md#write-codealice-keyid-prefixlink-1 ).

It looks like these 'human-review' predicates to attest to are still 'up for debate': in-toto/attestation#77
So, perhaps following + contributing thoughts to this Issue would be wise (it would have impact across many orgs now and in the future)

@wiktor-k
Copy link

Sig Review

I guess some of these things we can save in git config (e.g. whether to publish, to which database, etc.)

Do we keep sig as a simple bash wrapper for tools like gnupg or sq, or rewrite in rust importing sequoia-pgp

I think it's better to directly rewrite it in Rust. Rust nowadays has quite a lot of tools to make CLI apps look nice and we could utilize Heiko's work for OpenPGP Cards to deliver a complete solution (including card signing) without a bit of GnuPG.

Lance is not yet very good at rust so this may slow progress, but not opposed to getting more experience here.

Maybe we can discuss it on #sequoia (oftc) or somewhere as I guess me and Heiko could help a bit (I didn't ask Heiko but know he's also interested in this project).

In general - very nice summary. Thanks Lance! 👏

@RyanSquared
Copy link

Lance is not yet very good at rust so this may slow progress, but not opposed to getting more experience here.

Maybe we can discuss it on #sequoia (oftc) or somewhere as I guess me and Heiko could help a bit (I didn't ask Heiko but know he's also interested in this project).

I would be willing to help out with building a Rust version.

@RyanSquared
Copy link

One of the things that should be considered if doing an implementation in Rust is that there should still be a way to manually retrieve the signature information using gpg to match current existing workflows without having to audit Sequoia and any other dependencies.

@wiktor-k
Copy link

@RyanSquared, yeah. Actually the app is structured in parts: one part verifies the signatures end extracts metadata (notations), second reads the JSON policy document and the third does the verification logic.

The first part is the only one using Sequoia and it could be replaced with a couple of invocations of the gpg binary. I did something like this for the simple-wot crate that uses gpg to get the graph info and just operates on raw data.

The second part is using serde and serde_json. Not sure how do you feel about the audit of these. Theoretically it's possible to avoid them by writing some simple config format.

The third part is using only the standard library so it's clean.

@lrvick
Copy link
Author

lrvick commented Feb 16, 2022

Thanks for the interest all! Will be in touch.

@jnaulty In-toto looked significantly complex and I didn't at first glance ( over a year ago ) grock how one could use PGP with it, and PGP seems to have the metadata support to do really clean portable blobs which is attractive now that we have a reliable rust OpenPGP implementation that forms a bridge to many smartcards. That said, if there is some traction on in-toto elsewhere and we can find a way to make it work well with PGP/smartcards (and thus have clean tooling for code review and signing workflows above making it transparent to the user) then I could possibly be convinced. More thoughts on how this might work will be welcome.

I see in-toto has an IRC channel so I'll pop in there and see if anyone has thoughts.

@lrvick
Copy link
Author

lrvick commented Feb 16, 2022

#in-toto:libera.chat is apparently a thing on matrix/irc if anyone else wants to pop in.

@SantiagoTorres
Copy link

Hi @lrvick ! in theory our python impl should support pgp w/ smartcards (that's what datadog uses right now through yubikeys). Of course it can be finnicky, but I'd be happy to help out in any way possible!

@RyanSquared
Copy link

RyanSquared commented Feb 16, 2022

The second part is using serde and serde_json. Not sure how do you feel about the audit of these. Theoretically it's possible to avoid them by writing some simple config format.

Sounds like a great first project to audit, then? According to cloc it's about 10k lines of code. It's not the easiest project to look at but it's by far not the worst. I may take a look through the project, at the very least to confirm there's nothing obviously malicious in the verification logic, and see if I can find a JSON test suite to test it against -- unless it already has one.

EDIT: I forgot serde_derive and serde_json aren't just in the serde directory... That's a bit more code to review.

@RyanSquared
Copy link

Another option is having a Python version of the validation code that ignores all of the PGP stuff, so it gets passed in a GnuPG compatible blob and the JSON policy and gives output from that. I don't see any obvious reasons this wouldn't be doable for the standard library, and would be a good option to bootstrap the sequoia-pgp and serde based version, as you could then audit the reviews of those using a "trusted" Python and GnuPG version, which I imagine more people implicitly trust by default.

@wiktor-k
Copy link

It's not the easiest project to look at but it's by far not the worst.

Yep. Also, this crate is used by a majority of the Rust ecosystem so if there were a crate one would like to start to have the biggest RoI it would be serde: https://crates.io/crates/serde/reverse_dependencies

@SantiagoTorres
Copy link

(FWIW the in-toto rust implementation uses serde and serde_json as well. I'd like to know if you find anything security relevant !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment