Skip to content

Instantly share code, notes, and snippets.

@oaustegard
Created March 17, 2026 11:45
Show Gist options
  • Select an option

  • Save oaustegard/1cb41c32f003cbffd1e415316612fd4d to your computer and use it in GitHub Desktop.

Select an option

Save oaustegard/1cb41c32f003cbffd1e415316612fd4d to your computer and use it in GitHub Desktop.
tdap RFC

RFC: tdap — A Safety Audit and Fuzz Inoculation Toolkit for Rust

Status: Pre-RFC (Proposal) Author: A Helpful Corvid, with human supervision Date: 2026-03-17 Rust Version: Stable (edition 2021+) License: MIT OR Apache-2.0


Summary

We propose tdap, a cargo subcommand and library that treats unsafe Rust as an infection vector. It statically analyzes a crate’s exposure surface — unsafe blocks, FFI boundaries, raw pointer arithmetic, transmutes, and union field access — produces a quantified exposure report, and then generates targeted fuzz harnesses and safe abstraction scaffolds to inoculate the codebase.

The medical metaphor is load-bearing: the mental model of infection, exposure, vaccination, and immunity maps cleanly onto the actual problem of auditing and hardening unsafe code. The vocabulary should make the tool’s purpose self-evident to anyone who has ever touched a rusty nail.

Motivation

Rust’s safety guarantees are only as strong as the unsafe boundary. The compiler enforces nothing inside unsafe blocks, and the ecosystem currently offers fragmented tooling for understanding how much of a crate’s safety rests on programmer discipline rather than compiler proof.

Existing tools address pieces of this:

  • cargo-geiger counts unsafe usage but doesn’t assess risk or generate mitigations.
  • cargo-fuzz / libFuzzer provide fuzzing infrastructure but require manual harness authoring.
  • cargo-audit checks dependency vulnerabilities but not first-party unsafe code.
  • miri interprets code under a strict operational semantics but can’t explore input spaces automatically.
  • clippy lints for some unsafe patterns but isn’t a dedicated audit tool.

No single tool connects the static analysis (“where is the risk?”) to the dynamic mitigation (“generate a fuzz target for this exact boundary”). tdap bridges that gap.

Design Principles

  1. Exposure, not prohibition. The goal is not to eliminate unsafe — it exists for a reason. The goal is to ensure every unsafe boundary is known, scored, and tested.
  2. Vaccination over avoidance. Generating fuzz harnesses is more valuable than generating warnings. Developers ignore warnings. They don’t ignore failing tests.
  3. Progressive inoculation. A codebase doesn’t get safe in one pass. tdap tracks exposure over time and rewards incremental improvement.
  4. Composability. tdap should integrate with existing tools (cargo-fuzz, miri, CI pipelines) rather than replace them.

Detailed Design

Architecture Overview

┌──────────────────────────────────────────────────────┐
│                    tdap CLI                        │
│          (cargo subcommand via cargo-tdap)         │
├────────────┬─────────────┬────────────┬──────────────┤
│   check    │    shot     │   boost    │  quarantine  │
├────────────┴─────────────┴────────────┴──────────────┤
│                  Analysis Engine                      │
│         (syn + quote AST parsing layer)               │
├──────────────────────────────────────────────────────┤
│               Risk Model & Scoring                    │
├──────────────────────────────────────────────────────┤
│  Fuzz Generator  │  Wrapper Generator  │  Reporter   │
└──────────────────┴─────────────────────┴─────────────┘

Subcommand: tdap check

Purpose: Static analysis of a crate’s unsafe exposure surface.

Behavior:

  1. Parse all .rs source files in the crate (and optionally dependencies) using syn.
  2. Identify and classify every unsafe usage site into one of the following exposure classes:
Exposure Class Description Base Risk Weight
UnsafeBlock Inline unsafe {} block 1.0
UnsafeFn Function declared unsafe fn 1.5
FfiCall Call to extern "C" function 3.0
FfiDecl extern "C" block declaration 2.0
RawDeref Dereference of *const T / *mut T 2.5
Transmute std::mem::transmute or transmute_copy 4.0
UnionAccess Read from a union field 2.0
GlobalMut Write to a static mut 3.5
InlineAsm asm! / global_asm! macro usage 5.0
UnsafeTrait unsafe impl of a trait 2.0
  1. Apply contextual modifiers to the base weight:
Modifier Multiplier Rationale
Inside #[no_mangle] or #[export_name] fn ×1.5 Exposed to foreign callers
Accepts raw pointer parameter ×1.3 Caller controls pointer validity
No // SAFETY: comment within 3 lines above ×1.2 Undocumented invariant
Inside #[test] or #[cfg(test)] ×0.3 Test-only code, lower production risk
Wrapped in debug_assert! guard ×0.8 Some runtime checking exists
Has associated fuzz target (detected by convention) ×0.5 Actively tested
  1. Produce an Exposure Report:
╔══════════════════════════════════════════════════════════╗
║              TDAP EXPOSURE REPORT                     ║
║  Crate: my-cool-crate v0.3.1                            ║
║  Date:  2026-03-17                                       ║
║  Files scanned: 42 │ Lines analyzed: 18,304              ║
╠══════════════════════════════════════════════════════════╣
║  EXPOSURE INDEX: 34.7  (Moderate)                        ║
║                                                          ║
║  0 ─────────■──────────────────── 100                    ║
║  Clean    Moderate    Serious    Critical                 ║
╠══════════════════════════════════════════════════════════╣
║  Exposure Sites: 23                                      ║
║  ├─ UnsafeBlock ×12     (weighted: 11.4)                 ║
║  ├─ FfiCall ×5          (weighted: 14.2)                 ║
║  ├─ RawDeref ×4         (weighted: 7.1)                  ║
║  ├─ Transmute ×1        (weighted: 2.0)                  ║
║  └─ GlobalMut ×1        (weighted: 0.0) [test-only]      ║
║                                                          ║
║  Undocumented Sites: 8 of 23 (34.8%)                     ║
║  Fuzz Coverage: 2 of 23 sites (8.7%)                     ║
╠══════════════════════════════════════════════════════════╣
║  HIGHEST RISK SITES:                                     ║
║  1. src/ffi/bindings.rs:142  FfiCall        score: 4.5   ║
║     raw_process_buffer(ptr, len)                         ║
║     ⚠ No safety comment · Accepts raw pointer            ║
║  2. src/codec.rs:89          Transmute      score: 4.0   ║
║     transmute::<[u8; 4], f32>(bytes)                     ║
║     ⚠ No safety comment                                  ║
║  3. src/ffi/bindings.rs:201  FfiCall        score: 3.9   ║
║     raw_init_context(cfg_ptr)                            ║
║     ⚠ Accepts raw pointer                                ║
╚══════════════════════════════════════════════════════════╝

Output formats: --format human (default, shown above), --format json, --format sarif (for CI integration with GitHub Code Scanning).

Flags:

  • --include-deps — Scan direct dependencies (uses source from cargo registry cache).
  • --threshold <N> — Exit with nonzero status if exposure index exceeds N. Designed for CI gates.
  • --ignore <pattern> — Ignore files matching a glob. Supports tdap.toml config.

Subcommand: tdap shot

Purpose: Generate fuzz harnesses targeting the exposure sites found by check.

Behavior:

  1. Run the analysis engine (or accept a cached exposure report via --from-report).
  2. For each unsafe site, determine the fuzz strategy:
Pattern Strategy
unsafe fn with scalar/slice params Arbitrary input via arbitrary crate
FFI function accepting *const u8, usize Length-bounded byte buffer
transmute of fixed-size input Byte array of matching size
Raw pointer deref inside a method Fuzz the method’s safe parameters, construct valid internal state
union field access Fuzz the union’s byte representation
  1. Emit fuzz targets into fuzz/targets/tdap_*.rs, wired to cargo-fuzz (libFuzzer) or optionally cargo-bolero for multi-engine support.

Example generated harness:

// Auto-generated by tdap shot
// Target: src/codec.rs:89 — transmute::<[u8; 4], f32>(bytes)
// Exposure class: Transmute (score: 4.0)
// 
// Review this harness and adjust constraints before running.
// The generated code is a starting point, not a guarantee.

#![no_main]
use libfuzzer_sys::fuzz_target;
use my_cool_crate::codec;

fuzz_target!(|data: [u8; 4]| {
    // Exercise the transmute path.
    // If this panics or triggers UB under miri, the site needs hardening.
    let _ = codec::decode_sample(data);
});

Flags:

  • --engine libfuzzer|afl|bolero — Target fuzzing backend.
  • --dry-run — Print generated harnesses to stdout without writing files.
  • --miri — Additionally generate a miri-compatible test for each site (runs under cargo +nightly miri test).

Important constraint: Generated harnesses are drafts. The tool emits a prominent comment in every file indicating the harness requires human review. Some function signatures won’t be fuzzable without domain-specific setup (e.g., constructing a valid Context struct). The generator should err toward a compilable skeleton with todo!() markers over a clever-but-broken attempt.

Subcommand: tdap boost

Purpose: Track exposure over time and measure improvement.

Behavior:

  1. Run check.
  2. Compare against the most recent stored report in .tdap/history/.
  3. Output a diff:
TDAP BOOST — Comparing against baseline from 2026-03-10

  Exposure Index: 34.7 → 28.3  (▼ 18.4% — improving)
  
  Sites resolved:
    ✓ src/codec.rs:89        Transmute    Added safety comment + fuzz target
    ✓ src/ffi/bindings.rs:201 FfiCall     Wrapped in safe abstraction

  New exposures:
    ✗ src/net/tls.rs:44      FfiCall      raw_ssl_handshake(ctx, bio)
                                           New in this scan, no mitigations

  Fuzz coverage: 8.7% → 21.7%  (▲ 13.0pp)
  Undocumented:  34.8% → 17.4%  (▼ 17.4pp)
  1. Store the new report in .tdap/history/<timestamp>.json.

Flags:

  • --baseline <path> — Compare against a specific report instead of the latest.
  • --ci — Output in a machine-readable format and exit nonzero if the exposure index increased.

Subcommand: tdap quarantine

Purpose: Generate safe wrapper modules around unsafe code.

Behavior:

For each flagged unsafe site, generate a module that:

  1. Encapsulates the unsafe operation behind a safe public API.
  2. Documents the safety invariants as both doc comments and debug_assert! preconditions.
  3. Uses Rust’s type system to enforce invariants where possible (e.g., newtype wrappers for validated pointers, NonNull<T> instead of *mut T).

Example output for an FFI boundary:

// Auto-generated by tdap quarantine
// Wrapping: src/ffi/bindings.rs:142 — raw_process_buffer
// 
// This module provides a safe interface to an unsafe FFI call.
// Review all safety invariants before using in production.

use std::ptr::NonNull;

/// A validated, non-null buffer pointer with a known length.
/// 
/// # Invariants
/// - The pointer is non-null and aligned to `u8`.
/// - The length accurately reflects the allocated region.
/// - The buffer remains valid for the lifetime `'a`.
pub struct ValidBuffer<'a> {
    ptr: NonNull<u8>,
    len: usize,
    _lifetime: std::marker::PhantomData<&'a [u8]>,
}

impl<'a> ValidBuffer<'a> {
    /// Construct from a byte slice. This is the *only* safe entry point.
    pub fn from_slice(data: &'a [u8]) -> Self {
        Self {
            ptr: NonNull::new(data.as_ptr() as *mut u8)
                .expect("slice pointer is never null"),
            len: data.len(),
            _lifetime: std::marker::PhantomData,
        }
    }
}

/// Safe wrapper around `raw_process_buffer`.
/// 
/// # Safety Invariants (enforced by `ValidBuffer`)
/// - `ptr` is non-null and points to `len` initialized bytes.
/// - The buffer is valid for the duration of this call.
pub fn process_buffer(buf: &ValidBuffer<'_>) -> Result<(), ProcessError> {
    debug_assert!(buf.len <= isize::MAX as usize, "buffer length overflow");

    // SAFETY: ValidBuffer guarantees non-null, valid-length, 
    // lifetime-bounded pointer. Length bound checked above.
    let result = unsafe {
        crate::ffi::raw_process_buffer(buf.ptr.as_ptr(), buf.len)
    };

    match result {
        0 => Ok(()),
        e => Err(ProcessError::from_code(e)),
    }
}

Flags:

  • --site <file:line> — Generate wrapper for a specific site only.
  • --style module|inline — Emit as a new module (default) or as an inline replacement with // TODO: review markers.
  • --dry-run — Print to stdout.

Constraint: Quarantine output is always a draft. The generated code will compile and encode the right shape of safety invariant, but domain-specific invariants (e.g., “this pointer must refer to an initialized TLS context”) require human review. The tool cannot infer semantic invariants from syntax alone, and it should not pretend otherwise.

Configuration

tdap.toml at crate root:

[tdap]
# Fail CI if exposure index exceeds this value
threshold = 50.0

# Sites to exclude from analysis (glob patterns)
ignore = [
    "src/generated/**",
    "benches/**",
]

# Custom risk weights (override defaults)
[tdap.weights]
Transmute = 5.0     # We consider transmute extra dangerous here
UnsafeBlock = 0.8   # Our unsafe blocks are well-audited

# Modifier overrides
[tdap.modifiers]
no_safety_comment = 1.5   # We're strict about documentation

[tdap.shot]
engine = "libfuzzer"
output_dir = "fuzz/targets"

Scoring Model

The Exposure Index is computed as:

E = Σ (base_weight(class_i) × Π modifier_j) for each site i

Normalized to a 0–100 scale:

Index = 100 × (1 - e^(-E / k))

Where k is a normalization constant calibrated so that:

  • A crate with 0 unsafe sites scores 0.
  • A crate with 10 typical unsafe blocks scores ~30 (Moderate).
  • A crate with 50+ unmitigated sites with FFI and transmutes scores 80+ (Critical).

The exponential curve means the first few unsafe sites cost you more per-site than later ones — reflecting the reality that going from “no unsafe” to “some unsafe” is a bigger architectural decision than adding one more site to an already-unsafe crate.

The suggested initial value for k is 20.0, tunable per-project in tdap.toml.

Severity Bands

Range Label Interpretation
0–10 Clean Minimal unsafe surface. Typical for pure-Rust crates.
10–30 Moderate Some unsafe usage, common for crates wrapping system APIs.
30–60 Serious Significant exposure. Fuzz coverage and wrappers strongly recommended.
60–100 Critical Extensive unsafe surface. Prioritize quarantine and audit.

CI Integration

GitHub Actions Example

name: Tdap Check
on: [push, pull_request]

jobs:
  safety-audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
      - run: cargo install cargo-tdap
      - run: cargo tdap check --format sarif --threshold 50
      - name: Upload SARIF
        if: always()
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: tdap-report.sarif

Pre-commit Hook

#!/bin/sh
# .git/hooks/pre-commit
cargo tdap check --threshold 50 --quiet || {
    echo "tdap: exposure index exceeds threshold. Run 'cargo tdap boost' for details."
    exit 1
}

Crate Structure

cargo-tdap/
├── Cargo.toml
├── src/
│   ├── main.rs              # CLI entry point (clap)
│   ├── cli/
│   │   ├── mod.rs
│   │   ├── check.rs
│   │   ├── shot.rs
│   │   ├── boost.rs
│   │   └── quarantine.rs
│   ├── analysis/
│   │   ├── mod.rs
│   │   ├── scanner.rs       # syn-based AST walker
│   │   ├── classifier.rs    # Exposure class detection
│   │   └── modifiers.rs     # Contextual modifier detection
│   ├── scoring/
│   │   ├── mod.rs
│   │   └── model.rs         # Risk model and normalization
│   ├── codegen/
│   │   ├── mod.rs
│   │   ├── fuzz.rs           # Fuzz harness templates
│   │   └── wrappers.rs       # Safe abstraction scaffolds
│   └── report/
│       ├── mod.rs
│       ├── human.rs          # Terminal output
│       ├── json.rs
│       └── sarif.rs
├── tdap-lib/              # Library crate for programmatic use
│   ├── Cargo.toml
│   └── src/
│       └── lib.rs            # Re-exports analysis + scoring
└── tests/
    ├── fixtures/             # Crates with known unsafe patterns
    └── integration/

Key Dependencies

Crate Purpose
syn (2.x) Rust source parsing
quote Code generation for harnesses and wrappers
proc-macro2 Token stream manipulation
clap (4.x) CLI argument parsing
serde + serde_json Report serialization, config parsing
toml Config file parsing
owo-colors Terminal coloring (because the output should look good)
similar Diffing for boost comparisons

Open Questions

  1. Dependency scanning depth. --include-deps is useful but potentially slow. Should it default to direct deps only, or walk the full tree? Should it respect cargo-geiger’s existing database to avoid redundant work?
  2. Macro expansion. syn parses surface syntax. A transmute hidden inside a macro invocation won’t be caught without expansion. Should tdap optionally invoke cargo expand as a preprocessing step? The tradeoff is speed vs. accuracy.
  3. Interaction with unsafe_op_in_unsafe_fn lint. As of Rust 1.74, unsafe_op_in_unsafe_fn is warn-by-default. Crates that adopt this lint have more granular unsafe blocks inside unsafe fn bodies. The scanner should handle both styles — counting the outer unsafe fn as a single site when the lint is suppressed, and the inner blocks individually when it’s enabled.
  4. Scoring calibration. The initial weights and normalization constant are educated guesses. The right approach is probably to run the scorer against a corpus of well-known crates (e.g., libc, nix, ring, tokio, hyper) and tune until the distribution feels right. This is an empirical question, not a design question.
  5. Should quarantine attempt to run the generated wrappers through cargo check? It could verify that the scaffolds at least compile, but this requires building the target crate, which may have complex build dependencies. A --verify flag that opts into compilation checking seems right.

Prior Art and Acknowledgments

  • cargo-geiger — The original unsafe counter. tdap is spiritually a successor that adds risk-weighting and mitigation generation. The name geiger set the precedent for radiation/hazard metaphors in safety tooling; we continue the tradition with a different pathogen.
  • cargo-fuzz — The fuzzing infrastructure tdap shot emits targets for. Not a competitor; a dependency.
  • cargo-audit — Dependency vulnerability scanning. Complementary: cargo-audit checks known vulnerabilities in dependencies; tdap checks potential vulnerabilities in your code.
  • miri — The gold standard for detecting undefined behavior. tdap can generate miri-compatible tests, but miri does the actual UB detection.
  • Rudra — A research tool for detecting memory safety bugs in Rust. More sophisticated static analysis than tdap proposes, but not maintained for recent Rust versions and not focused on mitigation generation.

Why tdap?

The original working names — tetanus and lockjaw — are both taken on crates.io. Neither is related to this proposal’s domain:

  • tetanus (v0.3.0) is a generic stdlib extension crate: collection macros, string utils, ring buffers, rate limiters. A utility grab-bag with no connection to unsafe auditing.
  • lockjaw is a compile-time dependency injection framework inspired by Dagger. Its own tagline is “It is also what you get when jabbed by a rusty dagger” — the rust pun was already taken.

tdap — the abbreviation for the Tetanus, Diphtheria, and Acellular Pertussis vaccine — is available on crates.io as of this writing and is arguably the better name anyway. The tool is the vaccine: it inoculates your Rust code against the consequences of touching unsafe. The medical acronym is precise, memorable, and self-documenting for anyone who’s ever gotten a shot after stepping on a nail. It also avoids competing with two established, unrelated crates in the same thematic space.


“You touched Rust. Now you need a shot.”

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment