In a way this is a follow on from this post's Types of Reasoning section
Formal machinery underpinning mathematical reasoning and computation
| Foundation | Core Idea | Supports Lenses | Tool Lineage |
|---|---|---|---|
| Set theory (ZFC) | Collections, membership | Platonism, Formalism | Model checking, specification |
| Type theory | Propositions-as-types (Curry-Howard) | Intuitionism, Structuralism | Coq, Agda, Lean, Haskell, Rust |
| Category theory | Structure-preserving maps, composition | Structuralism | fp-ts, Cats, Haskell typeclasses |
| Lambda calculus | Computation as primitive | Intuitionism, Formalism | Functional languages (ML, Haskell) |
| Proof theory | Derivability, syntax | Formalism | Compilers, static analysis |
| Model theory | Semantics, interpretations | Formalism, Platonism | SMT solvers (Z3), Alloy |
| Computability theory | Decidability limits | All (constrains all) | Undecidability results, complexity bounds |
Ways of looking at mathematical objects, knowledge, and proof—not fixed positions but perspectives to adopt per problem
Understanding these lenses began with Frege's logicist project: describing mathematics through logic alone. Gödel's incompleteness theorems responded with proofs showing the limits of formal systems—using constructive diagonal arguments to demonstrate what formalism cannot capture. Encountering structuralism later revealed a predisposition in my own thinking toward that lens. This recognition—that one's reasoning has implicit philosophical commitments—makes it valuable to understand alternative perspectives on mathematics and, by extension, computer science.
| Lens | Ontology | Epistemology (what proof achieves) | Methodology (admissible techniques) | Tool Affinity |
|---|---|---|---|---|
| Platonism | Abstract objects exist independently | Discovers pre-existing truths | Any valid derivation | Formal specification (TLA+, Alloy); model checking |
| Formalism | No ontological commitment; symbol manipulation | Establishes consistency within system | Any valid derivation | Compilers, linters, static analysis, type checkers |
| Logicism | Math reduces to logic | Derives mathematical truths from logical axioms | Logical derivation | Prolog, Datalog, SMT solvers (Z3), rule engines |
| Intuitionism | Mental constructions | Constructs mathematical object | Constructive only (no classical reductio) | Coq, Agda, Idris; property-based testing; TDD |
| Structuralism | Relations, not objects | Characterizes structural positions | Varies | Interface-driven design; algebraic data types; fp-ts, Cats |
| Nominalism | No abstract objects | Reinterprets abstract claims as concrete | Varies | Integration testing; observability/tracing; runtime verification |
| Atheoretical | No commitment | None | None | Vibe coding, unreviewed LLM output |
Pragmatism (Peirce, Dewey) operates at a meta-level: it selects whichever lens best serves the problem at hand. A question is meaningful only if answering it has practical consequences. This requires understanding multiple frameworks to select appropriately—more rigorous than adopting a fixed position unreflectively.
Atheoretical: No engagement with evaluative frameworks; accepts outputs without considering what criteria would validate them. Distinct from pragmatism, which actively selects among lenses.
Vibe engineering describes pragmatic tool use: accepting LLM assistance while applying appropriate evaluative frameworks per context.
Vibe coding (atheoretical) accepts outputs without engagement with evaluative frameworks.
Foundation: TLA+ is grounded in set theory and temporal logic. The TLC model checker exhaustively explores state spaces—sets of possible system states—checking properties across all reachable configurations.
Problem: Distributed systems at scale (S3, DynamoDB, EBS) require correctness guarantees that conventional testing cannot provide. Subtle bugs in concurrent protocols survive design reviews, code reviews, and extensive testing.
Approach: Since 2011, AWS engineers have used TLA+ formal specification and the TLC model checker to verify distributed system designs. Engineers specify system behavior as sets of states and transitions, then exhaustively check correctness properties across all possible execution traces.
Outcome: Found critical bugs in DynamoDB, S3, EBS, and internal distributed lock manager that had evaded all other verification methods. Seven teams adopted TLA+; executive management now proactively encourages TLA+ specs for new features. Engineers report learning TLA+ in 2-3 weeks and finding it more reliable than informal proofs.
Source: Newcombe et al., "How Amazon Web Services Uses Formal Methods," CACM 2015; "Systems Correctness Practices at AWS," CACM 2025
Foundation: Isabelle/HOL is a proof assistant based on higher-order logic and type theory. Proofs are constructed as typed terms; the Curry-Howard correspondence connects proofs to programs. Refinement proofs establish that concrete implementations preserve abstract specifications.
Problem: Operating system kernels are critical security boundaries. Bugs in kernel code can enable privilege escalation and arbitrary code execution. Traditional testing cannot guarantee absence of bugs.
Approach: The seL4 team performed complete formal verification of an OS microkernel using Isabelle/HOL theorem prover. They proved functional correctness: the C implementation (8,700 lines) always strictly follows the high-level abstract specification. The proof guarantees no buffer overflows, no null pointer access, no code injection attacks.
Outcome: seL4 is the first general-purpose OS kernel with machine-checked proof of functional correctness. The proof comprises ~200,000 lines of proof script. Used in high-assurance systems including military and aerospace applications.
Source: Klein et al., "seL4: Formal Verification of an OS Kernel," SOSP 2009; "Comprehensive Formal Verification of an OS Microkernel," ACM TOCS 2014
Foundation: QuickCheck specifications are executable models—formal interpretations of what correct behavior means. Testing checks whether implementations satisfy these models. Shrinking finds minimal counterexamples: smallest inputs that distinguish implementation from specification.
Problem: Modern vehicles contain 100+ processors running AUTOSAR software from multiple vendors. Components must interoperate correctly; incompatibility risks are high when mixing vendors. Manual test creation cannot cover the configuration space.
Approach: Quviq translated 3,000+ pages of AUTOSAR specifications into executable QuickCheck models. Property-based testing automatically generated random test sequences from these models, testing implementations against formal specifications of expected behavior.
Outcome: Found 200+ issues across multiple vendor implementations, including 100+ inconsistencies in the AUTOSAR standard itself. Outperformed manually crafted test suites in both efficiency and fault-finding capability. Now used for acceptance testing of safety-critical automotive components.
Source: Arts et al., "Testing AUTOSAR software with QuickCheck," IEEE ICSTW 2015; Hughes, "Experiences with QuickCheck: Testing the Hard Stuff and Staying Sane"
Foundation: OCaml is rooted in lambda calculus (computation via function application) and type theory (Hindley-Milner type inference, algebraic data types). Category-theoretic patterns appear in functor/monad usage and compositional design. Types encode invariants; the compiler enforces them.
Problem: Proprietary trading requires rapid development of correct, high-performance systems. Code must be readable, maintainable, and change quickly to adapt to markets. Bugs in trading systems have direct financial consequences.
Approach: Jane Street adopted OCaml as primary development language in 2005. They leverage strong static typing, algebraic data types, and functional programming patterns. Heavy use of interfaces-as-contracts: type signatures communicate function behavior; Option/Result types make failure cases explicit.
Outcome: Hundreds of thousands of lines of OCaml in production. Sustain rate of change an order of magnitude higher than previously possible. OCaml's type system catches bugs at compile time that would otherwise reach production. Non-engineers complete "OCaml Bootcamp" in 4 weeks and build useful applications. Open-sourced Core library, contributed to OCaml compiler.
Source: Minsky et al., "Caml Trading: Experiences with Functional Programming on Wall Street," JFP 2008; Jane Street Engineering Blog
Foundation: State machine models formalize protocol behavior as mathematical structures. QuickCheck generates sequences that explore the model's state space, checking that implementations conform to the specified transition semantics.
Problem: Telecom software (Megaco protocol implementation) must handle complex concurrent state machines. Edge cases in protocol implementations cause production failures that are difficult to reproduce and debug.
Approach: Quviq QuickCheck used to define state machine models of the Megaco protocol. Random test generation explored state space systematically; shrinking automatically simplified failing test cases to minimal reproducible examples.
Outcome: Detected faults that had not been found by other testing techniques. Found unclarities in specifications and potential faults when software used in different contexts. Results promising enough that Ericsson invested in larger case studies from the beginning of new product development.
Source: Arts et al., "Testing Telecoms Software with Quviq QuickCheck," Erlang Workshop 2006
| Structuralism | Pragmatism | Atheoretical | |
|---|---|---|---|
| Level | Lens | Meta-position | A-positional |
| Ontology | Relations are real | Selects per problem | No engagement |
| Truth criterion | Structural isomorphism | Practical consequences | None |
| Engineering analog | "The interface is the contract" | "Right tool for the job" | "It compiled" |