Skip to content

Instantly share code, notes, and snippets.

@matu3ba
Last active September 8, 2021 19:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save matu3ba/dda72ad5ee473e3ea26426c121e0a967 to your computer and use it in GitHub Desktop.
Save matu3ba/dda72ad5ee473e3ea26426c121e0a967 to your computer and use it in GitHub Desktop.
comparing language semantics of systems programming languages

#comparison

This in an article about what I think what different C-family languages can be semantically distinguished in how they approach programming and what the effects on the most important categories in system programming (highest performance and reliability demands) are.

Nongoal: Looking into language philosophies and origins for design decisions, because they are very deep rabbit holes (even for single languages). Explaining what the categories mean. Please investigate yourself.

The languages to compare are

  • C language
  • C++
  • Rust
  • Zig

Categories 1.1. Type safety 1.2. Memory safety 1.3. Bounds checking 1.4. UB

  1. Concurrency 2.1. Multi-threading 2.2. Asynchronous execution

  2. Code-generalization 3.1. Polymorphism 3.2. Comptime-execution

  3. Speed 4.1. Commptime-Speed 4.2. Runtime-Speed

  4. Complexity handling

1.1. Type safety

  • C language
    • no, opaque pointers needed for many operations
    • integer arithmetic semantic type-dependend (wraps on unsigned int)
    • usage of C "explicit by convention"
  • C++
    • yes (pointers can not store object count in continuous memory),
    • integer arithmetic semantic type-dependend (wraps on unsigned int)
    • usage of C not always explicit: include<example.h> may include C or C++ header.
  • Rust
    • yes (pointers can not store object count in continuous memory)
    • integer arithmetic semantic not type-dependend
    • usage of C functions and types explicit with extern C
  • Zig
    • yes (pointers can store object count in continuous memory)
    • integer arithmetic semantic not type-dependend
    • usage of C functions and types explicit with cInclude and cImport

1.2. Memory-safety:

  • C
    • none, alloca can also smash your stack
  • C++
    • memory-safety: RAII is memory-safe, race conditions and access via pointers or references not
  • Rust
    • without unsafe annotation yes
      • safe code: memory lifetimes are annotated on variables and functions borrow checker then does an alias analysis to ensure no aliasing pointer/reference can access the same memory on stack/heap without synchronisation and ensures accessed memory exists => results in tree lifetimes
      • unsafe code: providing hardware access etc, ie to enable premises of "safe code"
  • Zig
    • compilation modes with(out) safety checks, testing and allocator decisions to detect and fix memory problems

1.3. Bounds checking

  • C
    • default: no check of run-time bounds
    • enabling: manual code
  • C++
    • default: no check of run-time bounds
    • enabling: libstd or library functions
  • Rust
    • default: check of run-time bounds
    • disabling: use of unsafe block
  • Zig
    • default(Debug and ReleaseSafe): check of run-time bounds
    • disabling: language annotation or compilation mode

1.4. Undefined Behavior (UB)

  • C
    • standard and compiler uses a lot UB for optimisations
    • compilers do not check it
  • C++
    • standard and compiler uses a lot UB for optimisations
    • compilers do not check it
  • Rust
    • code without unsafe annotation has no UB
    • unsafe code may invoke UB
    • UB checks at runtime (inside MIRI) very slow
  • Zig [specification and implementation incomplete]
    • default(Debug and ReleaseSafe): checking all forms of of UB
    • disabling for speed: language annotation or compilation mode
    • UB checks at runtime fast

2.1 Multi-threading TODO clarify this (sections in standard)

  • C language
    • POSIX primitives
    • more stuff?
    • basic synchronisation primitives
  • C++
    • POSIX primitives
    • more stuff?
    • many synchronisation primitives in libstd
  • Rust
    • POSIX primitives
    • multi-threading synchronisation primitives
    • comptime-check for race conditions, but not deadlocks
  • Zig TODO finish up with asking lithdrew or kprotty
    • fences, atomics, atomic stack, atomic queue
    • Futex, Mutex, Semaphore
    • Kernel threading primitives
    • TODO check what is POSIX

2.2 Asynchronous execution

  • C language
    • no asynchronous inside libstd
    • no cancellation abstraction
  • C++
    • colored asynchronous functions, completion based, no io_uring support
    • runtimes not in libstd
    • coroutines (stackless) since c++20, no cancellation abstraction yet
  • Rust
    • colored asynchronous functions, polling based, safety checked
    • runtimes not in libstd
    • no cancellation abstraction
  • Zig
    • colorless asynchronous functions, completion based
    • io_uring support in libstd
    • no cancellation abstraction yet

3.1 Polymorphism

  • C language static dispatch: macros=text replacement: generic selection expression or hand-written ones dynamic dispatch: hand-written, usually via casting (function) pointers
  • C++ static dispatch: operator and function overloading (latter with function call overhead) dynamic dispatch: virtual functions
  • Rust static dispatch: operator overloading, trait bounds, declarative macros, procedual macros dynamic dispatch: operator overloading, trait objects
  • Zig static dispatch: compiletime-function evaluation(CTFE) with tagged enum dynamic dispatch: comptime construction and usage of (tagged) unions and vtables

3.2. Comptime-execution [please provide benchmark sources, if you have any]

  • C language
    • The macro processor runs at compile-time and can be extended to a high-level functional language called Order.
  • C++
    • extends C-properties
    • constexpr: all calls must be constant expressions, heap objects not allowed
    • Template metaprogramming known for creating alot boilerplate/slow execution speed
  • Rust
    • constant expressions: might be evaluated at compile-time
    • constant context: always evaluated at compile-time
    • const functions: upon call from constant context the function is evaluated at compile-time
  • Zig
    • arbitrary CTFE with limitations in IO(portability), async, assembler, memoization and forbidding closures for compilation efficiency.

4.1. Compilation-Speed, see for numbers https://github.com/nordlow/compiler-benchmark

  • C language
    • fast
    • external tooling for caching compilation units
    • debug: no optimisations for linking time
    • macros not optimal implemented
  • C++
    • slow
    • external tooling for caching compilation units
    • debug: no optimisations for linking time
    • templates create alot boilerplate
  • Rust
    • slow
    • inbuild caching system for compilation artefacts
    • debug: no optimisations for linking time
    • macros notoriously slow, static traits slow, dynamic traits ok
  • Zig
    • very fast (once selfhosting)
    • inbuild caching system for compilation artefacts
    • debug: in-binary patching to modify GOT for all function calls
    • comptime with CPython speed, but is relative parallelizable

4.2. Runtime-Speed better compared by use cases Overall reachable speed comparable, so list criteria for runtime performance

  • C language
    • portability
    • code size
    • simplicity
  • C++
    • simple to build abstractions for objects and object-specific operators
    • provide (functional) optimisations inside language
    • duck-type abstractions
  • Rust
    • memory safety for lifetimes generating derivation trees
    • provide (functional) optimisations inside language
    • prologish logic checks of interfaces with type system
    • absence of race conditions and related errors (but not deadlocks)
  • Zig
    • portability
    • code size
    • simplicity (including macro replacement)
    • development experience (dev speed, dev control and understandability inside code reviews)

=> personal opinion: C for portability, C++ for graph stuff, Rust for tree stuff,

 Zig if 1. functional optimisations not worth extra compile time (or provided/implementable),
        2. lifetimes not tree-like or not worth extra compile time,
        3. multithreading design and problems understood,
        4. deriving logic checks not worth extra compile time,
        5. **development experience matters**
        (To me this mean that the complexity or semantic assumptions of
        the language do not prevent tooling and developers from fixing it).
        6. Graph stuff in Zig needs further exploration.
  1. Complexity handling
  • C language
    • (1) macros select and merge exported file-parts (or file as a whole)
    • (2) differentiation between headers and source files to simplify (1)
    • file as compilation unit
    • only struct members are namespaced
    • imports are always file-scoped, enum members also
  • C++
    • macros to glue things together
    • file as compilation unit
    • (3) namespaces to scope and limit access of C language parts (1, 2)
    • objects and namespaces are namespaced, but must be redundantly handled in build system or lead to linker errors
  • Rust
    • interface inheritance with constrains (traits)
    • several types of macros, which are namespaced
    • crate as compilation units, modules as crate namespaces
    • lifetimes are automatically derived or can be tagged as such
  • Zig
    • comptime select and merge file-parts (or file as a whole)
    • file as compilation unit
    • everything is namespaced individually
    • usingnamespace to mix all public declarations of an operand into the scope of an object, ie for forward declarations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment