Skip to content

Instantly share code, notes, and snippets.

@Snaipe
Last active August 29, 2015 14:23
Show Gist options
  • Save Snaipe/ebb8dfd16798a23ecb8c to your computer and use it in GitHub Desktop.
Save Snaipe/ebb8dfd16798a23ecb8c to your computer and use it in GitHub Desktop.

Experiments on building high level abstractions with C

Introduction: facts vs bullshit (5 min)

C ⊄ C++

"C is just C++ without classes"

ANSI C has not been a subset of C++ since 1999!

  • Compound literals
  • Designated initializers
  • Flexible array members
  • Variable length arrays

C is concise

  • It's easier to have syntactic smells in C++ than in C (mostly because of template horrors)
  • Casts are also less complex in C because it always reinterprets and leaves full control to the programmer (but with great power comes great responsibility)
  • Types are tailored for a specific type instead of being generic (this has also extremely big inconvenients because you have to reinvent the wheel every time)

C can be safe

  • If you do not cast or restrict your use of void pointer conversions, you get a solid safety assertion enforced by the type system
  • Opaque types help to provide a data type without unsafely exposing its internals

C can be used for convenient abstractions and syntactic sugar

  • Better varargs
  • keyword arguments
  • Macro pseudo metaprogramming
  • All the examples below

ISO C is fairly limited compared to GNU C

  • GNU extensions built mostly for the linux kernel -> Stronger developer community for C on *nix

Smart pointers & RAII: safe memory management in C (35 min)

A typical day in a C programmer's life (5 min)

  • malloc/use/free cycle for long term storage
  • nontrivial to track for complex programs
  • many prefer automatic storage and passing pointers as arguments -> but stack size is limited

Is there a way to reproduce the behaviour of automatic storage on the heap ?

Enters RAII (10 min)

In C++ we have native RAII and smart pointers

  • Smart pointers denote ownership of the dynamic memory, with unique_ptr having only one owner and shared_ptr having multiple ones.
    • Memory allocated with unique_ptr is freed when the owning smart pointer is destroyed; and memory allocated with shared_ptr is freed when all the owning smart pointers are destroyed.
  • RAII is simply cleaning up your used ressources in an object's destructor.
    • Known examples are standard streams

In C however, we are pretty limited, but there are workarounds.

  • MSVC: __try/__finally
    • Exception safe on windows
    • Simple but verbose, you still need to manually call the cleanup function
  • GCC: __attribute__((cleanup(fun)))
    • Variable attribute
    • Calls fun with the pointer to the variabled being cleaned up when exiting the scope
    • Bypassed by noreturn functions (exit, longjmp) and signals

Both are nonstandard C, however.

Building automatic storage variables on the heap (2 min)

  • Introducing autofree: the keyword that automatically frees your dynamically allocated memory
  • Same concept can be applied for an autoclose and autofclose keyword.

Problem: it's still not good enough

  • Nested pointers in structure are untouched
  • Variables cannot live outside the scope

Wrapping malloc and adding destructors (5 min)

We introduce smalloc and sfree to wrap malloc and free and implement our special logic

  • We attach metadata by reserving some space before the allocated memory zone
  • We get destructors!
  • sfree becomes the universal deallocator, akin to delete in C++
  • let's make a smart keyword to automatically call sfree when going out of scope

What we get is a simple working implementation of unique_ptr

Reference counting and thread safety shenanigans (7 min)

  • We need to be able to share a pointer between multiple owners, hence we add a counter to the prepended metadata.
  • sfree decrements the counter, and we add a sref function that increments it; if the counter goes to 0, free is called on the memory block.
  • however this is not thread safe; and adding a lock would slow allocation down horribly
  • We get to (manually!) implement atomic operations over size_t (because C11 is still not that widespread)
  • We also get to enjoy race conditions if we don't carefully think the code through
  • To add even more complexity we need to implement weak pointers that require another independent counter for weak pointer instances
  • And everything needs to be thread safe

After countless hours of pain we get a nice shared_ptr implementation.

Wrapping up and the final result (6 min)

  • smalloc takes a lot of parameters by now, so we wrap it up in a macro
  • we introduce the unique_ptr and shared_ptr macros that take a type instead of a size
  • we also introduce smart arrays under the same interface (this one was hard)

Doing all this and we get an extremely satisfying result.

Criterion: doing modern unit testing with C (40 min)

Unit testing in C: what a sad world we live in (10 min)

  • Current (known) unit testing frameworks for C are a mess
  • They are stable but require lots of boilerplate code
  • You need to care about the implementation before adding tests
  • All other languages have an automatic test registration mechanism
  • All other languages doesn't force you to set up your test runner

Framework hall of shame:

  • CUnit is horribly verbose and you get to manually clean up his garbage (CU_cleanup_registry() extravaganza)
  • Check, WTF ARE YOU DOING WITH MACROS AND M4 (BAD CHECK, BAD)

What we actually need:

  • Default entry point that setups the test runner with sane defaults and provides useful command line parameters
  • Automatic test registration
  • Test & Suite metadata set with a declarative syntax

Producing a state-of-the-art library (10 min)

In order to stand a chance in front of existing solution, we need to being feature-complete

  • xUnit structure (tests, suites, fixtures, runner)
  • TAP 12 support
  • Custom logging & reporting
  • Command-line arguments & environment variables to alter runtime behaviour
  • Test filtering and matching (globbing for simplicity)
  • Be crash-resilient; DO NOT TRUST USER CODE
    • Also test crashes (signals) where they are expected

But also:

  • Being simple
  • Being extensible
  • Being solid (smart pointers help wink)

Getting rid of the entry point (5 min)

  • Static and shared libraries can provide a default main that can be overriden by user code
  • We implement a main function that uses sane defaults, command line parameters and environment variables
  • We provide a hook mechanism to allow the user to handle events from the runner without needing to provide a main

Tackling the automatic test registration (5 min)

  • GCC/Clang/MSVC allows you to put data in arbitrary sections
  • We use this to associate metadata to tests, suites, and report hooks -> beware of padding between variables in a section
  • We iterate the structures when the runner is started and set up everything

The result & testing done easy (7 min)

  • Show sample to test malloc()

In the future (3 min)

  • Parallel test execution
  • Concolic test generation for functions
  • Windows support

Reflection: an Insight on C/C++ (30 min)

Metadata is an omnipresent god (5 min)

Programs are made by humans However humans are flawed Therefore these programs are bugged

Hence a program cannot properly function without a debugger, which in turn cannot properly function without debugging information

  • Statements claiming that runtime reflection cannot be done are utter bullshit
  • Debuggers use runtime remote introspection all the time
  • typing var->x in your debugging console proves that it knows that x exists in var and what its offset is.

A tale of elves, dwarves and tears (10 min)

  • DWARF and libdwarf++
  • Traversing the DIE tree
  • Building incremental metadata

Insight in action (10 min)

  • Get TypeInfo from type, expression
  • Manipulate an object from its TypeInfo
  • Mutate its internals
  • Call any method
  • Instanciate an object
  • Having fun with the standard library

What can I do with that ? (5 min)

  • Universal marshalling!
  • Object views and mirrors!
  • Wrappers and proxies!
  • Object instanciation from type info!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment