Author: Kirk Shoop
feature | description |
---|---|
link-time replaceable |
Allows the std library to know at link-time which services to construct. Allows specified construction order of services by the std library. Prevents any need to address the complexities of runtime replacement in the specification or in the implementation (eg. when to construct, when to destruct, how many replacements allowed, etc..) |
cannonical across all stdlib implementations |
libraries like TBB and Qt implement a replacement once and that works with all std libraries. |
user provided storage |
minimize allocations by allowing a user to supply storage for an operation |
user-provided allocator | Allows the user to determine how operations allocate additional storage |
specified construction order |
The order of the construction of services provided by the std lib is specified in start and term sections of the std |
compile-time versioned |
Specify how to version types that change or extend the ABI. When additional features are added to the system context how does that look in the type-system? std::execution::system_context -> std::execution::system_context2 ?or std::execution::v1::system_context -> std::execution::v2::system_context ? |
run-time versioned |
The user can inspect the run-time version of the implementation and use that to select a different code-path. This allows an implementation of v2 system_context to be provided by the std library as a v1 system_context and then when the v1 system_context is passed by an app or a library to a library that is aware of v2, the library will be able to recognize at runtime that the provided v1 system_context has a v2 implementation and use the v1 system_context as a v2 system_context |
run-time extensible |
A replacement can expose additional functionality and code has a mechanism to query for that functionality at runtime An example might be that a replacements provided by hpux, TBB, Qt, and ASIO might expose tuning options and additional features like IO |
backward- compatible |
This means that it is specified how existing code using the system_context and existing replacements of the system_context are not affected by changes to the ABI over time. The ABI explicitly specifies how to: add features, remove features, and change features, without breaking any existing users or libraries or std libraries or compiled binaries. |
feature | Paper | POC godbolt |
---|---|---|
link-time replaceable | ✅ | ✅ |
cannonical across all stdlib implementations |
✅ | ✅ |
user provided storage | ✅ | ✅ |
user-provided allocator | ❌ | ❌ |
specified construction order | ❌ | ✅ |
compile-time versioned | ❌ | ✅ |
run-time versioned | ❌ | ✅ |
run-time extensible | ❌ | ✅ |
backward-compatible | ❌ | ✅ |
- An ABI should be defined in a way that avoids or prevents common programming mistakes
- mistakes like: buffer overruns, deadlocks, races, resource leaks, bad casts, units represented by storage types like ‘int’, unversioned ABIs
- examples:
- Use struct with ptr and count instead of separate pointer and count arguments
- Add functions to manipulate these safely
- Write adapters to idiomatic forms in each language (std::span etc..)
- Use struct for each unit that contains a value member instead of native arithmetic types
- Add functions to manipulate these safely
- Write adapters to idiomatic forms in each language (std::chrono::nanoseconds etc..)
- Define a canonical unit as the abstraction unit and then do conversions to and from the canonical unit. A balance must be achieved in the abstraction that favors safety.
- If nanometers, then no inches, etc..
- If kelvin, then no fahrenheit or celsius, etc..
- Rationale
- reduces the size of the abstraction API
- reduces bugs.
- increases safety
- Effects
- increases errors in the values as they are converted.
- increases overhead as they are converted.
- Use struct with ptr and count instead of separate pointer and count arguments
- An ABI should provide structured-concurrency
- all resources are tied to (async-)function-scopes
- exclude
- create_handle() -> handle & close_handle(handle) -> void
- An ABI should be replaceable
- link-time replacement of a function in order to replace a whole component with a different implementation of the ABI
- An ABI should be ‘canonical’
- same across all std library implementations
- An ABI should be versioned
- features added in new versions
- features deprecated in old versions
- features replaced by new versions
- An ABI should be backward-compatible (no limits on changes to the std library that modify the ABI)
- old binaries are unaffected by
- features added
- features deprecated
- features replaced
- old binaries are unaffected by
- An API, expressed in terms of an ABI, will likely need to be supported for 10+ years
Sketch of such an ABI for the C++ system-executor
- C++
- C
Microsoft defined COM in the 80s
COM used a tool called 'midl' to generate C and C++ definitions of interfaces that were structs with vtables in C and abstract bases in C++.
Each interface has an associated uuid that uniquely identifies that interface. interface + uuid are immutable. This creates ABI-stability and backwards-compatibility.
Change occurrs by:
- adding an interface + uuid that extends an existing interface + uuid
- creates a new vtable that includes all of the old vtable functions and appends new funtions to the end of the vtable
- adding a new interface + uuid that replaces an existing interface + uuid
- new vtable does not include the functions from the vtable of an existing interface + uuid
- add an interface + uuid to the implementation of an object
- remove an interface + uuid from the implementation of an object
The COM model of using an abstract base interface paired with a unique uuid value has a proven track record of allowing:
- extensive modifications of object implmentations with no impact on existing binaries using those implementations
- major features added with no impact on existing binaries that are using other features
Every COM interface includes IUnknown.
IUnknown has two services.
- ref-count managed lifetime
- interface selection by uuid
All COM interfaces allow changing to a different interface by calling QueryInterface with a uuid for the interface desired. The query will fail when the underlying object does not implement the given interface + uuid.
Replicate? Yes
uuid assigned by generating new uuids from scratch for each interface
Replicate? No
COM QueryInterface has a void**
argument in which the selected interface pointer is placed.
Replicate? No
All COM interfaces include IUnknown.
The way to check two different interfaces for Object Identity depends on queryies for IUnknown being cannonical across all interfaces on the same object.
If two interfaces are queried for exactly IUnknown, then the pointers returned from both queries must have the same value when the given interfaces were retrieved from the same underlying object.
Replicate? Yes
All COM interfaces share Ownership of the underlying object. There are no Ref semantics. There are no Unique semantics.
Replicate? No
COM created an integer value type that was expected to hold a superset of all errors across all time, across all features.
Replicate? No
Rationale:
- callers must handle every possible error at every call site
- usually this is reduced to a binary if ()
COM created its own string type. A BSTR was a pointer to a string that was preceded by a UINT16 count of capacity (allowing embedded null chars in a string). It also required that BSTR were allocated from a dedicated BSTR allocator.
Replicate? No
Rationale:
- negative indexing to retrieve untyped struct members and then
reinterpret_cast
to a type is not safe - forcing allocation limits the usage in no-alloc targets
- designating a single allocator implementation does not allow using fit-for-purpose allocators
VARIANT is a tag union of a fixed set of POD types. This has been a severe limitation when designing interfaces that will pass values whose type is not known at the time that the interface + uuid was defined.
Replicate? No
The uuid v5 namespace ids for abi interfaces and objects:
namespace std::abi::v1 {
struct interface_id_t {
inline static constexpr uuid_t namespace_id = /* 31e5da35-7745-46f4-a241-7cac52889433 */;
uuid_t value;
};
struct object_id_t {
inline static constexpr uuid_t namespace_id = /* 584fc8d0-15ac-4c43-a5be-07b4e9f68dc3 */;
uuid_t value;
};
} // namespace std::abi::v1
The abstract base struct, std::abi::v1::base::interface
, that all interfaces derive from.
namespace std::abi::v1::base {
enum class select_interface_expected_t : std::uint8_t {
select_interface_expected_uninitialized = 0,
not_implemented = 1,
implemented = 2
};
struct interface_t;
struct select_interface_result_t {
select_interface_expected_t expected;
object_id_t object_id;
interface_id_t interface_id;
interface_t* value;
};
struct interface_t {
inline static constexpr interface_id_t id{/* "base::interface_t" */};
virtual ~interface_t() {}
virtual auto select_interface(interface_id_t) noexcept -> select_interface_result_t = 0;
};
} // namespace std::abi::v1::base
The ref wrapper to simplify correct usage of base::interface_t
namespace std::abi::v1::base {
template<std::derived_from<interface_t> Interface>
struct ref_t {
using interface_t = Interface;
inline static constexpr interface_id_t id{interface_t::id};
~ref_t() noexcept { value == nullptr; }
ref_t() = delete;
explicit ref_t(object_id_t oid, Interface* i) noexcept :
object_id(oid),
value(i) {
if (!i) { std::terminate(); }
}
ref_t(const ref_t&) = default;
ref_t& operator=(const ref_t&) = default;
ref_t(ref_t&& o) noexcept :
object_id(o.object_id),
value(std::exchange(o.value, nullptr)) {
}
ref_t& operator=(ref_t&& o) noexcept {
object_id = o.object_id;
value = std::exchange(o.value, nullptr);
}
template<std::derived_from<interface_t> DesiredInterface>
auto select_interface() const noexcept
-> std:expected<
ref_t<DesiredInterface>,
std::tuple<interface_id_t, object_id_t>> {
select_interface_result_t result = value->select_interface(DesiredInterface::id);
if (select_interface_expected_t::implemented != result.expected) {
return std::make_tuple(DesiredInterface::id, result.object_id);
}
return ref_t<DesiredInterface>{result.object_id, static_cast<DesiredInterface*>(result.value)};
}
Interface* get_native() const noexcept {
return value;
}
object_id_t get_object_id() const noexcept {
return object_id;
}
private:
object_id_t object_id;
interface_t* value;
};
} // namespace std::abi::v1::base
The implementation of a destructible object
namespace std::abi::v1::object {
struct interface_t : base::interface_t {
inline static constexpr interface_id_t id{/* "object::interface_t" */};
virtual auto destruct() noexcept -> void = 0;
};
template<class Object>
struct ref_t : base::ref_t<Object::interface_t> {
~ref_t() {
Object::object_t* o = std::exchange(object, nullptr);
o->destruct();
}
explicit ref_t(Object::object_t* o, Object::interface_t* i) noexcept :
base::ref_t<Object::interface_t>(Object::id, i),
object(o) {
if (!i) { std::terminate(); }
}
private:
Object::object_t* object;
};
} // namespace std::abi::v1::object
The implementation of an object factory
namespace std::abi::v1::factory {
enum class construct_expected_t : std::uint8_t {
construct_expected_uninitialized = 0,
not_implemented = 1,
implemented = 2
};
struct construct_result_t {
construct_expected_t expected;
object_id_t object_id;
interface_id_t interface_id;
base::interface_t* base;
object::interface_t* value;
};
struct storage_info_t {
construct_expected_t expected;
std::uint32_t size;
std::uint16_t alignment;
};
struct storage_t {
std::byte* begin;
std::byte* end;
};
struct interface_t : base::interface_t {
inline static constexpr interface_id_t id{/* "factory::interface_t" */};
virtual auto get_storage_info(object_id_t, base::interface_t*) noexcept -> storage_info_t = 0;
virtual auto construct(object_id_t, storage_t, base::interface_t*) noexcept -> construct_result_t = 0;
};
template<std::derived_from<interface_t> Factory>
struct ref_t : base::ref_t<Factory> {
template<class Object>
auto get_storage_info(base::ref_t<Object::arguments_t> args) const noexcept
-> std:expected<
std::tuple<std::uint32_t, std::uint16_t>,
object_id_t> {
storage_info_t info = get_native()->get_storage_info(Object::id, args.get_native());
if (construct_expected_t::implemented != info.expected) {
return Object::id;
}
return std::make_tuple(info.size, info.alignment);
}
template<class Object>
auto construct(std::span<std::byte> storage, base::ref_t<Object::arguments_t> args) const noexcept
-> std:expected<
object::ref_t<Object>,
std::tuple<interface_id_t, object_id_t>> {
construct_result_t result = get_native()->construct(
Object::id, storage_t{storage.begin(), storage.end()}, args.get_native());
if (construct_expected_t::implemented != result.expected) {
return std::make_tuple(Object::interface_t::id, result.object_id);
}
return object::ref_t<Object>{
static_cast<Object::object_t*>(result.value),
static_cast<Object::interface_t*>(result.base)};
}
};
} // namespace std::abi::v1::factory
The implementation of an empty object
namespace std::abi::v1 {
struct empty_object_t : base::interface_t {
inline static constexpr object_id_t id{/* "empty_object_t" */};
using object_t = object::interface_t;
using arguments_t = base::interface_t;
using interface_t = base::interface_t;
override auto select_interface(interface_id_t desired) noexcept -> select_interface_result_t {
if (interface_t::id == desired) {
return {{id, interface_t::id, static_cast<interface_t*>(this)}, select_interface_expected_t::implemented};
}
return {{id, interface_t::id, nullptr}, select_interface_expected_t::not_implemented};
}
};
inline static constexpr empty_object_t empty_object{};
inline static constexpr base::ref_t<base::interface_t> ref_empty_object{empty_object_t::id, &empty_object};
} // namespace std::abi::v1
uuid is an IETF RFC https://datatracker.ietf.org/doc/html/rfc4122
There are multiple versions of uuid defined. v5 (rfc4122#section-4.3) has namespacing properties that fit an ABI very well.
A version 5 (SHA-1) UUID is created using SHA-1 to hash a "name" string and a "namespace" uuid
A namespace uuid is a normal uuid generated in the normal way uuid_create()
A v5 uuid can be created by starting with a namespace uuid and using SHA-1 to hash the namespace uuid and a given string "name" together and use the output of the SHA-1 hash to produce the v5 uuid.
Hana Dusikova started writing a constexpr uuid_t: https://github.com/hanickadot/cthash/blob/feature/uuid/examples/uuid.cpp
For an ABI, an abstract base
struct mylib::my_interface : std::abi::v1::base::interface_t {..};
would have a v5 uuid built by
uuid_v5_create(std::abi::v1::interface_id_t::namespace_id, "mylib::my_interface");
each ABI version would generate a new interface namespace uuid and a new object namespace uuid, and use those to create all the v5 uuids for the interfaces and objects in that version of the ABI.
void*
casts are not needed. static_cast
is safer than reinterpret_cast
, so having an abstract base interface struct that all other interfaces inherit from, allows reinterpret_cast
to be elimiinated.
Have an abstract base struct ala std::abi::v1::base::interface_t
the base interface contains a select_interface
function that takes an std::abi::v1::interface_id
.
the select_interface
function returns std::abi::v1::base::select_interface_result_t
the select_interface_result_t
struct has three members: expected
, object_id
, and value
expected
reports the expected conditions:not_implemented
, orimplemented
object_id
reports the uuid of the object implementing the giveninterface_id
value
contains a pointer that can be static_cast to the interface associated with the selected interface_id.
This is an ABI form of dynamic_cast
that requires no RTTI from the compiler. All the type information (interface + uuid) is defined by the ABI.
- allow an object to implement N different interface + uuid and let the user select which interface + uuid they want to use.
- allow new interfaces to replace old ones.
- allow multiple representations for data. an object that supports an interface that provides a string and an interface that provides functions for an URL contained in the string.
- support a value type that has infinite variants. an object might support an interface that provides a string or instead support an interface that provides a float or instead support an interface that provides an URL.
vtable selection provides an infinite variant with no fixed size.
an object might support an interface that provides a string or instead support an interface that provides a float or instead support an interface that provides an URL, to infinity..
An object will have a point of creation into some storage and a point of destruction of the object in some storage.
ref-counting an object will be an opt-in feature of the user of an object and this will not affect the ABI of the object.
A solution for storage involves some difficult tradeoffs.
There is a need to support:
- zero-alloc systems where all storage is statically known at compile-time.
- dynamic replacement of the ABI implementation - without recompiling the binary using the ABI.
- different implementations with different object sizes and alignments.
Conflicts:
- If there is a fixed size defined for each declared object, then all ABI implementations will be constrained to that size.
- If there is a dynamic query for the storage info required for each declared object, then all users of the ABI must either over-size static reserved storage based on a guess or must dynamically allocate.
It may be possible to separate the cases enough to reduce the impact of the conflicts.
- Decide that the declaration of an object includes a static-size that implementations, that support dynamic replacement, must be implemented to fit in the declared static-size. The implementations are allowed to use internal allocations in this case.
- Decide that zero-alloc systems must recompile in order to change the implementation of the ABI. Dynamic ABI replacement is not allowed for zero-alloc systems.
- Decide that implementations that support zero-alloc systems are allowed to define replacement objects whose declarations include an implementation-specific size
- creates a new ABI that is implementation-specific and thus is not replaceable with another implementation
- allows updates of the implementation to retain other features provided by an ABI
- eg. maintain backwards compatibility, versioning, ..
Other:
The user of an object should be in control of where the object is stored. This is a requirement to zero-alloc that might need to be relaxed for dynamic replacement of the implementation.
Different implementations of the object are allowed to have different storage requirements.
In the case of an object exposed in an ABI, the object declaration will need to provide the information for the size and alignment. All implementations that support dynamic replacement, are required to support the declared storage for their object implemetation.
The constructor for an object exposed in an ABI will take a reference to the storage for the object.
The constructor will require that the storage provided is sufficient for the object. The program will terminate when there is not enough storage provided. This would happen if an implementation that did not support dynamic replacement was used with a binary that was compiled to support dynamic replacement
Async Allocation:
It is possible for an implementation that requires larger storage to keep an internal pool of storage and allocate that to supplement the user supplied storage.
Lewis implemented an io-scheduler for Microsoft IOCP that had a fixed pool of internal state that contained IOCP records. Scheduling a new IO operation would have a fixed size that was used to add the request to an intrusive list of pending IO until an IOCP slot became available after which it was moved to an intrusive list of in-flight IO.
This approach effectivly hides an async allocation on a fixed-size pool of object storage behind an async function. This works for objects when the object construction is an async-function.
In many systems precondition failures and code-bugs are represented as errors.
Examples: invalid-argument, invalid-pointer, out-of-range,
There is no valid representation of a code-bug or precondition failure. These failures must result in process termination.
This is required for ABI stability for callers across ABI implementations. The alternative is that implementors have to support every creative invalid usage of an ABI in every version/implementation that provides that ABI 'bug-compatible'.
This includes each and every std library implementation needing to maintain 'bug-compatible' implementations even when that makes new features less efficient to implement or more complicated to support.
The ABI must define all preconditions for each function so that the caller of a function can guarantee the the function will not encounter a pre-condition failure.
A function result has expected states and output values.
expected states are specific to each function.
changing the set of expected result states for a function changes the ABI, just like changing the arguments for a function changes the ABI.
expected states provide information about the state when the function exited.
expected states can indicate:
- the state of one or more disjoint post-conditions as enum or flag-set
- corrupt-data (parsing untrusted data)
- not-found (a try-.. method did not produce a value)
- disconnected
- additional state
- end-of-file
- default-value
expected states must be handled by every caller. The design of the function ABI must balance the usage complexity of many expected states with the impact of leaving states that cannot be represented as an expected state, which will then result in program termination when those states are encountered.
namespace mylib::myfoo {
using namespace std::abi::v1;
struct interface_t : base::interface_t {
inline static constexpr interface_id_t id{/* "mylib::myfoo::interface_t" */};
virtual auto hello() noexcept -> void = 0;
};
template<std::derived_from<interface_t> Foo>
struct ref_t : base::ref_t<Foo> {
void hello() noexcept {
get_native()->hello();
}
};
} // namespace mylib::myfoo
namespace mylib::myfooworld {
using namespace std::abi::v1;
struct interface_t : myfoo::interface_t {
inline static constexpr interface_id_t id{/* "mylib::myfooworld::interface_t" */};
virtual auto world() noexcept -> void = 0;
};
template<std::derived_from<interface_t> FooWorld>
struct ref_t : myfoo::ref_t<FooWorld> {
void world() noexcept {
get_native()->world();
}
};
} // namespace mylib::myfooworld
namespace mylib::mynewfoo {
using namespace std::abi::v1;
struct interface_t : base::interface_t {
inline static constexpr interface_id_t id{/* "mylib::mynewfoo::interface_t" */};
virtual auto helloworld() noexcept -> void = 0;
};
template<std::derived_from<interface_t> NewFoo>
struct ref_t : base::ref_t<NewFoo> {
void helloworld() noexcept {
get_native()->helloworld();
}
};
} // namespace mylib::mynewfoo
namespace mylib::mybar {
using namespace std::abi::v1;
struct object_t {
inline static constexpr object_id_t id{/* "mylib::mybar::object_t" */};
using object_t = object::interface_t;
using arguments_t = base::interface_t;
using interface_t = myfoo::interface_t;
};
struct implementation_t :
object_t,
myfooworld::interface_t,
mynewfoo::interface_t,
object::interface_t {
// select_interface can be generated from a list of interface types
override auto select_interface(interface_id_t desired) noexcept -> select_interface_result_t {
myfooworld::interface_t* identity = static_cast<myfooworld::interface_t*>(this);
if (myfooworld::interface_t::id == desired) {
return implemented<myfooworld::interface_t>(identity);
} else if (myfoo::interface_t::id == desired) {
return implemented<myfoo::interface_t>(identity);
} else if (base::interface_t::id == desired) {
// casting from 'this' would be ambigous, but casting from 'identity' is not.
return implemented<base::interface_t>(identity);
}
if (mynewfoo::interface_t::id == desired) {
return implemented<mynewfoo::interface_t>(this);
}
if (object::interface_t::id == desired) {
return implemented<object::interface_t>(this);
}
return {{id, interface_t::id, nullptr}, select_interface_expected_t::not_implemented};
}
override auto destruct() noexcept -> void {
this->~implementation_t();
}
// original interface
override auto hello() noexcept -> void {
this->myhello();
}
// extended interface
override auto world() noexcept -> void {
this->myworld();
}
// replacement interface
override auto helloworld() noexcept -> void {
this->myhelloworld();
}
private:
template<class Interface, class Base>
static auto implemented(Base* base) noexcept -> select_interface_result_t {
return {{id, Interface::id, static_cast<Interface*>(base)}, select_interface_expected_t::implemented};
}
void myhello() noexcept { /**/ }
void myworld() noexcept { /**/ }
void myhelloworld() noexcept { /**/ }
};
struct factory_t : factory::interface_t {
inline static constexpr object_id_t id{/* "mylib::mybar::factory_t" */};
// select_interface can be generated from a list of interface types
override auto select_interface(interface_id_t desired) noexcept -> select_interface_result_t {
if (factory::interface_t::id == desired) {
return {{id, factory::interface_t::id, static_cast<factory::interface_t*>(this)}, select_interface_expected_t::implemented};
} else if (base::interface_t::id == desired) {
return {{id, base::interface_t::id, static_cast<base::interface_t*>(this)}, select_interface_expected_t::implemented};
}
return {{id, desired, nullptr}, select_interface_expected_t::not_implemented};
}
override auto get_storage_info(object_id_t desired, base::interface_t*) noexcept -> storage_info_t {
if (object_t::id == desired) {
return {construct_expected_t::implemented, sizeof(implementation_t), alignof(implementation_t)};
}
return {construct_expected_t::not_implemented, 0, 0};
}
override auto construct(object_id_t desired, storage_t storage, base::interface_t*) noexcept -> construct_result_t {
if (object_t::id == desired) {
if (storage.end - storage.begin < sizeof(implementation_t)) { std::terminate(); }
implementation_t* object = new(storage.begin) implementation_t();
return {
construct_expected_t::implemented,
object_t::id, object_t::interface_t::id,
static_cast<object_t::interface_t*>(object),
static_cast<object_t::object_t*>(object)};
}
return {construct_expected_t::not_implemented, desired, base::interface_t::id, nullptr, nullptr};
}
};
factory::interface_t get_bar_factory() noexcept {
return factory_t{};
}
} // namespace mylib::mybar
using namespace std::abi::v1;
void use_bar() {
std::vector<std::byte> storage;
factory::ref_t<factory::interface_t> fctry = mylib::mybar::get_bar_factory();
auto info = fctry.get_storage_info<factory::object_t>(ref_empty_object);
if (!info.has_value()) { return; }
auto [size, alignment] = info.value();
storage.resize(size);
auto object = fctry.construct<factory::object_t>(storage, ref_empty_object);
if (!object.has_value()) { return; }
auto newfoo = object.value().select_interface<mylib::mynewfoo::interface_t>();
if (newfoo.has_value()) {
newfoo.value().helloworld()
return;
}
auto fooworld = object.value().select_interface<mylib::myfooworld::interface_t>();
if (fooworld.has_value()) {
fooworld.value().hello();
fooworld.value().world();
return;
}
auto foo = object.value().select_interface<mylib::myfoo::interface_t>();
if (foo.has_value()) {
foo.value().hello();
return;
}
}