Skip to content

Instantly share code, notes, and snippets.

@Nekotekina
Last active January 17, 2019 00:03
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Nekotekina/a3e80707daef4a0a409b67906650083b to your computer and use it in GitHub Desktop.
Save Nekotekina/a3e80707daef4a0a409b67906650083b to your computer and use it in GitHub Desktop.
Bound Class Templates Proposal

Bound Class Templates

Introduction

Bound class - a class that has direct access to non-static members of some outer class and knows its own location within this outer class. Whether it can access private or protected members of the outer class, is governed by existing friendship rules. The only proposed way a bound class can be defined and instantiated is through a bound class template. Non-template bound classes are not proposed.

Outer class - a class that has a member or a base class which is a bound class. An outer class itself can be a bound class, or a normal class. This term does not imply anything special.

Bound class template - a class template whose template parameter list has a special parameter, with special syntax. The presence of this template parameter turns a normal class template into a bound class template.

template <class T = int, this auto::* Self>
struct bound
{
	using outer = std::outer_t<Self>;

	T foo() const
	{
		return (T)this->z;
	}
};

This parameter is implicitly defaulted in regard to other template parameters. This parameter shall appear only once in the template parameter list. This parameter is essentially a pointer to a data member, which points to the bound class itself inside the outer class.

struct bar : bound<>, bound<void>
{
	using btype = bound<>; // Respelled base type
	
	bound<> x, y;
	int z;
	
	using xtype = decltype(x);
	using typex = bound<int, &bar::x>; // Respelled member type
	static_assert(std::is_same_v<xtype, typex>);
	
	// All three types are unique: base bound<> class, x, and y
	static_assert(!std::is_same_v<xtype, btype>);
	static_assert(!std::is_same_v<xtype, decltype(y)>);
	static_assert(!std::is_same_v<btype, decltype(y)>);
};

// Only int z; occupies the storage
static_assert(sizeof(bar) == sizeof(int));

Motivation

Bound class templates can solve multiple problems, some of them are thoroughly discussed in other proposals, for example: https://github.com/NicolBolas/Proposal-Ideas/blob/master/older/Mixins%2C%20Inner%2C%20and%20Stateless%20Classess.md

  • "Properties" with zero overhead, for example: https://gist.github.com/Nekotekina/84f333674c1fcd902115e0010d889745
  • Base classes which can access the derived class directly (unnamed bound classes) - similar to CRTP but with more terse syntax; no need to type outer (derived) class type; no need to static_cast this pointer to access them.
  • Named subobjects with similar, potentially unlimited functionality and zero access overhead.

Design principles

  • Keep bound classes as close to normal classes as possible.
  • Minimal special syntax to declare bound class templates. Bound class template syntax may be strange, but it's intended primarily for library writers. On the library user side, however, no additional syntax is required. Possible alternative: specifying template parameter this to denote bound class template instatiation:
struct bar : bound<int, this>, bound<void, this>
{
	bound<int, this> x; // This is verbose and less convenient; makes it harder to spell base class type
};

Rules

  • Bound class templates have one and exactly one template parameter with special type this auto::*. This parameter is implicitly defaulted, cannot be explicitly defaulted, so non-defaulted template parameters shall not appear after it. The type of this template parameter is a pointer to data member from the outer class to the bound class. The value of this parameter allows to convert a pointer (reference) of the outer class to a bound class, just like a normal pointer to data member.
  • Bound class templates are instantiated only in the following contexts, with defaulted this auto::* parameter:
    • Declaring a non-static data member. Cannot declare an array. Cannot declare a pointer or a reference on the same line.
    • Declaring a non-virtual base class, at most once in sense it can be spelled unambiguously with all other template parameters preceding defaulted this auto::* parameter.
  • Upon instantiation, a unique type is created, recurringly pointing to itself from within this auto::* pointer type.
  • this auto::* parameter can be matched in templates likewise it is an ordinary pointer to member or auto template parameter. this auto::* parameter can be used in template template parameter list, or template alias parameter list. this auto::* shall not appear in function template parameter list.
  • No instance of the bound class shall be constructed outside of the context is has been instantiated in. The bound class behaves like an abstract class in this sense. If something like the external instance of the bound class seems necessary, I believe that a proxy object (a pointer to the outer class along with the bound class bound to the proxy) should be constructed instead.
  • Non-static members of the outer class can be accessed via this pointer, or unambiguously, via the typename of the outer class. If a bound class template is declared as a nested class template, non-static members of the outer class can be accessed directly as if they were in the scope. Non-static members of the bound class hide those of the outer class in the name lookup. Friendship rules fully apply.
  • A pointer (reference) to the bound class can be implicitly converted to the pointer (reference) to the outer class, likewise the outer class is the base class of the bound class (but it isn't).
  • Because every bound class has a unique type (uniquely parametrized by the pointer to member), empty bound classes occupy no space inside the outer class, just like empty base optimization. This optimization is mandatory and applies to members too, not only base classes.
  • If a bound class has a virtual destructor, it's overridden so the outer class can be deleted by the pointer to the bound class, even if it's a named member.

Explanation

Pointer to data member is a ubiquitos object which has the following information:

  1. Outer class type.
  2. Member object type, in our case a unique type recurringly referring to itself.
  3. Offset value, or other implementation-specific information.

All this data indicates the unique bound between bound class and outer class, but also remains a normal pointer to data member. When outer class needs to access any its member, it performs offset computation, in the simplest case (without virtual bases) it's just adding a constant to this pointer value. Normal data members cannot access the outer class, but bound class does it trivially. It knows its own offset, so it can reverse the computation by subtracting a constant from this pointer. It has practically no overhead, but also transparent and safely hidden from the programmer. What about complex cases with virtual bases? Well, in this case, reversing the member address computation will involve additional overhead, just like accessing any data members also does. But there is nothing extraordinary here. Compilers are free to add as much information inside the pointers to members as they need. Bound class itself cannot be virtual base, but other classes can, and it comes with a price.

Alternatives

  • Duplicating this pointer. Making bound classes less parametrized at the cost of runtime overhead. In the absence of other template parameters, a bound class could be not considered a template. Its methods will need to receive all necessary information at runtime, and it seems obvious they will have to do it via two (or more) pointers.
    • Pro - can have its methods easily implemented in translation units.
    • Con - needs to pass two this pointers, to the outer class, and to the bound class itself, for calling bound class methods. Or have the pointer to the outer class implicitly stored inside the bound class, increasing its size, which is very undesirable, because it affects the bound class dramatically - it stops being trivially copyable/assignable, for example.
    • Con - bad at increasing depth: if a bound class needs another bound class inside, we will potentially have 3 this pointers getting passed to the deepest level. Or we will be unable to access the top level outer class. Note that this is trivially possible in current proposal and has no demanded additional overhead.
    • This sort of runtime overhead cannot be moved to compile time if necessary. But the opposite is easy: we can always add a "second this" parameter to some non-template method, or store additional this pointer inside it. And it may become much easier with reflection/metaclasses when they are available in C++.

How to implement it without bound classes

It's possible for data members, and this is horrifying. Consider this practically working example:

template <auto Get, auto Set, auto Offset>
class property
{
	using outer = typename memptr_traits<decltype(Get)>::base;
	using inner = typename memptr_traits<decltype(Get)>::type;

	// Hidden copy assignment operator
	property& operator=(const property&) noexcept = default;
	
	property(const property&) noexcept = default;
	
	property() = default;

	const outer* hack_this() const
	{
		return reinterpret_cast<const outer*>(reinterpret_cast<const char*>(this) - Offset());
	}

	outer* hack_this()
	{
		return reinterpret_cast<outer*>(reinterpret_cast<char*>(this) - Offset());
	}

	friend outer;
public:
	// Invoke getter
	inner get() const
	{
		return std::invoke(Get, *hack_this());
	}

	// Invoke setter, forward return value
	decltype(auto) operator=(inner value)
	{
		// ... omitted
	}
};

struct prop_test
{
	int m_x;

	static inline std::ptrdiff_t off_x()
	{
		return offsetof(prop_test, x);
	}

	[[no_unique_address]] property<&prop_test::m_x, &prop_test::m_x, &off_x> x;
};

This implementation has several problems:

  • The main problem: offsetof is not guaranteed to work if the class is not standard layout. The proposal solves it by hiding the offset computation from the programmer - the compiler will guarantee correct behaviour regardless of the layout.
  • Construction of the property object outside of the bound class. It will lose its connection with the outer class - the offset will be incorrectly calculated. It's impossible to prevent completely, since private constructors are still available for the friend outer class. In the proposal, the bound class object simply cannot be constructed separately.
  • Manipulating constructors pursuing the workaround will affect the outer class. In this proposal, the outer class can keep being an aggregate, a standard layout class, a trivially copyable class, a trivially assignable class, etc.
  • In future C++20 standard, [[no_unique_address]] cannot guarantee empty object optimization, it's just a hint. This proposal provides a reason why this optimization can be enforced for bound classes, the fact that each bound class type is unique and is distinct from other types, so no identical objects will share the same address.
  • No type safety at all: two reinterpret casts for upcasting the pointer, no way to verify the offset function points to the correct member.

Proposed type traits

namespace std
{
	// Helper typedef to extract direct outer class type from the 'Self' pointer to member
	template <auto MemPtr>
	using outer_t = typename member_pointer_traits<decltype(MemPtr)>::base;
	
	// Get top-level outer class in bound hierarchy (which is not a bound class)
	template <auto MemPtr>
	using outer_top_t = ...;
	
	// If T is a bound class, return type of its direct outer class type, otherwise not defined
	template <class T>
	using outer_type_t = typename outer_type<T>::type;
	
	// Similar to outer_type_t, but return top-level outer class type, or the type itself if not bound
	template <class T>
	using outer_top_type_t = ...;
	
	// Check whether the type T is a bound class
	template <class T>
	constexpr bool is_bound_v = is_bound<T>::value;
}

Problems

  • With guaranteed copy elision, in C++17 we can create factory functions even for unmoveable objects. But it is not possible to return a bound class object from a function due to its uniquely-typed nature. Creating rules for allowing it seems to be extremely tricky and better avoided. Possible simpler workaround is using the conversion operator:
struct bound_factory
{
	template <class T, auto Self>
	operator bound<T, Self>() const
	{
		return {};
	};
};
  • Cannot specify single-purpose bound class without a template. The closest thing possible is:
// In x.h
struct base
{
	template <this auto::*>
	struct inner_t
	{
		int z;
		void foo();
	};

	int x;
	inner_t<> y;
	using y_type = inner_t<&base::y>;
);

// In x.cpp
template <>
void base::y_type::foo()
{
	x = z;
}

Possible solution (contextual keyword unique, not proposed):

// In x.h
struct base
{
	// Keyword `unique` turns the class into a bound class
	struct y_type unique
	{
		int z;
		void foo();
	} y;

	int x;
	//y_type z; // Error: already declared a member of y_type
);

// In x.cpp
void base::y_type::foo()
{
	x = z;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment