The current proposal for pattern matching (P0095R1) seems overly heavy to me. The following recombines existing syntax and concepts into a smaller modification providing a fairly expressive solution for handling variants.
💭
|
Instead of addressing discriminated unions with a full language level variant, let it be a concept which types like
std::variant<...> implement.
|
// the interface for interacting with discriminated unions is visit()
template <typename T>
concept Visitable = requires(T a) { visit([](auto&&){}, a); };
💭
|
Add a new statement union switch for visiting variants with less boilerplate.
|
This will just be syntactic sugar around visit()
, in much the same way range-based-for
is sugar around begin()
and end()
.
- union-switch-statement:
-
union switch ( variant-expression… ) { union-case-statement… }
- union-case-statement:
-
case case-template-parameters opt parameter-declaration-clause : statement
- case-template-parameters:
-
< template-parameter-list >
- variant-expression:
-
⇒ satisfies
Visitable
(C# has a very similar control structure: switch statement with type pattern.)
Rather than a constant expression, case labels are followed by optionally templated parameter lists.
These are used to unpack the variant-expression(s) using the robust mechanism of overload resolution.
The union switch
statement is evaluated exactly as if the variant were visited by a closure with
an overloaded call operator for each union-case-statement. Each call operator’s parameter list and body
would be identical to the union-case-statement's parameter list and labeled statement.
✏️
|
For example: |
void obligatory_and_artificial(std::variant<int, double, std::vector<int>, std::string, std::wstring> v)
{
extern int &evil_counter();
union switch (v)
{
case std::vector<int> &v:
v.push_back(evil_counter()++);
std::cout << "the vector was " << v.size() << " long" << std::endl;
case<typename Char> std::basic_string<Char> const &s:
std::cout << "the string was " << s.size() << " long" << std::endl;
case auto int_or_double:
std::cout << "it was a number: " << int_or_double << std::endl;
}
}
💩
|
This could be considered equivalent to: |
struct closure
{
void operator()(std::vector<int> &v) const
{
v.push_back(evil_counter()++);
std::cout << "the vector was " << v.size() << " long" << std::endl;
}
template<typename Char>
void operator()(std::basic_string<Char> const &s) const
{
std::cout << "the string was " << s.size() << " long" << std::endl;
}
template <typename T>
void operator()(T int_or_double) const
{
std::cout << "it was a number: " << int_or_double << std::endl;
}
using evil_counter_t = int&();
evil_counter_t &evil_counter;
};
void obligatory_and_artificial(std::variant<int, double, std::vector<int>, std::string, std::wstring> v)
{
extern int &evil_counter();
visit(closure{ evil_counter }, v);
}
Unlike a switch
statement, there is no fall through between cases in union switch
; exactly one case’s statement will be executed.
As a consequence, a union switch
wherein a visited argument list does not match is ill formed (so a union switch
with no cases
would necessarily be ill formed). As a nice side effect, the lack of fall through means we wouldn’t have to wrap declaration statements
in {}
because there would be no transfer of control into the scope of variables from different cases.
Unlike a switch
statement, a default
case should probably not be allowed because it may be impossible to declare a parameter
list which matches all visited argument lists but matches them less than any of the explicit parameter lists.
The default case would have to be equivalent to case ...
or case auto&&...
and would conflict with those cases if they were
explicitly provided. IMHO the slightly increased expressiveness
of a default case does not warrant the added subtlety of remembering the parameter list it implies.
💭
|
Since Visitable is a concept, we can easily define unboxing for existing types which act like discriminated unions.
For example, we could expose nlohmann::json as a discriminated union:
|
template <typename Visitor>
void visit(Visitor &&vis, json &j)
{
switch (j.type())
{
case json::value_t::object:
vis(j.get<json::object_t>());
break;
case json::value_t::array:
vis(j.get<json::array_t>());
break;
// ...
}
}
// now we can use union switch on jsons
void print_one_line_summary(std::ostream &os, json &j)
{
union switch (j)
{
case json::object_t const &o:
os << "object { size=" << o.size() << " }";
case json::array_t const &a:
os << "array { size=" << a.size() << " }";
case json::string_t const &s:
auto eol = s.find('\n');
if (eol == json::string_t::npos)
{
os << std::quoted(s);
}
else
{
auto extra_line_count = std::count(s.begin(), s.end(), '\n');
auto first_line = s.substr(0, eol);
os << std::quoted(first_line) << " (" << extra_line_count << " more lines)";
}
case auto &&other_scalar:
os << other_scalar;
}
}
json concatenate(json const &left, json const &right)
{
json ret;
union switch (left, right)
{
case json::object_t const &left, json::object_t const &right:
ret = left;
for (auto it = right.begin(); it != right.end(); ++it)
{
ret[it->key()] = it->value();
}
case json::array_t const &left, json::array_t const &right:
ret = left;
ret.insert(ret.end(), right.begin(), right.end());
case json::string_t const &left, json::string_t const &right:
ret = left + right;
case ...:
ret = nullptr;
}
return ret;
}
Pattern matching like this would also expedite querying for interfaces. For example, if your class
hierarchy is wired with LLVM-style RTTI
you could implement visit()
to take a pointer to the base and call getKind() then switch on the
result and cast to a pointer to the appropriate derived class. After this, you can use union switch
to down cast a pointer at whatever level you need. For example, if you needed to
union switch (decl)
{
case clang::NamedDecl *ND:
// do something NamedDecl-specific
llvm::raw_os_ostream OS(std::cout);
ND->printQualifiedName(OS);
case clang::TypeDecl *TD:
// TypeDecl inherits NamedDecl so TD is-a NamedDecl but
// this case overrides the above case for NamedDecl
auto T = TD->getTypeForDecl();
T->dump();
case ...:
decl->dump();
}
- What this doesn’t address
-
-
Pattern matching is capable of much more than just unboxing variants but the above proposal is not.
-
return
andbreak
will require some reinterpretation inside a union-case-statement. If the case statement is transposed unchanged to the body of a call operator as described abovereturn;
will terminate theunion switch
and non-void return would be ill formed. Alsobreak;
would not be allowed. This is highly non-intuitive and we’d need to replacereturn;
with abreak;
at the very least:union switch (v) { case ...: break; }
⇒struct closure { void operator()(...) const { return; } };
-
Duplicated types- I feel this is best handled by tags. Actually that’s part of why I wrote rekt
-
interesting