Python Atrocity: Structural Pattern Matching built on Exception Handling Syntax
| import sys | |
| import functools | |
| """ | |
| Python Atrocity: Structural Pattern Matching built on Exception Handling Syntax | |
| Daniel Spitz | |
| (Python 3.7+ only) | |
| The goal of this Python Atrocity is to implement extensible, composable pattern | |
| matching in a way that is syntactically remeniscent of the pattern matching in | |
| functional languages. | |
| We target exception handling syntax for this purpose. | |
| (See the very bottom for 3 demos of it in use.) | |
| Exception handling in python allows for multiple exception types to be checked | |
| sequentially, with a different code block getting executed according to which | |
| exception type matches first. It also allows for the optional binding of a new | |
| name within the scope of the handling code block (e.g. "except Exception as e:"), | |
| although binding multiple names is not supported. | |
| So with the right massaging, we ought to be able to change this: | |
| try: | |
| ... | |
| except IndexError: | |
| ... # handle case 1 | |
| except TypeError: | |
| ... # handle case 2 | |
| to this: | |
| try: | |
| raise obj | |
| except Pattern1 as p1: | |
| ... # do stuff with p1 | |
| except Pattern2 as p2: | |
| ... # do stuff with p2 | |
| Unfortunately, such a direct expression of pattern matching is not actually possible. | |
| Here are the stumbling blocks: | |
| 1. We cannot raise any old object as an exception. Only instances of BaseException can be raised. | |
| 2. We cannot use any old class in an except clause. Only subclasses of BaseException can be used. | |
| 3. The class in an except clause is determined to be a match via a direct check on the raised | |
| object's __mro__, not via the more generic and overrideable isinstance() machinery. | |
| 4. We can only bind a single new name in a given except clause, making concise destructuring | |
| assignment, a key feature of pattern matching, difficult. | |
| In this python atrocity, we attempt to address all 4 stumbling blocks, to varying levels of | |
| success (if you can really call it that). | |
| """ | |
| """ | |
| Exception handling hacking implementation. | |
| Lifted from an old python-ideas thread[0] with far less nefarious purposes in mind | |
| originally[1]. | |
| Thanks to Random832 for the key insight about using sys.exc_info() *inside the except clause*! | |
| "This is, of course, a *profoundly* ugly way to do this." - Random832 | |
| The idea here is that we can call this new function _isinstance_except() inside of | |
| each except clause, returning its result rather than the original exception class in | |
| the clause. | |
| So, this: | |
| except TypeError: | |
| pass | |
| would become | |
| except _isinstance_except(TypeError): | |
| pass | |
| The *effect* of this change would be roughly the same as if python used | |
| isinstance() to check if an exception clause is a match, rather than directly | |
| checking the __mro__. Thus do we address stumbling block #3. | |
| To address the syntactic ugliness, we can hide this call in an attribute access on | |
| the pattern class being checked: | |
| except MyPatternType.M: | |
| pass | |
| The ".M" implicitly calls _isinstance_except(), and a naive end user might | |
| simply shrug it off as standing for "Match". Ignorance is bliss. | |
| [0] https://mail.python.org/pipermail/python-ideas/2015-November/037121.html | |
| [1] https://mail.python.org/pipermail/python-ideas/2015-November/037104.html | |
| """ | |
| class _NeverThrown(BaseException): | |
| """ | |
| Exception class which is expected never to be thrown. | |
| Used to disqualify certain `except` blocks from ever | |
| matching a raised exception. | |
| """ | |
| def _isinstance_except(type): | |
| """ | |
| Normally python checks if an exception is of a certain type | |
| by looking for the type in the exception's class' __mro___. | |
| By wrapping your exception types in a call to this function we | |
| simulate what it'd be like if the core exception handling logic | |
| used isinstance() instead of explicitly checking the __mro__. | |
| This allows us to attain good, useful, worthwhile power by | |
| arbitrarily overriding __instancecheck__ on the metaclass of a | |
| new BaseException subclass. | |
| """ | |
| etype, e = sys.exc_info()[:2] | |
| if isinstance(e, type): | |
| return etype | |
| return _NeverThrown | |
| """ | |
| Structural Pattern Matching Implementation | |
| We define a collection of classes and decorators for the purposes of | |
| encapsulating Structural Patterns and the mechanisms by which they can be hooked | |
| into python's exception handling machinery. | |
| Importantly, any object raised must be an instance of BaseException, | |
| and any class appearing in an except clause must be a subclass of | |
| BaseException. | |
| We therefore define a MatchWrapper class to wrap arbitrary objects for | |
| matching. Since MatchWrapper is a subclass of BaseException, it can be | |
| raised, and can then carry along the object being matched to any of the | |
| patterns appearing in the except clauses. | |
| We likewise define a Pattern base class for all patterns intended to | |
| appear in an except clause. In order to gain control of instance-checking | |
| against Pattern subclasses, we give Pattern a metaclass with an overridden | |
| __instancecheck__ method. Together, the MatchWrapper and Pattern classes | |
| help us address stumbling blocks #1 and #2. | |
| Our overridden __instancecheck__ will dispatch to an overrideable classmethod | |
| match() to determine if a given object is an instance of the given pattern. It | |
| will also specially recognize MatchWrapper objects. If a MatchWrapper is passed | |
| in, then the wrapped object will be passed along to the match() call, rather than | |
| the wrapper itself. | |
| Additionally, if the call to match() returns something, then | |
| that result is set to the wrapper's .bindings attribute, for usage within | |
| the exception handling code block. While a bit disappointing, this was the best | |
| way I could find to address stumbling block #4. While this does not let us | |
| do destructuring assignment within the "except" line itself, it lets us do it | |
| in the first line of the handling block, via e.g.: | |
| except Pair.M as p: | |
| first, second = p.bindings | |
| ... | |
| The Pattern base class is intended to be extended by end users to | |
| define their own patterns. A function decorator, @pattern, is provided for | |
| defining simple patterns with less boilerplate than writing a class would | |
| require. | |
| On the other side of things, the match() function provides the means of | |
| pushing an object through the matching machinery, and returning the result. | |
| The notion of a "matcher" is defined as a generator that behaves in a certain | |
| way, and end users can use the @matcher decorator to bundle up their | |
| matcher generator functions with an implicit call to match(). | |
| """ | |
| class MatchWrapper(BaseException): | |
| """ | |
| Wrap any object to make it throwable. This in turn lets us | |
| "catch" any object according to arbitrary predicates, allowing | |
| us to achieve structural pattern matching with exception handling | |
| syntax! | |
| """ | |
| def __init__(self, object): | |
| self.object = object | |
| self.bindings = None | |
| class NoMatch(Exception): | |
| """ | |
| Raised by pattern matchers when they find no match. | |
| """ | |
| class PatternMeta(type): | |
| """ | |
| Metaclass for Pattern classes. Overrides | |
| __instancecheck__ to call the class' .match() method. | |
| """ | |
| def __instancecheck__(cls, instance): | |
| if isinstance(instance, MatchWrapper): | |
| try: | |
| result = cls.match(instance.object) | |
| except NoMatch: | |
| return False | |
| else: | |
| instance.bindings = result | |
| return True | |
| else: | |
| try: | |
| result = cls.match(instance, match) | |
| except NoMatch: | |
| return False | |
| else: | |
| return True | |
| @property | |
| def M(cls): | |
| """ | |
| Just syntax sugar to concisely wrap any given exception class | |
| with a call to _isinstance_except(). | |
| """ | |
| return _isinstance_except(cls) | |
| class Pattern(BaseException, metaclass=PatternMeta): | |
| """ | |
| Base class for patterns. A pattern subclass must implement | |
| match(). You can then use the pattern class in an except clause. | |
| """ | |
| @classmethod | |
| def match(cls, obj): | |
| """ | |
| Attempt to match obj, returning a bindings object (which can be anything) | |
| if successful. If the object fails to match, raise a NoMatch exception. | |
| """ | |
| raise NotImplementedError() | |
| def pattern(func): | |
| """ | |
| Convenience decorator to create simple Pattern subclasses. | |
| The decorated function becomes the match() implementation on | |
| the new subclass. It is assumed to be a static method, so must not | |
| accept a class as the first argument. | |
| """ | |
| return PatternMeta( | |
| func.__name__, | |
| (Pattern,), | |
| {'match': staticmethod(func)}) | |
| def match(m_gen, obj): | |
| """ | |
| Actually match an object against a "matcher". | |
| A matcher is a python generator which is expected to | |
| yield once, then the object being matched is wrapped in a | |
| MatchWrapper instance and thrown into the generator. | |
| The generator is then expected to handle the exception and optionally | |
| return some result. | |
| Because of all the hacks, the "matcher" generator's code can | |
| look eerily like pattern matching syntax seen in some functional | |
| languages, but with strange python boilerplate. | |
| By putting matchers into a no-arg generator with a "try: yield" at the | |
| beginning, we "abstract out" the actual process of wrapping the | |
| input object in a MatchWrapper and raising it for the user. This minimizes | |
| the needed boilerplate down to "try: yield", leaving more room for the | |
| actual beef of the logic, the matching code. Put another way, this helps | |
| further obscure the horror that's actually happening from the user, for their | |
| own psychological well-being. | |
| """ | |
| try: | |
| next(m_gen) | |
| except StopIteration as e: | |
| raise RuntimeError("Matcher raised StopIteration too early") from e | |
| try: | |
| m_gen.throw(MatchWrapper(obj)) | |
| except StopIteration as e: | |
| return e.value | |
| except MatchWrapper: | |
| raise RuntimeError('Matcher failed to match object: {}'.format(obj)) | |
| else: | |
| raise RuntimeError('Matcher did not return or raise') | |
| def matcher(genfunc): | |
| """ | |
| Decorator for matcher generator functions. This will convert | |
| the generator function into a single-argument function which | |
| invokes match() on the passed-in argument against the decorated | |
| generator function. | |
| The generator function is always called with no arguments. | |
| """ | |
| @functools.wraps(genfunc) | |
| def wrapped(x): | |
| return match(genfunc(), x) | |
| return wrapped | |
| """ | |
| Core Pattern Types | |
| Here we define a few core pattern types that permit writing a wide | |
| range of pattern matching code. | |
| Included here are: | |
| - Destructuring Tuple patterns | |
| - Dictionary patterns | |
| - Wildcard (Any) patterns | |
| - Non-destructuring type-based patterns (Instance) | |
| """ | |
| class Tuple(Pattern): | |
| """ | |
| Tuple pattern. Matches a tuple of values against one | |
| pattern for each index into the tuple. | |
| """ | |
| def __class_getitem__(cls, patterns): | |
| if not isinstance(patterns, tuple): | |
| raise TypeError('Tuple[] input must be a tuple.') | |
| return PatternMeta('Tuple', (Tuple,), {'patterns': patterns}) | |
| @classmethod | |
| def match(cls, obj): | |
| if not isinstance(obj, tuple) or len(obj) != len(cls.patterns): | |
| raise NoMatch() | |
| return tuple(p.match(o) for p, o in zip(cls.patterns, obj)) | |
| class Dict(Pattern): | |
| """ | |
| Dictionary pattern. Matches the values of a dict against the | |
| corresponding pattern for each value's key. | |
| """ | |
| def __class_getitem__(cls, patterns): | |
| if not isinstance(patterns, dict): | |
| raise TypeError('Dict[] input must be a dict.') | |
| return PatternMeta('Dict', (Dict,), {'patterns': patterns}) | |
| @classmethod | |
| def match(cls, obj): | |
| if not isinstance(obj, dict) or not obj.keys() >= cls.patterns.keys(): | |
| raise NoMatch() | |
| return {k: p.match(obj[k]) for k, p in cls.patterns.items()} | |
| def D(*args, **kwds): | |
| """ | |
| Syntax sugar for Dict patterns (especially useful for string keys): | |
| D(a=Any, b=Any) -> Dict[{'a': Any, 'b': Any}] | |
| """ | |
| result = {} | |
| for a in args: | |
| result.update(a) | |
| result.update(kwds) | |
| return Dict[result] | |
| @pattern | |
| def Any(obj): | |
| """ | |
| Matches any object, and binds it to the match's .bindings attribute. | |
| """ | |
| return obj | |
| class Instance(Pattern): | |
| """ | |
| Just matches anything that isinstance() of a given type. | |
| """ | |
| def __class_getitem__(cls, t): | |
| return PatternMeta('Instance', (Instance,), {'type': t}) | |
| @classmethod | |
| def match(cls, obj): | |
| if not isinstance(obj, cls.type): | |
| raise NoMatch() | |
| return obj | |
| """ | |
| Demo: Type-based dispatch! | |
| """ | |
| @matcher | |
| def say_type(): | |
| try: yield | |
| except Instance[str].M as s: | |
| print('got a string:', s.bindings) | |
| except Instance[int].M as i: | |
| print('got an int:', i.bindings) | |
| say_type('horse') # got a string: horse | |
| say_type(35904) # got an int: 35904 | |
| """ | |
| Demo: Add two things! | |
| """ | |
| from numbers import Number | |
| @matcher | |
| def add(): | |
| """ | |
| A matcher can only receive a single argument as input, so | |
| by convention if we are adding two arguments then we receive | |
| a 2-tuple as input. | |
| Tuple[Any, Any] ~> A 2-tuple whose contents can both be anything | |
| """ | |
| try: yield | |
| except Tuple[Any, Any].M as t: | |
| x, y = t.bindings | |
| return x + y | |
| print(add((6, 18))) # 24 | |
| print(add(('af', 'sdfg'))) # 'afsdfg' | |
| # error: | |
| # print(add((3,5,9))) # error: input is length 3, not 2 | |
| """ | |
| Demo: Destructuring python lists into linked lists! | |
| """ | |
| class Cons(Pattern): | |
| head_pattern = Any | |
| tail_pattern = Any | |
| def __class_getitem__(cls, item): | |
| head_pattern, tail_pattern = item | |
| return PatternMeta('Cons', (Cons,), { | |
| 'head_pattern': head_pattern, | |
| 'tail_pattern': tail_pattern}) | |
| @classmethod | |
| def match(cls, obj): | |
| if isinstance(obj, list) and len(obj) > 0: | |
| # return the bindings for list head and tail | |
| return cls.head_pattern.match(obj[0]), cls.tail_pattern.match(obj[1:]) | |
| raise NoMatch() | |
| @pattern | |
| def Nil(obj): | |
| if not (isinstance(obj, list) and len(obj) == 0): | |
| raise NoMatch() | |
| @matcher | |
| def sum(): | |
| """ | |
| Recursive sum() function straight outta CS101. | |
| Demonstrates how bindings can be used for destructuring rather | |
| than just returning the original object. | |
| """ | |
| try: yield | |
| except Cons[Instance[Number], Instance[list]].M as c: | |
| head, tail = c.bindings | |
| return add((head, sum(tail))) | |
| except Nil.M: | |
| return 0 | |
| print(sum([1,2,3,4])) # 10 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This comment has been minimized.
mcspud commentedApr 26, 2019
Its beautiful.