Skip to content

Instantly share code, notes, and snippets.

@ryan-williams
Created March 14, 2020 17:06
Show Gist options
  • Save ryan-williams/42d4b7df586d37aa48a67e10e6446697 to your computer and use it in GitHub Desktop.
Save ryan-williams/42d4b7df586d37aa48a67e10e6446697 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Interesting Python language features (for the Scala/FP-minded)\n",
"- [dataclasses](#Dataclasses) (`case class`es)\n",
"- [`property`/`cached_property`](##property,-cached_property) (omitting parends on 0-arg methods, memoization)\n",
"- [singledispatch](##Single-dispatch-methods) (`match`/`case` and/or typeclasses)\n",
"- [significant indentation / optional braces](#Significant-indentation,-optional-braces-in-Dotty) (coming to Dotty, from Python?)\n",
"- [varargs, object-{,de}structuring](#varargs,-object-structuring/destructuring)\n",
"- [Metaclasses](#Metaclasses)\n",
"- [Operator overloading, eDSLs](#Operator-overloading-/-eDSLs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Dataclasses\n",
"Analogous to Scala `case class`es; see [`dataclass` docs](https://docs.python.org/3.8/library/dataclasses.html):"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from dataclasses import dataclass"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"@dataclass\n",
"class Foo:\n",
" a: int\n",
" b: str"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Foo(a=111, b='bbb')"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"foo = Foo(111, 'bbb'); foo"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"(111, 'bbb')"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"foo.a, foo.b"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`.__dict__` gives you a dict of the fields 😎:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'a': 111, 'b': 'bbb'}"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"foo.__dict__"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## `property`, `cached_property`\n",
"- `property`: interesting analogue to Scala's automatic calling of nullary methods (without requiring `()` parends)\n",
"- `cached_property`: memoized `property`"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"from functools import cached_property"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"@dataclass\n",
"class Bar:\n",
" a: int\n",
" b: str\n",
"\n",
" @property\n",
" def a2(self):\n",
" a = self.a\n",
" print(f'squaring a ({a})')\n",
" return a**2\n",
"\n",
" @cached_property\n",
" def a3(self):\n",
" a = self.a\n",
" print(f'cubing a ({a})')\n",
" return a**3"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Bar(a=222, b='BBB')"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bar = Bar(222, 'BBB'); bar"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`a2` runs every time (see multiple `squaring a (222)` outputs):"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"squaring a (222)\n",
"squaring a (222)\n"
]
},
{
"data": {
"text/plain": [
"(49284, 49284)"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bar.a2, bar.a2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`a3` only runs the first time (see: only one instance of `cubing a (222)`):"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"cubing a (222)\n"
]
},
{
"data": {
"text/plain": [
"(10941048, 10941048)"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bar.a3, bar.a3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Running again later: `a2` still computes, `a3` still doesn't:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"squaring a (222)\n"
]
},
{
"data": {
"text/plain": [
"49284"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bar.a2"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"10941048"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bar.a3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Single-dispatch methods\n",
"- [`functools.singledispatch`](https://docs.python.org/3/library/functools.html#functools.singledispatch): analogous to `match`/`case`\n",
"- also isomorphic to **typeclasses**?!\n",
" - ad-hoc polymorphism\n",
" - \"evidence\" instances/implementations"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"from functools import singledispatch"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Example: `Show[T]`\n",
"Declare a base method, defaulting to `raise` for unrecognized types:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"@singledispatch\n",
"def show(arg): raise NotImplementedError"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Define for `int`s:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"@show.register(int)\n",
"def _(n): return '%d' % n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Works for `int`s!"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'111'"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"show(111)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"But not for other types (e.g. `str`s):"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"error!\n"
]
}
],
"source": [
"try:\n",
" show('aaa')\n",
"except NotImplementedError:\n",
" print('error!')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Define for `float`s:"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
"@show.register(float)\n",
"def _(f): return '%.2f' % f"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Round to 2 decimal places; nice!"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'1.23'"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"show(1.2345)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`int`s still work:"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'111'"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"show(111)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`str`s still don't:"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"error!\n"
]
}
],
"source": [
"try:\n",
" show('aaa')\n",
"except NotImplementedError:\n",
" print('error!')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Significant indentation, optional braces in Dotty\n",
"- [dotty docs](https://dotty.epfl.ch/docs/reference/other-new-features/indentation.html)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## varargs, object structuring/destructuring\n",
"Python has really useful varargs and object-destructuring syntax that there's little analogue of in Scala:\n",
"\n",
"### `*args` (varargs), `**kwargs` (\"keyword args\")\n",
"Functions can take `*args` (like varargs) and `**kwargs` (a bag of keyword args):"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"def fn(*args, **kwargs):\n",
" print(f'args: {args}')\n",
" print(f'kwargs: {kwargs}')"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"args: (111, 'aaa', False)\n",
"kwargs: {'c': 'some', 'd': 'keyword', 'e': 'args'}\n"
]
}
],
"source": [
"fn(\n",
" 111, 'aaa', False, \n",
" c='some',\n",
" d='keyword',\n",
" e='args',\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can even have named positional args before `*args`/`**kwargs`:"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"def fn2(first, second, *varargs, **other):\n",
" print(f'first: {first}')\n",
" print(f'second: {second}')\n",
" print(f'args: {varargs}')\n",
" print(f'kwargs: {other}')"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"first: 111\n",
"second: aaa\n",
"args: (False,)\n",
"kwargs: {'c': 'some', 'd': 'keyword', 'e': 'args'}\n"
]
}
],
"source": [
"fn2(\n",
" 111, 'aaa', False, \n",
" c='some',\n",
" d='keyword',\n",
" e='args',\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `**` prefix operator: object *destructuring*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This function takes `a` and `b` arguments (perhaps with similar meanings to the `a` and `b` fields of [the `Foo` class above!](#Dataclasses)):"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"def handle_ab(a, b):\n",
" print(f'a: {a}, b: {b}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Perhaps we ended up with an object with `a` and `b` fields:"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'a': 111, 'b': 'bbb'}"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ab = foo.__dict__; ab"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can melt it right into a function that wants such named fields:"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"a: 111, b: bbb\n"
]
}
],
"source": [
"handle_ab(**ab)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"(note the `**` in `**ab`)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There are some Big Ideas lurking in this kind of pattern; I don't exactly have the vocabulary to articulate them, but I've definitely struggled to mimic them Scala.\n",
"\n",
"Lots of {shapeless,magnolia}-style typeclass-derivation is accomplished very easily in Python with the `**` operator. In a way, the named args to a function define a product-type, and you can play with "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### object *structuring* (\"comprehensions\"): `{ _: _ for _ in _ }`\n",
"This function wants an `a`, `b`, and `c`:"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"def handle_abc(a, b, c):\n",
" print(f'a {a}, b {b}, c {c}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We have an `a` and `b` in a dict ([from `foo` above](#Dataclasses)):"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'a': 111, 'b': 'bbb'}"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ab"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Perhaps someone gave us a `c` separately:"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [],
"source": [
"c = [4,5,6]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Inline `ab`'s fields in a new object, with some extra fields to boot!"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'a': 111, 'b': 'bbb', 'c': [4, 5, 6], 'd': False}"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"abcd = { **ab, 'c': c, 'd': False }; abcd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Convert easily between `list`s and `dict`s:"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[('a', 111), ('b', 'bbb'), ('c', [4, 5, 6]), ('d', False)]"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pairs = [ (k,v) for k,v in abcd.items() ]; pairs"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Back to a `dict` (this is the same as the original `abcd` above):"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'a': 111, 'b': 'bbb', 'c': [4, 5, 6], 'd': False}"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"{ k:v for k,v in pairs }"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Error: handle_abc() got an unexpected keyword argument 'd'\n"
]
}
],
"source": [
"try:\n",
" handle_abc(**abcd)\n",
"except TypeError as e:\n",
" print(f'Error: {e}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Slice only certain fields from the object we inlining:"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"a 111, b bbb, c [4, 5, 6]\n"
]
}
],
"source": [
"handle_abc(\n",
" **{ \n",
" k:v\n",
" for k,v \n",
" in abcd.items() \n",
" if k in {'a','b','c'} \n",
" }\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Metaclasses\n",
"Meta-classes are a powerful hook that classes can opt into:\n",
"- at *`class`-creation time*:\n",
" - the `class` (just a `dict` of all methods, by name) is passed to the metaclass\n",
" - arbitrary reflections/modifications can be performed\n",
" - an arbitrary dict of methods is returned, and forms the actual `class`\n",
"- at *instance-creation time*:\n",
" - `__new__` and `__init__` are just methods on the class\n",
" - they can also be arbitrarily modified during `class`-creation time above\n",
"\n",
"### Example: locally-cloned gists (annotated fields automatically persisted to an on-disk cache):\n",
"- [example metaclass (`Meta`)](https://gitlab.com/runsascoded/ur/-/blob/0da85dad427c0a0d923e0dd52c9fbb9c77a2dc19/pclass/dircache.py#L8)\n",
" - from [the `ur` library](https://pypi.org/project/ur/) (something I wrote that imports code from notebooks/gists, even remotely! 😱)\n",
" - class fields tagged with `@field` or `@directfield` are:\n",
" - memoized (only computed once)\n",
" - persisted to an on-disk cache\n",
"- [Example usage: `class Gist`](https://gitlab.com/runsascoded/ur/-/blob/0da85dad427c0a0d923e0dd52c9fbb9c77a2dc19/gist.py#L144)\n",
" - represents local clones of Github Gists\n",
" - fields overview:\n",
" ![](./gist-py.png)\n",
" - `clone` and `user` are automatically-disk-cached fields\n",
" - `clone` is itself a `Meta`-class with further cached fields\n",
"\n",
"#### Demo:\n",
"To see it in action:\n",
"- install the `ur` module (live in this notebook! if it's not already installed)\n",
"- import the `Gist` class"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [],
"source": [
"from sys import executable as python\n",
"!{python} -m pip install --quiet ur\n",
"from gist import Gist"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Instantiating a `Gist` object clones it into an `.objs` cache locally (if it's not already present):"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Cloning https://gist.github.com/4e2f21fa1d3493674ac52c766a02e637 into .objs/Gist/4e2f21fa1d3493674ac52c766a02e637/clone\n"
]
},
{
"data": {
"text/plain": [
"Commit(id=e1b58e5a4dc482fcc7eda7d96e07edf6c9efbaec)"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"gist = Gist.from_url('https://gist.github.com/ryan-williams/4e2f21fa1d3493674ac52c766a02e637'); gist"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A `Commit` object is returned (a reference to the top/`HEAD` commit/revision on the gist.\n",
"\n",
"Check out the `.objs` cache:"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1m.objs\u001b[00m\r\n",
"├── \u001b[1mCommit\u001b[00m\r\n",
"│   └── \u001b[1me1b58e5a4dc482fcc7eda7d96e07edf6c9efbaec\u001b[00m\r\n",
"│   └── xml\r\n",
"└── \u001b[1mGist\u001b[00m\r\n",
" └── \u001b[1m4e2f21fa1d3493674ac52c766a02e637\u001b[00m\r\n",
" ├── \u001b[1mclone\u001b[00m\r\n",
" │   ├── _index.md\r\n",
" │   ├── gform-shuffle.png\r\n",
" │   ├── group-slacking.png\r\n",
" │   ├── jekyll.png\r\n",
" │   ├── ryan-desk.jpg\r\n",
" │   ├── ryan-mission-control.jpg\r\n",
" │   ├── slack-msg-count.png\r\n",
" │   ├── speaker-view.gif\r\n",
" │   ├── t-shirts.jpg\r\n",
" │   ├── talks-channel.gif\r\n",
" │   ├── tix.png\r\n",
" │   └── zoom-hosts.png\r\n",
" └── user\r\n",
"\r\n",
"5 directories, 14 files\r\n"
]
}
],
"source": [
"!tree .objs"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\"ryan-williams\""
]
}
],
"source": [
"!cat .objs/Gist/4e2f21fa1d3493674ac52c766a02e637/user"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Operator overloading / eDSLs\n",
"Python just hardcodes names for 90% of the fancy operator overloads you might require:\n",
"- `__add__`/`__radd__` ({pre,post}fix `+`)\n",
"- `__mul__`/`__rmul__` ({pre,post}fix `*`)\n",
"- `__land__`/`__rand__` ({pre,post}fix `&`)\n",
"- etc.\n",
"\n",
"This makes it very easy to do operator overloading.\n",
"\n",
"There are even more powerful \"meta-overloads\" available:\n",
"- `__getitem__`/`__setitem__`: how should square-bracketing behave? (e.g. make `x['foo']` default to returning `None` instead of `raise`ing)\n",
"- `__getattr__`/`__setattr__`: how should all accesses behave? (e.g. make `x.foo` default to returning `None` instead of `raise`ing)\n",
"\n",
"This (and things like `__dict__` above) make it easy to \"delegate\" calls/accesses on an object to another, and intercept them and make them lazy.\n",
"\n",
"In fact, every Python ML framework is basically an `IO`/`Lazy`/`Deferred` monad that does some compute-graph optimizing, will render you a nice dataflow DAG, can sometimes be nested inside another similar \"`IO` monad\", etc.!\n",
"\n",
"I don't think the users (or even developers) of these frameworks even think of it this way or know that's how the literature / PLT communities would view what they've done, but they've figured it out and it works…"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## TODO:\n",
"- `functools.partial`\n",
"- notebook life"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[![](https://imgs.xkcd.com/comics/python.png)](https://xkcd.com/353/)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "nescala-3.8.1",
"language": "python",
"name": "nescala-3.8.1"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment