Create a gist now

Instantly share code, notes, and snippets.

Embed
What would you like to do?

Why Not Python?

Hi! My name is Corbin. Several years ago, I gave up on Python as my future language of choice and joined an effort to produce Monte. The purpose of this document is to explain what the Monte team found lacking in Python.

We do not expect any action to be taken, and indeed this enumeration mostly focuses on things which cannot be changed, but we wanted to share this information so that future programming language designers have some food for thought. (Additionally, a CPython core developer requested that I prepare this list.)

To satisfy typical Python community conventions, "Python" is the common syntax and semantics of Python 2 and Python 3, if not otherwise specified, and I will mostly discuss Python 3. All REPL interactions were performed on CPython 3.6.1.

As a final reminder, this is an opinion. I have tried to mention evidence which was crucial to the formation of this opinion, but I am not trying to change anybody's mind today.

Things Python 3 Got Right

Integers

The integer type in a language should reflect the ring of integers. Python 3 fixes Python 2's misstep by unifying int and long. This is the fulfillment of the plan first outlined in PEP 237.

print is a Function

A language should strive to limit the number of special forms. PEP 3105 removes the print keyword and its special syntax.

Generalized Comprehensions

Set comprehensions and dictionary comprehensions sensibly extend comprehension syntax to cover the full range of Python's builtin collections.

Names and Scopes

Python is a dynamically-scoped language. Names are not really declared, just assigned, and not all blocks introduce new scopes.

To be fair, Python is a somewhat-lexically-scoped language. However, there are many syntactic forms which let names linger beyond their expected scope. In Python 3, a list comprehension and for-loop have surprisingly different scoping rules:

>>> [i for i in range(5)]; i
[0, 1, 2, 3, 4]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'i' is not defined
>>> for i in range(5):
...  i
...
0
1
2
3
4
>>> i
4

Learning which scopes are lexically contained is not hard.

More generally, Python does not have static scoping, where each lexical scope's contents are fixed at compile time. This massively frustrates static analysis; it is undecidable whether there are undefined or unused names, by Rice's theorem.

Audience participation time: I have found evidence in the archeological record that GvR is opposed to static scoping (http://legacy.python.org/search/hypermail/python-1994q1/0313.html) but I do not yet know why. I would love to read more information about this viewpoint.

Closure Quirk

Additionally, Python has an idiosyncratic behavior when closing over some names:

>>> l = [(lambda y: x + y) for x in range(5)]
>>> [f(0) for f in l]
[4, 4, 4, 4, 4]

Again, learning this rule is not hard, but it makes Python an odd duck amongst its peers, and is a reliable source of frustration for expatriates from other language communities. In Haskell:

GHCi, version 8.0.2: http://www.haskell.org/ghc/  :? for help
Prelude> [ f 0 | f <- [ (x +) | x <- [0..4] ] ]
[0,1,2,3,4]

In Monte:

▲> [for f in ([for x in (0..!5) fn y { x + y }]) f(0)]
Result: [0, 1, 2, 3, 4]

nonlocal

PEP 3104 doubled down on global by adding nonlocal, which only increases the number of special Python-specific scoping rules.

Immutability

Types

Python has great support for some immutable types, but a few important types are missing. In particular, frozendict (PEP 416) is not available. Additionally, using tuple or frozenset in place of list and set can be awkward.

Names

Python has no syntax for declaring a name which cannot be reassigned. Adding injury to insult, Python generally permits shadowing builtin names.

Equality Semantics

Without getting especially philosophical (see [Baker 1992] for background), object identity and equality operators should not be so flexible. Indeed, the presence of both is and == operators can be frequently confusing, and Python should have only one equality operator. Python should not have is, but Python should not have its current incarnation of == either.

Suppose that Python did not have is. Then the common idiom x is None for testing whether x is equal to None. Why can't we use x == None? Because x may have a custom .__eq__() or .__ne__(). In general, by allowing objects to define their own equality arbitrarily, we lose the ability to precisely reason about object equality.

Equality should be an equivalence relation; or, in other words, equality should satisfy three laws:

  • Reflexivity: For all x, x == x
  • Symmetry: x == y if (and only if) y == x
  • Transitivity: If x == y and y == z, then x == z

In addition to the relatively large class of problems caused by user-controllable equality, Python has some non-user-defined failures:

>>> x = float("nan"); x == x
False

Concurrency

Python does not have a single high-level concurrency toolkit. Instead, concurrency tooling must be built ad-hoc from syscalls and libc, leading to fragmentation in the community.

Coroutines

Coroutines destroy local reasoning and reduce readability:

class Victim:
    state = "invariants holding"

    def go(self, f):
        self.state = "invariants broken"
        rv = f(42)
        self.state = "invariants holding"
        return rv

def pause(out):
    x = yield
    out()
    p = yield

v = Victim()
def out():
    print("Victim's state:", v.state)

p = pause(out)
p.send(None)
v.go(p.send)

Adding insult to injury, Python coroutines require a spurious .send(None).

Object Model

Classes

Python has classes instead of object literals. This complicates the creation of singletons and creates a special so-called "class scope" where assignments have special meaning.

Inheritance

Python permits class composition via inheritance. This was probably a mistake.

Plan Coordination

This section is a bland restatement of [Miller, Tribble, Shapiro 2005]. The goal is to outline an intuition for the class of bugs known as "plan interference."

Consider this trivial class:

class StatusHolder:
    def __init__(self, status):
        self._status = status
        self._listeners = []

    def addListener(self, listener):
        self._listeners.append(listener)

    @property
    def status(self):
        return self._status

    @status.setter
    def status(self, status):
        self._status = status
        for listener in self._listeners:
            listener(status)

Aborting the Wrong Plan

A thrown exception introduces inconsistency:

tag = lambda t: lambda status: print(tag, status)
def fail(status):
    raise Exception(status)

sh = StatusHolder("idle")
sh.addListener(fail)
sh.addListener(tag)
sh.status = "going"

Nested Subscription

Minor semantic details clobber expected behaviors. The extra prints here are due to how Python handles appending to a list during iteration:

sh = None

leaf = lambda status: print("leaf", status)
def branch1(status):
    sh.addListener(leaf)
def branch2(status):
    leaf(status)
    sh.addListener(leaf)

sh = StatusHolder("idle")
sh.addListener(branch1)
sh.status = "branching"

sh = StatusHolder("idle")
sh.addListener(branch2)
sh.status = "branching"

Nested Publication

Recursively updating the status leads to some observers receiving out-of-order updates:

sh = StatusHolder("idle")

def inner(status):
    print("inner", status)
    if status == "first":
        sh.status = "second"

sh.addListener(lambda status: print("observer1", status))
sh.addListener(inner)
sh.addListener(lambda status: print("observer2", status))
sh.status = "first"

Attempted Fixes

Techniques which do not work here include:

  • Changing the list of listeners to a set. This can remove double-subscription bugs, but makes ordering bugs much worse and less deterministic.
  • Copying the list of listeners before delivering status updates. This prevents reëntrancy bugs on a single status holder, but not on two or more mutually-recursive status holders.
  • Using threads to deliver updates asynchronously. Threads are not an answer to any question except the question of how one can ensure that one has a bad day.

For more details, read the paper linked above.

Capability Safety

Capability-safe design is the next great leap in semantics after memory-safe design.

  • Promises, eventual sends, vats

Perfect Encapsulation

Producing a Python object which has a private member is non-trivial. Producing a private member which cannot, under any circumstances, be accessed with plain user-level Python is probably impossible.

Similarly, modules are neither encapsulated nor private; they are global mutable state.

Ambient Authority

Python exposes many ambient authorities, including the builtin scope, the builtin modules, and the module cache.

Unsorted

  • Guards vs. type annotations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment