Skip to content

Instantly share code, notes, and snippets.

@stroxler
Last active December 22, 2021 14:40
Show Gist options
  • Save stroxler/321af865614aa8e04781a29c358f28ae to your computer and use it in GitHub Desktop.
Save stroxler/321af865614aa8e04781a29c358f28ae to your computer and use it in GitHub Desktop.

Considering tweaks to parentheses rules for PEP 677

This gist contains

  • A description of some major concerns I have about adding -> as a legal top-level operator in type expressions.
  • Discussion of solutions that involve requiring parentheses around callable types.
  • Two files of examples of proposed alternative syntax
    • First, a comparison of the current syntax to requiring parentheses both around the args list (as we do now) and around the entire type.
    • Then, a runthrough of the same examples if we dropped the parentheses around the arguments, and only required the outer ones (to me this syntax looks weird in isolation, but actually works very well in real code)

My worries about allowing -> as a top-level operator for creating type expressions

The more I've worked with real code examples for PEP 677, the more opposed I'm becoming to allowing a bare -> as a top-level token in types.

I still strongly favor arrow syntax over having to live with typing.Callable, so I still very much want to see PEP 677 succed, but I've becomed worried about a number of situations:

  1. A lot of folks, including to some extent me, are uncomfortale with chained arrows. This is probably the single biggest concern that could sink the PEP entirely.
  2. There are some real problems defining binding order of -> vs |. Opinions are split down the middle, and it will come up often enough that it really does matter.
  3. I personally have found callables even in argument position harder to read, expecially when I'm looking at the rightmost part of an arguments list.

To my mind, these concerns are big enough that I want to require that callable types be parenthesized. There are two ways we could do this.

The first is "full" parentheses, i.e. requiring outer parentheses around the current PEP 677:

((int, str) -> bool)

A more concise alternative would be to drop the parentheses on the args list, e.g.:

(int, str -> bool)

To me, this looks funny at first by itself but in code it seems okay. One concern raised is that it could look like a tuple of an int and something else, but since tuples are never legal as annotations anyway this doesn't really concern me in the context of real code - I haven't seen a single example where I thought it was confusing.

An deeper dive into each problem

1. Chained arrows

Quite a few folks have had a negative visceral reaction to either of these code samples:

def f(
  x: int,
) -> (int) -> int:          # double arrow in function signature
  return lambda y: x + y
  
f: (int) -> (int) -> int    # double arrow in callable type

There's nothing objectively wrong with this, Haskell and Ocaml have it. But it doesn't feel terribly pythonic to me.

Also, -> in python code the -> is a visual cue for end-of-function-signature. Having several of them at the top level can be distracting. To me this is a bigger concern - we can get used to right-associative -> but visual noise might remain annoying forever.

2. -> vs |

The PEP currently proposes that | bind tighter than ->. There are several major reasons to do it this way:

  • It follows TypeScript's lead. This is a pretty big deal, especially given how many developers use both languages.
  • It got an overwhelming majority of typing-sig's vote in early polls. To me, this suggests most users will also expect this, so doing anything else is likely to trip people up more.

There are also several major reasons to have -> bind tighter.

  • str | (int) -> bool has to be a SyntaxError if we make | bind tighter. Syntax errors are bad UX, it's just about the worst way to tell somoene they got the binding order wrong. If we have | bind tighter, the problem goes away and users who use the wrong order will get obvious errors.
  • Pradeep demonstrated that because of the tendancy to use None as a default value, fully one-quarter of existing callable annotations are optional. That means they'll have to be written as <callable> | None, and has led several folks to favor having -> bind tighter.
  • It's standard that "or" binds loosest in logic (looser than implication), so prior conventions are mixed - TypeScript on the one hand, preexisting math conventions on the other.

My conundrum is that the downsides of either choice are pretty serious and could result in constant user confusion. For me, this is a big factor in favoring required parentheses so that the edge cases don't become painful.

3. Right-most callable arguments can be confusing

Most of the debate about allowing bare top-level -> has focused on what happens when we chain -> in return position.

But I actually find that my eye has a little bit of trouble even in argument position. For example when I look at:

(int, str, (int, str) -> SomeBigClassName) -> bool
#                        ^ this kind of looks like the last argument when I'm skimming!

it's easy, wheneer my eye jumps to the right side of the args list, to think that the function takes SomeBigClassName as the last argument. But it doesn't, that type is in the return position of a callable.

The same probelm happens in function siguatures, at the rightmost part of a single-line parameters list.

If we require parentheses, then there'll be an extra ) which for my eye serves as a clear cue that there's more going on here, and I no longer get confused:

(int, str, ((int, str) -> SomeBigClassName)) -> bool
#                                         ^ now it's more obvious what's gong on!

Conclusion

At this point I strongly favor requiring outer parentheses, to the point where I might oppose my own PEP on readability grounds without them.

I like the idea of dropping required parentheses around arguments but I can see why people might not, it's a weirder-looking syntax, particularly when you just show an example rather than showing it in the context of real code.

And on to the typeshed examples!

# A quick example to illustrate the two syntaxes
def f(x: str, y: int) -> bool: ...
f: ((str, int) -> bool) # full parentheses syntax
f: (str, int) -> bool # args only parentheses
# Now, some samples from typeshed (I abbreviated a few args lists so that it's easier to find the relevant part)
# full parentheses
def check_call_abbreviated(
args: _CMD,
bufsize: int = ...,
preexec_fn: (() -> Any) = ...,
close_fds: bool = ...,
) -> int: ...
# only args parenthesized
def check_call_abbreviated(
args: _CMD,
bufsize: int = ...,
preexec_fn: (() -> Any) = ...,
close_fds: bool = ...,
) -> int: ...
# full
class UnixDatagramServer(BaseServer):
def __init__(
self,
server_address: str | bytes,
RequestHandlerClass: ((...) -> BaseRequestHandler),
bind_and_activate: bool = ...,
) -> None: ...
# only args
class UnixDatagramServer(BaseServer):
def __init__(
self,
server_address: str | bytes,
RequestHandlerClass: ((...) -> BaseRequestHandler),
bind_and_activate: bool = ...,
) -> None: ...
# full
def configure_abbreviated(
self,
foreground: _Color = ...,
postcommand: (() -> Any) | str = ...,
relief: _Relief = ...,
tearoffcommand: ((str, str) -> Any) | str = ...,
title: str = ...,
) -> Dict[str, Tuple[str, str, str, Any, Any]] | None: ...
# only args, if -> binds tighter (if | binds tighter, same as full!)
def configure_abbreviated(
self,
foreground: _Color = ...,
postcommand: () -> Any | str = ...,
relief: _Relief = ...,
tearoffcommand: (str, str) -> Any | str = ...,
title: str = ...,
) -> Dict[str, Tuple[str, str, str, Any, Any]] | None: ...
# full
def Pool(
processes: Optional[int] = ...,
initializer: ((...) -> Any) | None = ...,
initargs: Iterable[Any] = ...,
) -> pool.Pool: ...
# only args parenthesized, if -> binds tighter (if | binds tighter, same as full!)
def Pool(
processes: Optional[int] = ...,
initializer: (...) -> Any | None = ...,
initargs: Iterable[Any] = ...,
) -> pool.Pool: ...
# full
class ProcessPoolExecutor(Executor):
def __init__(
self,
max_workers: int | None = ...,
initializer: ((...) -> None) | None = ...,
initargs: Tuple[Any, ...] = ...,
) -> None: ...
# only args, if -> binds tighter (otherwise, same as full)
class ProcessPoolExecutor(Executor):
def __init__(
self,
max_workers: int | None = ...,
initializer: (...) -> None | None = ...,
initargs: Tuple[Any, ...] = ...,
) -> None: ...
# full
def tag_bind(
self, tagname: str, sequence: str | None = ..., callback: ((tkinter.Event[Treeview]) -> Any) | None = ...
) -> str: ...
# only args (if -> binds tighter, otherwise same as full)
def tag_bind(
self, tagname: str, sequence: str | None = ..., callback: (tkinter.Event[Treeview] -> Any) | None = ...
) -> str: ...
# full
class Element(MutableSequence[Element]):
def __init__(self, tag: str | ((...) -> Element), attrib: Dict[str, str] = ..., **extra: str) -> None: ...
# only args, if -> binds tighter (otherwise same as full)
class Element(MutableSequence[Element]):
def __init__(self, tag: str | (...) -> Element, attrib: Dict[str, str] = ..., **extra: str) -> None: ...
# full
def start_new_thread(function: ((...) -> Any), args: Any, kwargs: Any = ...) -> int: ...
# args
def start_new_thread(function: (...) -> Any, args: Any, kwargs: Any = ...) -> int: ...
# full
def takewhile(predicate: ((_T) -> Any), iterable: Iterable[_T]) -> Iterator[_T]: ...
# args
def takewhile(predicate: (_T) -> Any, iterable: Iterable[_T]) -> Iterator[_T]: ...
# full
def min(__arg1: _T, __arg2: _T, *_args: _T, key: ((_T) -> SupportsLessThanT)) -> _T: ...
# args
def min(__arg1: _T, __arg2: _T, *_args: _T, key: (_T) -> SupportsLessThanT) -> _T: ...
# full
def contextmanager(func: ((...) -> Iterator[_T])) -> ((...) -> ContextManager[_T]): ...
# args
def contextmanager(func: (...) -> Iterator[_T]) -> (...) -> ContextManager[_T]: ...
# full
class StartResponse(Protocol):
def __call__(
self, status: str, headers: List[Tuple[str, str]], exc_info: _OptExcInfo | None = ...
) -> ((bytes) -> Any): ...
# only args
class StartResponse(Protocol):
def __call__(
self, status: str, headers: List[Tuple[str, str]], exc_info: _OptExcInfo | None = ...
) -> bytes -> Any: ...
# full
def lru_cache(maxsize: int | None = ..., typed: bool = ...) -> (((...) -> _T) -> _lru_cache_wrapper[_T]): ...
# only args
def lru_cache(maxsize: int | None = ..., typed: bool = ...) -> (...) -> _T -> _lru_cache_wrapper[_T]: ...
# full
def module_for_loader(fxn: ((...) -> types.ModuleType)) -> ((...) -> types.ModuleType): ...
# only args
def module_for_loader(fxn: (...) -> types.ModuleType) -> (...) -> types.ModuleType: ...
# A quick example to illustrate
def f(x: str, y: int) -> bool: ...
# full parentheses syntax
f: ((str, int) -> bool)
# outer only
f: (str, int -> bool)
# args only
f: (str, int) -> bool
def f(x: str, y: int) -> bool: ...
f: (str, int -> bool) # standard outer-only syntax
def g() -> bool: ...
g: (() -> bool) # variant 1: use a special () for empty args
g: (-> bool) # variant 2: use a bare -> for empty args
# typshed examples, using variant 1
def check_call_abbreviated(
args: _CMD,
bufsize: int = ...,
preexec_fn: (() -> Any) = ...,
close_fds: bool = ...,
) -> int: ...
class UnixDatagramServer(BaseServer):
def __init__(
self,
server_address: str | bytes,
RequestHandlerClass: (... -> BaseRequestHandler),
bind_and_activate: bool = ...,
) -> None: ...
def configure_abbreviated(
self,
foreground: _Color = ...,
postcommand: (() -> Any) | str = ...,
relief: _Relief = ...,
tearoffcommand: (str, str -> Any) | str = ...,
title: str = ...,
) -> Dict[str, Tuple[str, str, str, Any, Any]] | None: ...
def Pool(
processes: Optional[int] = ...,
initializer: (... -> Any) | None = ...,
initargs: Iterable[Any] = ...,
) -> pool.Pool: ...
class ProcessPoolExecutor(Executor):
def __init__(
self,
max_workers: int | None = ...,
initializer: (... -> None) | None = ...,
initargs: Tuple[Any, ...] = ...,
) -> None: ...
def tag_bind(
self, tagname: str, sequence: str | None = ..., callback: (tkinter.Event[Treeview] -> Any) | None = ...
) -> str: ...
class Element(MutableSequence[Element]):
def __init__(self, tag: str | (... -> Element), attrib: Dict[str, str] = ..., **extra: str) -> None: ...
def start_new_thread(function: (... -> Any), args: Any, kwargs: Any = ...) -> int: ...
def takewhile(predicate: (_T -> Any), iterable: Iterable[_T]) -> Iterator[_T]: ...
def min(__arg1: _T, __arg2: _T, *_args: _T, key: (_T -> SupportsLessThanT)) -> _T: ...
def contextmanager(func: (... -> Iterator[_T])) -> (... -> ContextManager[_T]): ...
class StartResponse(Protocol):
def __call__(
self, status: str, headers: List[Tuple[str, str]], exc_info: _OptExcInfo | None = ...
) -> (bytes -> Any): ...
def lru_cache(maxsize: int | None = ..., typed: bool = ...) -> ((... -> _T) -> _lru_cache_wrapper[_T]): ...
def module_for_loader(fxn: (... -> types.ModuleType)) -> (... -> types.ModuleType): ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment