Skip to content

Instantly share code, notes, and snippets.

@lunacodes
Last active May 10, 2016 16:03
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lunacodes/7a17e9840e40dcc49ea1670138260222 to your computer and use it in GitHub Desktop.
Save lunacodes/7a17e9840e40dcc49ea1670138260222 to your computer and use it in GitHub Desktop.
Greg Price - Static Typing In Python Talk - 5/10/16 - Initial Notes
[NOTE: These are raw, I may have been getting a bit sick, there's probly errors or things I just didn't hear/get... if something seems Off, assume it's something I missed - not him]
Greg Price - works w/ Guido Van Rossum
Python has Static Types in Beta
Core Mypy team:
Jukka Lehtosdale - PhD research was the start of this
Guido vR
Greg
David Fisher
Reid Barton
Why Static Types:
Understanding COde
Dropbox Engineers spend 40% of their engineering time reading & trying to understand code (Dropbox survey, 2015)
Computers too (navigate, refactor, find bugs)
Understanding Code
What does this code do?
for entry in entries:
entrydata.validate()
Want to read what Validate does now
git grep "Def validate" -> 45 results
This is actually true in Dropbox environment
What's the type of entry.data
Is it a parameter? Grep for call sites
Return Value? Find that Method
Subclasses, duck typing**, generic containers (list, dict, etc)
Are there related types or a base class?
Undecidable in General
Same issue of too multiple possibilities to sort through
It's like a Search Tree where you hope to get everything right/clear
WHat's the type of (some expression)?
Lots of excuses, but:
The author had some kind of answer or expectation of what the type should have been
Did they think it could be anything? List of anything?
One could be a type T, one could be a list of things that are Type T
Static type: the expectation the author had of the (runtime) type of an expression's value (note: there are other competing definitions in academia)
Two principles that
Explicit is better than implicit!!
People do write helpful, explicit docstrings
But the more a codebase has been modified over time, you can't trust these docstrings
Issues of Mutability/Side effects as well
Checked is better than unchecked
PEP 484 - Static Typing in Python 2 & 3
Use Explicit is better than Implicit
Use
Mypy typechecker
Notation:
Python 3: function annotations (PEP 3107)
def gd(a: int, b: int) -> int
Annotations are Python expressions
Design Constraint affected notation
Minimal Semantics, since the implementation hadn't been planned well
Would alter the way you'd write lists(??)
Python 2 & 3
*** didn't get slides
Parametric polymorphis
generic containers:
from typing import List, Dict, Set, Iterable
def sum(a):
# type: (List[int]) ->
...
def topostsort(data):
#type (Dict[str, Set[str]]) -> Iterable[Set[str]]
...
from typing import TypeVar, Generic
T = TypeVar('T')
class MyList(Generic[T])
'''T is bound to *this* Instance of the class. Will not affect classes outside of this subclass. T is not shared'''
def append(self, item):
# type: (T) -> None
...
Note: Generic is an annotation not a type that affects how you write it out later (when you decide the type). This makes sure that in this Intance the T's will all be the same type (meaning, you could have different subclasses of the Base class)
The disadvantage to this notation is that you can't really indicate scope with multiple nested classes.
The # declaration is used by the Type Checker
Gradual Typing
Typed code co-exists with untyped code
Still want to type-check the typed code
At the boundary: type Any
Untyped Code will default to type any
If this is being passed to typed code, good idea to explicitly annotate this in the tyed code
Some aspects of Python are hard to write types for
decorators that do reflection things
meta something in class(*didn't here what he said*)
Other people's code:
Libraries you use may lack written types
SOlution: write them down, in sperate files "stub files", extension .pyi
Type checker is the only thing that will care about this
Python Standard library "typeshed repo" - people are typing the STL
See PEP 484 for Details on typing!!
Tools:
Type-checker: Mypy
http://github.com/python/mypy
Think of it as break and build
Designed for gradual adoption
Run on just the files you've given types to:
$ mypy --silent-imports file.py dir/
Other modules all become Any
This means the typechecker doesn't worry about them
Status at Dropbox:
15 eager early adopters
50,000 Lines of Code have explicit types
People love it:
100% agree "easier to read an dunderstand"
100% "I am more productive"
45% "adding to existing code is a lot of work"
90% are glad to have done it
They have ideas about how to reduce the amount of work it is
"Refactors are already so much easier with the typings that's been added. Game changing for the sync engine!"
"This makes it sooo much easier for me to read code and figure out what parameters are supposed to be"
Missed third
Maybe annotate your favorite Python Codebase
Zulip.readthedocs.io/en/latest/mypy.html
Report any issues on Github
They respond quickly
There's a contributing file
Patches and code reviews are also welcome
Question:
Facebook had issues with lots of PHP
They had a project called Hack
They added Types of their PHP
THey mark files with "Everything here has to be typed"
70-80% have types
They used a tool to do Type Inference
Straightforward Line to Line Inference (based on what's passed/called, etc)
Would spit out guesses
Human would then review it
Mypy team will probably write a similar tool within a year
Question:
What's the likelihood of adoption?
Answer:
Dropbox adoption will help a lot
It's up to each little team adopting it on their projects and being excited about adding types
"What are the things we need to do to keep making it a better experience for early adopters and users"
A lot of people already feel the pain of this
QuestioN:
How does this work with default arguments?
Answer:
Syntactically there's a way for combining it
Something like x:=3
Question:
Small Code Base. Is the static type checker stable? We don't want to break code
Answer:
We have a pin for each time we run it.
It's pretty reliable
OCcasionally will break, but we review it periodically
Question:
Can you statically duck type?
Answer:
Iterable (in notes above) is a type of duck typing
Iterable was designed prior to Mypy
Uses Abstract Base Classes (specific Python thing)
Puts methods on the objects
Registers itself as a thing that is part of the other things
You can define a new class as being an Iterable
You have to explicitly say this somewhere in the class
Question:
Can you do outright duck typing?
Answer:
No. You have to specify
Question:
T = TypeVar('T') - why is it a string?
Answer:
Just so you have an easy way to print it in error messages in stuff
Question:
Can you reduce the resource load/runtime of Type inferrance?
Answer:
In the future
Something about Pyston (Python JIT)
JIT = Just In Time compiler
Allows optimized compiling
Essentially implements DRY in the context of compilation
Part of why Python runs slowly is that nobody has yet built a good JIT in Python
In a future where both of these have succeeded, they would gain performance enhancements from each other
Jit wouldn't need
Question:
A lot of the time people write code that works dynamically (could be two different types)? Something about classes
Answer:
Union[Any Kind] is how you do the dynamic version
Instead of Int you'd just say whatever the class is
Question:
How do you implement this on args and kwargs
Answer:
If it's supposed to be a specific type, you just write down what that should be?
If the kwargs are multiple generic types, we don't have a good way to write that down, and you'd use Any instead
We are thinking about ways to implement this
Question:
Is there a way to just have a type def?
Answer:
A pure type def falls out for free from the fact that these are Python expressions
If you have a complicated type, just give it a variable name and you've achieved this
QUestion:
How did the 'Generic' being set at the class level come to be?
Answer:
You might have a free-standing function (not method) that operates on a list and gives you the first two elements of a list
You just write your Def and have a type variable there
It's not explicitly bound
If you have several methods, they may all use that Variable
A given Instance of a class be some type of the the type
You have to pass any of the instances the same kind of type
Question:
Dictionaries, if I have a specific type of key and specific type of value?
Answer: see toposort func
Question:
Does it play well with hire ordered functions
Answer:
Yes, there's a type called Cobble(???)
Question:
Of the people who found it difficult, how much was because of refactoring?
Answer:
Most of it wasn't refactoring
A lot of it was figuring out (or guessing) what type something was supposed to be
Question:
is it common for people to identify bugs as a result of doing that work
Answer:
It happens. It isn't super common.
I don't think we found any really terrible bugs this way
One of the Hack People (Facebook) on Security & Privacy team told me they found a real howler/scary bug that they were glad to have fixed.
More often it finds something very early/sooner
The great bulk of the value is in future readability
Question:
Interesting Challenges?
Answer:
I'm very happy with the outcome at this point.
Dropbox is paying 3 ppl full time (1 contractor) to work on building this system
That too work and convincing to show people tjhat this already delivers value and is not just an isolated research project
Technically
Notating Aspects of Python that are hard to write a type down for (like * and **)
Common Example
Dicts you pass around
Keys are all strings
Dicts of strings
You have a whole family that accept this.
For different types of keys they expect different values
This structure shouldn't really be here in the first place
Gnarly and Hard to understand
Having types first would make it easier to understand
We have thoughts and hopes to implement this in the future
Currently:
Working on the concept of Optional
Whether something is *allowed* to be optional or not
This saves you from running into attribute errors due to multiple author misudnerstandings
PEP 484 specificies this being explicit
Mypy didn't orginally implement this, we're working on it
Question:
Are there similar things in dynamic langs that you wish you could have done
Answer:
Yes, implementing Optional
Hack & TypeScript
Hack Has it
TypeScript does not
People regret it
But seems to be too difficult to add at this point
It would be nice to have more control of the (Standard Python) syntax
Questions:
Humans are taking a lot of time to reason about code and you said they've been improved. Machines take a lot of time as well. Is it possible for PyCharm to use the parser in Mypy to enable something like this?
Answer:
We hope so, don't know about Python in particular
It won't throw other systems off systems that use the standard AST b/c the anotation is in a # Comment
We've created a variant of the AST Module that's as high performance as AST module, but also includes Type Comments
Called TypesAST
Question:
Isn't the appeal of Python that it's dynamically typed? How do we reconcile this? Is some of the code I've seen where this makes sense not idiomatic to Python (ie you'd need to refactor it)?
Answer:
The things I and most of my colleagues value are compatible with adding typing
People like dynamic typing:
1. It's concise.
We're adding a small amount, but not really that much
Unlike C/C++/Java you don't have to deal with the mess of those types
This is more about Human Readability and security than enforced Types
LISP & Java didn't have Generic Types for a while
This meant you couldn't easily use certain data structures
You don't have to put this on all your code. It's optional
2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment