Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save bay007/9d375c88862dd08285d9e1e34af24fae to your computer and use it in GitHub Desktop.
Save bay007/9d375c88862dd08285d9e1e34af24fae to your computer and use it in GitHub Desktop.
Python

Intro to Advanced Python

By Adam Anderson

adam.b.anderson.96@gmail.com

These notes are mostly just summaries of the provided references. The purpose of this document is to centralize the resources I found useful so they would be easy to find. Most definitions and explanations are paraphrased or quoted directly from the sources.

Table of Contents

How the Language Works

Is Python Interpreted or Compiled?

Everyone says that Python is an interpreted language, but this is technically incorrect. Python is “a language” in the sense of defining a class of language implementations that must be similar in some fundamental respects (syntax, most semantics). As long as two implementations satisfy the language requirements, they are allowed to differ in other implementation details (how to deal with source files, whether or not they compile sources to lower level forms, how they execute said forms, etc).

The standard Python implementation is CPython, which compiles source files to Python bytecode in .pyc files. This is done automatically when needed if there is no bytecode or bytecode is out of date. Other major implementations of Python include IronPython by Microsoft which uses .NET, Jython which compiles to JVM codes, and PyPy. You can think of the Python language reference as an interface to implement different versions of Python.

The lower level bytecode is executed by interpreters, AKA virtual machines - the CPython VM, the .NET runtime, the JVM, etc.

Python compilation process

You can also is -o flag when invoking the python interpreter to optimize the bytecode and store the result in a .pyo file. See “Compiled” Python Files

A good example of seeing C implementation in Python is at Type Objects, where we can see the C struct used to describe built in types. When you define a magic method in a class, the function will end up in the PyTypeObject struct.

References

  1. What Are .pyc Files?
  2. Python is Compiled Language or Interpreted Language?

Data Model - Everything Is An Object

All data is represented by objects or by relationships between objects. Every object has three things:

  • Identity - Unique to the object, never changes once object has been created. The is operator compares object identity. id() function returns an integer representing the identity.
  • Type - Determines operations that the object supports. type() function returns the type. An object's type cannot be changed.
  • Containers - Objects containing references to other objects (lists, dictionaries, tuples).

References

  1. Python Data Model

Dynamic, Strongly Typed Language

Static Typing - Type checking is performed at compile-time. Once a variable's type is set, you cannot change it.

  • Errors can be detected at compile time
  • Ececution may be made more efficient by omitting runtime type checks
  • Java, C, C++, Fortran, Haskell

Dynamic Typing - Type checking is performed at runtime. Types are associated with values, not with variables.

  • May result in runtime type errors - a value may have an unexpected type, making bugs difficult to locate
  • Python, JavaScript, Ruby, PHP, Prolog

Strong vs Weak Typing - How stricly types are distinguished, i.e does the language do automatic casting. For example: does the language do implicit conversion from strings to numbers?

Python is dynamically and strongly typed. This means that types are determined at runtime, but they are not automatically changed for operations. For example, 5 + "a" will cause an exception.

Recall that everything in Python is an object which keeps track of its type. One variable can refer to different objects at different points throughout the program because variables do not care what type an object is. To understand why, see the Namespaces section below.

References 1.Difference Between Statically Dynamically Typed Languages 2.Static/Dynamic vs Strong/Weak Typing

Duck Typing

Duck Typing is a style of dynamic typing in which an object's methods and properties determine valid semantics rather than its inheritance from a particular class or implementation of a specific interface.

This is best explained through an example. Imagine we want to call the drive() function on an object. We could do one of two things:

  • Make drive() only accept data of a certain type (car). Then code will not compile if we try to pass in the wrong type. This is what we do in Java and C.
  • We allow drive() to accept any object parameter, and if something that cannot drive is passed in, the program will create a runtime error. This is duck typing.

With duck typing, we are interested in what the object can do rather than what the object is. This is similar to polymorphism.

Also, recall Late (Dynamic) Binding, a mechanism where the method being called on an object is looked up by name at runtime.

References

  1. What is Duck Typing?

Bound and Unbound Methods

Python will create a bound method when a function is an attribute of a class and you access the function via an instance of the class. This is why when you define a method, you need to add self as a parameter.

def MyClass(object):
	def __init__(self, input):
		self.variable = input
		
	def func(self, multiplier):
		self.variable *= multiplier
		
obj = MyClass(5)

When you call a member function (a bound function), it is translated to an unbound method. This means that the following two lines of code are identical:

obj.func(10)
MyClass.func(obj, 10)

Above, the func needs to be bound to an object. In this case, it is bound to obj. When we call obj.func(10), it is translated into the unbound function call MyClass.func(obj, 10). This is why we need the self parameter in our function definitions.

References

  1. How to Pass Member Function as Argument
  2. Decorators

For-Else Loop

The for...else construct is used to recognize whether we broke out of a loop or if the loop iterated through every element. If we break out of a loop with a break statement, the else is ignored. If we don't, and the loop is executed to completion, the else is evaluated.

Below is an example of how this can save us the trouble of making extra variables to keep track of whether or not the loop is exited.

flag_found = False
for i in alist:
	if i == the_flag:
		flag_found = True
		break
		# Other code
		
if not flag_found:
	# Loop executed to completion

The above code needs the flag_found variableto see if the_flag is found in alist. We can simplify this code to the below using for...each

for i in myList:
	if i == the_flag:
		break
	# Other code
else:
	# Loop executed to completion

References

  1. Why Does Python Use 'Else' After for and while Loops?

The Meaning of Single and Double Underscore Before an Object Name

Single leading underscore - _ignored will be ignored when a module is imported. Good if you want to keep something private, or if you intend for it to be overridden in subclasses.

Double leading underscore - Name mangling. __varname is textually replaced with classname__varname. This is useful for preventing accidental access.

Double leading and trailing underscores - These look like __init__ and __dict__. These are called "dunder" attributes (for double-underscore). Things with dunder names are used by Python for specific purposes, and the user usually does not directly call them. Phython automatically creates dunder variables and methods for certain things. Specific examples show up in the following sections.

References

  1. The Meaning of Single and Double Underscore Before an Object Name

Magic Methods and Operator Overloading

Magic methods are dunder methods like __init__(). These methods are not called directly, as there is syntactic sugar that calls these methods. For example, x + y is syntactic sugar for x.__add__(y). We can therefore use magic methods for operator overloading. Some examples are shown below Example magic methods

References

  1. Magic Methods and Operator Overloading

Namespaces

In other languages, a variable can be thought of as a container with allocated memory that stores data.

In Python, a variable is just a name. Names can be assigned to values, functions, etc. You can also reuse a name to refer to objects of different types throughout a program in Python.

A namespace is a mapping from names to objects. Namespaces are implemented as dictionaries mapping a string name to the object to which the name refers.

Each module gets its own global namespace. Each namespace is isolated, so two things can have the same name in different modules. When you call a function, it gets a private namespace where its local variables are created. We can see variables in the local namespace using locals() and in the global namespace using global(). See Locals and Globals.

Multiple names (in multiple scopes) can be bound to the same object. This is known as aliasing in other languages. Changes made to a mutable object using one reference to it will persist for all aliased references to the same object.

When you run a Python script, the interpreter treats it as a module called __main__, which gets a global namespace. Import functions have the following behavior:

  • import Module - Gives access to the namespace for Module, so names can be accessed using Module.name
  • from Module import name - Imports a name directly into the program's namespace. Now yo can use name directly.
  • from Module import * - Imports all names in Module's namespace into current one. This is not recommended, as it can cause "namespace bloat".

When you call namespace.name, it is syntactic sugar for namespace.__dict__["name"]

References

  1. A Guide to Python Namespaces
  2. Namespaces and Scopes
  3. Classes in Python

Pass by Reference?

Once you understand namespaces, you can appreciate how arguments are passed into functions. Recall that in compiled languages, a variable is a memory space that holds a value. In Python, a variable is a name (captured internally as a string) bound to a reference value to an object.

When you pass an argument into a function, a namespace for the function is created. Parameters in the function's namespace will reference the objects that were passed in as arguments using the names given in the function prototype. An example to illustrate this is shown below.

def outer_func(a, b):
	print(locals())
	inner_func(a, b)

def inner_func(x, y):
	print(locals())

outer_func(5, 10)

The output to the above script will be:

{'a': 5, 'b': 10}
{'x': 5, 'y': 10}

The below image shows all of the relationships going on in a simple program between names and objects. See the reference for an explanation of the diagram if it looks confusing.

How namespaces work with function calls

References

  1. How Do I Pass a Variable by Reference?

Monkey Patching

In Python, an attribute is any name following a dot. These can be variables, classes, functions, etc. contained in a namespace. For example, if the class Number has the method add(), then we can use number_object.add() to access the method, which is an attribute of the object. In Python, attributes can be changed at runtime, so we can dynamically add variables and functions to classes and modules.

Monkey Patching is the dynamic replacement of attributes at runtime. We call a function using namespace.function(), but we can access and mutate the function object. For example, say we create new_function() and want to put it in a class to replace the function the class came with. We can just use classname.function = new_function.

Note that in a class, data attribtes override method attributes with the same name.

References

Classes, Metaclasses, and Objects

In most languages, classes are just code describing how to produce an object. In Python, classes are also objects. When you use the class keyword, Python executes it and creates an object.

Since a class is an object, you can...

  • Assign it to a variable
  • Copy it
  • Add attributes to it
  • Pass it as a function parameter

Since classes are objects, they must be constructed by something. A metaclass defines how a class object should be constructed. By default in Python, type is the metaclass for a given class. The class keyword automatically invokes type's constructor:

type(name of class, tuple of parents, dict of names:values)

Thus, the following two ways of defining a class are equivalent equivalent:

class MyClass(object):
	pass

is the same as

MyClass = type('MyClass', (), {})

A slightly more complicated example:

class FooChild(Foo):
	bar = True

is the same as

FooChild = type('FooChild', (Foo,), {'bar': True})

When you declare a class, Python checks if there is a specified __metaclass__. Python will use the given metaclass to construct the class object. type is the default metaclass. As we will see in the Optimizations Section, you actually get a slight performance boost from specifying __metaclass__ = type

References

  1. What is a Metaclass in Python?
  2. Understanding Python Metaclasses

Iterables, Iterators, and Generators

Container - Data structure that holds elements. Containers support membership tests and live in memory. For example, list, set, dict, tuple. You can check if an element is in a container using if element in container. Iterable - Any object that can return an iterator with the purpose of returning its elements. Most containers are iterable, as are open files and open sockets. We can think of an iterable as an stream of data, where the iterator allows us to get the next item in the stream. An iterable must implement __iter__(), which returns an iterator. You can get the iterator by calling iter(iterable_object). Iterator - Stateful helper object that produces a next value when you call next(). In Python 3, an iterator implements the __next__() method. Generator - Special kind of an iterator. A Generator Object is returned by a Generator Function, but we also talk about Generator Expressions. These are discussed below after the notes about iterators and iterables.

Below is a diagram showing the relationship between an iterable and an iterator.

Iterator and iterable relationship

Now, let's look at some code snippets from Iterators and Generators.

Basic Iterable Usage - A list is an iterable, and the iter() method returns an iterator for the list (using the list's internal __iter__() method). We can then call next() on the iterator to get consecutive elements in the iterable.

>>> x = iter([1, 2, 3])
>>> x
<listiterator object at 0x1004ca850>
>>> x.next()
1
>>> x.next()
2
>>> x.next()
3
>>> x.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Creating an Iterable - The __iter__() method makes a yrange object iterable, and it happens that in this case the yrange object is an iterator because it implements a __next__() method (note that the code was implemented in Python 2 so next does not have dunders).

class yrange:
    def __init__(self, n):
        self.i = 0
        self.n = n

    def __iter__(self):
        return self

    def next(self):
        if self.i < self.n:
            i = self.i
            self.i += 1
            return i
        else:
            raise StopIteration()

Now, we look at the usage of a yrange object.

>>> y = yrange(3)
>>> y.next()
0
>>> y.next()
1
>>> y.next()
2
>>> y.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 14, in next
StopIteration

Because iterables do not require that every item to be iterated over be stored in memory, they can be a lot more efficient. For example, consider iterating over the numbers from 1 to 10. It would waste memory to store a list containing all numbers from 1 to 10 and iterate over that when we could just have an iterator give us each number when necessary.

Now that we understand iterables and iterators, we are ready to talk about generators.

Generator Expression - If you are familiar with list comprehension, generator expressions work similarly. We use a comprehension statement in parentheses, and Python gives us a Generator Object that iterates over things produced by the comprehension.

For example, consider the following list comprehension:

stripped_lines = [line.strip() for line in line_list if line != ""]

This comprehension combines a lot of instructions into a single line of code to produce a list of whitespace-free lines using the lines in line_list. If we wanted an iterator for these lines, we could use the generator expression:

stripped_iter = (lines.strip() for line in line_list)

Two common operators we want to perform on iterables are:

  • Perform operation on every element
  • Select subset of elements meeting a condition

Using comprehensions and generator expressions, we can achieve this. Below is another example of a generator expression for Pythagorean Triples to show how much we can get done in a single line.

triples = ((x, y, z) for z in range(100) for y in range(1, z) for x in range(1, y) if x*x + y*y == z*z)

Generator Function - Return generator objects using a yield statement. This is explained below.

Normal functions compute a value and return it. The are subroutines, which are entered at a specific point and exited at another. When a regular function is called, a namespace is created for local variables. When a return statement is reached, local variables are destroyed and the value is returned to the caller. A later call to the same function creates a new local namespace with new local variables.

Generator Functions act like "resumable" functions. When a Generator Function hits a yield expression, a value is returned, but the function just suspends execution rather than throwing out the local variables. These functions are coroutines, which can be entered, exited, and resumed at different points.

To illustrate, consider another example from Iterators and Generators:

>>> def foo():
...     print "begin"
...     for i in range(3):
...         print "before yield", i
...         yield i
...         print "after yield", i
...     print "end"
...
>>> f = foo()
>>> f.next()
begin
before yield 0
0
>>> f.next()
after yield 0
before yield 1
1
>>> f.next()
after yield 1
before yield 2
2
>>> f.next()
after yield 2
end
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>>

We see above that when the function is called, a generator object is returned without executing the body of the function at all. When next() is called, the function executes until the yield statement. When next() is called again, the function resumes until it reaches the yield statement again.

References

  1. Iterables vs. Iterators vs. Generators
  2. What are Python's Iterator, Iterable, and Iteration Protocols?
  3. Iterators and Generators
  4. Improve Your Python: 'yield' and Generators Explained

Sorting

We can easily sort an iterable using the sorted() function. The prototype is

sorted(iterable [,key][,reverse])

Where key and reverse are optional parameters. Key should be a key function. A key funcion is a function of one argument that is used to extract a comparison key from each list element. This is efficient because Python can get a key for each object in the iterable and just sort by number, and the key function only needs to be called once for each object.

To sort an iterable, just define the key function and pass the iterable and the key function into sorted(). The operator module provides the functions ittemgetter() and attrgetter() if you want to use indices or attributes as sorting keys.

For example, if we have tuples representing students of the form (name, grade) and we want to sort by grade, we can do

new_list = sorted(tuple_list, key=ittemgetter(1))

References

  1. Sorting How To

###*args and **kwargs *args and **kwargs Explained

Optimizations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment