Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save bespokoid/205efd91546ddb16b210678830fa9eeb to your computer and use it in GitHub Desktop.
Save bespokoid/205efd91546ddb16b210678830fa9eeb to your computer and use it in GitHub Desktop.
Transforming code into Beautiful, Idiomatic Python by Raymond Hettinger

Transforming code into Beautiful, Idiomatic Python

This is merely a gist of great speech made by Raymond Hettinger @raymondh at PyCon 2013 Original video can be enjoyed here and original slides can be found there

Almost all code snippets are presenting two or three ways of doing things: bad, ugly and pythonic :) I changed code a bit to make it Python3 compatible and added some notes regarding Python2/3 transition

Loops

When you see this, do that instead

  • Replace traditional index manipulation with Python’s core looping idioms
  • Learn advanced techniques with for-else clauses and the two argument form of iter()
  • Improve your craftmanship and aim for clean, fast, idiomatic Python code

If you mutate something you're iterating over, you're living in a state of sin and deserve whatever happens to you Raymond Hettinger

Looping over a range of numbers

# Code below could use some improvement
for i in [0, 1, 2, 3, 4, 5]:
    print(i**2)

# This one faster & better
for i in range(6):
    print(i**2)

# This was kinda best, but in Python3 xrange() got renamed to range()  
for i in xrange(6):
    print(i**2)

Looping over a collection

colors = ['red', 'green', 'blue', 'yellow']

# Copy-pasted from straight PyUglyCodeConf slides :)
for i in range(len(colors)):
    print(colors[i])

# and reworked in pythonic way
for color in colors:
    print(color)

Looping backwards

colors = ['red', 'green', 'blue', 'yellow']

# Another piece of delight to get rid of
for i in range(len(colors)-1, -1, -1):
    print(colors[i])

# To make things pythonic
for color in reversed(colors):
    print(color)

Looping over a collection and indicies

colors = ['red', 'green', 'blue', 'yellow']

# Python was lot of questionable fun back in a day
for i in range(len(colors)):
    print(i, '-->', colors[i])

# Now it's about using right tool for the task in hand
for i, color in enumerate(colors):
    print(i, '-->', color)

Looping over two collections

names = ['raymond', 'rachel', 'matthew']
colors = ['red', 'green', 'blue', 'yellow']

n = min(len(names), len(colors))
for i in range(n):
    print(names[i], '-->', colors[i])

for name, color in zip(names, colors):
    print(name, '-->', color)

# Note: zip computes all the list at once, izip computes the elements only when requested. 
# In Python3 zip returns iterator pretty much like old izip 
for name, color in izip(names, colors):
    print(name, '-->', color)

Looping in sorted order

colors = ['red', 'green', 'blue', 'yellow']

for color in sorted(colors):
    print(color)

for color in sorted(colors, reverse=True):
    print(color)

Custom sort order

colors = ['red', 'green', 'blue', 'yellow']

def compare_length(c1, c2):
    if len(c1) < len(c2): return -1
    if len(c1) > len(c2): return 1
    return 0

print(sorted(colors, cmp=compare_length))

print(sorted(colors, key=len))

Call a function until a sentinel value

blocks = []
while True:
    block = f.read(32)
    if block == '':
        break
    blocks.append(block)

blocks = []
for block in iter(partial(f.read, 32), ''):
    blocks.append(block)

Distinguishing multiple exit points in loops

def find(seq, target):
    found = False
    for i, value in enumerate(seq):
        if value == tgt:
            found = True
            break
    if not found:
        return -1
    return i

def find(seq, target):
    for i, value in enumerate(seq):
        if value == tgt:
            break
    else:
        return -1
    return i

Dictionary Skills

There's two kinds of people in the world: people who've mastered dictionaries and total goobers. Raymond Hettinger

  • Mastering dictionaries is a fundamental Python skill
  • They are fundamental for expressing relationships, linking, counting, and grouping

Looping over dictionary keys

d = {'matthew': 'blue', 'rachel': 'green', 'raymond':'red'}

for k in d:
    print(k)

for k in d.keys():
    if k.startswith('r'):
        del d[k]

d = {k : d[k] for k in d if not k.startswith('r')}

Looping over a dictionary keys and values

for k in d:
    print(k, '-->', d[k])

# Note: in Python3 this works pretty much like next snippet
for k, v in d.items():
    print(k, '-->', v)

for k, v in d.iteritems():
    print(k, '-->', v)

Construct a dictionary from pairs

names = ['raymond', 'rachel', 'matthew']
colors = ['red', 'green', 'blue']

# Change zip to izip in case of Python2 
d = dict(zip(names, colors))
{'matthew': 'blue', 'rachel': 'green', 'raymond': 'red'}

d = dict(enumerate(names))
{0: 'raymond', 1: 'rachel', 2: 'matthew'}

Counting with dictionaries

colors = ['red', 'green', 'red', 'blue', 'green', 'red']

d = {}
for color in colors:
    if color not in d:
        d[color] = 0
    d[color] += 1

{'blue': 1, 'green': 2, 'red': 3}

d = {}
for color in colors:
    d[color] = d.get(color, 0) + 1

from collections import defaultdict   
d = defaultdict(int)
for color in colors:
    d[color] += 1

Grouping with dictionaries

names = ['raymond', 'rachel', 'matthew', 'roger',
        'betty', 'melissa', 'judith', 'charlie']

d = {}
for name in names:
    key = len(name)
    if key not in d:
        d[key] = []
    d[key].append(name)

#>  {5: ['roger', 'betty'], 6: ['rachel', 'judith'],
#>  7: ['raymond', 'matthew', 'melissa', 'charlie']}

d = {}
for name in names:
    key = len(name)
    d.setdefault(key, []).append(name)

d = defaultdict(list)
for name in names:
    key = len(name)
    d[key].append(name)

Is a dictionary popitem() atomic?

d = {'matthew': 'blue', 'rachel': 'green', 'raymond':'red'}

while d:
    key, value = d.popitem()
    print(key, '-->', value)

Linking dictionaries

defaults = {'color': 'red', 'user': 'guest'}
parser = argparse.ArgumentParser()
parser.add_argument('-u', '--user')
parser.add_argument('-c', '--color')
namespace = parser.parse_args([])
command_line_args = {k:v for k, v in
                    vars(namespace).items() if v}

d = defaults.copy()
d.update(os.environ)
d.update(command_line_args)

d = ChainMap(command_line_args, os.environ, defaults)

Improving Clarity

  • Positional arguments and indicies are nice
  • Keywords and names are better
  • The first way is convenient for the computer
  • The second corresponds to how human’s think

Clarify function calls with keyword arguments

twitter_search('@obama', False, 20, True)

twitter_search('@obama', retweets=False, numtweets=20,
popular=True)

Clarify multiple return values with named tuples

doctest.testmod()
#> (0, 4)

doctest.testmod()
TestResults(failed=0,attempted=4)

TestResults = namedtuple('TestResults', ['failed', 'attempted'])

(Un)packing

Unpacking sequences

p = 'Raymond', 'Hettinger', 0x30, 'python@example.com'

fname = p[0]
lname = p[1]
age = p[2]
email = p[3]

fname, lname, age, email = p

Updating multiple state variables

def fibonacci(n):
    x = 0
    y = 1
    for i in range(n):
        print(x)
        t = y
        y = x + y
        x = t

def fibonacci(n):
    x, y = 0, 1
    for i in range(n):
    print(x)
    x, y = y, x+y

Tuple packing and unpacking

  • Don’t underestimate the advantages of updating state variables at the same time
  • It eliminates an entire class of errors due to out‐of‐order updates
  • It allows high level thinking: “chunking”

Simultaneous state updates

tmp_x = x + dx * t
tmp_y = y + dy * t
tmp_dx = influence(m, x, y, dx, dy, partial='x')
tmp_dy = influence(m, x, y, dx, dy, partial='y')
x = tmp_x
y = tmp_y
dx = tmp_dx
dy = tmp_dy

x, y, dx, dy = (x + dx * t,
                y + dy * t,
                influence(m, x, y, dx, dy, partial='x'),
                influence(m, x, y, dx, dy, partial='y'))

Efficiency

  • An optimization fundamental rule
  • Don’t cause data to move around unnecessarily
  • It takes only a little care to avoid O(n**2) behavior instead of linear behavior

Concatenating strings

names = ['raymond', 'rachel', 'matthew', 'roger',
'betty', 'melissa', 'judith', 'charlie']

s = names[0]
for name in names[1:]:
    s += ', ' + name
print(s)

print(', '.join(names))

Updating sequences

names = ['raymond', 'rachel', 'matthew', 'roger',
'betty', 'melissa', 'judith', 'charlie']

del names[0]
names.pop(0)
names.insert(0,'mark')

names = deque(['raymond', 'rachel', 'matthew', 'roger',
'betty', 'melissa', 'judith', 'charlie'])

del names[0]
names.popleft()
names.appendleft('mark')

Decorators and Context Managers

  • Helps separate business logic from administrative logic
  • Clean, beautiful tools for factoring code and improving code reuse
  • Good naming is essential.
  • Remember the Spiderman rule: With great power, comes great respsonsibility!

Using decorators to factor‐out administrative logic

def web_lookup(url, saved={}):
    if url in saved:
        return saved[url]
    page = urllib.urlopen(url).read()
    saved[url] = page
    return page

@cache
def web_lookup(url):
    return urllib.urlopen(url).read()

Caching decorator

def cache(func):
    saved = {}
    @wraps(func)
    def newfunc(*args):
        if args in saved:
            return newfunc(*args)
        result = func(*args)
        saved[args] = result
        return result
    return newfunc

Factor-out temporary contexts

old_context = getcontext().copy()
getcontext().prec = 50
print(Decimal(355) / Decimal(113))
setcontext(old_context)

with localcontext(Context(prec=50)):
    print(Decimal(355) / Decimal(113))

How to open and close files

f = open('data.txt')
try:
    data = f.read()
finally:
    f.close()

with open('data.txt') as f:
    data = f.read()

How to use locks

# Make a lock
lock = threading.Lock()

# Old-way to use a lock
lock.acquire()
try:
    print('Critical section 1')
    print('Critical section 2')
finally:
    lock.release()

# New-way to use a lock
with lock:
    print('Critical section 1')
    print('Critical section 2')

Factor-out temporary contexts with ignored/suppress

# Old and silly way
try:
    os.remove('somefile.tmp')
except OSError:
    pass

# Pythonic way
from contextlib import contextmanager 
 
@contextmanager
def ignored(*exceptions):
    try:
        yield
    except exceptions:
        pass

with ignored(OSError):
    os.remove('somefile.tmp')
    
# In Python 3.4 you can use suppress() context manager
# https://docs.python.org/3/library/contextlib.html#contextlib.suppress
with suppress(OSError):
    os.remove('somefile.tmp')

Factor-out temporary contexts with redirect_stdout

# Weak old stuff
with open('help.txt', 'w') as f:
    oldstdout = sys.stdout
    sys.stdout = f
    try:
        help(pow)
    finally:
        sys.stdout = oldstdout

# Cool new shit
from contextlib import contextmanager 

@contextmanager
def redirect_stdout(fileobj):
    oldstdout = sys.stdout
    sys.stdout = fileobj
    try:
        yield fieldobj
    finally:
        sys.stdout = oldstdout
        
with open('help.txt', 'w') as f:
    with redirect_stdout(f):
        help(pow)

Concise Expressive One-Liners

Two conflicting rules:

  1. Don’t put too much on one line
  2. Don’t break atoms of thought into subatomic particles

Raymond’s rule:

One logical line of code equals one sentence in English

List Comprehensions and Generator Expressions

# Old school
result = []
for i in range(10):
    s = i ** 2
    result.append(s)
print(sum(result))

# New way. Change range() for xrange() in case of Python2
print(sum([i**2 for i in range(10)]))

# Even faster
print(sum(i**2 for i in range(10)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment