Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
EuroPython 2018

Rust and Python - Oxidize Your Snake ✨

  • Sven-Hendrik Haase / Consultant
  • Video
  • PyO3
  • Why?
    • Speed up your program
    • Escape the GIL
    • Bind to existing system libraries - serde πŸ€”
  • Rust is...
    • Safe
    • Modern
    • Fast
    • Statically and strongly typed
    • Immutable by default
    • Private by default
    • Amazing tooling
    • Great community
  • You can use Rust to speed up your Python code
    1. Annotate your Rust code
    2. Compile to a .so file
    3. Python supports importing .so files
    4. Optionally add to setup.py
  • You can use Python to slow down your Rust code πŸ˜‚ (could still be useful - e.g. Lua in Redis)
  • rayon crate for data parallelism
  • That Rust Book

Lies, damned lies, and statistics ✨

  • Marco Bonzanini / Consultant
  • Video
  • Slides
  • In the Vatican City there are 5.88 popes per square mile

  • Even if the statistics are correct, they can be misleading
  • Lurking variable
    • ⬆️Ice Cream β†’ ⬆️Drowning❓
    • ⬆️Temperature β†’ ⬆️Ice Cream, ⬆️Temperature β†’ ⬆️Drowning
  • Correlation between number of firefighters and damage - reduce damage by sending fewer firefighters? πŸ™ƒ
  • Simpson's paradox
  • Bias - a systematic error
  • Sampling bias - an error in your sampling process
  • Data visualisation can give better insights but also mislead
  • Significant means the results are reliable but not necessarily important (could be a small change)
  • Data dredging / data fishing / p-hacking

Trust me, I'm a Data Scientist - ethics for builders of data-based applications

  • Sarah Diot-Girard / PeopleDoc
  • Bag of words vs word embeddings (e.g. Word2Vec)
  • Bias in word embeddings: Man β†’ Developer, Woman β†’ Homemaker πŸ’₯
  • Interpretability can be used to get feedback from experts and make sure reasoning is unbiased
  • With GDPR you have to be able to explain decisions
  • Local interpretable model-agnostic explanations (LIME) πŸ‹
  • Apophenia - seeing patterns that don't exist
  • Illusory causation - confusing correlation and causation

Writing good error messages

  • Paul Keating / Consultant
  • Think about audience - interactive, batch, API
  • Sometimes two audiences - users calling the library and users of the application
  • Did you mean? error πŸ‘
  • Something went wrong error πŸ‘Ž
  • Use logger.exception
  • Avoid identical messages for different error paths
  • Need to test the error path
  • Understandable, explicit, unambiguous, point in the right direction

Get Productive with Python in Visual Studio Code

  • Dan Taylor / Microsoft
  • Snippets and tasks
  • ⌘P then βŒ₯⏎ to open in split tab
  • ⌘ click to open definition
  • πŸ’‘ Learn how to use watch in the debugger
  • Right click and run to cursor (~temporary breakpoint)
  • IntelliCode 🀯
  • Language server is opt-in for now but will be the default soon
  • VS Live Share 🀯

Python 3: ten years later

  • Victor Stinner / Red Hat
  • Inheritance from object 🀒
  • long vs int 🀒
  • Unicode 🀒
  • 2008 - Python 3 was released
  • All dependencies must be Python 3 compatible
  • Some projects were forked at add Python 3 support
  • Python 3 wall of shame
  • ensurepip
  • Remove Python 2 support Add Python 3 support
  • u"unicode" re-added in Python 3.3 - doesn't do anything but helps port Python 2 code
  • Python 2 backports
  • 95% of top 200 packages support Python 3
  • Instagram on Python 3
    • ⬇️12% CPU
    • ⬇️30% Memory
  • time.monotonic() added in Python 3.3

How to Ignore Most Startup Advice and Build a Decent Software Business by Ines M ✨

  • Ines Montani / Explosion AI
  • Video
  • Slides
  • Explosion AI
  • spaCy
  • Bootstrapped / self-funded / did consulting
  • prodigy
  • πŸ”΄ Misconception #1 - you need to run at a loss
    • Reasons to run at a loss: network effects, scale operations, predatory pricing, enterprise sales
    • Upfront costs
    • Bigger is not necessarily better
    • Most businesses aren't "winner takes all"
    • Optimise for median outcome
  • πŸ”΄ Misconception #2 - you need to hire lots of people
    • 🚌 test
    • Excellence requires authorship / ownership (not design by committee)
    • Building the right thing
    • Specialists - people don't understand what others are working on, more meetings...
    • Generalists
    • Complementary - mix of skills ✨
    • Tree-shaped skills 🌳
  • πŸ”΄ Misconception #3 - you can't make good decisions without testing all of your assumptions
    • Inverse of survivorship bias - We didn't do X and we failed, therefore X would have saved us.
    • http://autopsy.io
    • You can't replace logic with data
    • Build things you think are good
  • πŸ”΄ Misconception #4 - the true value lies in your user data
    • Sell products, not promises
    • Focus on what you can really charge people money for right now
    • Ship value, charge money
    • Profit is the best KPI

Succinct data structures for python

  • Konstantin Ignatov / Qrator
  • bytearray
  • Succinct data structures use knowledge about the data
  • PySDSL
  • SDSL
  • x.bit_compress() picks the most appropriate data type (e.g. int16) for the data
  • Compression but still supports same operations
  • Suffix arrays πŸŽ‰

The Web is Terrifying! Using the PyData stack to spy on the spies.

Mocks, fakes, dummies, stubs and spies: Successfully isolating the snake

  • Mario Corchero / Bloomberg
  • side_effect can be an array
  • patch(..., autospec=True)
  • seal(mock) - stops undefined attributes being mocked
  • Mock(wraps=func) - a spy
  • sentinel
  • You can name your mocks - Mock(name="bob")

asyncio in Practice: We Did It Wrong

  • Lynn Root / Spotify
  • Used in Spotify's chaos monkey service
  • Event driven hostname generation for DNS
  • SLI - Service Level Indicators
  • Be careful to avoid accidentally swallowing exceptions
  • asyncio.create_task
  • Might want to handle cancelled tasks exceptions
  • asyncio.gather swallows exceptions by default
  • http://rogue.ly/aio

Standardize Testing in Python

  • Bernat Gabor / Bloomberg
  • tox
  • detox - run environments in parallel
  • Testing with different version of Python, Django etc.

How to Write Deployment-friendly Applications

  • Hynek Schlawack / Variomedia
  • Use environment variables
  • environ_config library
  • Don't put secrets in environment variables πŸ€”
  • /-/ prefix for internal endpoints (e.g. /-/version)
  • Liveness vs health endpoint

PyPI: Past, Present and Future

  • Nicole Harris / PeopleDoc
  • Pronounced Py-P-I
  • Costs ~$1M/year to run

White Mars: living far away from any form of life

Let’s embrace WebAssembly! ✨

  • Almar Klein / Consultant
  • Video
  • Slides
  • Low-level representation of code
  • Doesn't make too many assumptions about the architecture it will be run on
  • Browsers (or other platforms) turn the WASM into native instructions
  • WASM is a binary format but it has a human readable version
  • It's safe - primarily designed to run in the browser
  • ppci (pure python compiler infrastructure)
  • Can compile a subset of Python to WASM
  • Can compile WASM to native code
  • Therefore Python to native code πŸŽ‰
  • Can import WASM modules written in other languages
  • Wow!

ETL pipeline to achieve reliability at scale

  • Isabel Lopez / Smarkets
  • Luigi for building the pipeline
  • Apache Parquet column-oriented data store on Hadoop
  • Spark (Amazon EMR)
  • Columnar (colum-nar)
  • RDD - Resilient Distributed Dataset
  • 8M events (overkill?)

Autism in development ✨

  • Ed Singleton / Consultant
  • Video
  • Repetition, social awkwardness, over-stimulation, stubbornness, meltdowns, "fizzy mind"
  • Over-stimulation - flood of information coming in
  • Stubbornness - unreasonableness
  • Correlation: insomnia, clumsiness, alcoholism, ADD / ADHS, low muscle tone, easting disorders
  • Neurological differences: larger brains, more neurons
  • Benefits: systemising, repetition / obsession, radical honesty, originality of thinking, more spare time, attention to detail
  • Aspergers archetype: obsessive, blunt, intelligent, original thinker, not suited to physical work...
  • "It seems that for success in science or art a dash of autism is essential." - Hans Asperger

  • Social problems: reseting "whatever" face, don't force socialising, try to tolerate meltdowns, less likely to be mentored
  • Work patterns: the unknown is scary, Agile / Kanban, prefigure changes
  • Meetings: don't expect people to speak up, actively manage turn-taking

Trio: A pythonic way to do async programming

Leadership of Technical Teams

  • Owen Campbell / Consultant
  • ICI
  • Leadership can be learnt
  • Must be practiced
  • Bounce
  • Find opportunities outside of work to practice
  • Schmooze 'em, bruise 'em or lose 'em. eek πŸ€”
  • Priorities: customers, investors, other teams, processes, resources, deliverables. Must ensure good communication with these.
  • Leader should work on lower priorities tasks in case they need to be dropped
  • Leadership styles: dictatorial, paternalistic, consensual, democratic, hands off
  • Dictator β†’ Observer
  • Rookie β†’ Expert
  • Dictator + Rookie = Git

Postgres at any scale

  • Craig Kerstiens / Citus
  • psqlrc (\x auto, \timing, history)
  • pgtune
  • Cache hit rate should be >=99%
  • Index hit rate should be >=95%
  • EXPLAIN and EXPLAIN ANALYSE
  • pg_stat_statements (extension)
  • Use GIN index when a column has multiple values (e.g. an array)
  • Use GIST for shapes (geo-spatial) and full text search
  • SP-GIST and BRIN for large tables (e.g. timeseries)
  • Composite, conditional, and functional indexes
  • Safe migrations:
    1. Allow nulls but set default value
    2. Backfill
    3. Add constraint
  • CREATE INDEX CONCURRENTLY - doesn't lock the table
  • Connection pooling - at application layer or daemon (pgBouncer)
  • Replication wal-e/wal-g or barman
  • Sharding with Citus (appears as a single database) - see Instragram talk
  • Logical backup (pg_dump) vs physical backup
  • Use physical backups for larger databases
    • Less load on system πŸ‘
    • Not portable πŸ‘Ž
    • pg_dump won't work for databases >~50GB

Asyncio in Python 3.7 and 3.8

  • Yury Selivanov / EdgeDB
  • asyncio.run
  • serve_forever
  • @coroutine will be deprecated soon
  • Try to avoid using event loop
  • Context variables
  • Trio
  • create_supervisor
  • TaskGroup
  • gather doesn't cancel the other tasks when one fails πŸ’₯
  • Tokio - asyncio event loop in Rust

Die Threads

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment