Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
EuroPython 2018

Rust and Python - Oxidize Your Snake ✨

  • Sven-Hendrik Haase / Consultant
  • Video
  • PyO3
  • Why?
    • Speed up your program
    • Escape the GIL
    • Bind to existing system libraries - serde πŸ€”
  • Rust is...
    • Safe
    • Modern
    • Fast
    • Statically and strongly typed
    • Immutable by default
    • Private by default
    • Amazing tooling
    • Great community
  • You can use Rust to speed up your Python code
    1. Annotate your Rust code
    2. Compile to a .so file
    3. Python supports importing .so files
    4. Optionally add to setup.py
  • You can use Python to slow down your Rust code πŸ˜‚ (could still be useful - e.g. Lua in Redis)
  • rayon crate for data parallelism
  • That Rust Book

Lies, damned lies, and statistics ✨

  • Marco Bonzanini / Consultant
  • Video
  • Slides
  • In the Vatican City there are 5.88 popes per square mile

  • Even if the statistics are correct, they can be misleading
  • Lurking variable
    • ⬆️Ice Cream β†’ ⬆️Drowning❓
    • ⬆️Temperature β†’ ⬆️Ice Cream, ⬆️Temperature β†’ ⬆️Drowning
  • Correlation between number of firefighters and damage - reduce damage by sending fewer firefighters? πŸ™ƒ
  • Simpson's paradox
  • Bias - a systematic error
  • Sampling bias - an error in your sampling process
  • Data visualisation can give better insights but also mislead
  • Significant means the results are reliable but not necessarily important (could be a small change)
  • Data dredging / data fishing / p-hacking

Trust me, I'm a Data Scientist - ethics for builders of data-based applications

  • Sarah Diot-Girard / PeopleDoc
  • Bag of words vs word embeddings (e.g. Word2Vec)
  • Bias in word embeddings: Man β†’ Developer, Woman β†’ Homemaker πŸ’₯
  • Interpretability can be used to get feedback from experts and make sure reasoning is unbiased
  • With GDPR you have to be able to explain decisions
  • Local interpretable model-agnostic explanations (LIME) πŸ‹
  • Apophenia - seeing patterns that don't exist
  • Illusory causation - confusing correlation and causation

Writing good error messages

  • Paul Keating / Consultant
  • Think about audience - interactive, batch, API
  • Sometimes two audiences - users calling the library and users of the application
  • Did you mean? error πŸ‘
  • Something went wrong error πŸ‘Ž
  • Use logger.exception
  • Avoid identical messages for different error paths
  • Need to test the error path
  • Understandable, explicit, unambiguous, point in the right direction

Get Productive with Python in Visual Studio Code

  • Dan Taylor / Microsoft
  • Snippets and tasks
  • ⌘P then βŒ₯⏎ to open in split tab
  • ⌘ click to open definition
  • πŸ’‘ Learn how to use watch in the debugger
  • Right click and run to cursor (~temporary breakpoint)
  • IntelliCode 🀯
  • Language server is opt-in for now but will be the default soon
  • VS Live Share 🀯

Python 3: ten years later

  • Victor Stinner / Red Hat
  • Inheritance from object 🀒
  • long vs int 🀒
  • Unicode 🀒
  • 2008 - Python 3 was released
  • All dependencies must be Python 3 compatible
  • Some projects were forked at add Python 3 support
  • Python 3 wall of shame
  • ensurepip
  • Remove Python 2 support Add Python 3 support
  • u"unicode" re-added in Python 3.3 - doesn't do anything but helps port Python 2 code
  • Python 2 backports
  • 95% of top 200 packages support Python 3
  • Instagram on Python 3
    • ⬇️12% CPU
    • ⬇️30% Memory
  • time.monotonic() added in Python 3.3

How to Ignore Most Startup Advice and Build a Decent Software Business by Ines M ✨

  • Ines Montani / Explosion AI
  • Video
  • Slides
  • Explosion AI
  • spaCy
  • Bootstrapped / self-funded / did consulting
  • prodigy
  • πŸ”΄ Misconception #1 - you need to run at a loss
    • Reasons to run at a loss: network effects, scale operations, predatory pricing, enterprise sales
    • Upfront costs
    • Bigger is not necessarily better
    • Most businesses aren't "winner takes all"
    • Optimise for median outcome
  • πŸ”΄ Misconception #2 - you need to hire lots of people
    • 🚌 test
    • Excellence requires authorship / ownership (not design by committee)
    • Building the right thing
    • Specialists - people don't understand what others are working on, more meetings...
    • Generalists
    • Complementary - mix of skills ✨
    • Tree-shaped skills 🌳
  • πŸ”΄ Misconception #3 - you can't make good decisions without testing all of your assumptions
    • Inverse of survivorship bias - We didn't do X and we failed, therefore X would have saved us.
    • http://autopsy.io
    • You can't replace logic with data
    • Build things you think are good
  • πŸ”΄ Misconception #4 - the true value lies in your user data
    • Sell products, not promises
    • Focus on what you can really charge people money for right now
    • Ship value, charge money
    • Profit is the best KPI

Succinct data structures for python

  • Konstantin Ignatov / Qrator
  • bytearray
  • Succinct data structures use knowledge about the data
  • PySDSL
  • SDSL
  • x.bit_compress() picks the most appropriate data type (e.g. int16) for the data
  • Compression but still supports same operations
  • Suffix arrays πŸŽ‰

The Web is Terrifying! Using the PyData stack to spy on the spies.

Mocks, fakes, dummies, stubs and spies: Successfully isolating the snake

  • Mario Corchero / Bloomberg
  • side_effect can be an array
  • patch(..., autospec=True)
  • seal(mock) - stops undefined attributes being mocked
  • Mock(wraps=func) - a spy
  • sentinel
  • You can name your mocks - Mock(name="bob")

asyncio in Practice: We Did It Wrong

  • Lynn Root / Spotify
  • Used in Spotify's chaos monkey service
  • Event driven hostname generation for DNS
  • SLI - Service Level Indicators
  • Be careful to avoid accidentally swallowing exceptions
  • asyncio.create_task
  • Might want to handle cancelled tasks exceptions
  • asyncio.gather swallows exceptions by default
  • http://rogue.ly/aio

Standardize Testing in Python

  • Bernat Gabor / Bloomberg
  • tox
  • detox - run environments in parallel
  • Testing with different version of Python, Django etc.

How to Write Deployment-friendly Applications

  • Hynek Schlawack / Variomedia
  • Use environment variables
  • environ_config library
  • Don't put secrets in environment variables πŸ€”
  • /-/ prefix for internal endpoints (e.g. /-/version)
  • Liveness vs health endpoint

PyPI: Past, Present and Future

  • Nicole Harris / PeopleDoc
  • Pronounced Py-P-I
  • Costs ~$1M/year to run

White Mars: living far away from any form of life

Let’s embrace WebAssembly! ✨

  • Almar Klein / Consultant
  • Video
  • Slides
  • Low-level representation of code
  • Doesn't make too many assumptions about the architecture it will be run on
  • Browsers (or other platforms) turn the WASM into native instructions
  • WASM is a binary format but it has a human readable version
  • It's safe - primarily designed to run in the browser
  • ppci (pure python compiler infrastructure)
  • Can compile a subset of Python to WASM
  • Can compile WASM to native code
  • Therefore Python to native code πŸŽ‰
  • Can import WASM modules written in other languages
  • Wow!

ETL pipeline to achieve reliability at scale

  • Isabel Lopez / Smarkets
  • Luigi for building the pipeline
  • Apache Parquet column-oriented data store on Hadoop
  • Spark (Amazon EMR)
  • Columnar (colum-nar)
  • RDD - Resilient Distributed Dataset
  • 8M events (overkill?)

Autism in development ✨

  • Ed Singleton / Consultant
  • Video
  • Repetition, social awkwardness, over-stimulation, stubbornness, meltdowns, "fizzy mind"
  • Over-stimulation - flood of information coming in
  • Stubbornness - unreasonableness
  • Correlation: insomnia, clumsiness, alcoholism, ADD / ADHS, low muscle tone, easting disorders
  • Neurological differences: larger brains, more neurons
  • Benefits: systemising, repetition / obsession, radical honesty, originality of thinking, more spare time, attention to detail
  • Aspergers archetype: obsessive, blunt, intelligent, original thinker, not suited to physical work...
  • "It seems that for success in science or art a dash of autism is essential." - Hans Asperger

  • Social problems: reseting "whatever" face, don't force socialising, try to tolerate meltdowns, less likely to be mentored
  • Work patterns: the unknown is scary, Agile / Kanban, prefigure changes
  • Meetings: don't expect people to speak up, actively manage turn-taking

Trio: A pythonic way to do async programming

Leadership of Technical Teams

  • Owen Campbell / Consultant
  • ICI
  • Leadership can be learnt
  • Must be practiced
  • Bounce
  • Find opportunities outside of work to practice
  • Schmooze 'em, bruise 'em or lose 'em. eek πŸ€”
  • Priorities: customers, investors, other teams, processes, resources, deliverables. Must ensure good communication with these.
  • Leader should work on lower priorities tasks in case they need to be dropped
  • Leadership styles: dictatorial, paternalistic, consensual, democratic, hands off
  • Dictator β†’ Observer
  • Rookie β†’ Expert
  • Dictator + Rookie = Git

Postgres at any scale

  • Craig Kerstiens / Citus
  • psqlrc (\x auto, \timing, history)
  • pgtune
  • Cache hit rate should be >=99%
  • Index hit rate should be >=95%
  • EXPLAIN and EXPLAIN ANALYSE
  • pg_stat_statements (extension)
  • Use GIN index when a column has multiple values (e.g. an array)
  • Use GIST for shapes (geo-spatial) and full text search
  • SP-GIST and BRIN for large tables (e.g. timeseries)
  • Composite, conditional, and functional indexes
  • Safe migrations:
    1. Allow nulls but set default value
    2. Backfill
    3. Add constraint
  • CREATE INDEX CONCURRENTLY - doesn't lock the table
  • Connection pooling - at application layer or daemon (pgBouncer)
  • Replication wal-e/wal-g or barman
  • Sharding with Citus (appears as a single database) - see Instragram talk
  • Logical backup (pg_dump) vs physical backup
  • Use physical backups for larger databases
    • Less load on system πŸ‘
    • Not portable πŸ‘Ž
    • pg_dump won't work for databases >~50GB

Asyncio in Python 3.7 and 3.8

  • Yury Selivanov / EdgeDB
  • asyncio.run
  • serve_forever
  • @coroutine will be deprecated soon
  • Try to avoid using event loop
  • Context variables
  • Trio
  • create_supervisor
  • TaskGroup
  • gather doesn't cancel the other tasks when one fails πŸ’₯
  • Tokio - asyncio event loop in Rust

Die Threads

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.