Skip to content

Instantly share code, notes, and snippets.

@rupert
Created July 28, 2018 15:22
Show Gist options
  • Save rupert/a1da313bf437a9cf3b9c56008b2d0d2f to your computer and use it in GitHub Desktop.
Save rupert/a1da313bf437a9cf3b9c56008b2d0d2f to your computer and use it in GitHub Desktop.
EuroPython 2018
  • Sven-Hendrik Haase / Consultant
  • Video
  • PyO3
  • Why?
    • Speed up your program
    • Escape the GIL
    • Bind to existing system libraries - serde 🤔
  • Rust is...
    • Safe
    • Modern
    • Fast
    • Statically and strongly typed
    • Immutable by default
    • Private by default
    • Amazing tooling
    • Great community
  • You can use Rust to speed up your Python code
    1. Annotate your Rust code
    2. Compile to a .so file
    3. Python supports importing .so files
    4. Optionally add to setup.py
  • You can use Python to slow down your Rust code 😂 (could still be useful - e.g. Lua in Redis)
  • rayon crate for data parallelism
  • That Rust Book
  • Marco Bonzanini / Consultant
  • Video
  • Slides
  • In the Vatican City there are 5.88 popes per square mile

  • Even if the statistics are correct, they can be misleading
  • Lurking variable
    • ⬆️Ice Cream → ⬆️Drowning❓
    • ⬆️Temperature → ⬆️Ice Cream, ⬆️Temperature → ⬆️Drowning
  • Correlation between number of firefighters and damage - reduce damage by sending fewer firefighters? 🙃
  • Simpson's paradox
  • Bias - a systematic error
  • Sampling bias - an error in your sampling process
  • Data visualisation can give better insights but also mislead
  • Significant means the results are reliable but not necessarily important (could be a small change)
  • Data dredging / data fishing / p-hacking
  • Sarah Diot-Girard / PeopleDoc
  • Bag of words vs word embeddings (e.g. Word2Vec)
  • Bias in word embeddings: Man → Developer, Woman → Homemaker 💥
  • Interpretability can be used to get feedback from experts and make sure reasoning is unbiased
  • With GDPR you have to be able to explain decisions
  • Local interpretable model-agnostic explanations (LIME) 🍋
  • Apophenia - seeing patterns that don't exist
  • Illusory causation - confusing correlation and causation
  • Paul Keating / Consultant
  • Think about audience - interactive, batch, API
  • Sometimes two audiences - users calling the library and users of the application
  • Did you mean? error 👍
  • Something went wrong error 👎
  • Use logger.exception
  • Avoid identical messages for different error paths
  • Need to test the error path
  • Understandable, explicit, unambiguous, point in the right direction
  • Dan Taylor / Microsoft
  • Snippets and tasks
  • ⌘P then ⌥⏎ to open in split tab
  • click to open definition
  • 💡 Learn how to use watch in the debugger
  • Right click and run to cursor (~temporary breakpoint)
  • IntelliCode 🤯
  • Language server is opt-in for now but will be the default soon
  • VS Live Share 🤯
  • Victor Stinner / Red Hat
  • Inheritance from object 🤢
  • long vs int 🤢
  • Unicode 🤢
  • 2008 - Python 3 was released
  • All dependencies must be Python 3 compatible
  • Some projects were forked at add Python 3 support
  • Python 3 wall of shame
  • ensurepip
  • Remove Python 2 support Add Python 3 support
  • u"unicode" re-added in Python 3.3 - doesn't do anything but helps port Python 2 code
  • Python 2 backports
  • 95% of top 200 packages support Python 3
  • Instagram on Python 3
    • ⬇️12% CPU
    • ⬇️30% Memory
  • time.monotonic() added in Python 3.3
  • Ines Montani / Explosion AI
  • Video
  • Slides
  • Explosion AI
  • spaCy
  • Bootstrapped / self-funded / did consulting
  • prodigy
  • 🔴 Misconception #1 - you need to run at a loss
    • Reasons to run at a loss: network effects, scale operations, predatory pricing, enterprise sales
    • Upfront costs
    • Bigger is not necessarily better
    • Most businesses aren't "winner takes all"
    • Optimise for median outcome
  • 🔴 Misconception #2 - you need to hire lots of people
    • 🚌 test
    • Excellence requires authorship / ownership (not design by committee)
    • Building the right thing
    • Specialists - people don't understand what others are working on, more meetings...
    • Generalists
    • Complementary - mix of skills ✨
    • Tree-shaped skills 🌳
  • 🔴 Misconception #3 - you can't make good decisions without testing all of your assumptions
    • Inverse of survivorship bias - We didn't do X and we failed, therefore X would have saved us.
    • http://autopsy.io
    • You can't replace logic with data
    • Build things you think are good
  • 🔴 Misconception #4 - the true value lies in your user data
    • Sell products, not promises
    • Focus on what you can really charge people money for right now
    • Ship value, charge money
    • Profit is the best KPI
  • Konstantin Ignatov / Qrator
  • bytearray
  • Succinct data structures use knowledge about the data
  • PySDSL
  • SDSL
  • x.bit_compress() picks the most appropriate data type (e.g. int16) for the data
  • Compression but still supports same operations
  • Suffix arrays 🎉
  • Mario Corchero / Bloomberg
  • side_effect can be an array
  • patch(..., autospec=True)
  • seal(mock) - stops undefined attributes being mocked
  • Mock(wraps=func) - a spy
  • sentinel
  • You can name your mocks - Mock(name="bob")
  • Lynn Root / Spotify
  • Used in Spotify's chaos monkey service
  • Event driven hostname generation for DNS
  • SLI - Service Level Indicators
  • Be careful to avoid accidentally swallowing exceptions
  • asyncio.create_task
  • Might want to handle cancelled tasks exceptions
  • asyncio.gather swallows exceptions by default
  • http://rogue.ly/aio
  • Bernat Gabor / Bloomberg
  • tox
  • detox - run environments in parallel
  • Testing with different version of Python, Django etc.
  • Hynek Schlawack / Variomedia
  • Use environment variables
  • environ_config library
  • Don't put secrets in environment variables 🤔
  • /-/ prefix for internal endpoints (e.g. /-/version)
  • Liveness vs health endpoint
  • Nicole Harris / PeopleDoc
  • Pronounced Py-P-I
  • Costs ~$1M/year to run
  • Almar Klein / Consultant
  • Video
  • Slides
  • Low-level representation of code
  • Doesn't make too many assumptions about the architecture it will be run on
  • Browsers (or other platforms) turn the WASM into native instructions
  • WASM is a binary format but it has a human readable version
  • It's safe - primarily designed to run in the browser
  • ppci (pure python compiler infrastructure)
  • Can compile a subset of Python to WASM
  • Can compile WASM to native code
  • Therefore Python to native code 🎉
  • Can import WASM modules written in other languages
  • Wow!
  • Isabel Lopez / Smarkets
  • Luigi for building the pipeline
  • Apache Parquet column-oriented data store on Hadoop
  • Spark (Amazon EMR)
  • Columnar (colum-nar)
  • RDD - Resilient Distributed Dataset
  • 8M events (overkill?)
  • Ed Singleton / Consultant
  • Video
  • Repetition, social awkwardness, over-stimulation, stubbornness, meltdowns, "fizzy mind"
  • Over-stimulation - flood of information coming in
  • Stubbornness - unreasonableness
  • Correlation: insomnia, clumsiness, alcoholism, ADD / ADHS, low muscle tone, easting disorders
  • Neurological differences: larger brains, more neurons
  • Benefits: systemising, repetition / obsession, radical honesty, originality of thinking, more spare time, attention to detail
  • Aspergers archetype: obsessive, blunt, intelligent, original thinker, not suited to physical work...
  • "It seems that for success in science or art a dash of autism is essential." - Hans Asperger

  • Social problems: reseting "whatever" face, don't force socialising, try to tolerate meltdowns, less likely to be mentored
  • Work patterns: the unknown is scary, Agile / Kanban, prefigure changes
  • Meetings: don't expect people to speak up, actively manage turn-taking
  • Owen Campbell / Consultant
  • ICI
  • Leadership can be learnt
  • Must be practiced
  • Bounce
  • Find opportunities outside of work to practice
  • Schmooze 'em, bruise 'em or lose 'em. eek 🤔
  • Priorities: customers, investors, other teams, processes, resources, deliverables. Must ensure good communication with these.
  • Leader should work on lower priorities tasks in case they need to be dropped
  • Leadership styles: dictatorial, paternalistic, consensual, democratic, hands off
  • Dictator → Observer
  • Rookie → Expert
  • Dictator + Rookie = Git
  • Craig Kerstiens / Citus
  • psqlrc (\x auto, \timing, history)
  • pgtune
  • Cache hit rate should be >=99%
  • Index hit rate should be >=95%
  • EXPLAIN and EXPLAIN ANALYSE
  • pg_stat_statements (extension)
  • Use GIN index when a column has multiple values (e.g. an array)
  • Use GIST for shapes (geo-spatial) and full text search
  • SP-GIST and BRIN for large tables (e.g. timeseries)
  • Composite, conditional, and functional indexes
  • Safe migrations:
    1. Allow nulls but set default value
    2. Backfill
    3. Add constraint
  • CREATE INDEX CONCURRENTLY - doesn't lock the table
  • Connection pooling - at application layer or daemon (pgBouncer)
  • Replication wal-e/wal-g or barman
  • Sharding with Citus (appears as a single database) - see Instragram talk
  • Logical backup (pg_dump) vs physical backup
  • Use physical backups for larger databases
    • Less load on system 👍
    • Not portable 👎
    • pg_dump won't work for databases >~50GB
  • Yury Selivanov / EdgeDB
  • asyncio.run
  • serve_forever
  • @coroutine will be deprecated soon
  • Try to avoid using event loop
  • Context variables
  • Trio
  • create_supervisor
  • TaskGroup
  • gather doesn't cancel the other tasks when one fails 💥
  • Tokio - asyncio event loop in Rust
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment