Skip to content

Instantly share code, notes, and snippets.

@tonybaloney
Last active September 27, 2024 06:53
Show Gist options
  • Save tonybaloney/24d545ed855a3c90f844209152835f07 to your computer and use it in GitHub Desktop.
Save tonybaloney/24d545ed855a3c90f844209152835f07 to your computer and use it in GitHub Desktop.
PyCon US 2024 Talk Notes

PyCon US 2024 Talk Notes - Unlocking the Parallel Universe: Sub Interpreters and Free-Threading in Python 3.13

Prerequisites

  1. PyCon 2023 – Eric Snow talk on sub interpreters
  2. EuroPython 2022 – Sam Gross talk on free-threading
  3. PyCon 2024 - “Sync vs Async in Python” happening right now
  4. PyCon 2024 - Building a JIT compiler for Cpython
  5. PyCon 2024 – Overcoming GIL with sub interpreters and immutability
  6. “Parallelism and Concurrency” chapter from CPython Internals
  7. My Masters Thesis doi.org/10.25949/23974764.v1

Section 1 - Parallel Execution in Python

Parallel Execution

Model Execution Start-up time Data Exchange Best for…
threads Parallel * small Any Small, IO-bound tasks that don’t require multiple CPU cores
coroutines Concurrent smallest Any Small, IO-bound tasks that don’t require multiple CPU cores
multiprocessing Parallel large Serialization Larger, CPU or IO-bound tasks that require multiple CPU cores
Sub Interpreters Parallel medium** Serialization or Shared Memory Larger, CPU or IO-bound tasks that require multiple CPU cores

Threading Benchmark

Crude sample:

import numpy

# Create a random array of 100,000 integers
a = numpy.random.randint(0, 100, 100_000)
for x in a:
  abs(x - 50)

Benchmark code to get the "2x slower" figure:

import numpy
import threading
# Create a random array of 100,000 integers between 0 and 100
a = numpy.random.randint(0, 100, 100_000)
def simple_abs_range(vec):
  for x in vec:
    abs(x - 50)
def f_linear():
  # Calculate the distance for each value to 50
  simple_abs_range(a)
def f_threaded():
  threads = []
  # Split array into blocks of 100 and start a thread for each
  for ar in numpy.split(a, 100):
    t = threading.Thread(target=simple_abs_range, args=(ar,))
    t.start()
    threads.append(t)
  for t in threads:
    t.join()

Sub Interpreter vs Thread vs multiprocessing benchmark

The Jupyter Notebook for this sample is here.

Demo

The demo code is here

Terms and Conditions

  1. Specializations are not enabled in free threading (yet)
  2. GitHub Actions does not have a free-threaded build 1
  3. Packaging does not detect the ABI for free-threading correctly 1
  4. Some benchmarks are slower with free threading
  5. The datetime module is not thread safe 1
  6. C Extensions need to support multi-phase-init to be supported with sub interpreters 1
  7. Most of your 3rd party C extensions aren’t supported yet 1
  8. Cython is not supported 1 2
  9. Django does not work in sub interpreters because of (5)
  10. Did I mention the datetime module is not thread safe
  11. Most PyPi C extensions are not thread safe
  12. Orjson, pydantic-core, httptools and uvloop don’t compile
  13. Nothing using PY03 is supported 1 https://github.com/PyO3/pyo3/releases/tag/v0.22.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment