Skip to content

Instantly share code, notes, and snippets.

@mistercrunch
Created August 3, 2016 17:00
Show Gist options
  • Save mistercrunch/5460483ec764e2a1cb816c6b1d6ad5a3 to your computer and use it in GitHub Desktop.
Save mistercrunch/5460483ec764e2a1cb816c6b1d6ad5a3 to your computer and use it in GitHub Desktop.

Airflow recap since 1.7.1 (Jun 23rd)

  • 243 PRs by 61 committers
  • 437 messages on the mailing list
  • 40 companies
  • Upcoming unsolicited talk: "A Practical Introduction to Airflow" at PyBay by MattDavis (Clover health)

No wonder it's hard to keep up!

Committers with 1 + Commit

Author prs
jlowin 38
Bolke de Bruin 28
Chris Riccomini 21
Dan Davydov 18
Maxime Beauchemin 11
Kengo Seki 11
Siddharth Anand 11
Stanislav Kudriashev 8
Sid Anand 7
Sumit Maheshwari 7
Arthur Wiedmer 6
Li Xuanji 5
Alex Van Boxel 4
Eric Stern 4
Hongbo Zeng 3
Hervé Werner 3
Norman Mu 3
Alexey Ustyantsev 3
Ajay Yadav 2
Rob Froetscher 2
Ilya Rakoshes 2
Tsuyoshi Ozawa 2
John Bodley 2
Joy Gao 2
Junwei Wang 2
Yap Sok Ann 2

Highlights

  • Scheduler complete rewrite with amazing cross company collaboration
    • Scheduler is threaded and will scale to the next order of magnitude
    • Scheduler is insulated, DAG code is parsed in subprocesses
  • jlowin's Git integration tool makes working with Apache's constraint so much easier, less friction when merging is really important
  • @mistercrunch no longer a bottleneck for the project, many engineers from different companies know the core of the project
  • More logging & stats collection in critical areas
  • Refactor of the dependency engine (@aoen pending merge), along with a "Why isn't my task triggered" web page
  • Workload management for Hive
  • SparkSql & EMR operators
  • Apache compliance (removing Highcharts, license headers)
  • Tons of bug fixes, tweaks and polish
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment