Skip to content

Instantly share code, notes, and snippets.

@pingzh
Last active January 6, 2022 19:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save pingzh/e849084934588a20ce3c440598669fd8 to your computer and use it in GitHub Desktop.
Save pingzh/e849084934588a20ce3c440598669fd8 to your computer and use it in GitHub Desktop.
for airflow perf test for ti creation inside the dag_run verify_integrity. The test is against a database without other traffic
import time
import logging
from airflow.utils.db import create_session
from airflow.utils import timezone
from airflow.models import TaskInstance
from airflow.models.serialized_dag import SerializedDagModel
logger = logging.getLogger(__name__)
out_hdlr = logging.FileHandler('./log.txt')
out_hdlr.setFormatter(logging.Formatter('%(asctime)s %(message)s'))
out_hdlr.setLevel(logging.INFO)
logger.addHandler(out_hdlr)
logger.setLevel(logging.INFO)
def create_tis_in_new_dag_run(dag, execution_date, number_of_tis):
tasks = list(dag.task_dict.values())[0:number_of_tis]
t1 = time.time()
success = True
try:
with create_session() as session:
for task in :
ti = TaskInstance(task, execution_date)
session.add(ti)
session.flush()
except:
success = False
t2 = time.time()
logger.info('Created %s tis. success?: %s, perf: %s', number_of_tis, success, t2 - t1)
return t2 - t1, success
def perf_tis_creation(dag):
perf = {}
for number_of_tis in [1000, 3000, 5000, 10000, 15000, 20000, 25000]:
next_run_date = timezone.utcnow()
perf, success = create_tis_in_new_dag_run(dag, next_run_date, number_of_tis)
perf[number_of_tis] = (perf, success)
time.sleep(5)
if __name__ == '__main__':
dag_id = 'a-very-large-dag'
dm = SerializedDagModel.get(dag_id)
dag = dm.dag
perf_tis_creation(dag)
@KevinYang21
Copy link

KevinYang21 commented Dec 21, 2021

Created 3000 tis. success?: True, perf: 3.33885216713
Created 5000 tis. success?: True, perf: 4.80166292191
Created 10000 tis. success?: True, perf: 10.0343668461
Created 15000 tis. success?: True, perf: 15.3140938282
Created 20000 tis. success?: True, perf: 19.5153648853
Created 25000 tis. success?: True, perf: 21.7094979286

@ashb
Copy link

ashb commented Jan 5, 2022

Mysql is shocking. Postgres has near linear performance:

[2022-01-05 10:57:25,672]  1106663 MainProcess {{__main__ test_ti_creation.py:32}} INFO - Created 1000 tis. success?: False, perf: 0.0025680065155029297
[2022-01-05 10:57:30,678]  1106663 MainProcess {{__main__ test_ti_creation.py:32}} INFO - Created 3000 tis. success?: False, perf: 0.0017096996307373047
[2022-01-05 10:57:35,686]  1106663 MainProcess {{__main__ test_ti_creation.py:32}} INFO - Created 5000 tis. success?: False, perf: 0.0025453567504882812
[2022-01-05 10:57:40,698]  1106663 MainProcess {{__main__ test_ti_creation.py:32}} INFO - Created 10000 tis. success?: False, perf: 0.0053005218505859375
[2022-01-05 10:57:45,706]  1106663 MainProcess {{__main__ test_ti_creation.py:32}} INFO - Created 15000 tis. success?: False, perf: 0.0023920536041259766
[2022-01-05 10:57:50,711]  1106663 MainProcess {{__main__ test_ti_creation.py:32}} INFO - Created 20000 tis. success?: False, perf: 0.0014641284942626953
[2022-01-05 10:57:55,722]  1106663 MainProcess {{__main__ test_ti_creation.py:32}} INFO - Created 25000 tis. success?: False, perf: 0.0048677921295166016

Pay more attention to the output Ash, so quick cos it failed.

@potiuk
Copy link

potiuk commented Jan 5, 2022

Elephant in the room indeed :)

@potiuk
Copy link

potiuk commented Jan 5, 2022

BTW @ashb -> is that an accident you had success: False (maybe that's the reason for the numbers?)

@ashb
Copy link

ashb commented Jan 5, 2022

Oh hmmm, no, didn't notice that.

@ashb
Copy link

ashb commented Jan 5, 2022

Better numbers (had to change it a bit for latest main):

[2022-01-05 12:05:13,647]  1262808 MainProcess {{__main__ test_ti_creation.py:35}} INFO - Created 1000 tis. success?: True, perf: 0.2504695159950643
[2022-01-05 12:05:19,537]  1262808 MainProcess {{__main__ test_ti_creation.py:35}} INFO - Created 3000 tis. success?: True, perf: 0.8673499860014999
[2022-01-05 12:05:25,884]  1262808 MainProcess {{__main__ test_ti_creation.py:35}} INFO - Created 5000 tis. success?: True, perf: 1.3316335039999103
[2022-01-05 12:05:33,485]  1262808 MainProcess {{__main__ test_ti_creation.py:35}} INFO - Created 10000 tis. success?: True, perf: 2.5858788749974337
[2022-01-05 12:05:42,555]  1262808 MainProcess {{__main__ test_ti_creation.py:35}} INFO - Created 15000 tis. success?: True, perf: 4.056032521002635
[2022-01-05 12:05:52,678]  1262808 MainProcess {{__main__ test_ti_creation.py:35}} INFO - Created 20000 tis. success?: True, perf: 5.108857400002307
[2022-01-05 12:06:03,551]  1262808 MainProcess {{__main__ test_ti_creation.py:35}} INFO - Created 21407 tis. success?: True, perf: 5.853602839000814

@ashb
Copy link

ashb commented Jan 5, 2022

Mysql 5.7 numbers on my laptop:

[2022-01-05 13:00:36,973] {test_ti_creation.py:55} INFO - Created 1000 tis. success?: True, perf: 0.3342053010055679
[2022-01-05 13:00:42,990] {test_ti_creation.py:55} INFO - Created 3000 tis. success?: True, perf: 1.0060032569963369
[2022-01-05 13:00:49,775] {test_ti_creation.py:55} INFO - Created 5000 tis. success?: True, perf: 1.7641245650011115
[2022-01-05 13:00:58,649] {test_ti_creation.py:55} INFO - Created 10000 tis. success?: True, perf: 3.860055482000462
[2022-01-05 13:01:09,464] {test_ti_creation.py:55} INFO - Created 15000 tis. success?: True, perf: 5.801471699000103
[2022-01-05 13:01:22,447] {test_ti_creation.py:55} INFO - Created 20000 tis. success?: True, perf: 7.970231813000282
[2022-01-05 13:01:36,190] {test_ti_creation.py:55} INFO - Created 21407 tis. success?: True, perf: 8.734915605004062

So smaller absolute, but still follow the same pattern (My large dag has "only" 21407 tasks)

@ashb
Copy link

ashb commented Jan 6, 2022

My version, here https://gist.github.com/ashb/f43ab6300f0f84ed91a599923de3a673 (updated to run against Main, which needs a DagRun for a TaskInstance to be created)

I also tried out the bulk performance modes mentioned https://docs.sqlalchemy.org/en/13/faq/performance.html?highlight=bulk%20create#i-m-inserting-400-000-rows-with-the-orm-and-it-s-really-slow which give about a 15-20% speed up for bulk_save_objects on Mysql (and, unfortunately for you) a 40% speed up on PostgreSQL apache/airflow#20722

Edit: 140% on Postgres now!

@pingzh
Copy link
Author

pingzh commented Jan 6, 2022

@ashb nice, we have been using bulk_save_objects for a while, which saved us around ~28 seconds in production when creating large dag run (from ~45 seconds to ~17 seconds).

Wow, looking at those numbers, PostgreSQL is so much better.

@ashb
Copy link

ashb commented Jan 6, 2022

Yeah I was surprised by just how much it sped up!

@potiuk
Copy link

potiuk commented Jan 6, 2022

Good stuff @ashb ! I remember times when I got even better results by just batchig DB writes (Telecom Billing process). I think there are only two reason whu you should NOT do batching:

  1. When you care about transactional integrity "per object"
  2. When you have risk of deadlocks

I think in this case we have neither the 1) need nor 2) risk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment