Hello! I will repeat my issue, now I have a little bit more info. I have a dag with joblib and CeleryExecutor:
from joblib import Parallel, delayed
def joblib_code(*args, **kwargs):
Parallel(n_jobs=2)(delayed(print)(i) for i in range(2))
dag = DAG(dag_id='test_joblib',
default_args=args,
schedule_interval=timedelta(minutes=5))
wiki_task = PythonOperator(task_id='run',
provide_context=True,
python_callable=joblib_code,
dag=dag)
Any time it runs it has an exception:
[2018-06-19 16:09:24,757] {{cli.py:374}} INFO - Running on host cfaef9fb627d
[2018-06-19 16:09:24,942] {{logging_mixin.py:84}} INFO - 1
[2018-06-19 16:09:24,944] {{logging_mixin.py:84}} INFO - 0
[2018-06-19 16:09:25,045] {{models.py:1470}} ERROR - Killing subprocess
[2018-06-19 16:09:25,045] {{logging_mixin.py:84}} WARNING - Process ForkPoolWorker-1:
[2018-06-19 16:09:25,047] {{logging_mixin.py:84}} WARNING - Traceback (most recent call last):
[2018-06-19 16:09:25,047] {{logging_mixin.py:84}} WARNING - File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
[2018-06-19 16:09:25,047] {{logging_mixin.py:84}} WARNING - File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
[2018-06-19 16:09:25,047] {{logging_mixin.py:84}} WARNING - File "/opt/conda/lib/python3.6/multiprocessing/pool.py", line 108, in worker
task = get()
[2018-06-19 16:09:25,047] {{logging_mixin.py:84}} WARNING - File "/opt/conda/lib/python3.6/site-packages/joblib/pool.py", line 364, in get
rrelease()
[2018-06-19 16:09:25,047] {{logging_mixin.py:84}} WARNING - File "/opt/conda/lib/python3.6/site-packages/airflow/models.py", line 1472, in signal_handler
raise AirflowException("Task received SIGTERM signal")
[2018-06-19 16:09:25,047] {{logging_mixin.py:84}} WARNING - airflow.exceptions.AirflowException: Task received SIGTERM signal
I also found an old issue about it: https://issues.apache.org/jira/browse/AIRFLOW-972 It is open and empty.
Small tasks run successfully, they have time for it, but large tasks at the end are killed completely. Could please someone help me how to debug it? I do not know what could I try and check. Is it airflow or celery? Simple python script with this code runs well. Thanks.