Skip to content

Instantly share code, notes, and snippets.

@mmattax
Created February 3, 2023 17:22
Show Gist options
  • Save mmattax/8ec0535fe14d21c74b233579011f4508 to your computer and use it in GitHub Desktop.
Save mmattax/8ec0535fe14d21c74b233579011f4508 to your computer and use it in GitHub Desktop.

Installing Airflow on Alma Linux

Below documents a simple "Standalone" Airflow server that can run Docker tasks.

MySQL

Install MySQL server

sudo dnf upgrade --refresh -y
sudo dnf install -y mysql-server mysql-devel gcc
systemctl enable --now mysqld

Set up Airflow database

Instructions are described here: https://airflow.apache.org/docs/apache-airflow/2.0.0/howto/initialize-database.html

CREATE DATABASE airflow CHARACTER SET utf8 COLLATE utf8_unicode_ci;
CREATE USER 'airflow' IDENTIFIED BY 'airflow';
GRANT ALL PRIVILEGES ON airflow.* TO 'airflow';

Conda

Latest install scripts can be found here: https://docs.conda.io/en/latest/miniconda.html.

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x Miniconda3-latest-Linux-x86_64.sh
sudo ./Miniconda3-latest-Linux-x86_64.sh

When prompted for an install path, set it to /opt/miniconda3

Mamba

Conda is slow, so install mamba into the base environment:

sudo /opt/miniconda3/conda install mamba -n base -c conda-forge

Activate the environment

If this wasn't done automatically during the install, we can activate the environment:

/opt/miniconda3/bin/conda init
source ~/.bashrc

Install Airflow

Copy the following conda environment file into airflow.yml

name: airflow
channels:
  - conda-forge
  - defaults
  - pypy
dependencies:
  - python=3.8
  - pip
  - pip:
    - mysqlclient
    - apache-airflow -c "https://raw.githubusercontent.com/apache/airflow/constraints-2.3.4/constraints-no-providers-3.8.txt"
    - apache-airflow-providers-docker

Then install:

mamba env create -f airflow.yml

Now let's generate the airflow config (note that we're setting Airflow's home to /opt/airflow):

AIRFLOW_HOME=/opt/airflow /opt/miniconda3/bin/conda run -n airflow airflow db init

Now we'll create an airflow user, and set the permissions for AIRFLOW_HOME:

sudo useradd airflow
chown -R airflow:airflow /opt/airflow

Configure Airflow

Update /opt/airflow/airflow.cfg

Make the following line changes in the Airflow config:

  • Use LocalExecuter:
executor = LocalExecutor
  • Set the DB to MySQL:
sql_alchemy_conn = mysql+mysqldb://airflow:airflow@localhost/airflow
  • Remove the example DAGs:
load_examples = False

Rebuild the Airflow database

Now that we've set Airflow to use MySQL, lets re-build the database:

AIRFLOW_HOME=/opt/airflow /opt/miniconda3/bin/conda run -n airflow airflow db init

Put Airflow in systemd

Airflow has example unit files for systemd: https://airflow.apache.org/docs/apache-airflow/stable/howto/run-with-systemd.html

  • Set AIRFLOW_HOME in /etc/environment:
AIRFLOW_HOME=/opt/airflow
  • Create /etc/systemd/system/airflow-scheduler.service
[Unit]
Description=Airflow scheduler daemon
After=network.target mysqld.service
Wants=mysqld.service
[Service]
EnvironmentFile=/etc/environment
User=airflow
Group=airflow
Type=simple
ExecStart=/opt/miniconda3/bin/conda run -n airflow airflow scheduler
Restart=always
RestartSec=5s
[Install]
WantedBy=multi-user.target
  • Create /etc/systemd/system/airflow-webserver.service
[Unit]
Description=Airflow webserver daemon
After=network.target myqld.service
Wants=myqld.service
[Service]
EnvironmentFile=/etc/environment
User=airflow
Group=airflow
Type=simple
ExecStart=/opt/miniconda3/bin/conda run -n airflow airflow webserver
Restart=on-failure
RestartSec=5s
PrivateTmp=true
[Install]
WantedBy=multi-user.target
  • Enable Airflow:
systemctl enable --now airflow-webserver
systemctl enable --now airflow-scheduler

Install Docker

sudo dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo
sudo dnf install docker-ce -y
sudo systemctl enable --now docker

Add users to the Docker group

sudo usermod -aG docker $USER
sudo usermod -aG docker airflow

Firewall

Airflow runs on port 8080, so let's open it up:

sudo firewall-cmd --zone=public --add-port 8080/tcp --permanent
sudo firewall-cmd --reload
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment