Airflow · Mysql · Python · Uncategorized

Apache Airflow 1.9 to 1.10 Upgrade

Upgrade or Downgrade Apache Airflow from 1.9 to 1.10 and vice-versa 

  • Check the current version using airflow version command.
  • Identify the new airflow version you want to run. 
  • Kill all the airflow containers (server, scheduler, workers etc).
  • Take the backup of all your Dags and Plugins with the current airflow.cfg file. 
  • Take the backup of your Airflow Metada. In case of Mysql use the below commend –

       mysqldump –host=MYSQL_HOST_NAME –user=MYSQL_USER_NAME –password=MYSQL_PASSWORD MYSQL_SCHEMA > airflow_metastore_mysql_backup.sql

Example – mysqldump –host=localhost –user=tanuj –password=tanuj airflow_db > airflow_meta_backup.sql

  • Upgradation from version 1.9 t0 1.10 requires setting SLUGIFY_USES_TEXT_UNIDECODE=yes or AIRFLOW_GPL_UNIDECODE=yes in your working environment.
  • Install the new version using pip install apache-airflow[celery]=={new_version} command.
  • Execute the command airflow initdb to regenerate the new metadata tables for the new version. Delete the newly generated airflow.cfg and copy the one which you backed up previously.
  • Update the old airflow.cfg with the compatible parameters of the new version. Like for celery 1.10 setup I have mentioned above.
  • Run airflow upgradedb to upgrade the schema.
  • Run show processlist; command on mysql to see the changes happening at mysql level.
  • Restart all the airflow containers (server, scheduler, workers etc) and test everything is working fine.
  • In case we find any issue regarding booting up the service or tasks are not running as usual then we need to rollback with the previous airflow version.

Issues faced while Upgrading/Downgrading Apache Airflow from 1.9 to 1.10 and vice-versa 

Issue Faced

Reason

Solution

pessimistic_connection_handling

ImportError

pessimistic_connection_handling()

is a part of Airflow 1.9 source code and it’s removed from Airflow 1.10.

Either I’ll have to move this function code to our plugin asit is or I’ll some other way to

fulfill the health-detailed api.

Sensor hierarchy ErrorImportError: No module named snakebite.client Previously(1.9 setup) all the 1st class airflow sensors were there inside

airflow.operators.sensors package. Now (1.10 setup), all the 1st class airflow operators and sensors are moved to airflow.operators and airflow.sensors package respectively for consistency purpose.

Instead of using airflow.operators.sensors package, it is changed as

airflow.sensors.
In addition to it, rename airflow.contrib.sensors.

hdfs_sensors to airflow.contrib.sensors.

hdfs_sensor for consistency purpose in case we use it in

future.

TIMEZONE
ImportError:
When I apply set global explicit_defaults_for_timestamp=1;at the time of running instances of airflow 1.10 server and scheduler. it gives the same error on both server and scheduler. In turn, Worker failed to boot. Apply this mysql setting after killing all the airflow containers.
Command – [SET GLOBAL explicit_defaults_for_timestamp=1;]
ResolutionError:No such revision or branch ‘9635ae0956e7’ When I downgrade the airflow metadata and run airflow initdb. It gives the same error. In case of downgrade, restore the backed up metadata and then execute the airflow initdb command.
IntegrityError:

(1062, “Duplicate entry “)

When I downgrade the airflow metadata and run airflow initdb. It gives the same error. In case of downgrade, restore the backed up metadata and then execute the airflow initdb command.
Broken DAG:  cannot import name ‘TIMEZONE’ In case, I upgrade the airflow version while running the old version. I get this error on Airflow UI and server as well. First kill all the container then install the new version.
ERROR – [0 / 0] some workers seem to have died When I run airflow upgradedb while running the old version. I get this error server and scheduler as well. First kill all the container then execute the upgrade command

Rollback Strategy for Airflow 

  • Kill all the airflow containers (server, scheduler, workers etc).
  • Restore MySql database which was taken as a backup at the time of upgrade. Below command –

            $ mysql –host=MYSQL_HOST_NAME –user=MYSQL_USER_NAME –password=MYSQL_PASSWORD  MYSQL_SCHEMA < airflow_metastore_mysql_backup.sqlU

  • Copy patse the backed up airflow.cfg in AIRFLOW_HOME.
  • Reinstall the old airflow version using  pip install airflow[celery]=={OLD_AIRFLOW_VERSION} –upgrade
  • Finally, restart all the airflow containers (server, scheduler, workers etc) and test everything is working fine.

Results 

  1. I have performed this airflow 1.10 installation and upgrade/downgrade (1.9 to 1.10 and vice-versa) on my local MAC machine keeping our Service in mind.
  2. I was able to run the existing compute DAG and plugins successfully.

One thought on “Apache Airflow 1.9 to 1.10 Upgrade

Leave a comment