Built to Scale: Running Highly-Concurrent ETL with Apache Airflow (part 1)

Apache Airflow has seemingly taken the data engineering world by storm. It was originally created and maintained by Airbnb, and has been part of the Apache Foundation for several years now. After heavily leveraging it for about a year (almost 2 million ¬°idempotent! ETL tasks later) and seeing its full potential (but numerous drawbacks), I was tasked with streamlining the deployment and operation of the system. The obvious first step? »