Podcast

Guest appearance on the Astronomer Dataflow Podcast — discussing how I transformed data engineering at Deloitte Digital using Apache Airflow.

Astronomer Dataflow Podcast

From ETL to Airflow: Transforming Data Engineering at Deloitte Digital

Astronomer Dataflow Podcast · April 10, 2025

In this episode, I talk about the journey of modernising legacy ETL pipelines to cloud-native Apache Airflow workflows, building Customer Data Platforms at scale, and the lessons learned architecting data solutions for enterprise clients at Deloitte.

View episode page on Astronomer →

What we covered

Before Airflow, our data orchestration at Deloitte Digital was held together with traditional ETL tools like Talend — rigid, hard to scale, and painful to maintain across cloud environments. The episode starts there: what the problems actually looked like before we made the switch, and why flexibility was the deciding factor when we evaluated alternatives.

A big chunk of the conversation is about managing dynamic DAGs at scale — the patterns that work, the mistakes that don't, and how we handled complex workflow configurations across enterprise clients. We also get into hybrid executors, which made a real difference to both performance and operational efficiency once we had them running properly.

On the testing and monitoring side: I walk through the mocking mechanisms we use for DAG validation, and the observability stack we built around Prometheus, Grafana, and Loki to keep everything visible. Cost is something people don't talk about enough with self-managed Airflow infrastructure — we cover that honestly too.

We wrap up on Airflow 3.0, specifically the hybrid executor changes and what they mean for teams still deciding between self-managed and Astronomer-managed setups.

Apache AirflowAirflow 3.0Dynamic DAGsHybrid ExecutorsETL ModernisationTalendPrometheusGrafanaLokiCloud-AgnosticDAG TestingDeloitte DigitalData Orchestration

Watch on YouTube

Listen on Spotify

Shared by Astronomer on LinkedIn