Core Concepts
Understand DAGs, Tasks, Operators, Schedulers, Executors, Hooks, XComs and more with interactive cards.
An interactive guide to Apache Airflow: what it is, how it works, real example DAGs, and an AI assistant to help you decide whether Airflow is right for your project.
Understand DAGs, Tasks, Operators, Schedulers, Executors, Hooks, XComs and more with interactive cards.
See real-world workflow patterns — ETL pipelines, ML training, data quality checks — rendered as interactive DAG diagrams.
Ask the AI assistant anything about Airflow. Get personalized advice on whether to adopt Airflow or build your own system.
Apache Airflow is an open-source platform to programmatically author, schedule, and monitor workflows. Workflows are expressed as Directed Acyclic Graphs (DAGs) of tasks written in Python. Airflow schedules those tasks, tracks their execution, retries failures, and provides a web UI to visualize everything.
You write a Python file that defines a DAG. The DAG contains Tasks, each backed by an Operator (Python function, SQL query, Bash script, HTTP request, etc.). You define dependencies between tasks, and Airflow's scheduler runs them in the right order.
Click any card to expand with a detailed explanation and code example.
Real-world workflow patterns shown as Airflow DAG diagrams.
Extracts data from a source API, transforms it, loads to warehouse, then notifies the team. Runs daily at midnight.
Runs weekly: ingest fresh data, engineer features, train model, evaluate against threshold, conditionally deploy to production.
Runs after each data load: checks completeness, referential integrity, business rules. Alerts on failures, auto-quarantines bad rows.
Shows Airflow's BranchPythonOperator: choose which downstream path to take at runtime based on data or business logic.
An honest assessment of where Airflow shines and where it struggles.
How Airflow compares to common workflow orchestration tools and approaches.
| Feature | Airflow | Prefect | Temporal | Celery Beat | Build Your Own |
|---|---|---|---|---|---|
| DAG / Workflow UI | ✓ Rich | ✓ Rich | ◐ Basic | ✗ None | ✗ DIY |
| Scheduling | ✓ Cron/interval | ✓ Cron/interval | ◐ Via timers | ✓ Cron | ◐ Varies |
| Python-native | ✓ | ✓ Decorator-based | ✓ SDK | ✓ | ✓ |
| Retries & backfill | ✓ Excellent | ✓ Good | ✓ Excellent | ◐ Basic | ✗ DIY |
| Dynamic workflows | ◐ Limited | ✓ First-class | ✓ Excellent | ✗ No | ✓ Full control |
| Event-driven | ◐ Sensors | ◐ Partial | ✓ First-class | ✗ No | ✓ Full control |
| Operator ecosystem | ✓ 1000+ | ✓ Good | ◐ Growing | ✗ None | ✗ DIY |
| Ops complexity | High (DB + sched) | Medium | High (cluster) | Low | ✓ Low |
| Managed cloud option | ✓ MWAA, Composer | ✓ Prefect Cloud | ✓ Temporal Cloud | ✗ No | ✗ No |
| Best for | Data/ML pipelines, ETL, large teams | Modern Python workflows, dynamic graphs | Microservice orchestration, long-running processes | Simple job scheduling, small apps | Full control, unique requirements |
If you already use Celery for background tasks, Celery Beat adds scheduling on top. It's much simpler to run (no extra DB schema, no web UI), but you get no DAG visualization, no backfill, no fan-out graph support. Good for <10 scheduled jobs.
Prefect is more Pythonic (decorator-based), supports dynamic DAGs natively, and has a cleaner developer experience. Airflow has a larger ecosystem and more production deployments. If you're starting fresh, Prefect 2.x is worth evaluating.
Ask anything about Apache Airflow — concepts, architecture, best practices, or whether it fits your use case.
Hi! I'm your Airflow expert. Ask me anything — from "what is a DAG?" to "should I use Airflow for my specific use case?"
Try one of the suggested questions on the right, or type your own below.