Airflow 1.10.12 broke branching. Airflow 1.10.13 fixes it.
Airflow 1.10.12 Change
The Airflow Changelog and this Airflow PR describe the following updated functionality.
[AIRFLOW-5391] Do not re-run skipped tasks when they are cleared (#7276)
This PR fixes the following issue:
If a task is skipped by
BranchPythonOperator
,BaseBranchOperator
orShortCircuitOperator
and the user then clears the skipped task, it'll execute.After this PR:
The
NotPreviouslySkippedDep
rule will first evaluate if a task has a directSkipMixin
parent that has decided to skip it. This is done by examining the XCom data stored bySkipMixin.skip()
orSkipMixin.skip_all_except()
.
At first glance, this is a great update! It lets you clear and re-run tasks downstream of branches while preserving the expected outcome.
Airflow 1.10.13 Change
However, the implementation above introduced another bug. What’s the issue here? This implementation no longer respects the following rule from the Airflow Documentation.
Paths of the branching task are
branch_a
,join
andbranch_b
. Sincejoin
is a downstream task ofbranch_a
, it will be excluded from the skipped tasks whenbranch_a
is returned by the Python callable.
This issue was raised in the following Airflow PR (cloned issue here):
Make sure
SkipMixin.skip_all_except()
handles empty branches like this properly. When "branch_a" is followed, "join" must not be skipped even though it is considered to be immediately downstream of "branching".
So, branching will be fixed once again when Airflow 1.10.13 is rolled out!
Conclusion
Airflow has its growing pains, but it’s good to see the project move in the right direction! At the time of writing this article, Airflow 1.10.12 is the stable version, so 1.10.13 fix isn’t rolled out yet. Keep up to date on changes like this by participating in the Airflow community. You can take a look at Airflow 2.0 planning to see more upcoming changes. Thanks to the PR creators for the images!