Problem
Your DLT pipeline update fails with the following error while trying to execute the apply_changes
function.
io.delta.exceptions.ConcurrentDeleteDeleteException: [DELTA_CONCURRENT_DELETE_DELETE] ConcurrentDeleteDeleteException: This transaction attempted to delete one or more files that were deleted (for example XXXX.snappy.parquet) by a concurrent update. Please try the operation again.
Cause
There is a write conflict between any of MERGE
, UPDATE
or DELETE
commands executed by the pipeline apply_changes
function and the OPTIMIZE with ZORDER
operation, which runs as part of regular DLT maintenance.
Even when deletion vectors (row-level concurrency) are enabled, they do not fully eliminate such conflicts. This is expected behavior. DLT performs maintenance tasks within 24 hours of a table being updated. By default, the system performs a full OPTIMIZE
operation with ZORDERBY
if specified followed by VACUUM
.
For more information, review the Isolation levels and write conflicts on Databricks ( AWS | Azure | GCP ) documentation.
Solution
Set the table property delta.enableRowTracking
to true
to enable row tracking on the table. This property helps reduce concurrency errors within DLT pipelines when an OPTIMIZE with ZORDER
operation triggered by the DLT maintenance pipeline conflicts with MERGE
, UPDATE
, or DELETE
operations.
Specify the config as a table property in the DLT table definition.
dlt.create_streaming_table( name="<table-name>",
table_properties={ "delta.enableRowTracking": "true" })
dlt.apply_changes(<definition>)
For more information, review the Use row tracking for Delta tables (AWS | Azure | GCP) documentation.