Lakeflow Declarative Pipelines fails with [CANNOT_READ_ARCHIVED_FILE] error

Move your files to a non-Glacier policy bucket, or add a timestamp filter to your Auto Loader readStream option.

Last published at: October 29th, 2025

Problem

When using Lakeflow Declarative Pipelines (LDP) to ingest data, the ingestion process fails with the following error.

Error:
SparkException: Job aborted due to stage failure: com.databricks.sql.io.FileReadException: Error while reading file dbfs:<path-to-file>.

..

Caused by: org.apache.spark.SparkRuntimeException: [CANNOT_READ_ARCHIVED_FILE] Cannot read file at path dbfs:<path-to-file> because it has been archived. Please adjust your query filters to exclude archived files. SQLSTATE: KD003

..

Caused by: java.io.IOException: java.lang.RuntimeException: java.io.IOException: Operation failed: "This operation is not permitted on an archived blob.", 409, GET, <url-with-path-to-file>?"

Cause

The LDP data ingestion process uses Auto Loader to read files that have been archived. When files are archived, they are moved to a storage class that is not directly accessible for processing. The error arises when Auto Loader tries to read these archived files.

Solution

If your S3 object storage class is Glacier, move the files to a non-Glacier policy bucket. For more information, refer to the Archival support in Databricks documentation.

Alternatively, add a timestamp filter to the Auto Loader readStream option in your code, such as modifiedAfter, and set a timestamp from where you want to start reading.

For more information, refer to the Auto Loader options documentation.

Databricks Help Center

Problem

Cause

Solution

Contact Us