Problem
Your Apache Spark job is processing a Delta table when the job fails with an error message.
org.apache.spark.sql.AnalysisException: Found duplicate column(s) in the metadata update: col1, col2...
Cause
There are duplicate column names in the Delta table. Column names that differ only by case are considered duplicate.
Delta Lake is case preserving, but case insensitive, when storing a schema.
Parquet is case sensitive when storing and returning column information.
Spark can be case sensitive, but it is case insensitive by default.
In order to avoid potential data corruption or data loss, duplicate column names are not allowed.
Solution
Delta tables must not contain duplicate column names.
Ensure that all column names are unique.