Problem
After you build an Iceberg table with dbt in Databricks, the timestamp column shows up as timestamp
in the Databricks UI or using DESCRIBE
. However, when you open the Iceberg manifest JSON in S3, the same column is recorded as timestamptz
.
Cause
Apache Spark treats a timestamp as an instant in time, storing microseconds and converting them to the session time zone when displayed.
Iceberg maps Spark types differently. Spark timestamp
values are mapped to Iceberg with timestamptz
(timestamp with time-zone). Spark timestamp_ntz
values are mapped to Iceberg as timestamp
(timestamp without time-zone). For more information, refer to the “Spark type to Iceberg type” section of the Apache Iceberg Spark Writes documentation.
The Databricks UI displays the Spark view of the schema, showing timestamp, while the Iceberg metadata file records timestamptz due to the Spark to Iceberg type mapping.
Solution
This behavior is a difference in type mapping, not an issue with data consistency. There is no need for concern.
If desired, you can convert an existing timestamp
column to timestamp_ntz
for visual consistency in Databricks using the following steps.
- Enable the required feature flags.
-- Turn on TIMESTAMP_NTZ support for this Delta table
ALTER TABLE <catalog>.<schema>.<table>
SET TBLPROPERTIES ('delta.feature.timestampNtz' = 'supported');
-- Switch the table to name-based column mapping
ALTER TABLE <catalog>.<schema>.<table>
SET TBLPROPERTIES ('delta.columnMapping.mode' = 'name');
- Add a replacement column with the new type.
ALTER TABLE <catalog>.<schema>.<table>
ADD COLUMN new_event_time TIMESTAMP_NTZ;
- Copy the existing values into the new column.
UPDATE <catalog>.<schema>.<table>
SET new_event_time = event_time;
- Remove the old column.
ALTER TABLE <catalog>.<schema>.<table>
DROP COLUMN event_time;
- Rename the replacement column back to the original name.
ALTER TABLE <catalog>.<schema>.<table>
RENAME COLUMN new_event_time TO event_time;
Note
Using an in-place ALTER TABLE … CHANGE COLUMN … TIMESTAMP → TIMESTAMP_NTZ
is not a suggested solution because this query is not supported.