Updated April 21st, 2023 by sergios.lalas
Field name sorting changes in Apache Spark 3.x
Problem When using a map transformation on a RDD using Databricks Runtime 9.1 LTS and above, the resulting schema order is different when compared to doing the same map transformation using Databricks Runtime 7.3 LTS. Cause Databricks Runtime 9.1 LTS and above incorporate Apache Spark 3.x. Starting with Spark 3.0.0, rows created from named arguments...
0 min reading timeUpdated April 21st, 2023 by sergios.lalas
Decreased performance when using DELETE with a subquery on Databricks Runtime 10.4 LTS
Problem Auto optimize on Databricks (AWS | Azure | GCP) is an optional set of features that automatically compact small files during individual writes to a Delta table. Paying a small cost during writes offers significant benefits for tables that are queried actively. Although auto optimize can be beneficial in many situations, you can see decreased...
0 min reading timeLoad More