Updated September 12th, 2024 by kuldeep.mishra

Apache Spark reading .gzip files from S3 instead of decompressed data

Problem When attempting to read  .gzip files from S3 using Apache Spark in the Data Engineering environment, you may find the compressed values being read instead of the decompressed data. The error message or symptom includes seeing compressed values like ��% eb���[�.K����Qh�q�h. Further, you may find methods such as  .option("compression", "gzip")...

0 min reading time
Updated September 23rd, 2024 by kuldeep.mishra

"AnalysisException: Incompatible Format Detected" error when writing to OpenSearch

Problem You are attempting to write dataframes into OpenSearch indices using the  org.opensearch.client:opensearch-spark-30_2.12 library when you encounter the error  AnalysisException: Incompatible format detected . Cause This error is caused by the presence of a  _delta_log folder in the root directory of the Databricks File System (DBFS). When Ap...

1 min reading time
Load More