Updated September 12th, 2024 by kuldeep.mishra
Apache Spark reading .gzip files from S3 instead of decompressed data
Problem When attempting to read .gzip files from S3 using Apache Spark in the Data Engineering environment, you may find the compressed values being read instead of the decompressed data. The error message or symptom includes seeing compressed values like ��% eb���[�.K����Qh�q�h. Further, you may find methods such as .option("compression", "gzip")...
0 min reading timeUpdated September 23rd, 2024 by kuldeep.mishra
"AnalysisException: Incompatible Format Detected" error when writing to OpenSearch
Problem You are attempting to write dataframes into OpenSearch indices using the org.opensearch.client:opensearch-spark-30_2.12 library when you encounter the error AnalysisException: Incompatible format detected . Cause This error is caused by the presence of a _delta_log folder in the root directory of the Databricks File System (DBFS). When Ap...
1 min reading timeLoad More