Object lock error when writing Delta Lake tables to S3
Problem You are trying to perform a Delta write operation to a S3 bucket and get an error message. com.amazonaws.services.s3.model.AmazonS3Exception: Content-MD5 HTTP header is required for Put Part requests with Object Lock parameters Cause Delta Lake does not support S3 buckets with object lock enabled. Solution You should use an S3 bucket that do...
0 min reading timeProtoSerializer stack overflow error in DBConnect
Problem You are using DBConnect (AWS | Azure | GCP) to run a PySpark transformation on a DataFrame with more than 100 columns when you get a stack overflow error. py4j.protocol.Py4JJavaError: An error occurred while calling o945.count. : java.lang.StackOverflowError at java.lang.Class.getEnclosingMethodInfo(Class.java:1072) at java.lang.Clas...
1 min reading timeError when running MSCK REPAIR TABLE in parallel
Problem You are trying to run MSCK REPAIR TABLE <table-name> commands for the same table in parallel and are getting java.net.SocketTimeoutException: Read timed out or out of memory error messages. Cause When you try to add a large number of new partitions to a table with MSCK REPAIR in parallel, the Hive metastore becomes a limiting factor, a...
0 min reading time