Problem
You are attempting to create an external table in Databricks using Terraform, when you encounter an error creating the table.
databricks_sql_table.table_classifications: Creating... Error: cannot create sql table: statement failed to execute: FAILED
This issue can occur when you use the databricks_sql_table
resource and specify a storage location that points to an S3 bucket as the source for the table.
Cause
The error cannot create sql table: statement failed to execute: FAILED
is a generic error message. Enabling DEBUG
level on the Terraform logs (export TF_LOG=DEBUG
) can give you more details and help identify the specific cause.
One common cause is when the key and source parameters on the aws_s3_object
resource both have the same file name. This results in the files not being correctly recognized in the databricks_sql_table
Terraform resource.
resource "aws_s3_object" "bucket" {
bucket = "<my-bucket-name>"
key = "<path-to-file-in-S3-my-file>"
source = "<path-to-local-file-my-file>"
}
Solution
You need to specify a different name on the key parameter so the file gets uploaded to the S3 external location with a unique name, then reference that on the databricks_sql_table
resource.
Example definition
- Upload the local file to your S3 bucket.
Replace the following with appropriate values:
-
<my-bucket>
- The name of the target S3 bucket. -
<path-to-file-in-S3-my-file>
- The full path (including file name) to the location where the file will be stored in the S3 bucket. The file name must be different from the local file. -
<path-to-local-file-my-file>
- The full path (including file name) to the local location where the file is stored.
resource "aws_s3_object" "bucket" {
bucket = "<my-bucket>"
key = "<path-to-file-in-S3-my-file>"
source = "<path-to-local-file-my-file>"
}
- Retrieve information about the Databricks SQL warehouse.
Repla1ce the following with appropriate values:
-
<warehouse-name>
- The name of your Databricks SQL warehouse.
data "databricks_sql_warehouse" "warehouse" {
name = "<warehouse-name>"
}
- Create an external Databricks SQL table based on the file you uploaded to the S3 bucket.
Replace the following with appropriate values:
-
<my-table>
- The name of the SQL table. -
<my-catalog>
- The catalog in which the table will be created. -
<my-schema>
- The scheme used for the table. -
<name-of-column>
- Name of the column. -
<data-type-of-column>
- Data type of the column (int, string, etc.). -
<sql-table>
- The name that you want assigned to the resource.
resource "databricks_sql_table" "<sql-table>" {
name = "<my-table>"
catalog_name = "<my-catalog>"
schema_name = "<my-schema>"
table_type = "EXTERNAL"
data_source_format = "CSV"
storage_location = "s3://${aws_s3_object.bucket.bucket}/${aws_s3_object.bucket.key}" # The S3 path where the table data is stored. Values are fetched from the resource "aws_s3_object" section.
warehouse_id = "${data.databricks_sql_warehouse.warehouse.id}" # The ID of the SQL warehouse to use for the table is fetched per the warehouse specified in data "databricks_sql_warehouse".
- Add or remove sections as needed to define the columns.
column {
name = "<name-of-column>"
type = "<data-type-of-column>"
}
column {
name = "<name-of-column>"
type = "<data-type-of-column>"
}
column {
name = "<name-of-column>"
type = "<data-type-of-column>"
}
column {
name = "<name-of-column>"
type = "<data-type-of-column>"
}
comment = "This table is managed by Terraform."
}
Info
You should also ensure that the user or service principal running the terraform code has the CREATE EXTERNAL TABLE
permission on the S3 external location in Unity Catalog.