Problem
While attempting to query a shared table using the Delta Sharing client in a notebook, you try to execute the delta_sharing.load_as_spark(table_url)
method.
You receive a 404 error.
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://<region>.<cloud-vendor>/delta_sharing/retrieve_config.html/shares
Cause
The endpoint URL used by the Delta Sharing client is incorrectly configured. It uses “HTTPS
” in the path to the external location, which is not supported in the load_as_spark
method.
Solution
Check your endpoint URL and use the correct external location URI.
- Confirm that the endpoint URL in the share file matches the expected format. The call to the
delta_sharing.load_as_spark(table_url)
method from thedelta-sharing
Python library expects a valid URI that can be accessed remotely by the Spark driver, in the following format.- AWS S3 (s3a://)
- Azure Blob Storage (abfss://)
- GCP GCS (gs://your-bucket/path/to/data)
- Select and replace
<your-URI>
in the following code snippet with your URI, based on your cloud platform in the previous step.
share_file_path = '<cloud storage>://<storage-region>/delta-sharing/share/open-datasets.share'
table_url = f"{share_file_path}#delta_sharing.default.file_name"
shared_df = delta_sharing.load_as_spark(table_url)
display(shared_df)
For more information, refer to the Delta Sharing Receiver Quickstart notebook.