Problem
You are setting up a machine learning model from Hugging Face, but the download fails with an error message that says there is no more disk space.
/databricks/python/lib/python3.11/site-packages/huggingface_hub/file_download.py:1006: UserWarning: Not enough free disk space to download the file. The expected file size is: XXXX MB. The target location /root/.cache/huggingface/hub only has XXX MB free disk space.
Cause
The root partition on machines is fixed and does not autoscale. If you download anything to a home directory or temporary folder on the root partition, it quickly runs out of storage space.
Solution
You can set os.environ['HF_HUB_CACHE'] = "<path-to-local-storage>"
to a local path with available space. Set this environment variable early, before importing transformers libs.
Autoscaling local storage is under /local_disk0
. That is the recommended choice for local storage.
If you want persistent storage, you can also use /dbfs
or /Volumes
. This is remote storage, so it may be slower to write and read as compared to using local storage.