Problem
While using serverless compute, you use %pip install to install a package, then restart your Python environment using %restart_python. When you then try to import a module from a local directory, src, you encounter a ModuleNotFoundError.
You may notice the same import statement works before installing the package or when using a non-serverless Databricks cluster such as Databricks Runtime 16.4 LTS.
Cause
When you execute %pip install and %restart_python on serverless compute, the environment is completely reinitialized. As a result, any temporary files or Python path customizations made before the restart are lost.
Although sys.path appears to be the same before and after the restart, the actual presence of the src/ directory on disk is not retained after the restart.
This behavior differs from standard clusters, where %pip install and %restart_python may retain the notebook-scoped file system (such as /Workspace/ or /Workspace/Repos/) in memory or through mounted paths, which keeps the src/ package accessible.
Solution
To resolve this issue on serverless compute:
- After executing
%restart_python, re-add the full path to thesrc/directory tosys.path. You can adapt and use the following example code. Select one of thesys.path.append()options after runningimport sys.
import sys
sys.path.append("/Workspace/Repos/<your-email>/<your-project>/src")
# or
sys.path.append("/Workspace/<your-folder>/src")- Adjust the path according to your workspace or repository structure.
- Verify the path you append to
sys.pathis correct by checking the current working directory. Useos.getcwd(). List its contents withos.listdir()if needed. - Try installing the package in a separate cell without
%restart_python. Some installations work without needing a restart on serverless compute. You can use the following example code.
# Install <package> without restarting Python
%pip install <package>
# Your import statement should work as expected
from src.column_names import ColumnNames as CN
Preventive Measures
- Avoid using
%restart_pythonon serverless compute unless functionality breaks or cannot work correctly without a full interpreter reboot. (For example, when you need an interpreter to pick up new versions and avoid stale dependencies). - Consider structuring your project as a Python package and installing it in editable mode using
%pip install -e /Workspace/<your-folder>/(if applicable).
For further reading, refer to the Install libraries (AWS | Azure | GCP) and Databricks notebooks (AWS | Azure | GCP) documentation.