Problem
You are attempting to install the lme4
R library on a cluster running Databricks Runtime 15.4 LTS ML or below when you see error messages indicating that the required version of the Matrix
package is not available.
Error: package 'Matrix' 1.5-1 was found, but >= 1.6.2 is required by 'lme4'.
This prevents the successful installation and usage of the lme4
package, which is essential for certain statistical modeling tasks.
Cause
There is a dependency issue between the version of R included in the Databricks Runtime you are using and the requirements for the lme4
package. Specifically, lme4
requires Matrix
version 1.6.2 or above.
Solution
You can upgrade to Databricks Runtime 16.0 and above, which includes an updated version of Matrix
.
If you do not want to use a newer Databricks Runtime, you can pin the required library versions in the cluster configuration or you can create an init script that installs the libraries when your cluster starts.
Pin library versions
You must upgrade the version of Matrix
to 1.6.2 or above before you can successfully install lme4
. You can install both libraries as cluster libraries (AWS | Azure | GCP) from the workspace UI.
Select CRAN as the Library Source and enter the specific versions:
Matrix==1.6-2
lme4==1.1-28
Install via init script
- Create an init script with the following content:
%sh
#!/bin/bash
# Update the package list
sudo apt-get update
# Install necessary dependencies for R packages
sudo apt-get install -y libcurl4-openssl-dev libxml2-dev libssl-dev
# Install the Matrix package version 1.6-2 from the CRAN archive
sudo R -e "install.packages('remotes')"
sudo R -e "remotes::install_version('Matrix', version = '1.6-2', repos = 'http://cran.us.r-project.org')"
# Install the lme4 package
sudo R -e "install.packages('remotes', repos = 'http://cran.us.r-project.org'); remotes::install_version('lme4', version = '1.1-28', repos = 'http://cran.us.r-project.org')"
# Test loading the lme4 package
# sudo R -e "library('lme4')"
- Save the init script as a workspace file, to a Unity Catalog volume, or cloud storage.
- Configure the init script as a cluster-scoped init script (AWS | Azure | GCP) or a global init script (AWS | Azure | GCP) depending on your use case.
- Restart your cluster to apply the changes. The init script runs during cluster startup, installing the required versions of R and its dependencies, including the
lme4
package.