R Package management with renv
9min
you may use the renv r package to create a personal r project environment for r packages documentation on renv can be found on the rstudio site https //rstudio github io/renv/ say your r code is in directory /scratch/$user/projects/project1 cd /scratch/$user/projects/project1 module purge module load r/gcc/4 3 2 r automatic deletion of your files this page describes the installation of packages on /scratch one has to remember, though, that files stored in the hpc scratch file system are subject to the hpc scratch old file purging policy by default, renv will cache package installation files to your home directory (most likely either in / local/share/renv or / cache/r/renv/ or something similar) to avoid filling up your home directory, we advise to set up path to alternative cache directory (otherwise your home directory may fill up quickly) create directory mkdir p /scratch/midway3/$user/ cache/r/renv put the following into renviron file withing the r project directory (/scratch/midway3/$user/projects/project1 in this example) renv paths root=/scratch/midway3/\<user netid>/ cache/r/renv the renv package is already installed for module r/4 2 1 you need to install it yourself if you use other r module version \## do this if renv is not available (already installed for r/4 2 1)# install packages("renv") ## by default this will install renv package into a sub directory within your home directory \## init renv in project’s directoryrenv init(" ") restart r for renv to take effect once you start r, your renv environment will be loaded automatically you can check your library paths with the libpaths() command \> libpaths()\[1] “/scratch/midway3/$user/projects/project1/renv/library/r 4 2/x86 64 pc linux gnu” you can check where the cache is set with the following renv paths$cache()#\[1] “/home/$user/ cache/r/renv/cache/v5/r 4 2/x86 64 pc linux gnu” add/remove, etc packages install a package, such as reshape2 below we can see it is not yet installed and then install it r library(reshape2) error in library(reshape2) there is no package called ‘reshape2’install packages(“reshape2”) note you must be in the project1 directory for renv to load your project and the appropriate personal environment that you have created if you want to copy your environment to a new location, use the bundle package, as shown below test r file print(“hello”)renv restore()library(reshape2)names(airquality) < tolower(names(airquality))head(airquality)aql < melt(airquality)print(“hello again”) for testing run it as srun –pty /bin/bash rscript test r note your ‘ rprofile’ file will include line source(“renv/activate r”) the file will output the following \[1] “hello” the library is already synchronized with the lockfile ozone solar r wind temp month day1 41 190 7 4 67 5 12 36 118 8 0 72 5 23 12 149 12 6 74 5 34 18 313 11 5 62 5 45 na na 14 3 56 5 56 28 na 14 9 66 5 6no id variables; using all as measure variables\[1] “hello again” keep only the packages that you use in this particular project (not all the packages available on the system) r # launch rrenv clean() # remove packages not recorded in the lockfile from the target library the general workflow when working with renv is call renv init() https //rstudio github io/renv/reference/init html to initialize a new project local environment with a private r library, work in the project as normal, installing and removing new r packages as they are needed in the project, call renv snapshot() https //rstudio github io/renv/reference/snapshot html to save the state of the project library to the lockfile (called renv lock), by default, renv snapshot() will only capture packages listed in your r scripts within the r project for more options read the continue working on your project, installing and updating r packages as needed if needed, call renv restore() https //rstudio github io/renv/reference/restore html to revert to the previous state as encoded in the lockfile if your attempts to update packages introduced some new problems the renv init() https //rstudio github io/renv/reference/init html function attempts to ensure the newly created project library includes all r packages currently used by the project it does this by crawling r files within the project for dependencies with the renv dependencies() https //rstudio github io/renv/reference/dependencies html function the discovered packages are then installed into the project library with the renv hydrate() https //rstudio github io/renv/reference/hydrate html function, which will also attempt to save time by copying packages from your user library (rather than reinstalling from cran) as appropriate calling renv init() https //rstudio github io/renv/reference/init html will also write out the infrastructure necessary to automatically load and use the private library for new r sessions launched from the project root directory this is accomplished by creating (or amending) a project local rprofile with the necessary code to load the project when the r session is started if you’d like to initialize a project without attempting dependency discovery and installation – that is, you’d prefer to manually install the packages your project requires on your own – you can use renv init(bare = true) to initialize a project with an empty project library when you launch a job with sbatch, r will check if there is renv directory, and if renv is on it will pick up packages, installed using renv in the current directory before you launch sbatch job, you need to make sure your project renv environment is ready, as outlined in the previous section store and share your r project’s r version and r package versions if you already have file renv lock or bundle file skip step 1 1\ in the original location (your own laptop for example) go to project directory and execute (make sure the whole path to project directory and names of your script files don’t have empty spaces!) r# install packages("renv") ## if neededrenv init()renv snapshot() 2\ take file renv lock and copy it to a new location for the project 3\ at the new location restore environment go to directory of the project and execute (make sure version of r is the same) \## reproduce environment module load r/4 2 1 r renv restore() renv init() renv will install/compile what is needed on any system (linux, windows, etc) you can share your code with other researchers no matter what system they use however, you should be careful that the same version of r is used between systems what to save/publish/commit with git in order to have your work reproducible by you or/and others, save and/or commit your code in git, please including renv lock (which lists all packages and versions that you use including the version of r) the renv package has replaced the now deprecated packrat package the renv migrate() function makes it possible to migrate projects from packrat to renv see the ?migrate documentation for more details in essence, calling renv migrate("") will be enough to migrate the packrat library and lockfile such that they can then be used by renv https //rstudio github io/renv/articles/renv html https //rstudio github io/renv/articles/renv html