Skip to content

Commit

Permalink
Merge pull request #426 from hpcleuven/feature/RStudio_update
Browse files Browse the repository at this point in the history
OOD RStudio & R package management documentation update
  • Loading branch information
WouterVanAssche authored Sep 16, 2024
2 parents 2ed9bfa + 8bc8d70 commit 0bb0277
Show file tree
Hide file tree
Showing 3 changed files with 112 additions and 43 deletions.
30 changes: 20 additions & 10 deletions source/leuven/services/openondemand.rst
Original file line number Diff line number Diff line change
Expand Up @@ -419,14 +419,20 @@ For more general information, please refer to the `official JupyterLab documenta
RStudio Server
--------------

This interactive app allows you to run an RStudio session as a compute job.
You will be running RStudio with R version 4.2.1.
For more information on how to use RStudio, check out the `RStudio official documentation`_.
This interactive app allows you to run an RStudio session on the cluster.
In the 'Toolchain year and R version' drop-down menu, you can choose the version
of R module that would be loaded for your session (such as `R/4.2.2-foss-2022b`).
Additionally, the `R-bundle-CRAN` and `R-bundle-Bioconductor` modules can be loaded
on top of the base R module to provide easy access to hundreds of preinstalled packages.

The use is very similar to regular RStudio.
It is recommended to install packages in a folder on your ``$VSC_DATA`` instead of the default location though,
to avoid clogging your ``$VSC_HOME``.
You can do this by using the ``lib`` argument for both the ``install.packages`` and the ``library`` function.
It is also possible to use locally installed R packages with RStudio, see :ref:`R package management<r_package_management_standard_lib>`.
RStudio furthermore allows to create RStudio projects to manage your
R environments. When doing so, we recommend to select the
`renv <https://rstudio.github.io/renv/articles/renv.html>`_ option
to ensure a completely independent R environment. Without `renv`,
loading an RStudio project may lead to incomplete R library paths.

For more information on how to use RStudio, check out the `official documentation <https://docs.posit.co/ide/user/>`__.

**Remarks:**

Expand All @@ -436,16 +442,20 @@ You can do this by using the ``lib`` argument for both the ``install.packages``
You will also notice that you cannot use the same way of navigating after this.
Another solution is to click the three dots on the right (...) and enter your path.
- The 'Tools-Install packages' interface does not allow you to select any other path than the default in your ``$VSC_HOME``.
It is recommended to use the ``install.packages`` function instead.
It is recommended to use the ``install.packages()`` function instead.
- RStudioServer will by default store the RStudio cache in ``$VSC_HOME/.local/share/rstudio``.
This cache can get very large, and cause you to exceed the quota of your home directory.
To avoid this, you can redirect this cache to your data directory by setting ``$XDG_DATA_HOME``
variables in your ``~/.bashrc``.
To avoid this, you can redirect this cache to your data directory by setting the ``$XDG_DATA_HOME``
variable in your ``~/.bashrc``:

.. code-block:: bash
echo "export XDG_DATA_HOME=$VSC_DATA/.local/share" >> ~/.bashrc
- Additionally, it is advised to change the default behaviour of RStudio to not restore .RData
into the workspace on start up and to never Save the workspace to .RData on exit.
You can do this via the RStudio interface:
Tools > Global Options > General > Workspace

Tensorboard
-----------
Expand Down
8 changes: 4 additions & 4 deletions source/software/r_devtools.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,19 +28,19 @@ approach to use and install devtools.
Installing in a local R library
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If you manage your R packages in a :ref:`local R library<r_package_management_standard_lib>` under ``$VSC_DATA/R``
If you manage your R packages in a :ref:`local R library<r_package_management_standard_lib>` under ``$VSC_DATA/Rlibs``
while using a centrally installed R module, you can use the devtools package included in the module.
You will need to execute the following commands in the R console:

.. code-block:: r
> # First check that the R library path points to your local R library:
> .libPaths()
> # Set the R library path if this is not the case.
> .libPaths("/data/leuven/XXX/vscXXXXX/R/")
> # Set the R library path if this is not the case. e.g.
> .libPaths("/data/leuven/XXX/vscXXXXX/Rlibs/rocky8/icelake/R-4.2.2")
> # Load devtools and e.g. install your package from github:
> library(devtools)
> devtools::install_github("Developer/Package")
> install_github("Developer/Package")
.. note::

Expand Down
117 changes: 88 additions & 29 deletions source/software/r_package_management.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,51 +6,110 @@ R package management
Introduction
------------

Most of the useful R packages can be installed separately. Some of those are
part of the centrally installed R modules. However, given the astounding number of
packages, it is not sustainable to install each and everyone of them system wide.
Fortunately, it is very easy for users to install R packages themselves.
If you do encounter problems when doing so, do not hesitate to contact support.
There exist thousands of R packages, available from online repositories like CRAN,
Bioconductor or github. Depending on the R version, the more commonly used packages like `ggplot2`, `tidyverse` or `readr`
are either already included in the centrally installed R module or can be accessed by
loading the `R-bundle-CRAN` and `R-bundle-Bioconductor` modules, e.g.:

.. code:: r
$ module load R-bundle-Bioconductor/3.16-foss-2022b-R-4.2.2
It is possible, however, that these modules do not contain all R packages you need
or that the package versions do not meet your requirements. In this case you will
need to locally install those packages, as will be described below. Do not hesitate
to contact your local support team when encountering issues during these local installations.

.. _r_package_management_standard_lib:

Standard R package installation
-------------------------------

Setting up your own package repository for R is straightforward.
Firstly, it is important to realize that R by default uses the `$VSC_HOME/R` path
to install new packages. Since `$VSC_HOME` has limited quota, it is not
the recommended location to install software. Instead, we recommend to use `$VSC_DATA`.

#. Load the appropriate R module, i.e., the one you want the R package
to be available for::
Secondly, it should be kept in mind that R packages often include extensions written in
compiled languages (e.g. C++ or Fortran) and that the centrally installed R modules are
configured to compile these extensions with optimizations for the CPU architecture at hand.
This means that such R packages cannot in general be used on different partitions than the
one they were created on.

$ module load R/3.2.1-foss-2014a-x11-tcl
Thirdly, R packages may also only work with certain versions of R and not with other versions.

#. Start R and install the package (preferably in your $VSC_DATA directory)::
With these three considerations in mind, we recommend to use a directory structure which
provides a unique path for each OS version, hardware architecture and R version.
The example below creates such a structure for a Rocky8 OS, Icelake CPU and R version 4.2.2:

> install.packages("DEoptim", lib="/data/leuven/304/vsc30468/R/")
.. code-block:: bash
#. Alternatively you can download the desired package::
# From within an interactive session on an icelake compute node:
$ module load R/4.2.2-foss-2022b
$ mkdir -p ${VSC_DATA}/Rlibs/${VSC_OS_LOCAL}/${VSC_ARCH_LOCAL}/R-${EBVERSIONR}
$ wget cran.r-project.org/src/contrib/Archive/DEoptim/DEoptim_2.0-0.tar.gz
The next step is to ensure such install locations are used by default in the R package installation process.
This can be done by setting the `R_LIBS_USER` variable to in the `~/.Renviron` file as follows:

.. code-block:: bash
$ echo 'R_LIBS_USER=${VSC_DATA}/Rlibs/${VSC_OS_LOCAL}/${VSC_ARCH_LOCAL}/R-${EBVERSIONR}' >> ~/.Renviron
The `${VSC_OS_LOCAL}` and `${VSC_ARCH_LOCAL}` environment variables are predefined
and match the OS version (e.g. `rocky8`) and CPU model (e.g. `icelake`) of the node.
The `${EBVERSIONR}` variable contains the R version (e.g. `4.2.2`) of the currently loaded
R module.

and install it with::
$ R CMD INSTALL DEoptim_2.0-0.tar.gz -l /$VSC_DATA/R/
#. These packages might depend on the specific R version, so you may
need to reinstall them for the other version.

Some R packages depend on libraries installed on the system. In that case,
you first have to load the modules for these libraries, and only then proceed
to the R package installation. For instance, if you would like to install
the `gsl` R package, you would first have to load the module for the GSL
library, .e.g., ::

$ module load GSL/2.5-GCC-6.4.0-2.28
R will now use this path as default install path, ensuring you are always installing
your packages in the appropriate R library folder.

.. note::

R packages often depend on the specific R version they were installed
for, so you may need to reinstall them for other versions of R.
This `.Renviron` configuration will also work as expected in Open OnDemand apps
such as RStudio Server.

The next step is to load the appropriate R module and run R.

.. code-block:: bash
# From within an interactive session on an icelake compute node:
$ module load R/4.2.2-foss-2022b
$ R
From here, installing packages can be as simple as:

.. code-block:: r
> install.packages("DEoptim")
If you are unsure whether R will install your desired package in the correct location, you can first list
the known library locations by executing `.libPaths()`. The first location is the
default one.

You can also specify your desired library path as an extra argument in the install command.
This will take precedence over any defaults.

.. code-block:: r
> Rlibs <- "/path/to/my/R_library"
> install.packages("DEoptim", lib = Rlibs)
Alternatively you can download the desired package

.. code-block:: bash
$ wget cran.r-project.org/src/contrib/Archive/DEoptim/DEoptim_2.0-0.tar.gz
and install it from the command line with

.. code-block:: bash
# From within an interactive session on an icelake compute node:
$ module load R/4.2.2-foss-2022b
$ R CMD INSTALL DEoptim_2.0-0.tar.gz -l ${VSC_DATA}/Rlibs/${VSC_OS_LOCAL}/${VSC_ARCH_LOCAL}/R-${EBVERSIONR}
If the installation of a package requires devtools, please consult the :ref:`devtools documentation<r_devtools>`.


.. _r_package_management_conda:

Expand Down

0 comments on commit 0bb0277

Please sign in to comment.