Installing R, R packages (e.g., tidyverse) and Rstudio on Ubuntu Linux
Introduction
The Linux operating system is a great platform for computing. However, it takes some efforts for users who migrate from other operating systems (e.g., Windows) to get started. For example, when I migrated to Linux, I spent quite some time trying to understand how to install the latest version of R on the system. For those of you who are in the same situation like I was, I am writing this tutorial to help. In this tutorial, I will not only show you how it’s done, but will also inform you why each step is necessary, so that you can get a better understanding.
This tutorial is based on Ubuntu, which is perhaps the most popular Linux distribution. Before diving in, here’s something you need to know: (1) most Linux distributions including Ubuntu include a program called bash
that runs various kinds of commands such as those for software/package management, e.g., apt
; (2) you should not confuse package installation using the apt install
command in a bash session (which can be invoked by pressing CTRL+ALT+T
) and that using the install.packages()
function in an R session (which can be invoked by entering R
in a bash session).
Update: I’ve added an appendix at the end of this post, which provides a simple step-by-step guide on how to compile base R from source.
Installing r-base
You can install the r-base
package, which includes the essential components of R, using the apt install
command. By default, the command will search and install the components from a repository called Universe. However, the version of R included in this repository is typically not up-to-date. Alternatively, you can tell apt install
to obtain the latest version directly from a CRAN repository.
Before adding the repository, you will need to add a key to your system so that apt
can perform signature checking of the Release File for the added repository to verify its authenticity. The CRAN repository for Ubuntu is signed with the key of “Michael Rutter [email protected]” (see https://cloud.r-project.org/bin/linux/ubuntu/fullREADME.html). The key can be added with the following command in bash:
# sudo apt install curl gnupg
curl -fsSL https://<my.favorite.cran.mirror>/bin/linux/ubuntu/marutter_pubkey.asc | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/marutter_pubkey.gpg
Note that you should replace <my.favorite.cran.mirror>
with one of the urls provided by this site, e.g., cloud.r-project.org
.
To add a CRAN repository, append /etc/apt/sources.list
with the following entry with an editor of your choice:
deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/marutter_pubkey.gpg] https://<my.favorite.cran.mirror>/bin/linux/ubuntu <code-name-adjective>-cran40/
In addition to <my.favorite.cran.mirror>
, also remember to replace <code-name-adjective>
with your Ubuntu release code name adjective (see here for a full list). For example, to obtain the latest R 4.x packages, the entry should be like
# for Ubuntu 22.04
deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/marutter_pubkey.gpg] https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/
or
# for Ubuntu 20.04
deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/marutter_pubkey.gpg] https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/
or
# for Ubuntu 18.04
deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/marutter_pubkey.gpg] https://cloud.r-project.org/bin/linux/ubuntu bionic-cran40/
Now that the setup is done, you can install the latest version of R (remember to first update package index files from the repository):
sudo apt update
sudo apt install r-base
Now, you should be able to open R by entering R
in bash.
Installing r-base-dev
After the installation of the core packages, you would typically want to install additional R packages using the install.packages()
function in R. However, the function depends on the r-base-dev
package to compile source code for some R packages. Therefore, prior to using the install.packages()
function, you should first install the r-base-dev
package. Like r-base
, the r-base-dev
package can be installed in a bash session:
sudo apt install r-base-dev
Now, you should be able to install most R packages using the install.packages()
function in R.
Installing tidyverse
You are perhaps aware that, when you install an R package that depends on other uninstalled R package(s), the install.packges()
function will automatically install all the required packages at once, even if you don’t explicitly tell it to.
However, things are different when you install an R package that depends on other uninstalled non-R package(s). Under this circumstance, you will always need to manually install the required package(s) before you use the install.packages()
function.
A good example would be an R package called tidyverse
. In case you don’t know, the tidyverse
is a set of R packages (e.g., ggplot2
, dplyr
, …) developed for data science (e.g., data visualization, data manipulation, …) under a tidy and elegant design philosophy. You can visit the home page for more information.
In short, the tidyverse
package requires the following system dependencies that should be installed in bash:
sudo apt install libcurl4-openssl-dev libssl-dev libxml2-dev libfontconfig1-dev libharfbuzz-dev libfribidi-dev libfreetype6-dev libpng-dev libtiff5-dev libjpeg-dev
Now, you should be able to install tidyverse
in R using the install.packages()
function:
install.packages("tidyverse")
Installing RStudio and RStudio Server
If you are looking for an integrated development environment (IDE) for R, I recommend Rstudio. The installation of Rstudio is simple and straightforward. First, download an installer for your system from here. Second, double-click the installer (the installer for Ubuntu should have a file extension “.deb”) and follow the instructions. The whole process is very straightforward so I am not going into the details here.
If you want to use R on a remote Linux server, you’ll probably need RStudio Server as well. Briefly speaking, RStudio Server provides a browser based interface to R running on a remote Linux server. To install RStudio Server, you’ll need to follow the steps below on the remote server.
First of all, since RStudio Server by default does not allows system users (such as root) to authenticate, you need a normal user account with sudo privilege1 on the server. The following code shows how to create one:
1 The sudo privilege is needed because you need to install various packages.
useradd -m <user_name> # create a user with a home directory
passwd <user_name> # set password
usermod -G sudo <user_name> # add the user to the sudo group
Second, install gdebi2 and RStudio Server (you can find <deb_package_url>
from here):
2 This package handles the installation of RStudio and its dependencies on Debian/Ubuntu.
sudo apt install gdebi-core
wget <deb_package_url>
sudo gdebi <deb_package_name>
If you installed RStudio using a package manager binary (e.g. a Debian package or RPM) then RStudio is automatically registred as a deamon which starts along with the rest of the system. However, if you need to manually stop, start, and restart the serve, here’s how:
sudo rstudio-server stop
sudo rstudio-server start
sudo rstudio-server restart
There are a number of administrative commands which allow you to see what sessions are active and request suspension of running sessions (note that session data is not lost during a suspend):
sudo rstudio-server active-sessions # list all currently active sessions
sudo rstudio-server suspend-session <pid> # suspend an individual session
sudo rstudio-server force-suspend-session <pid> # a "force" variation of the suspend command which will send an interrupt to the session to request the termination of any running R command
After the above steps, you should be able to access RStudio Server through the web browser of your local machine. To do so, just enter the ip address (including port number; by default, RStudio Server listens on port 8787) of your remote machine.
Conclusions
In this post, I demonstrated and explained the process of installing R, as well as relevant components on Ubuntu. Once you are familiar with the concepts, you should be able to install additional R packages with more ease.
Appendix: Compile from source
TODO: figure out how to build R so that graphics (especially those with CJK characters) can be displayed on screen.
For some reasons3, you may want to compile base R from source. A simple step-by-step guide is given below (tested on Ubuntu 22.04 with R 4.2.0; for more details, see the R Installation and Administration manual):
3 For example, I upgraded to Ubuntu 22.04 soon after its initial release and I wanted to install R 4.2, which was also released at about the same time. At the time, the official CRAN repo for the new Ubuntu release was not ready, so I had to compile R from source. If you don’t want to go through the hassle, you may try an unofficial Ubuntu PPA provided by Dirk Eddelbuettel, which is usually more up-to-date than the official one: sudo add-apt-repository ppa:edd/misc && sudo apt update && sudo apt install r-base r-base-dev
4 The dependencies can be found using apt show r-base-core
and apt show r-base-dev
. Some additional notes: 1. ATLAS (libatlas-base-dev
+ libatlas3-base
) is a better alternative than BLAS (libblas3
+ libblas-dev
) + LAPACK (liblapack3
+ liblapack-dev
), in terms of performance. 2. libatlas-base-dev
depends on libatlas3-base
, and the latter provides libblas.so.3
as well as liblapack.so.3
for r-base-core
. 3. Although not mentioned in apt show
, libcurl4-gnutls-dev
is also required by r-bese-core
.
First, install system dependencies using the following commands4:
# TODO: categorize dependencies by `capabilities()`
# required by r-base-core
sudo apt install zip unzip libpaper-utils xdg-utils libbz2-1.0 libc6 libcairo2 libcurl4 libglib2.0-0 libgomp1 libicu70 libjpeg8 liblzma5 libpango-1.0-0 libpangocairo-1.0-0 libpcre2-8-0 libpng16-16 libreadline8 libtcl8.6 libtiff5 libtirpc3 libtk8.6 libx11-6 libxt6 zlib1g ucf ca-certificates libcurl4-gnutls-dev
# required by r-base-dev
sudo apt install build-essential gcc g++ gfortran libatlas-base-dev libncurses5-dev libreadline-dev libjpeg-dev libpcre2-dev libpcre3-dev libpng-dev zlib1g-dev libbz2-dev liblzma-dev libicu-dev xauth pkg-config
# required for viewing graphics on-screen (configure R with `--with-x=no` if it is not needed)
sudo apt install xorg-dev
# needed by r-base-dev to build PDF help pages (optional; reasons not to install: 1. they are quite heavy; 2. PDF help pages are redundant since html ones are already included; 3. may interfere with tinytex, which is the recommended tex distribution for RMarkdown/Quarto.)
# sudo apt install texlive-base texlive-latex-base texlive-plain-generic texlive-fonts-recommended texlive-fonts-extra texlive-extra-utils texlive-latex-recommended texlive-latex-extra texinfo
Second, download the source tarfile of the desired R version (e.g., R-4.2.0.tar.gz
) from CRAN, extract the content to a directory and execute the following command inside that directory to prepare for build:
./configure \
--enable-memory-profiling \
--enable-prebuilt-html \
--enable-R-shlib \
--with-blas
# TODO: how to build with `--with-tcltk`, `--with-cairo`, `--with-tiff` and `--with-libxml`?
# TODO: build with `--with-x` or `--without-x`?
Last, build, check and install (by default, the installation will be located somewhere under the /usr/local/
directory, e.g., /usr/local/lib/R
and /usr/local/bin/R
; you can change it in the previous configure step using the --prefix
option):
make
make check
sudo make install
sudo chmod -R a+rx /usr/local/lib/R/ /usr/local/bin/R # may need to execute this command to grant privileges