The Linux operating system is a great platform for computing. However, it takes some efforts for users who migrate from other operating systems (e.g., Windows) to get started. For example, when I migrated to Linux, I spent quite some time trying to understand how to install the latest version of R on the system. For those of you who are in the same situation like I was, I am writing this tutorial to help. In this tutorial, I will not only show you how it’s done, but will also inform you why each step is necessary, so that you can get a better understanding.
This tutorial is based on Ubuntu, which is perhaps the most popular Linux distribution. Before diving in, here’s something you need to know: (1) most Linux distributions including Ubuntu include a program called
bash that runs various kinds of commands such as those for software/package management, e.g.,
apt; (2) you should not confuse package installation using the
apt install command in a bash session (which can be invoked by pressing
CTRL+ALT+T) and that using the
install.packages() function in an R session (which can be invoked by entering
R in a bash session).
You can install the
r-base package, which includes the essential components of R, using the
apt install command. By default, the command will search and install the components from a repository called Universe. However, the version of R included in this repository is typically not up-to-date. Alternatively, you can tell
apt install to obtain the latest version from a CRAN repository. This can be achieved by first adding the following entry in your
/etc/apt/sources.list file in a new line (you can run a text editor as root, e.g.,
sudo nano /etc/apt/sources.list, to add the entry):
deb https://<my.favorite.cran.mirror>/bin/linux/ubuntu <code-name-adjective>-cran40/
Note that you should replace
<my.favorite.cran.mirror> with one of the urls provided by this site; replace
<code-name-adjective> with your Ubuntu release code name adjective (see here for a full list). For example, to obtain the latest R 4.0 packages, add an entry like
# for Ubuntu 20.04 deb https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/
# for Ubuntu 18.04 deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran40/
# for Ubuntu 16.04 deb https://cloud.r-project.org/bin/linux/ubuntu xenial-cran40/
After adding the entry, you will also need to add a key to your system so that
apt can perform signature checking of the Release File for the added repository to verify its authenticity. The CRAN repository for Ubuntu is signed with the key of “Michael Rutter [email protected]” (see https://cran.r-project.org/bin/linux/ubuntu/README.html). To add the key, enter the following command in bash:
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
Now that the setup is done, you can install the latest version of R (remember to first update package index files from the repository):
sudo apt update sudo apt install r-base
Now, you should be able to open R by entering
R in bash.
After the installation of the core packages, you would typically want to install additional R packages using the
install.packages() function in R. However, the function depends on the
r-base-dev package to compile source code for some R packages. Therefore, prior to using the
install.packages() function, you should first install the
r-base-dev package. Like
r-base-dev package can be installed in a bash session:
sudo apt install r-base-dev
Now, you should be able to install most R packages using the
install.packages() function in R.
tidyverse (or other R packages with system dependencies)
You are perhaps aware that, when you install an R package that depends on other uninstalled R package(s), the
install.packges() function will automatically install all the required packages at once, even if you don’t explicitly tell it to.
However, things are different when you install an R package that depends on other uninstalled non-R package(s). Under this circumstance, you will always need to manually install the required package(s) before you use the
A good example would be an R package called
tidyverse. In case you don’t know, the
tidyverse is a set of R packages (e.g.,
dplyr, …) developed for data science (e.g., data visualization, data manipulation, …) under a tidy and elegant design philosophy. You can visit the home page for more information.
In short, the
tidyverse package requires the following non-R packages:
r-base-dev, you can install them in bash:
sudo apt install libcurl4-openssl-dev libssl-dev libxml2-dev
Now, you should be able to install
tidyverse in R using the
Installing RStudio and RStudio Server
If you are looking for an integrated development environment (IDE) for R, I recommend Rstudio. The installation of Rstudio is simple and straightforward. First, download an installer for your system from here. Second, double-click the installer (the installer for Ubuntu should have a file extension “.deb”) and follow the instructions. The whole process is very straightforward so I am not going into the details here.
If you need to use R on a remote Linux server, you’ll probably need RStudio Server as well. Briefly speaking, RStudio Server provides a browser based interface to R running on a remote Linux server. To use RStudio Server, you’ll need to following steps on the remote server.
First of all, since RStudio Server by default does not allows system users (such as root) to authenticate, you need a normal user account with sudo privilege1 on the server. The following code shows how to create one:
useradd -m <user_name> # create a user with a home directory passwd <user_name> # set password usermod -G sudo <user_name> # add the user to the sudo group
sudo apt install gdebi-core wget <deb_package_url> sudo gdebi <deb_package_name>
If you installed RStudio using a package manager binary (e.g. a Debian package or RPM) then RStudio is automatically registred as a deamon which starts along with the rest of the system. However, if you need to manually stop, start, and restart the serve, here’s how:
sudo rstudio-server stop sudo rstudio-server start sudo rstudio-server restart
There are a number of administrative commands which allow you to see what sessions are active and request suspension of running sessions (note that session data is not lost during a suspend):
sudo rstudio-server active-sessions # list all currently active sessions sudo rstudio-server suspend-session <pid> # suspend an individual session sudo rstudio-server force-suspend-session <pid> # a "force" variation of the suspend command which will send an interrupt to the session to request the termination of any running R command
After the above steps, you should be able to access RStudio Server through the web browser of your local machine. To do so, just enter the ip address (including port number; by default, RStudio Server listens on port 8787) of your remote machine.
In this post, I demonstrated and explained the process of installing R, as well as relevant components on Ubuntu. Once you are familiar with the concepts, you should be able to install additional R packages with more ease.