1.2 Installing Packages
One of the strengths of R is its packages. Being bare bones on its own, these packages extend R’s capabilities to virtually any task imaginable. These packages generally come in three forms: those with an official stable release hosted on the Comprehensive R Archive Network (CRAN), those still in development, often found on platforms like GitHub, and those we write ourselves. We will look at each of these in turn.
1.2.1 CRAN
CRAN is the principal repository for stable, official R packages and is essentially a distributed network of servers hosting R distributions, contributed packages, documentation, and related materials. Before a package appears here, it has gone through a review process that helps ensure that it is stable, free of critical bugs, adheres to certain coding standards, includes essential documentation (such as README and NEWS files), and has a unique version number. Also, many of the packages here come with an accompanying article in journals like The R Journal or The Journal of Statistical Software, which provide the logic behind the package, practical explanations of the code, and illustrative examples.
The standard function for installing packages from CRAN is install.packages(). To install a single package, provide its name within double quotes:
You can install multiple packages simultaneously by providing a character vector of package names using c and separating them with commas. Note that we also include dependencies = TRUE here. Doing so makes R automatically check for any other packages (dependencies) we need for our package to work correctly. We highly recommend including this to avoid errors caused by missing dependencies:
By default, R installs packages in a standard library location on our system. Use .libPaths() to see where this is. The default location is usually fine, but you can specify another location using the lib argument if needed.
We should also keep our packages updated, as newer versions often provide us with new features, bug fixes, and performance enhancements. To check for and install updates for all our installed packages, run:
This will scan all installed packages, compare their versions with those available from CRAN, and present us with a list of packages we can update. In RStudio, we can also click the Update button in the Packages tab, which will open a window listing all packages with available updates. It is good practice to update packages regularly (e.g. monthly) to minimise the chance of encountering bugs already fixed in newer versions. To see which packages are outdated without starting the update process, type old.packages() in the console.
1.2.2 GitHub
If a package is still under active development or has not yet gone through the CRAN submission process, we can download it from platforms like GitHub. In addition, GitHub also hosts the latest versions of official packages before their official release, giving us access to the latest features and bug fixes, with the caveat that they may be less stable or more prone to problems than their CRAN counterparts. We cannot install these packages directly, but must use the devtools package to install them, which also allows us to install packages from other sources such as Bioconductor or GitLab:
As we are now installing packages from “source” (that is, they are not on CRAN), we need to have the necessary tools to build them before we can use them (compilers, libraries, etc.). What you need varies by operating system:
Windows: From the CRAN RTools page, download the latest recommended version of RTools that matches your R version and run it, making sure that you add RTools to your system’s PATH.
macOS: You need the Xcode Command Line Tools. To install this, open the Terminal application (found in
/Applications/Utilities/) and enterxcode-select --install.A software update window should appear. Follow the prompts to install the tools and accept the license agreement.Linux: Depending on our Linux flavour, you need several development libraries and compilers. To install them, run the following in your console:
sudo apt-get updatesudo apt-get upgradesudo apt install build-essential libcurl4-gnus-dev libxml2-dev libssl-devThese commands update the package lists, upgrade existing software, and install essential build tools (
build-essential) and libraries (libcurl,libxml2,libssl) needed for compiling R packages from source. The package names can differ slightly between Linux distributions (like Fedora, CentOS, etc.).
Once you have devtools and the necessary build tools, you can use install_github() to install the packages. You can find the github_username and repository_name on the GitHub page that hosts the package:
1.2.3 Writing Packages
If no package does exactly what we need, we can write it ourselves. This can be a good idea if there are certain pieces of code that we need to run multiple times, or that we want to share with others. As the name suggests, packages are collections of functions, pieces of code that we define using the function() command:
my_function_name <- function(argument1, argument2, ...) {
result <- # Commands to be executed using the arguments
return(result) # Returns the result
}Functions are invaluable for making your code modular and repeatable, and you will use them often while using R. However, once you have a collection of related functions (and possibly data) that you use frequently or want to share with others, you should consider organising them into a package. This involves structuring your code and documentation in a particular directory layout and including specific metadata files (such as DESCRIPTION and NAMESPACE) that aim to make your code easily installable, loadable, and discoverable by others. For more information, see Hadley Wickham’s book R Packages or the guide by Hilary Parker.