The string formatting in R can feel old fashioned because R is an old programming language. R was created in 1995 to be a better version of an even older programming language called S, which was created in 1976. The clunkiness of parts of R when compared to modern languages is due to this heritage.

Fortunately, R is not stuck in the past. R comes with a lot of packages that extend the core functionality, and provide a more modern interface to its most powerful functionality. When we use R with these packages, we are using “modern R”.

For example, the Stringr package provides a set of modern functions for manipulating and formatting strings.

To use Stringr, we first need to install the package. You install packages in R using install.packages, e.g. type into your console;

install.packages("stringr")

and hit return. If your user account has permission to install packages then you should see something like;

trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.0/stringr_1.4.0.tgz'
Content type 'application/x-gzip' length 210650 bytes (205 KB)
==================================================
downloaded 205 KB


The downloaded binary packages are in
    /var/folders/vg/lyxsq9391fxfm64hfdfr88f40000gq/T//Rtmp57uUY0/downloaded_packages

Notice that this will automatically get the right package for your operating system (in my case macosx). Also note that you only have to do this once, as once installed, this package is available for everyone.

Using a package

You can use a package in your script via the library command. To use stringr you should type;

library(stringr)

into the console. When you press return, nothing should happen. If you see output similar to;

Error in library(stringr) : there is no package called ‘stringr’

then this means that stringr is not installed properly.

To get help on a package type ? before its name, e.g.

?stringr

All of the functions in stringr start with str_ and take a string (or vector/list of strings) as the first argument.

Key functions are;

EXERCISE

Use ? to learn about the above stringr functions and have a play printing different strings to the console.

Answer

CRAN

The power of R comes from its great wealth of excellent packages. These packages are managed in a central repository called CRAN (the Comprehensive R Archive Network). There are very strict protocols to follow to publish a package in CRAN, which includes an external review stage. As such, publishing an R package is a lot like publishing a paper, and so R packages on CRAN are mostly of a high standard, and come complete with documentation and tests. You can get an idea of what is needed to publish a package on CRAN by reading R Packages by Hadley Wickham and Jennifer Bryan. This excellent online book provides complete detail of how to write and publish an R package.

A good way to find the package you want is to use an R search service, such as rseek. You can search for individual package names, or even the kind of thing you want to do. As most R packages come with vignettes (web pages that show examples of how to use the package) this means you can quickly find both the package that achieves your goal, plus really clear documentation and examples.

EXERCISE

Use rseek to look for packages that help you calculate Pearson’s product-moment correlation. Limit the search to vignettes. Can you find a vignette that shows you how to do this? Do not worry that the R in the vignette is more advanced than you’ve seen so far - it won’t be long before it will make sense ;-)

Answer

Updating packages

R’s strength is its packages, and what makes this easier is that package management is handled directly within the language.

You can update a package by running install.packages again, e.g. to update stringr to the newest version, just type;

install.packages("stringr")

You can get a list of all installed packages via installed.packages(), e.g.

installed.packages()

You can get a list of all packages for which new versions are available using old.packages(), e.g.

old.packages()

You can update all packages for which updates are available using update.packages(). Set ask=FALSE to update everything without prompting, e.g.

update.packages(ask=FALSE)

Note that updating all of your packages could take a while if you haven’t done it recently…

Previous | Next