Until now all the variables we have used have contained a single piece of information, for example,
a <- 4 makes a variable
a containing a single number,
4. It’s very common in programming (and in fact real life) to want to refer to collections of this. For example a shopping list contains a list of items you want to buy, or a car park contains a set of cars.
Create a new file by going to File → New File → R Script. Save it (e.g. using File → Save) to a file called list.R. Inside that file write the following:
my_list <- c("cat", "dog", 261) print(my_list)
Run this script in the terminal with
Rscript list.R and look at the output.
This will create a R list (also called a vector) with three elements and assign it to the variable
my_list. The function
c is the “combine” function that “creates a list” from by combining its arguments. The elements of the list, like all arguments to a function, are separated by commas. As with previous variable types, you can print lists by passing their name to the
You can have as many items in a list as you like, even zero items. An empty list could look like:
my_list <- c()
And a list with six different numbers could look like:
my_list <- c(32, 65, 3, 867, 3, -5)
Edit the list so that it has some more items in it. Try adding some different data types and even rearranging the items. Make sure you save the file and rerun it in the terminal and check that the output matches what you expect.
The power of R’s lists comes not simply from being able to hold many pieces of data but from being able to get specific pieces of data out. The primary method of this is called indexing. Indexing a list in R is done using the square brackets
. To get a single element out of a list you write the name of the variable followed by a pair of square brackets with a single number between them:
my_list <- c("cat", "dog", 261) my_element <- my_list print(my_element)
my_list means “give me the number 1 element of the list my_list”. Run this code and see what you get. Is it what you expect?
You’ll probably notice that it prints
cat whereas those of you who’ve used other programming languages may have expected it to print
dog. This is because in R you count from one when indexing lists and so index 1 refers to the first item in the list. This is different to languages like Python or C++, which count from 0. R instead counts from 1. This “one-indexing” is not common and is not used by many programming languages.
Try accessing some different elements from the list by putting in different numbers between the square brackets.
Indexing lists is likely the first time you will see an R
NA is short for “not available”, and is returned by R whenever you ask for something that is not available. This can happen when you ask for an item in a list that doesn’t exist. You can test if a value is
NA using the
is.na function. For example;
my_list <- c("cat", "dog", 261) my_element <- my_list print(my_element) print(is.na(my_element))
Running this you will see the following printed to the screen:
 NA  TRUE
Be very careful in R that you check if values are
NA. A common source of bugs is accessing elements in a list that don’t exist.
You can get the length of the list using the
length function, e.g.
my_list <- c("cat", "dog", 261) print(length(my_list))
A valid index in a list must be
1 <= index <= length(list).
You can update a value in a list by putting it on the left-hand side of the assignment operator (
my_list <- c("cat", "dog", 261) my_list <- "fish" print(my_list)
 "fish" "dog" "261"
as the item at index 1 (which was
cat) has been replaced by
Note that you can update items at any index, including those outside of the list, e.g.
my_list <- c("cat", "dog", 261) my_list <- "fish" print(my_list)
 "cat" "dog" "261" NA NA NA NA "fish"
fish was assigned to index 8. Since there was nothing previously above index 3, R automatically added
NA values to indexes 4, 5, 6 and 7.
As well as being able to select individual elements from a list, you can also grab sections of it at once. This process of asking for subsections of a list of called slicing. Slicing starts out the same way as standard indexing (i.e. with square brackets) but instead of putting a single number between them, you combine a collection (
c) of numbers. For example;
my_list <- c(3, 5, "green", 5.3, "house", 100, 1) my_slice <- my_list[c(3,4,5)] print(my_slice)
Run this code and look at the output.
You see that is printed
"green" "5.3" "house" which is index 3 (
green), index 4 (
5.3) and index 5 (
house), as you requested.
Writing out each index you want would be hard if you wanted a lot of values, e.g. slicing elements 5 to 50 of a big list. Fortunately, the
seq function generates sequences of numbers, e.g.
my_list <- c(3, 5, "green", 5.3, "house", 100, 1) my_slice <- my_list[seq(3,5)] print(my_slice)
In this case,
seq(3,5) generates the sequence of numbers from 3 to 5 inclusive. It is thus the same as
c(3,4,5). You can add an optional third argument to
seq that gives the step, e.g.
seq(1,9,2) will be the sequence of odd numbers between 1 and 9, while
seq(10,1,-1) would be a countdown from 10 to 1.
You can use negative indicies to mask out elements from a list. For example:
my_list <- c("red", "green", "blue") print(my_list[-1]) print(my_list[-2]) print(my_list[-3])
 "green" "blue"  "red" "blue"  "red" "green"
This is because
my_list[-1] means “everything except the element at index 1”, thus meaning
"green" "blue". Similarly,
my_list[-3] means “everything except the element at index 3”, thus meaning
You can mask out multiple indicies at once, e.g.
my_list <- c("red", "green", "blue") print(my_list[c(-1, -2)])
 "blue", as we have masked out the items at indicies 1 and 2.
You can also mask by passing in a list of booleans (true or false values) that has the same size as the list, e.g.
my_list <- c("red", "green", "blue") print(my_list[c(FALSE, TRUE, FALSE)])
 "green", as this is the only value whose index is
TRUE in the list of passed booleans. Equally:
my_list <- c("red", "green", "blue") print(my_list[c(TRUE, FALSE, TRUE)])
would now print
 "red" "blue", as the value
green is masked by
blue are both masked as
list.Rto print various slices of your list. Make sure you understand why you get an
NA(if you do).
Note that you can also use slices to update items in a list. For example;
my_list <- c("cat", "dog", 261) my_list[c(1,3)] <- c("fish","horse") print(my_list)
 "fish" "dog" "horse"
because we have updated the items at indicies 1 and 3 with the items
horse. You must make sure that the number of indicies matches the number of items that are updated.
In addition to updating a list, you can add items to the end of your list by using the
my_list <- c("cat", "dog") my_list <- append(my_list, "horse") print(my_list) other_list <- c("fish", "zebra") my_list <- append(my_list, other_list) print(my_list)
Append takes two arguments; the list, and then the item (or items) you want to append to the list. It appends the item(s) onto a copy of the list, which is returned.
The combine (
c) function does a very similar job to
append, and so you may see it used to append items too, e.g.
my_list <- c("cat", "dog") my_list <- c(my_list, "horse") print(my_list) other_list <- c("fish", "zebra") my_list <- c(my_list, other_list) print(my_list)
In my opinion, your code should be descriptive, and so describe the intent of what you want to do. Thus, I recommend using
append when you want to append items onto the end of a list, and use
c only when you want to combine or create a collection.