It is likely that a lot of the R code in the vignettes you saw in the last page did not make sense to you now. Do not worry! You are new to R and are still near the beginning of your learning journey. Over the last half of this workshop we will explore some of the building blocks of R so that they will begin to make a little more sense ;-)

A core building block of all programming languages is a function. A function is a reusable block of code that can be used over and over again in your program. A function takes inputs (called arguments), it then does something to those inputs to produce some outputs, which are returned to you.

You’ve already used many functions. For example,

```
library(stringr)
hello <- "Hello R"
length <- str_length(hello)
cat(sprintf("'%s' has %d characters\n", hello, length))
```

will print `'Hello R' has 7 characters`

.

This code has four functions:

`library`

: This function loads the library passed as the argument, e.g.`library(stringr)`

loads the`stringr`

library`str_length`

: This function calculates the number of characters in the string passed in as the argument, returning the number of characters. When input the value of`hello`

(namely`Hello R`

) it returns the number`7`

.`cat`

: This prints its arguments to the screen, returning nothing.`sprintf`

: This formats a string based on its many input arguments, returning the string that has been created.

You can write your own functions in R! For example, let’s try to write a function that calculates the mean average of a list of numbers.

As input, the function will take a list of numbers. It should output a number which is the mean of those numbers.

There are many ways this function could be written. Here is a possible solution;

```
calculate_mean <- function(values){
total <- 0.0
count <- 0
for (value in values){
total <- total + value
count <- count + 1
}
return(total / count)
}
```

We can then use this function, to, e.g. calculate the average height of a group of people, by typing;

```
person_heights = c(1.62, 1.80, 1.56, 1.73, 1.91)
average_height <- calculate_mean(person_heights)
cat(sprintf("The average height is %.2f m\n", average_height))
```

Running this would print;

`The average height is 1.72 m`

To explain how this worked, we need to look at how this function was defined. There is some scaffolding that is common to all functions. First, we define the function name. In R, this is a variable that holds the code of the function. We define this variable and assign data to it in the same way as if this was assigning a number to a numeric variable, or a string to a string variable, namely using the syntax `variable <- value`

:

```
Variable assigned value
↓ ↓ ↓
calculate_average <- function(...
```

Next, we have the keyword `function`

, that says that this is some data that is of type function. This means that the data will contain code. The arguments to `function`

are the arguments you would like to use as input for your new function;

```
keyword arguments
↓ ↓
calculate_average <- function(values) {
```

Next, you need the body of the function. This body is the lines of code that will be run when your function is called. Just like with `for`

loops or `if`

statements, the body of code is contained within curly brackets

```
Open curly brackets
↓
calculate_average <- function(values) {
# body of the function is the
# lines of code within the curly brackets
}
↑
Close curly brackets
```

The input(s) for the function is/are the argument(s) that are passed to `function`

, in this case, `values`

. Our code will loop through all of the values in `values`

to calculate the mean average. Once we have finished, we reach the final part of the function, which is `return`

. This is used to return the output of the function back to the caller.

```
Input(s)
↓
calculate_mean <- function(values){
total <- 0.0
count <- 0
for (value in values){
total <- total + value
count <- count + 1
}
return(total / count)
↑
Return output
}
```

Finally, when we call the function, the arguments that pass to the function are used as the input. The output is then returned and assigned to the result variable. So, in this case;

```
Call function with input(s)
↓ ↓
average_height <- calculate_mean(person_heights)
↑ ↑
Output assigned
```

`calculate_mean`

is called with `person_heights`

. The data referred to by `person_heights`

is passed to `calculate_mean`

and in this function is referred to as `values`

. This data is looped over, the mean average calculated, resulting in an output that is returned at the end of the function, and assigned to the variable `average_height`

.

EXERCISE

Write a function, called

`calculate_max`

, that returns the largest value. Use this to find the largest height in the list of heights above.Hint - start by using a variable called

`max_value`

and setting that equal to`NA`

. Then use`if (is.na(max_value) || value > max_value)`

to test whether a`value`

in`values`

is greater. The`||`

means “or”

## Errors

Your function works well, but what would happen if the wrong arguments were passed? What should we do if someone did this?

`result <- calculate_mean(c("cat", "dog", "horse"))`

If you run this now, you will see that R prints an error;

`Error in total + value : non-numeric argument to binary operator`

This isn’t very descriptive of helpful. You can control how R will behave in an error condition by using `stop`

or `warning`

.

You use `stop`

if you want to stop the function from continuing, and to print an error message for the user. For example, we could use `is.numeric`

to check if all of the values are numeric. If not, then we could `stop`

;

```
calculate_mean <- function(values){
total <- 0.0
count <- 0
for (value in values){
if (!is.numeric(value)){
stop("Cannot calculate average of non-numeric values")
}
total <- total + value
count <- count + 1
}
return(total / count)
}
```

(note that `!`

means “not”)

Now running;

`result <- calculate_mean(c("cat", "dog", "horse"))`

gives the more useful error message;

```
Error in calculate_mean(c("cat", "dog", "horse")) :
Cannot calculate average of non-numeric values
```

However, what if instead of stopping, we want to calculate the average of any numeric values, and just warn the user if non-numeric values are passed? We can do this using the `warning`

function, e.g.

```
calculate_mean <- function(values){
total <- 0.0
count <- 0
for (value in values){
number <- as.numeric(value)
if (!is.na(number)){
total <- total + number
count <- count + 1
} else {
warning("Skipping '", value,
"' as it is not numeric")
}
}
return(total / count)
}
```

In this case, we try to convert the value into a number using the `as.numeric`

function. If this fails, it will return `NA`

. We then test for `NA`

using the `is.na`

function, printing a warning that we are skipping this value if it isn’t a number.

EXERCISE

Add error handling to your

`calculate_max`

function so that it warns when non-numeric values are skipped, and stops when there is no maximum value (i.e. because there are no numeric values passed).