Import the tidyverse
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✔ ggplot2 3.3.3 ✔ purrr 0.3.4
## ✔ tibble 3.1.1 ✔ dplyr 1.0.5
## ✔ tidyr 1.1.3 ✔ stringr 1.4.0
## ✔ readr 1.4.0 ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
Read in the file. As whitespace is the delimiter, we need to use read_table
. Note that read_delim
with delim=" "
is the wrong choice as it will try to split on single whitespace characters. read_table
is the right choice for multiple whitespace separators.
Note that we should read the DATE
as an integer, as it is a year.
temperature <- read_table(
"https://chryswoods.com/data_analysis_r/cetml1659on.txt",
skip=6,
na=c("-99.99", "-99.9"),
col_types=cols("DATE"=col_integer())
)
temperature
How many years had a negative average temperature in January?
negative_jan <- temperature %>% filter(temperature["JAN"] < 0)
negative_jan
num_negative_jan <- as.numeric(count(negative_jan))
num_negative_jan
## [1] 20
What was the average temperature in June over the years in the data set?
jun_average <- mean(temperature[["JUN"]])
jun_average
## [1] 14.33757
(note that we have to get the column data, and not a 1-column tibble, hence why we use [[ ]]
)