12 Code for example data
Below you can see pretty much all of the example data we have worked on thus far.
In the next chapter, we will do analyses of these data in an R Markdown document.
If you fell off the wagon at some point, you can copy-paste the entire example data below into RStudio as follows: In RStudio, click from the top menu on File –> New File –> R Script, paste the example data below into that script and save (Save As…) into an R script with the name my_data.R.
my_data.R will be downloaded as the source file of our R Markdown document in the next chapter.
Note! If you don’t copy-paste lines from here, check before moving to the next chapter that there are no View() commands in your R script my_data.R, as they will mess up the “knitting” (or converting) phase of the R Markdown document in the future. I have checked that there are no View() commands in the text below, so the code below is bullet-proof.
# Copy-paste everything below and save it in an R script names my_data.R
# Load dplyr:
library(dplyr)
# Load mtcart into a new dataset:
car_dataset <- mtcars
# Process your dataset with dplyr's commands:
car_dataset2 <- car_dataset %>% # Let's process the dataset car_dataset.
filter(carb > 1) %>% # Keep only cars with more than 1 carburetor.
mutate(fuel_cons = mpg, # New variables with new names are created
horsepower = hp, # on these four lines
mass = wt, # (actually, were making duplicates
gear = am) %>% # of the existing ones).
select(fuel_cons, # Keep only the variables we created,
horsepower, # destroy all others.
mass,
gear)
# Create "broken data" and fix it:
# 1.
#
# Create a car with no mass defined.
# Thus, the mass automatically gets the value NA in R (not applicable, i.e. missing data).
# Store the entire new dataset as the dataset "car_dataset3":
car_dataset3 <- car_dataset2 %>%
add_row(fuel_cons = 30.7,
horsepower = 67,
gear = 1)
#Let's name the last line (i.e. the car) "Lada":
row.names(car_dataset3)[nrow(car_dataset3)] <- "Лада"
# 2.
#
# The next car created will have the name "Trabant" that reflects the history of the GDR (DDR).
# The mass will be given a clearly incorrect value of 0 kilograms.
# The entire car dataset replaces the dataset "car_dataset3":
car_dataset3 <- car_dataset3 %>%
add_row(fuel_cons = 34.6,
horsepower = 23,
mass = 0,
gear = 1)
row.names(car_dataset3)[nrow(car_dataset3)] <- "Trabant"
# Delete any rows with either inadequate data or NA.
# Save a complete dataset by the name "car_dataset4":
car_dataset4 <- car_dataset3 %>%
filter(fuel_cons > 0,
horsepower > 0,
mass > 0,
gear %in% c(0,1))
# Continue practicing with dplyr:
# Let's start from the beginning: let's load mtcars.
# Note! Only when all the pipelined commands below have been run,
# the dataset is saved into a new dataset called "car_dataset5".
car_dataset5 <- mtcars %>%
# Only cars with more than 1 carburetor are selected.
filter(carb > 1) %>%
# Use the mutate command to change the American units to European ones,
# give the variables new names.
mutate(fuel_cons_eu = 235.2 / mpg,
horsepower_eu = hp / 0.9863,
mass_eu = wt * 453.6,
gear = am) %>%
# With the Select command, only the new variables we created are
# selected and thus preserved, all other variables will be destroyed.
select(fuel_cons_eu,
horsepower_eu,
mass_eu,
gear)
# Practice converting a variable to a factor form:
car_dataset5 <- car_dataset5 %>% mutate(gear = as.factor(gear))
# Practice converting a data frame to a tibble:
car_dataset6 <- as_tibble(car_dataset5, rownames = "Car Brands")
R guide by Ville Langén is licensed under Attribution-ShareAlike 4.0 International