10 Transformation variables

In Finnish, a new variable that is based on one or more existing variables is called a muunnosmuuttuja - the closest translation in English might be a combination variable or a transformation variable.

An example could be a new variable hypertension, which would receive the value 1 if the subject’s office blood pressure was 140/90 mmHg or higher or if the subject was taking blood pressure medication, but otherwise the value 0. (In this specific example, the resulting variable would also be a binary or “dummy” variable.)

We already created such transformation variables with the command mutate, but we really only gave a new name to each of the new variables and did not change their values.

For the sake of practice, let’s change the American units in our dataset to a European format. With the following command, take a look at the info of the dataset mtcars that we have been using (you can run the command either in the Console or in your R script):

?mtcars

In the bottom right pane of RStudio, in the “Help” tab, the following content will appear (here you can see a few lines from it):

mtcars {datasets}   R Documentation
Motor Trend Car Road Tests
Description
The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).

Usage
mtcars
Format
A data frame with 32 observations on 11 (numeric) variables.

[, 1]   mpg Miles/(US) gallon
[, 2]   cyl Number of cylinders
[, 3]   disp    Displacement (cu.in.)
[, 4]   hp  Gross horsepower


In the Help tab, you’ll notice that the dataset mtcars contains data entered in American units (miles, gallons, hp-type horsepower, etc.).

We now want to change the American units to European ones.

I found the following info with an open internet search:

  • European fuel consumption (l/100 km) = 235.2 / American consumption units (mpg)
  • European horsepower (PS) = American horsepower (hp) / 0.9863
  • European mass (kg) = American mass of the car (expressed in unit: 1000 lbs) * 453.6

Let’s run the code already run in chapter 7 one more time, but this time we will change the units to European. Let’s save the results into a new dataset “car_dataset5”:

# Let's start from the beginning: let's load mtcars.
# Note! Only when all the pipelined commands below have been run,
# the dataset is saved into a new dataset called "car_dataset5".

car_dataset5 <- mtcars %>%
  
# Only cars with more than 1 carburetor are selected.

  filter(carb > 1) %>%
  
# Use the mutate command to change the American units to European ones,
# give the variables new names.

  mutate(fuel_cons_eu = 235.2 / mpg,
         horsepower_eu = hp / 0.9863,
         mass_eu = wt * 453.6,
         gear = am) %>%

# With the Select command, only the new variables we created are
# selected and thus preserved, all other variables will be destroyed.

  select(fuel_cons_eu,
         horsepower_eu,
         mass_eu,
         gear)


Now view your data by typing in the console:

View(car_dataset5)

Now the “biomarkers” of cars look more familiar to a European eye (shown below are the first 4 lines):

fuel_cons_eu horsepower_eu mass_eu gear
Mazda RX4 11.20000 111.5279 1188.432 1
Mazda RX4 Wag 11.20000 111.5279 1304.100 1
Hornet Sportabout 12.57754 177.4308 1560.384 0
Duster 360 16.44755 248.4031 1619.352 0

R guide by Ville Langén is licensed under Attribution-ShareAlike 4.0 International