Data Transformation I

Week 09, Fall 2024

Summary

This week we will dive into data visualization and transformation. In doing so, we will move from using strictly Base R, to supplementing Base R with the tidyverse.

Learning Objectives

After completing this week, you are expected to be able to:

  • Understand the difference between a tibble and a data.frame.
  • Visualize data using ggplot2.
  • Transform data using dplyr, specifically using the single table verbs:
    • select to pick columns (variables) based on their names
    • filter to pick rows (observations) based on their values
    • mutate to add new columns using functions of existing variables
    • summarize to create single number statistical summaries of columns
    • arrange to change the ordering of rows

Reading

Link Source
Introduction R4DS
Data Visualization R4DS
Workflow: Basics R4DS
Data Transformation R4DS
Workflow: Code Style R4DS
Data Tidying R4DS
Workflow: Scripts and Projects R4DS
Data Import R4DS
Workflow: Getting Help R4DS

Additional Reading

Link Source
tibble Vignette tibble Documentation
Tidy Data Vignette tidyr Documentation

Cheatsheets

Link Source
ggplot2 Posit Cheatsheets
dplyr Posit Cheatsheets
readr Posit Cheatsheets
tidyr Posit Cheatsheets

Data

Video

Title Link Mirror
9.1 - Welcome to Week 09 9.1 - YouTube 9.1 - ClassTranscribe
9.2 - Data and Tibbles 9.2 - YouTube 9.2 - ClassTranscribe
9.3 - Data Visualization with ggplot2 9.3 - YouTube 9.3 - ClassTranscribe
9.4 - Data Manipulation with dplyr 9.4 - YouTube 9.4 - ClassTranscribe
9.5 - Lab 06 9.5 - YouTube 9.5 - ClassTranscribe

Assignments

Assignment Deadline Credit
Lab 05 Thursday, October 24 100%
Quiz 05 Thursday, October 24 105%
Lab 06 Thursday, October 31 100%
Quiz 06 Thursday, October 31 105%

Office Hours

See Syllabus!