Importing and recoding data

Lecture 9

Dr. Benjamin Soltoff

Cornell University
INFO 2951 - Spring 2025

February 20, 2025

Announcements

Announcements

  • Lab 03
  • Homework 03
  • Project proposal

Update your renv.lock file

  1. To install a new package for the project, use renv::install() or install.packages().
  2. Run renv::snapshot(type = "all"). This will update renv.lock to include all packages used in the project library.
  3. Stage, commit, and push the renv.lock file.

Data “wrangling”

A screenshot of a New York Times article.

A screenshot of 'Data Carpentry' by David Mimno.

Reading data into R

  • Local data files
  • Databases
  • Web scraping
  • Application programming interfaces (APIs)

Reading rectangular data

Application exercise

Powerball Lottery

Powerball Lottery

Powerball Lottery

ae-07

Instructions

  • Go to the course GitHub org and find your ae-07 (repo name will be suffixed with your GitHub name).
  • Clone the repo in RStudio, run renv::restore() to install the required packages, open the Quarto document in the repo, and follow along and complete the exercises.
  • Render, commit, and push your edits by the AE deadline – end of the day

Wrap up

Recap

  • Simplify your life – get the data in as simple a format as possible
  • Examine the file’s structure before attempting to import into R. Use the RStudio interactive menu as necessary.
  • Ensure all data cleaning is reproducible. Do not replace your raw data files.

Jazz or Not Jazz?