INFO 2951: Introduction to Data Science with R

Modified

April 17, 2026

This page contains an outline of the topics, content, and assignments for the semester. Note that this schedule will be updated as the semester progresses and the timeline of topics and assignments might be updated throughout the semester.

WEEK DATE TOPIC PREPARE MATERIALS DUE
1 Tue, Jan 20 Welcome to INFO 2951 πŸ‘©β€πŸ’» Login to Cornell’s GitHub server
πŸ“½οΈ slides 01
⌨️ ae 00 - UN votes


Thu, Jan 22 Grammar of graphics πŸ“— r4ds - intro
πŸ“— r4ds - ch 1.1-.3, 1.7
πŸ“˜ ims - ch 1
πŸ“½οΈ slides 02
⌨️ ae 00 - WDI
βœ… ae 00 - WDI


Fri, Jan 23 Hello data science!
⌨️ hw 00
2 Tue, Jan 27 Visualizing various types of data πŸ“˜ ims - ch 4
πŸ“˜ ims - ch 5
πŸ“½οΈ slides 03
⌨️ ae 01
βœ… ae 01


Wed, Jan 28

βœ… hw 00 HW 00 at 11:59pm

Thu, Jan 29 Grammar of data wrangling πŸ“— r4ds - ch 3 πŸ“½οΈ slides 04
⌨️ ae 02
βœ… ae 02


Fri, Jan 30 Data visualization
⌨️ hw 01
3 Tue, Feb 3 Working with relational data πŸ“— r4ds - ch 19 πŸ“½οΈ slides 05
⌨️ ae 03
βœ… ae 03


Wed, Feb 4

βœ… hw 01 HW 01 at 11:59pm

Thu, Feb 5 Tidying data πŸ“— r4ds - ch 5 πŸ“½οΈ slides 06
⌨️ ae 04
βœ… ae 04


Fri, Feb 6 Data wrangling πŸ‘©β€πŸ’» Complete team preference survey for team project ⌨️ hw 02
4 Tue, Feb 10 Data types and classes πŸ“— r4ds - ch 4
πŸ“— r4ds - ch 12 (read 12.1-12.2, skim the rest)
πŸ“— r4ds - ch 14.1-.3
πŸ“— r4ds - ch 16
πŸ“½οΈ slides 07
⌨️ ae 05
βœ… ae 05


Wed, Feb 11

βœ… hw 02 HW 02 at 11:59pm

Thu, Feb 12 Importing and recoding data πŸ“— r4ds - ch 7
πŸ“— r4ds - ch 17.1 - 17.3
πŸ“½οΈ slides 08
⌨️ ae 06
βœ… ae 06


Fri, Feb 13 Git workflows (basics + merge conflicts) πŸ“„ Happy Git with R - ch 1 ⌨️ Collaborating with Git
5 Tue, Feb 17 No class (February Break)



Thu, Feb 19 Databases + SQL πŸ“— r4ds - ch 21 πŸ“½οΈ slides 09
⌨️ ae 07
βœ… ae 07


Fri, Feb 20 What makes for a good data science project? Read these articles to discuss in class
πŸ“„ Are Pop Lyrics Getting More Repetitive?
πŸ“„ The Hidden Cost of Digital Consumption
⌨️ hw 03
6 Tue, Feb 24 Getting data from the web: Scraping πŸ“— r4ds - ch 24
πŸ‘©β€πŸ’» Install SelectorGadget
πŸ“½οΈ slides 10
⌨️ ae 08
βœ… ae 08


Wed, Feb 25

βœ… hw 03 HW 03 at 11:59pm

Thu, Feb 26 Functions πŸ“— r4ds - ch 25.2-.3 πŸ“½οΈ slides 11
⌨️ ae 09
βœ… ae 09


Fri, Feb 27 Quiz 01
πŸ“œ quiz 01
⌨️ hw 04

7 Tue, Mar 3 Iteration πŸ“— r4ds - ch 26
πŸ“— r4ds - ch 27 (skim for familiarity with base R syntax)
πŸ“½οΈ slides 12
⌨️ ae 10
βœ… ae 10


Wed, Mar 4

βœ… hw 04 HW 04 at 11:59pm

Thu, Mar 5 Getting data from the web: APIs πŸ“„ Application Programming Interface
πŸ“„ Obtaining World Bank indicators
πŸ“„ Securely storing API keys
πŸ“½οΈ slides 13
⌨️ ae 11
βœ… ae 11


Fri, Mar 6 Refine project proposals
⌨️ hw 05
8 Tue, Mar 10 Rectangling data πŸ“— r4ds - ch 23 πŸ“½οΈ slides 14
⌨️ ae 12
βœ… ae 12
Project proposal at 11:59pm

Wed, Mar 11

βœ… hw 05 HW 05 at 11:59pm

Thu, Mar 12 Reproducible reporting with Quarto πŸ“— r4ds - ch 28 πŸ“½οΈ slides 15
⌨️ ae 13
βœ… ae 13


Fri, Mar 13 Project EDA work session


9 Tue, Mar 17 Hypothesis testing with randomization πŸ“˜ ims - ch 11
πŸ“˜ ims - ch 14
πŸ“½οΈ slides 16
⌨️ ae 14
βœ… ae 14


Wed, Mar 18




Thu, Mar 19 Quantifying uncertainty with the bootstrap πŸ“˜ ims - ch 12 πŸ“½οΈ slides 17
⌨️ ae 15
βœ… ae 15


Fri, Mar 20 Quiz 02
πŸ“œ quiz 02
⌨️ hw 06

10 Tue, Mar 24 Linear regression with a single predictor πŸ“˜ ims - ch 8
πŸ“˜ ims - ch 24
πŸ“½οΈ slides 18
⌨️ ae 16
βœ… ae 16
Project EDA at 11:59pm

Wed, Mar 25

βœ… hw 06 HW 06 at 11:59pm

Thu, Mar 26 Linear regression with multiple predictors πŸ“˜ ims - ch 9
πŸ“˜ ims - ch 25
πŸ“½οΈ slides 19
⌨️ ae 17
βœ… ae 17


Fri, Mar 27 No class


11 Tue, Mar 31 No class (Spring Break)



Thu, Apr 2 No class (Spring Break)



Fri, Apr 3 No class (Spring Break)


12 Tue, Apr 7 Models for discrete outcomes πŸ“˜ ims - ch 9 πŸ“½οΈ slides 20
⌨️ ae 18
βœ… ae 18


Wed, Apr 8




Thu, Apr 9 Introduction to machine learning πŸ“• tmwr - ch 4-6, 10 πŸ“½οΈ slides 21
⌨️ ae 19
βœ… ae 19


Fri, Apr 10 Refine project hypotheses
⌨️ hw 07
13 Tue, Apr 14 Build better training data πŸ“• tmwr - ch 7-9 πŸ“½οΈ slides 22
⌨️ ae 20
βœ… ae 20
Project preregistration at 11:59pm

Wed, Apr 15

βœ… hw 07 HW 07 at 11:59pm

Thu, Apr 16 Tree-based inference and hyperparameter optimization πŸ“• tmwr - ch 12-14 πŸ“½οΈ slides 23
⌨️ ae 21
βœ… ae 21


Fri, Apr 17 No class
πŸ“œ quiz 03
⌨️ hw 08

14 Tue, Apr 21 An introduction to LLMs

Project draft at 11:59pm

Wed, Apr 22


HW 08 at 11:59pm

Thu, Apr 23 Programming with LLMs



Fri, Apr 24 Quiz 03


15 Tue, Apr 28 Prompt engineering and augmented generation



Thu, Apr 30 Tool calling



Fri, May 1 Project presentations

Project presentations in class
16 Tue, May 5 Wrap-up: Where to go from here

Project report + reproducibility at 11:59pm
Extra credit at 11:59pm

Wed, May 13 Final exam

Final exam at 2pm (location TBD)