Data types and classes

Lecture 7

Dr. Benjamin Soltoff

Cornell University
INFO 2951 - Spring 2026

February 10, 2026

Announcements

Announcements

  • Homework 02 due tomorrow
  • Complete team project preference survey

Written content outside code chunks

```{r}
#| label: plot-penguins-visible
#| echo: false

ggplot(penguins, 
       aes(x = flipper_len, y = bill_len)) +
  geom_point(aes(color = species, shape = species)) +
  scale_color_manual(values = c("darkorange","purple","cyan4")) +
  labs(
    x = "Flipper length (mm)", y = "Bill length (mm)",
    color = "Penguin species", shape = "Penguin species"
  ) +
  theme_minimal()
```

Generally speaking, larger flipper lengths correspond to larger bill lengths.
However Gentoo and Chinstrap tend to have longer bills than Adélie penguins,
and Gentoo also have longer flippers than either of the other species.

Generally speaking, larger flipper lengths correspond to larger bill lengths. However Gentoo and Chinstrap tend to have longer bills than Adélie penguins, and Gentoo also have longer flippers than either of the other species.

Written content inside code chunks

```{r}
#| label: plot-penguins-hidden
#| echo: false

ggplot(penguins, 
       aes(x = flipper_len, y = bill_len)) +
  geom_point(aes(color = species, shape = species)) +
  scale_color_manual(values = c("darkorange","purple","cyan4")) +
  labs(
    x = "Flipper length (mm)", y = "Bill length (mm)",
    color = "Penguin species", shape = "Penguin species"
  ) +
  theme_minimal()

# Generally speaking, larger flipper lengths correspond to larger bill lengths.
# However Gentoo and Chinstrap tend to have longer bills than Adélie penguins,
# and Gentoo also have longer flippers than either of the other species.
```

Learning objectives

  • Define vector objects in R
  • Distinguish between data types and classes
  • Review factors and date classes
  • Practice coercing vectors into appropriate classes

Vector basics

Types of vectors

Vectors are the basic building block for storing data in R.

  • Atomic vectors
  • Lists

Types of vectors

Types and classes

Types and classes

  • Type is how an object is stored in memory, e.g.,

    • double: a real number stored in double-precision floating point format.
    • integer: an integer (positive or negative)
  • Class is metadata about the object that can determine how common functions operate on that object, e.g.,

    • factor
    • date
    • date-time

Types of vectors

You’ll commonly encounter:

  • logical
  • integer
  • double
  • character

You’ll less commonly encounter:

  • list
  • NULL
  • complex
  • raw

We don’t typically think of them as lists, but data frames are lists where every element is a vector and they are all the same length.

Coercing vectors into different types

Base R functions

  • as.logical()
  • as.integer()
  • as.double()
  • as.character()

{readr} functions

  • parse_logical()
  • parse_integer()
  • parse_double()
  • parse_character()

Coercing vectors into different types

x <- c("3", "5", "alpha")

as.double(x)
Warning: NAs introduced by coercion
[1]  3  5 NA
parse_double(x)
Warning: 1 parsing failure.
row col expected actual
  3  -- a double  alpha
[1]  3  5 NA
attr(,"problems")
# A tibble: 1 × 4
    row   col expected actual
  <int> <int> <chr>    <chr> 
1     3    NA a double alpha 
y <- c("$23", "$17.67", "$123,000")

as.numeric(y)
[1] NA NA NA
parse_number(y)
[1]     23.00     17.67 123000.00

Factors

A factor is a vector that can contain only predefined values. It is used to store categorical data.

x <- factor(c("a", "b", "b", "a"))
x
[1] a b b a
Levels: a b
typeof(x)
[1] "integer"
attributes(x)
$levels
[1] "a" "b"

$class
[1] "factor"

Other classes

Just a couple of examples…

Date:

today <- Sys.Date()
today
[1] "2026-02-10"
typeof(today)
[1] "double"
attributes(today)
$class
[1] "Date"

Date-time:

now <- Sys.time()
now
[1] "2026-02-10 08:50:15 EST"
typeof(now)
[1] "double"
attributes(now)
$class
[1] "POSIXct" "POSIXt" 

Application exercise

ae-05

Instructions

  • Go to the course GitHub org and find your ae-05 (repo name will be suffixed with your GitHub name).
  • Clone the repo in Positron, run renv::restore() to install the required packages, open the Quarto document in the repo, and follow along and complete the exercises.
  • Render, commit, and push your edits by the AE deadline – end of the day

Wrap up

Recap

  • Vectors have types and classes. In general we don’t need to be too concerned with this distinction, but make sure you use an appropriate type/class for a variable (e.g. don’t store year as a character type)
  • {forcats} is a powerful package for working with factors
  • Check out the {tidyverse} and other packages for working with different classes of vectors
    • {lubridate} for date and date-time objects
    • {forecast} and {zoo} for time-series objects
    • {stringr} for character strings
    • {sf} for spatial objects

Acknowledgments