Skip to content

R Objects

In R, every piece of data is stored as an object. These objects come in different forms depending on the structure and type of the data. Understanding these forms (or "classes") is essential because different functions require or produce specific object types.

Common Object Types

1. Vectors

Vectors are the simplest R objects. They’re one-dimensional arrays that hold elements of the same type (e.g., all numbers, all characters, or all logical values).

  • Examples:
    numeric_vector <- c(1, 2, 3)
    character_vector <- c("apple", "banana", "cherry")
    logical_vector <- c(TRUE, FALSE, TRUE)
    
  • Use Case:
    Vectors form the building blocks of more complex data structures.

2. Matrices

Matrices are two-dimensional arrays with rows and columns where every element is of the same type.

  • Examples:
    my_matrix <- matrix(1:9, nrow = 3)  # 3 rows, 3 columns
    
  • Use Case:
    Ideal for mathematical computations where the data is homogeneous (all numbers, for instance).

3. Data Frames

Data frames are like spreadsheets or tables where each column can be a different type (numeric, character, factor, etc.).

  • Examples:
    my_df <- data.frame(
      Name = c("Alice", "Bob", "Charlie"),
      Age = c(25, 30, 35),
      Married = c(TRUE, FALSE, TRUE)
    )
    
  • Use Case:
    Widely used for storing datasets in statistics and data analysis. They allow you to mix different data types by column.

4. Lists

Lists are very flexible containers that can hold elements of different types and even other lists. Each element of a list can be a vector, matrix, data frame, or another list.

  • Examples:
    my_list <- list(
      numbers = 1:5,
      letters = c("a", "b", "c"),
      data = data.frame(x = 1:3, y = c("A", "B", "C"))
    )
    
  • Use Case:
    Great for bundling diverse outputs, such as results from a complex model.

  • Compare with Data Frames:

    1. Structure: Data frames are inherently 2-dimensional (like tables), whereas lists can be one-dimensional collections of arbitrary objects.
    2. Uniformity: Data frames require each column to be of equal length, but lists can contain elements of varying lengths and types.
    3. Usage: Use data frames when you need a structured table for analysis. Use lists when you need a flexible container for storing assorted objects.

5. Factors

Factors are used for categorical data. They store data as a set of discrete levels, which is useful for statistical modeling and plotting.

  • Examples:
    gender <- factor(c("male", "female", "female", "male"))
    
  • Use Case:
    Helps R understand that a variable is categorical, which influences how functions like modeling or plotting treat the variable.

Conversion Functions: The "as." Family

R provides a variety of functions that start with as. to convert or "coerce" objects from one type to another. Here are some key ones:

  • as.vector(x)
    Ensures that x is a vector.

  • as.matrix(x)
    Converts an object (like a data frame or vector) into a matrix. This is useful when you need a 2D, homogeneous data structure.

  • as.data.frame(x)
    Converts an object into a data frame, which is handy when you need tabular data with potentially mixed data types.

  • as.list(x)
    Converts x into a list.

  • as.numeric(x)
    Converts an object to a numeric vector. This is especially important when numbers are stored as characters.

  • as.character(x)
    Converts an object to a character vector.

  • as.logical(x)
    Converts data to logical values (TRUE/FALSE).

  • as.factor(x)
    Converts a vector into a factor for categorical data.

These functions help ensure that your data is in the correct format for analysis. For example, when using modeling functions like glmnet, you might need to convert your predictors to a matrix with as.matrix().