Learning Objectives

Vector Types

Atomic Vectors

  • Four basic types:

    • Logical: Either TRUE or FALSE
    • Integer:
      • Exactly an integer. Assign them by adding L behind it (for “long integer”).
      • -1L, 0L, 1L, 2L, 3L, etc…
    • Double:
      • Decimal numbers.
      • 1, 1.0, 1.01, etc…
      • Inf, -Inf, and NaN are also doubles.
    • Character:
      • Anything in quotes:
      • "1", "one", "1 won one", etc…
  • You create vectors with c() for “combine”

    x <- c(TRUE, TRUE, FALSE, TRUE) ## logical
    x <- c(1L, 1L, 0L, 1L) ## integer
    x <- c(1, 1, 0, 1) ## double
    x <- c("1", "1", "0", "1") ## character
  • There are no scalars in R. A “scalar” is just a vector length 1.

    is.vector(TRUE)
    ## [1] TRUE
  • Integers and doubles are together called “numerics”

    atomic vector types

  • You can determine the type with typeof().

    x <- c(TRUE, FALSE)
    typeof(x)
    ## [1] "logical"
    x <- c(0L, 1L)
    typeof(x)
    ## [1] "integer"
    x <- c(0, 1)
    typeof(x)
    ## [1] "double"
    x <- c("0", "1")
    typeof(x)
    ## [1] "character"
  • The special values, Inf, -Inf, and NaN are doubles

    typeof(c(Inf, -Inf, NaN))
    ## [1] "double"
  • Determine the length of a vector using length()

    length(x)
    ## [1] 2
  • Missing values are represented by NA.

  • NA is technically is a logical value.

    typeof(NA)
    ## [1] "logical"
    • This rarely matters because logicals get coerced to other types when needed.

      typeof(c(1L, NA))
      ## [1] "integer"
      typeof(c(1, NA))
      ## [1] "double"
      typeof(c("1", NA))
      ## [1] "character"
    • But if you need missing values of other types, you can use

      NA_integer_ ## integer NA
      NA_real_ ## double NA
      NA_character_ ## character NA
    • This typically shows up in dplyr::if_else() where the return values need to be all of the same type.

      dplyr::if_else(c(TRUE, FALSE), 1, NA) ## errors
      ## Error in `dplyr::if_else()`:
      ## ! `false` must be a double vector, not a logical vector.
      dplyr::if_else(c(TRUE, FALSE), 1, NA_real_) ## works fine
      ## [1]  1 NA
    • Never use == when testing for missingness. It will return NA since it is always unknown if two unknowns are equal. Use is.na().

      x <- c(NA, 1)
      x == NA
      ## [1] NA NA
      is.na(x)
      ## [1]  TRUE FALSE
  • You can check the type with is.logical(), is.integer(), is.double(), and is.character().

    is.logical(TRUE)
    ## [1] TRUE
    is.integer(1L)
    ## [1] TRUE
    is.double(1)
    ## [1] TRUE
    is.character("1")
    ## [1] TRUE
  • Attempting to combine vectors of different types coerces them to the same type. The order of preference is character > integer > double > logical.

    typeof(c(1L, TRUE))
    ## [1] "integer"
    typeof(c(1, 1L))
    ## [1] "double"
    typeof(c("1", 1))
    ## [1] "character"
  • Exercise (from Advanced R): Predict the output:

    c(1, FALSE)
    c("a", 1)
    c(TRUE, 1L)
  • Exercise (from Advanced R): Explain these results:

    1 == "1"
    ## [1] TRUE
    -1 < FALSE
    ## [1] TRUE
    "one" < 2
    ## [1] FALSE

Attributes

Names

  • Names are a character vector the same length as the atomic vector. Each name corresponds to a single element.

  • You could set names using attr(), but you should not.

    x <- 1:3
    attr(x, "names") <- c("a", "b", "c")
    attributes(x)
    ## $names
    ## [1] "a" "b" "c"
  • Names are so special, that there are special ways to create them and view them

    x <- c(a = 1, b = 2, c = 3)
    names(x)
    ## [1] "a" "b" "c"
    x <- 1:3
    names(x) <- c("a", "b", "c")
    names(x)
    ## [1] "a" "b" "c"
  • The proper way to think about names is like this:

    names correct

    But each name corresponds to a specific element, so Hadley does it like this:

    names intuitive

  • Names stay with single bracket subsetting (not double bracket subsetting)

    names(x[1])
    ## [1] "a"
    names(x[1:2])
    ## [1] "a" "b"
    names(x[[1]])
    ## NULL
  • Names can be used for subsetting (more in Chapter 4)

    x[["a"]]
    ## [1] 1
  • You can remove names with unname().

    unname(x)
    ## [1] 1 2 3

Dimensions

  • The dim attribute makes a vector into a matrix (a rectangle of numbers) or an array (a block of numbers).

  • Again, you could use attr() to set dim, but you should not.

    x <- 1:6
    attr(x, "dim") <- c(2, 3)
    x
    ##      [,1] [,2] [,3]
    ## [1,]    1    3    5
    ## [2,]    2    4    6
    x <- 1:12
    attr(x, "dim") <- c(2, 2, 3)
    x
    ## , , 1
    ## 
    ##      [,1] [,2]
    ## [1,]    1    3
    ## [2,]    2    4
    ## 
    ## , , 2
    ## 
    ##      [,1] [,2]
    ## [1,]    5    7
    ## [2,]    6    8
    ## 
    ## , , 3
    ## 
    ##      [,1] [,2]
    ## [1,]    9   11
    ## [2,]   10   12
  • You should either use matrix() or array() to create these objects, or set the dimension with dim().

    x <- 1:6
    dim(x) <- c(2, 3)
    dim(x)
    ## [1] 2 3
    x <- matrix(1:6, nrow = 2, ncol = 3)
    dim(x)
    ## [1] 2 3
    y <- 1:12
    dim(y) <- c(2, 2, 3)
    dim(y)
    ## [1] 2 2 3
    y <- array(1:12, dim = c(2, 2, 3))
    dim(y)
    ## [1] 2 2 3
  • length() still works on matrices and arrays, but is less useful

    length(x)
    ## [1] 6
    length(y)
    ## [1] 12
  • Instead, use nrow() and ncol() (for matrices), or dim() (for arrays).

    nrow(x)
    ## [1] 2
    ncol(x)
    ## [1] 3
    dim(y)
    ## [1] 2 2 3
  • Instead of having names, arrays and matrices of dimnames. The dimnames of an array is a list the same length as the number of dimensions of the array.

    x <- array(1:12, dim = c(2, 2, 3))
    dimnames(x) <- list(first = c("a", "b"),
                        second = c("c", "d"),
                        third = c("e", "f", "g"))
    dimnames(x)
    ## $first
    ## [1] "a" "b"
    ## 
    ## $second
    ## [1] "c" "d"
    ## 
    ## $third
    ## [1] "e" "f" "g"
    x
    ## , , third = e
    ## 
    ##      second
    ## first c d
    ##     a 1 3
    ##     b 2 4
    ## 
    ## , , third = f
    ## 
    ##      second
    ## first c d
    ##     a 5 7
    ##     b 6 8
    ## 
    ## , , third = g
    ## 
    ##      second
    ## first  c  d
    ##     a  9 11
    ##     b 10 12

    This is useful for subsetting, and for bookkeeping when you have data structured in a complicated multidimensional array (e.g. it is hard to remember what indexes the first vs second vs third dimensions without dimnames).

    x["a", "c", "g"]
    ## [1] 9
  • A vector is not a matrix with 1 dimension. It has NULL dimensions.

    z <- c(1, 2, 3)
    dim(z)
    ## NULL
  • Exercise: What’s the differences between ncol() and NCOL(). Read the help file and demonstrate some code where they provide different results.

  • Exercise (from Advanced R): How would you describe the following three objects? What makes them different from 1:5?

    x1 <- array(1:5, c(1, 1, 5))
    x2 <- array(1:5, c(1, 5, 1))
    x3 <- array(1:5, c(5, 1, 1))
  • Exercise: How do you get rid of the dimensions in the following array?

    x <- array(1:12, dim = c(2, 2, 3))

S3 Atomic Vectors

Creating Empty Vectors

Lists

Data Frames

NULL

New Functions


National Science Foundation Logo American University Logo Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.