Learning Objectives

Components

A function has three parts:

  1. The formals(), the list of arguments that control how you call the function.
  2. The body(), the code inside the function.
  3. The function environment() (sometimes called “enclosing environment”), the data structure that determines how the function finds the values associated with the names.
square <- function(x) {
  x^2
}
formals(square)
## $x
body(square)
## {
##     x^2
## }
environment(square)
## <environment: R_GlobalEnv>

First class functions

Lexical Scoping

Name Masking

  • Names inside a function mask names defined outside a function.

  • If a name is not defined in a function, R looks up one level.

    x <- 2
    y <- 20
    g03 <- function() {
      y <- 1
      return(c(x, y))
    }
    g03()
    ## [1] 2 1
    # This doesn't change the previous value of y
    y
    ## [1] 20
  • If a function is defined inside a function, then it keeps looking up levels until it finds a variable.

  • Below, the inner function finds z inside the inner function, finds y in the outer function, and finds x outside both functions.

    x <- 1
    f1 <- function() {
      y <- 2
      f2 <- function() {
        z <- 3
        return(c(x, y, z))
      }
      return(f2())
    }
    f1()
    ## [1] 1 2 3

A fresh start

  • Each time a function is called, it creates a new environment to execute in (called the “execution environment”).

  • This means it does not remember what happened last time.

    a <- 2
    g11 <- function() {
      a <- a + 1
      return(a)
    }
    
    g11()
    ## [1] 3
    g11()
    ## [1] 3

Functions versus variables

  • Functions and variables can share names (though, this is not a good idea).

    sum <- c(10, 11)
    sum(sum)
    ## [1] 21
  • This is allowed since the function and the variable are in different environments. The sum object is in the global environment while the sum() function is in the package:base environment. More on this in Chapter 7.

    rlang::env_has(env = rlang::global_env(), nms = "sum")
    ##  sum 
    ## TRUE
    rlang::env_has(env = rlang::env_parents()[["package:base"]], nms = "sum")
    ##  sum 
    ## TRUE

Dynamic lookup

  • R determines where to look at function creation time (e.g. one level up), but it determines what is there at evaluation time.

    a <- 1
    g11()
    ## [1] 2
    a <- 12
    g11()
    ## [1] 13

Advanced R Exercises

  1. What does the following code return? Why? Describe how each of the three c’s is interpreted.

    c <- 10
    c(c = c)
  2. What does the following function return? Make a prediction before running the code yourself.

    f <- function(x) {
      f <- function(x) {
        f <- function() {
          x ^ 2
        }
        f() + 1
      }
      f(x) * 2
    }
    f(10)

Lazy Evaluation

... (dot-dot-dot)

Function exits

Implicit versus explicit returns

  • R will return the last evaluated expression by default:

    f <- function(x, y) {
      y
      x
    }
    f(1, 2)
    ## [1] 1
  • I prefer to explicitly include a return() call:

    f <- function(x, y) {
      y
      return(x)
    }
    f(1, 2)
    ## [1] 1

Visible versus invisible returns

  • A visible return prints the result:

    f <- function(x) {
      return(x)
    }
    f(1)
    ## [1] 1
  • You can prevent automatic printing by applying invisible().

    f <- function(x) {
      return(invisible(x))
    }
    f(1)
  • We can print the value with print().

    f(1) |>
      print()
    ## [1] 1

    or by enclosing in parentheses:

    (f(1))
    ## [1] 1
  • Assignment is a function with returns invisibly:

    x <- 1
    (x <- 1)
    ## [1] 1
  • You might be surprised that assignment is a function, but remember in R almost everything is a function. Prefix notation might make it more clear:

    `<-`(x, 10)
    x
    ## [1] 10
  • invisible() returns are often used for arguments whose main purpose are side effects (like print() or plot() functions), so that you can chain arguments.

    a <- b <- c <- d <- 2
    a
    ## [1] 2
    b
    ## [1] 2
    c
    ## [1] 2
    d
    ## [1] 2

Exit handlers

  • If you change the global state (e.g. the options() arguments), then it is polite to revert back on exit.

  • Use on.exit() to do so, setting add = TRUE to not overwrite previous exit handlers.

    cleanup <- function(dir, code) {
      old_dir <- setwd(dir)
      on.exit(setwd(old_dir), add = TRUE)
    
      # I can now change the working directory with impunity
    
      old_opt <- options(stringsAsFactors = FALSE)
      on.exit(options(old_opt), add = TRUE)
    
      # I can now change the options with impunity
    }
  • I have used this in real life when manipulating the parallelization backend using the {foreach} package

    oldDoPar <- doFuture::registerDoFuture()
    on.exit(with(oldDoPar, foreach::setDoPar(fun=fun, data=data, info=info)), add = TRUE)

Function Forms

Advanced R Exercises:

  1. Rewrite the following code snippets into prefix form:

    1 + 2 + 3
    ## [1] 6
    1 + (2 + 3)
    ## [1] 6
    if (length(x) <= 5) x[[length(x)]] else x[[5]]
    ## [1] 5
    • For the last one, make sure to convert if, else, [[, and <= all into prefix form.
  2. Create a replacement function called rmod() that modifies a random location in a vector. E.g.

    set.seed(1)
    x <- 1:10
    rmod(x) <- NA
    x
    ##  [1]  1  2  3  4  5  6  7  8 NA 10
    rmod(x) <- NA
    x
    ##  [1]  1  2  3 NA  5  6  7  8 NA 10
  3. Write your own version of + that pastes its inputs together if they are character vectors but behaves as usual otherwise. In other words, make this code work:

    1 + 2
    ## [1] 3
    "a" + "b"
    ## [1] "ab"
    • Hint: Look at the source code of +.
  4. Create an infix xor() operator. Call it %x|%. E.g.

    c(TRUE, TRUE, FALSE, FALSE) %x|% c(TRUE, FALSE, TRUE, FALSE)
    ## [1] FALSE  TRUE  TRUE FALSE
  5. Create infix versions of the set functions intersect(), union(), and setdiff(). You might call them %n%, %u%, and %/% to match conventions from mathematics.

New functions:


National Science Foundation Logo American University Logo Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.