Learning Objectives

Environment Basics

Parents

Working with Objects in an Environment

Hadley’s Advanced R Exercises

  1. Create an environment as illustrated by this picture.

    recursive

  2. Create a pair of environments as illustrated by this picture.

    recursive

Loop through environments

Special Environments

Package environments.

  • Each package attached by library() creates a package environment that becomes an ancestor of the global environment. They are parents in the order that you attached them.

    recursive

  • This order is called the search path because variable names are searched in that order. You can see the search path with rlang::search_envs().

    rlang::search_envs()
    ##  [[1]] $ <env: global>
    ##  [[2]] $ <env: package:rlang>
    ##  [[3]] $ <env: package:stats>
    ##  [[4]] $ <env: package:graphics>
    ##  [[5]] $ <env: package:grDevices>
    ##  [[6]] $ <env: package:utils>
    ##  [[7]] $ <env: package:datasets>
    ##  [[8]] $ <env: package:methods>
    ##  [[9]] $ <env: Autoloads>
    ## [[10]] $ <env: package:base>
  • So if I try to evaluate a variable/function name, then it will first search for it in the global environment, then in the {rlang} package environment, then in the {stats} package environment, etc…

  • Attaching a new package with library() makes that package the immediate parent of the global environment.

    library(d)

    recursive

Function Environment

  • The function environment is the environment where the function has access to all objects in that environment and its parent environments. This is the current environment when the function is created, not when the function is called.

  • You can see the function environment via rlang::fn_env()

  • The function environment may or may not be a new environment. E.g. most of the functions you write use the global environment as the function environment.

    x <- 5
    fn <- function() {
      sum(1:x)
    }
    rlang::fn_env(fn)
    ## <environment: R_GlobalEnv>
  • Above, since the function environment is the global environment, fn() has access to x (which is in the global environment) and to sum() (which is in the namespace:base environment).

  • Most functions in a package have the namespace environment (see below) as the function environment.

    rlang::fn_env(base::sum)
    ## <environment: namespace:base>
    rlang::fn_env(stats::lm)
    ## <environment: namespace:stats>
    rlang::fn_env(rlang::fn_env)
    ## <environment: namespace:rlang>
  • The function environment may or may not be different from the environment where the name is bound to the function. That space is called the binding environment of the function.

  • In many cases, the function environment is the same as the binding environment. Below, the name f in the global environment is bound to the function (arrow moving from f to the yellow object), so the binding environment is the global environment. Also below, the function is bound to the global environment (arrow moving from the black dot to the global environment), so the global environment has the objects that the function has access to, so the function environment is also the global environment.

    y <- 1
    f <- function(x) {
      return(x)
    }

    recursive

  • Below, the name g in the e environment was created in the global environment. So the function environment is the global environment. But the name g is in the e environment, so the binding environment is e.

    y <- 1
    e <- rlang::env()
    e$g <- function(x) {
      return(x)
    }

    recursive

  • Exercise: Does g() still have access to all of the objects in the global environment?

Namespaces

  • The package environment is the external interface for a package. It contains the exported functions of a package.

  • The namespace environment of a package is the internal interface for a package. Functions in the package will search for its objects in the namespace environment.

  • This is what allows us to modify (rather foolishly) var() but still allow sd() to work properly.

    sd
    ## function (x, na.rm = FALSE) 
    ## sqrt(var(if (is.vector(x) || is.factor(x)) x else as.double(x), 
    ##     na.rm = na.rm))
    ## <bytecode: 0x55943244bc68>
    ## <environment: namespace:stats>
    x <- rnorm(10)
    sd(x)
    ## [1] 0.6674
    var <- function(x) 0
    var(x)
    ## [1] 0
    sd(x)
    ## [1] 0.6674
  • The namespace environment acts as the function environment for all functions in a package.

  • An exported function has a binding both in the namespace environment and the package environment.

    namespace vs packge environments

  • An internal function only has a binding the namespace environment.

  • The parent of a namespace environment is an imports environment that contains bindings of all functions used by the package.

  • The parent of the imports environment is the base namespace, where all base functions are located (this is why you don’t need to import sum() or use base::sum() in your package code).

  • The parent of the base namespace is the global environment.

    library(rlang)
    env_parents(fn_env(stats::var))
    ## [[1]] $ <env: imports:stats>
    ## [[2]] $ <env: namespace:base>
    ## [[3]] $ <env: global>
    env_parents(fn_env(rlang::fn_env))
    ## [[1]] $ <env: imports:rlang>
    ## [[2]] $ <env: namespace:base>
    ## [[3]] $ <env: global>

    ancestors of namespace environment

  • Below, let the yellow object be the sd() function defined in the package {stats}. Then whenever another function in {stats} uses sd() or var() it finds them in the namespace:stats environment. Right now, the package:stats environment also points to that function, so when the user uses sd() they can use the version created by the {stats} authors. However, if we change the definition of var(), we only change the binding in the package:stats environment, not in the namespace:stats environment. This means that {stats} functions will work even if we change the binding. In particular, sd(), which uses var(), will still work.

    namespace and package

Execution Environments

  • Each time a function is called, a new environment is created to host execution. This is called the execution environment.

  • This is why a is not saved between calls:

    fn <- function() {
      if (!env_has(current_env(), "a")) {
        a <- 1
      } else {
        a <- a + 1
      }
      return(a)
    }
    fn()
    ## [1] 1
    fn()
    ## [1] 1
  • The execution environment is always the child of the function environment.

  • Consider this function

    h <- function(x) {
      # 1.
      a <- 2 # 2.
      x + a
    }
    y <- h(1) # 3.
  • The yellow object is the h() function. The name h is in the global environment (bottom right). The execution environment is the top left. x binds to 1. Then a is defined, it binds to 2. When the function completes, it returns 3 and so y binds to 3 in the global environment. The execution environment is garbage collected.

    execution environment

  • It is possible to explicitly return the execution environment so that it is not garbage collected. But this is rarely done.

    h2 <- function(x) {
      a <- x * 2
      rlang::current_env()
    }
    e <- h2(x = 10)
    rlang::env_print(e)
    ## <environment: 0x559433fd0850>
    ## Parent: <environment: global>
    ## Bindings:
    ## • a: <dbl>
    ## • x: <dbl>
  • More frequently, the execution environment is maintained because it is a function environment of a returned function.

    plus <- function(x) {
      function(y) x + y
    }
    
    plus_one <- plus(1)
    plus_one
    ## function(y) x + y
    ## <environment: 0x559430a38cc8>

    execution environment is function environment

  • Above figure: the global environment is the bottom box. The execution environment is the top box. The plus() function is the right yellow object. Its function environment and binding environment is the global environment. When plus() is called with x = 1, it creates the execution environment where x is bound to 1. When plus_one() is defined, its function environment is the execution environment (since that is where it was created), but its binding environment is the global environment (where the name is bound). Note that the parent environment of the plus()’s execution environment is the global environment.

    rlang::fn_env(plus_one)
    ## <environment: 0x559430a38cc8>
    rlang::env_parent(rlang::fn_env(plus_one))
    ## <environment: R_GlobalEnv>
  • When we call plus_one(), its execution environment will have the execution environment of plus() as its parent.

    x <- 20
    plus_one(2)
    ## [1] 3

    execution environment of plus_one()

  • When we call plus_one() with y bound to 2 (execution environment top left), when it tries to find x it first searches in

    rlang::fn_env(plus_one)
    ## <environment: 0x559430a38cc8>

    (execution environment top right) before going to the global environment (bottom).

  • Exercise (Advanced R): Draw a diagram that shows the function environments of this function, along with all bindings for the given execution.

    f1 <- function(x1) {
      f2 <- function(x2) {
        f3 <- function(x3) {
          x1 + x2 + x3
        }
        f3(3)
      }
      f2(2)
    }
    f1(1)
    ## [1] 6

Caller Environment

  • Recall, the function environment is the environment in which the function was created.

  • The caller environment is the environment in which the function was called (aka “used”)

  • You can get this environment inside a function via rlang::caller_env().

    fn1 <- function(x) {
      fn2 <- function(y) {
        rlang::caller_env()
      }
      e0 <- rlang::current_env() ## The execution environment of fn1()
      e1 <- fn2() ## The caller environment of fn2()
      e2 <- rlang::caller_env() ## The caller environment of fn1()
      return(list(e0, e1, e2))
    }
    fn1()
    ## [[1]]
    ## <environment: 0x559432c478c8>
    ## 
    ## [[2]]
    ## <environment: 0x559432c478c8>
    ## 
    ## [[3]]
    ## <environment: R_GlobalEnv>

Applications

Numeric Derivatives

  • stats::numericDeriv() numerically evaluates the gradient of an expression at some value. It assumes that the evaluation occurs within some environment you provide.

  • Let’s calculate the gradient of \(x^y\) evaluated at \(x = 3\).

    myenv <- rlang::env(x = 3, y = 2)
    numericDeriv(expr = rlang::expr(x ^ y), theta = "x", rho = myenv)
    ## [1] 9
    ## attr(,"gradient")
    ##      [,1]
    ## [1,]    6
  • From calculus, we know that the derivative of \(x^2\) is \(2x\), and so the gradient evaluated at \(x = 3\) should be \(2\times 3 = 6\).

  • rlang::expr() is discussed in Chapter 19. Basically, it captures an expression without evaluating it. This is called “quoting”. We can then evaluate that expression with eval().

    rlang::expr(x^2) 
    ## x^2
    rlang::expr(x^2) |> eval(envir = myenv)
    ## [1] 9
    rlang::expr(sum(c(1, 2, 3)))
    ## sum(c(1, 2, 3))
    rlang::expr(sum(c(1, 2, 3))) |> eval(envir = rlang::global_env())
    ## [1] 6

Managing State

  • An R package cannot alter the global environment.

  • Objects in packages are locked, so cannot be changed.

  • Say you want to keep track of whether or not a function was run (e.g. to write a message on first use). I have done this to (i) say that a function is defunct or (ii) list out special licenses that cover a method.

  • One way you could keep track of this is to use super assign in a function that will change the value of a logical in the package environment.

    ran_fun <- FALSE ## Global variable
    fun <- function() {
      if (!ran_fun) {
        message("Here is a message")
        ran_fun <<- FALSE ## this alters package environment
      }
    }
    fun()
    ## Here is a message
    fun()
    ## Here is a message
  • What I think is better is having an environment specific to messages, that way you have to worry less about global variables.

    menv <- rlang::env(ran_fun = FALSE)
    fun <- function() {
      if (!menv$ran_fun) {
        message("Here is a message")
        menv$ran_fun <- TRUE
      }
    }
    fun()
    ## Here is a message
    fun()
  • This functionality is very popular, so {rlang} has a function dedicated to it.

    fun <- function() {
      rlang::warn("Here is a message", .frequency = "once", .frequency_id = "ran_fun")
    }
    fun()
    ## Warning: Here is a message
    ## This warning is displayed once per session.
    fun()
  • Exercise: Use environments to keep a tally for how many times a function called foo() is run. Create another function called foo_count() that returns that number. E.g.

    foo_count()
    ## [1] 0
    foo()
    foo()
    foo()
    foo_count()
    ## [1] 3

R6 Objects

Hashing

New Functions


National Science Foundation Logo American University Logo Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.