Learning Objectives

Motivation

Because S3 is really informal, it is also really easy to make dangerous mistakes.
Example: Class structures are not enforced.
- E.g. you can do
```
mt <- mtcars
class(mt) <- "lm"
```
  and method dispatch will try to use lm methods on mt even though that makes no sense.
- This is because in S3 the class is the class attribute.
- The structure of a class is not enforced, except by perhaps a “gentleman’s agreement” that classes should be of a certain form.
Example: Using function names to identify methods is dangerous. E.g. is row.names.data.frame()
1. The row() method of the names.data.frame class?
2. The row.names() method of the data.frame class?
3. The row.names.data() method of the frame class?
S3 is also limiting because the method is only determined by the class of one object. But what if you have a function f(x, y) and you want the method to be different based on:
- x and y are both class foo.
- x is class foo and y is class bar.
- x is class bar and y is class foo.
- x and y are both class bar
S4
1. Makes creating objects of a class more strict, to enforce structure.
2. Makes method creation more strict, so you don’t have to name a new function generic.class() and just hope for the best.
3. Allows for multiple dispatch, where you can choose the method based on the class of more than one object.

`{methods}` Package

All functions for S4 are in the {methods} package which comes with R by default.
- So in a package you either need to import the {methods} package, or you need to use :: to use these functions create S4 objects.
- Running R in batch mode (e.g. running it on the supercomputer) also does not attach the {methods} package, so you need to do library(methods) if you intend to use S4 in batch mode.

S4 Classes

In S3, any object with a class attribute is an S3 object.
In S4, you need to explicitly run the methods::setClass() function to define the class. This makes the structure explicitly enforced.
- The Class argument is the name of the new class.
- The slots argument contains a named character vector with the name of the fields in the class and the type for each field.
  - Note that it supports "numeric" and not "double".
- The prototype argument contains the default values for each slot. This is typically either
  1. A names list object, or
  2. If inheriting from a basetype vector, you use structure, where the base type is in an implicit slot called .Data and the rest of the elements are the other slots. See below for example usage.
```
setClass(
  Class = "Person",
  slots = c(
    name = "character",
     age = "numeric"
  ),
  prototype = list(
    name = character(),
    age = double()
  )
)
```
Note: In S4, they call fields slots (i.e. data that belongs to the class).
You then use methods::new() to create objects of the given S4 class, specifying the class and filling in the slots.
```
david <- new(Class = "Person", name = "David Gerard", age = 34)
```
david is an S4 object
```
sloop::otype(david)
```
```
## [1] "S4"
```
You use methods::is() to see (via one argument) and test (via two arguments) the S4 class of an object.
```
is(david)
```
```
## [1] "Person"
```
```
is(david, "Person")
```
```
## [1] TRUE
```
You extract the data from a slot with methods::slot()
```
slot(david, "age")
```
```
## [1] 34
```

Equivalently, you can use @ notation to extract slots. (You do not use $ or [[ notation for S4 objects).

david@age

## [1] 34

## will error
david$age

## Error in david$age: $ operator not defined for this S4 class

## will error
david[["age"]]

## Error in david[["age"]]: this S4 class is not subsettable

The @ notation is typically only used internally. If you want folks to access slots, you need to define an accessor function. Accessor’s are usually methods for S4 generics, which we will cover later.

Inheritance

Inheritance is really informal in S3. class is a vector and method dispatch just sequentially goes down the vector. But there is no guarantee that a subclass will have the correct structure of a superclass, or that the class vector is consistent across all members of a class.
In S4, use the contains argument in setClass() to explicitly define the superclass that your class is inheriting from.

E.g. we can create a class called Employee that inherits from Person.

setClass(
  Class = "Employee", 
  contains = "Person",
  slots = c(
    boss = "character",
    years = "numeric"
  ),
  prototype = list(
    boss = NA_character_,
    years = NA_real_
  )
)

Now, we can create an Employee object with all of the slots of Person, along with the new slots from Employee.

david <- new(Class = "Employee", 
             name = "David Gerard", 
             age = 34, 
             boss = "Self", 
             years = 4)
david

## An object of class "Employee"
## Slot "boss":
## [1] "Self"
## 
## Slot "years":
## [1] 4
## 
## Slot "name":
## [1] "David Gerard"
## 
## Slot "age":
## [1] 34

Helper Function

You should not expect users to use new(). Instead you should create a helper function for them. E.g.

Employee <- function(name, 
                     age = NA_real_, 
                     boss = NA_character_, 
                     years = NA_real_) {
  age <- as.double(age)
  boss <- as.character(boss)
  years <- as.double(years)

  return(new("Employee", name = name, age = age, boss = boss, years = years))
}

david <- Employee(name = "David Gerard", age = 34, boss = "Self", years = 4)
david

## An object of class "Employee"
## Slot "boss":
## [1] "Self"
## 
## Slot "years":
## [1] 4
## 
## Slot "name":
## [1] "David Gerard"
## 
## Slot "age":
## [1] 34

Try to make sensible defaults, and try to type-cast as much as you can.

Validator Functions

Suppose we want Person to be a vector-type object. So we should constrain the age and name slots to be the same size.

This is not currently a constraint

profs <- new(Class = "Person", name = c("David Gerard", "Zois Boukouvalas"), age = 34)
profs

## An object of class "Person"
## Slot "name":
## [1] "David Gerard"     "Zois Boukouvalas"
## 
## Slot "age":
## [1] 34

You define S4 validators with setValidity().

Class is the name of the class to create a validator for.
method is the function that should be run on the class.
- It returns TRUE if all validity checks pass.
- It returns a character string describing the issues if some of the validity checks fail.

setValidity(
  Class = "Person", 
  method = function(object) {
    if(length(object@name) != length(object@age)) {
      "@name and @age must be same length"
    } else {
      return(TRUE)
    }
  }
)

## Class "Person" [in ".GlobalEnv"]
## 
## Slots:
##                           
## Name:       name       age
## Class: character   numeric
## 
## Known Subclasses: "Employee"

Now it is impossible to make @name and @age slots of different lengths

    profs <- new(Class = "Person", name = c("David Gerard", "Zois Boukouvalas"), age = 34)

## Error in validObject(.Object): invalid class "Person" object: @name and @age must be same length

Note: Validity checks only run automatically with new(), so you can edit the slot to have invalid inputs.

profs <- new(Class = "Person",
             name = c("David Gerard", "Zois Boukouvalas"),
             age = c(34, NA_real_))
profs@age <- 1:20

But you can manually check validity with validObject().

validObject(profs)

## Error in validObject(profs): invalid class "Person" object: @name and @age must be same length

profs <- new(Class = "Person", 
             name = c("David Gerard", "Zois Boukouvalas"), 
             age = c(34, NA_real_))
validObject(profs)

## [1] TRUE

Exercise: Create a factor2 S4 class that tries to implement the factor class for S4.
- Define the class (with appropriate slots and prototypes)
- Create a helper function. It should be able to take a character vector as input and define the
- Create a validator. It should check that the the maximum integer is less than or equal to the number of levels.

S4 Generics and Methods

You create generic functions with setGeneric().
- The name is the name of the new generic.
- The def is the function of the generic. Recall that S3 generics are really simple and just call UseMethod("generic"). Similarly, S4 generic functions should be really simple and just call standardGeneric("generic").

Below we are defining a generic called age() (for getting the age of individuals) and a generic for age()<- which will act as a replacement function.

setGeneric(name = "age", 
           def = function(x) standardGeneric("age"))

## [1] "age"

setGeneric(name = "age<-", 
           def = function(x, value) standardGeneric("age<-"))

## [1] "age<-"

Literally most S4 generics look like this

BiocGenerics::append

## standardGeneric for "append" defined from package "BiocGenerics"
## 
## function (x, values, after = length(x)) 
## standardGeneric("append")
## <bytecode: 0x5613b2bc4e60>
## <environment: 0x5613b2bbbae8>
## Methods may be defined for arguments: x, values
## Use  showMethods(append)  for currently available ones.

BiocGenerics::dims

## standardGeneric for "dims" defined from package "BiocGenerics"
## 
## function (x) 
## standardGeneric("dims")
## <bytecode: 0x5613b338afc0>
## <environment: 0x5613b33869a0>
## Methods may be defined for arguments: x
## Use  showMethods(dims)  for currently available ones.

BiocGenerics::get

## standardGeneric for "get" defined from package "BiocGenerics"
## 
## function (x, pos = -1L, envir = as.environment(pos), mode = "any", 
##     inherits = TRUE) 
## standardGeneric("get")
## <bytecode: 0x5613b369a580>
## <environment: 0x5613b36919e8>
## Methods may be defined for arguments: x, pos, envir
## Use  showMethods(get)  for currently available ones.

The only other argument you should usually use in setGeneric() is signature.
- By default, method dispatch is determined by all of the arguments in the def function
- You can choose to only do method dispatch based on some arguments via signature.
- E.g. below only x determines method dispatch:
```
setGeneric(name = "age<-", 
           def = function(x, value) standardGeneric("age<-"),
           signature = "x")
```
```
## [1] "age<-"
```

You define a method for a given generic with setMethod()

f: The name of the generic.
signature: The name of the class(es) to which to apply the method.
definition: The method function.

setMethod(f = "age", 
          signature = "Person", 
          definition = function(x) {
            return(x@age)
          }
         )
setMethod(f = "age<-",
          signature = "Person",
          definition = function(x, value) {
            x@age <- value
            return(x)
          }
         )

Let’s look at these

david <- Employee(name = "David Gerard", age = 34)
age(david)

## [1] 34

Notice that this worked with an Employee object even though we only defined the method for Person. This is because Employee inherits from Person.

Let’s reassign age

age(david) <- 21 # ah, young again
age(david)

## [1] 21

The age() method is called an accessor or, more informally, a getter.
The age()<- method is called a replacement function or, more informally, a setter.
Note: in a setter, you need to use the variable name value for the replacement value. Don’t use a different name.

Getting Help

To read the help file of an S4 class, put class? before the name of the class. E.g. the following will open up the help file of the GRanges class from the {GenomicRanges} package from Bioconductor.
```
library(stats4)
```
```
class?mle
```
This is equivalent to
```
?`mle-class`
```

You can see the code of a generic function with getGeneric()

getGeneric(f = "show")

## standardGeneric for "show" defined from package "methods"
## 
## function (object) 
## standardGeneric("show")
## <bytecode: 0x5613af672568>
## <environment: 0x5613af5a82d8>
## Methods may be defined for arguments: object
## Use  showMethods(show)  for currently available ones.
## (This generic function excludes non-simple inheritance; see ?setIs)

You can obtain the method of a generic with getMethod()

getMethod(f = "show", signature = "mle")

## Method Definition:
## 
## function (object) 
## {
##     cat("\nCall:\n")
##     print(object@call)
##     cat("\nCoefficients:\n")
##     print(coef(object))
## }
## <bytecode: 0x5613af76ac18>
## <environment: namespace:stats4>
## 
## Signatures:
##         object
## target  "mle" 
## defined "mle"

You can use args() on both of these to get their arguments

args(getGeneric("show"))

## function (object) 
## NULL

args(getMethod("show", "mle"))

## function (object) 
## NULL

To read the help file of method, put a question mark before the call of a generic. E.g., to see the help file for the coverage() method of a GRanges object, run

## Create mle object
x <- c(1, 2, 3, 4)
gobj <- mle(minuslogl = function(mu) -sum(dnorm(x = x, mean = mu, log = TRUE)), start = 0) 

## opens help file for show method of mle
?show(gobj)

{sloop} is useful for seeing

If an object is an S4 object
```
sloop::otype(david)
```
```
## [1] "S4"
```

If a function is an S4 generic

sloop::ftype(age)

## [1] "S4"      "generic"

A list of all methods associated with an S4 class

sloop::s4_methods_class("Person")

## # A tibble: 3 × 4
##   generic class  visible source     
##   <chr>   <chr>  <lgl>   <chr>      
## 1 age     Person TRUE    R_GlobalEnv
## 2 age<-   Person TRUE    R_GlobalEnv
## 3 coerce  Person TRUE    R_GlobalEnv

A list of all methods associated with an S4 generic

sloop::s4_methods_generic("age")

## # A tibble: 1 × 4
##   generic class  visible source     
##   <chr>   <chr>  <lgl>   <chr>      
## 1 age     Person TRUE    R_GlobalEnv

`show()` method

You should always define a show() method. This is used for printing S4 objects.
You need to use the same arguments of a generic that already exists. E.g. show needs object
```
args(getGeneric("show"))
```
```
## function (object) 
## NULL
```

You usually use the cat() function for printing.

setMethod(
  f = "show", 
  signature = "Person", 
  definition = function(object) {
    cat(is(object)[[1]], "\n",
        "  Name: ", object@name, "\n",
        "   Age: ", object@age, "\n",
        sep = "")
  }
)

Now we get better printing for Person’s and Employee’s.
```
david
```
```
## Employee
##   Name: David Gerard
##    Age: 21
```

Accessors

Above, we defined the accessor age as a generic.
But if you have slots unique to your class, you don’t need to go all out OOP like that.

A basic accessor can be defined as follows:

person_name <- function(x) x@name
person_name(david)

## [1] "David Gerard"

Or we can go through the whole rigmarole so that other classes can use your generic.

setGeneric(name = "name", def = function(x) standardGeneric("name"))

## [1] "name"

setMethod(f = "name", signature = "Person", definition = function(x) x@name)

setGeneric(name = "name<-", def = function(x, value) standardGeneric("name<-"))

## [1] "name<-"

setMethod(
  f = "name<-", 
  signature = "Person",
  definition = function(x, value) {
    x@name <- value
    validObject(x)
    return(x)
  }
)

name(david)

## [1] "David Gerard"

name(david) <- "DG"
david

## Employee
##   Name: DG
##    Age: 21

Exercise: Define a show() method for the factor2 class. It should have similar output as
Exercise: Define a levels2() and levels2<- generic and method for the factor2 class that will set and get the levels slot.

Coercion

Use setAs() to define a method for coercing an object to a given class.
- from: The class you are coercing from.
- to: The class you are coercing to.
- def A function of one argument, of class from that will convert it to class to. Typically the name of the argument is just from.

Let’s coerce a character vector into a Person by assuming that the character vector contains the names.

setAs(
  from = "character",
  to = "Person",
  def = function(from) {
    return(
      new(Class = "Person", 
          name = from, 
          age = rep(NA_real_, length.out = length(from)))
    )
  }
)

You then use methods::as() to apply the coercion.

pvec <- as(object = c("David", "Alice", "John", "Jenny"), Class = "Person")
pvec

## Person
##   Name: DavidAliceJohnJenny
##    Age: NANANANA

Though, just as with new() and is(), you try to hide as() from the user through a generic function

setGeneric(name = "asPerson", def = function(x) standardGeneric("asPerson"))

## [1] "asPerson"

setMethod(
  f = "asPerson", 
  signature = "character", 
  definition = function(x) {
    as(object = x, Class = "Person")
  }
)

pvec <- asPerson(c("David", "Bob", "Megan"))
pvec

## Person
##   Name: DavidBobMegan
##    Age: NANANA

Exercise: Create a coercion for factor2 from character.

Converting from S3 so S4

It’s possible to register an S3 class to be an S4 class by setOldClass(). You do this after calling setClass().

setClass("factor",
  contains = "integer",
  slots = c(
    levels = "character"
  ),
  prototype = structure(
    integer(),
    levels = character()
  )
)
setOldClass("factor", S4Class = "factor")

## Warning in rm(list = what, pos = classWhere): object '.__C__factor' not found

Above, we inherit from the integer base type.
Notice that we used structure() in the prototype, not list(). This is because we are building on factors, which are integers with attributes, not list-like objects.

When you inherit from a base type, it has a slot .Data which is where the base type’s values live. Let’s use this to make new S4 factors.

x <- new(Class = "factor", .Data = c(1, 2, 1, 2, 1), levels = c("A", "B"))
sloop::otype(x)

## [1] "S4"

## Object of class "factor"
## [1] A B A B A
## Levels: A B

x@.Data

## [1] 1 2 1 2 1

It is possible to convert any S3 generic to an S4 generic by simply running
```
setGeneric("mean")
```
```
## [1] "mean"
```

Method Dispatch

Multiple inheritance

It’s possible to create a class that inherits from two or more classes (i.e. have two superclasses). This is called multiple inheritance.
Issue: If the subclass does not not have a method, but both superclasses have methods, then R will choose the method based on alphabetical order. This is dangerous, so we won’t cover it.

Multiple dispatch

A method can be determined by the classes of multiple objects in the generic call. This is called multiple dispatch.
E.g. how should c(x, y) behave if
1. x and y are both factors?
2. x is a factor and y is a character?
3. x and y are both characters?
We will demonstrate this using the Rational class defined next.

Rational Vector Example

I will create a new Rational class that contains numerator and denominator slots:

setClass(
  Class = "Rational", 
  slots = c(
    num = "integer",
    den = "integer"
  ),
  prototype = list(
    num = integer(),
    den = integer()
  )
)

setValidity(
  Class = "Rational", 
  method = function(object) {
    if (length(object@num) != length(object@den)) {
      return("numerator length must equal denominator length")
    } else {
      return(TRUE)
    }
  }
)

## Class "Rational" [in ".GlobalEnv"]
## 
## Slots:
##                       
## Name:      num     den
## Class: integer integer

Let’s create a helper function

Rational <- function(num = integer(), den = integer()) {
  num <- as.integer(num)
  den <- as.integer(den)
  return(new(Class = "Rational", num = num, den = den))
}

And let’s create show(), den(), and num() methods

setMethod(
  f = "show", 
  signature = "Rational", 
  definition = function(object) {
    cat(paste0(object@num, "/", object@den))
  }
)

setGeneric(name = "den", def = function(x) standardGeneric("den"))

## [1] "den"

setGeneric(name = "den<-", def = function(x, value) standardGeneric("den<-"))

## [1] "den<-"

setGeneric(name = "num", def = function(x) standardGeneric("num"))

## [1] "num"

setGeneric(name = "num<-", def = function(x, value) standardGeneric("num<-"))

## [1] "num<-"

setMethod(
  f = "den", 
  signature = "Rational", 
  definition =  function(x) {
    return(x@den)
  }
)

setMethod(
  f = "num", 
  signature = "Rational", 
  definition =  function(x) {
    return(x@num)
  }
)

setMethod(
  f = "den<-", 
  signature = "Rational", 
  definition =  function(x, value) {
    x@den <- as.integer(value)
    validObject(x)
    return(x)
  }
)

setMethod(
  f = "num<-", 
  signature = "Rational", 
  definition =  function(x, value) {
    x@num <- as.integer(value)
    validObject(x)
    return(x)
  }
)

Let’s see how it works

x <- Rational(c(1, 2, 3), c(9, 5, 2))
x

## 1/9 2/5 3/2

den(x)

## [1] 9 5 2

num(x)

## [1] 1 2 3

den(x) <- 11:13
x

## 1/11 2/12 3/13

Methods for Group Generic Functions

Can we add rational objects to each other?

x <- Rational(c(1, 2, 3), c(9, 5, 2))
y <- Rational(c(5, 2, 6), c(11, 1, 2))
x + y

## Error in x + y: non-numeric argument to binary operator

+ is not an S3 generic, it is a group generic (see ?S4groupGeneric for a full list of group generics used by S4).
S4 automatically allows for group generics to be used as S4 generics, and so we do not need to do that conversion via setGeneric("+").
+ takes arguments e1 and e2, which we will need to use when creating a method:
```
args(getGeneric("+"))
```
```
## function (e1, e2) 
## NULL
```

Let’s create a method for Rational addition.

setMethod(
  f = "+", 
  signature = "Rational", 
  definition = function(e1, e2) {
    num <- e1@num * e2@den + e2@num * e1@den
    den <- e1@den * e2@den
    return(new(Class = "Rational", num = num, den = den))
  }
)

Now let’s try it out
```
x + y
```
```
## 56/99 12/5 18/4
```

Now 1 is just 9/9 or 5/5 etc…, But we cannot just add an integer to a Rational?

x + 1L

## Error in x + 1L: trying to get slot "den" from an object of a basic class ("integer") with no slots

We need to define a method based on double dispatch, where one input is a Rational and one is an integer.

In the signature, you tell R which class each argument is.
You need to create two methods if you want commutivity. But the second one just calls the first one.

setMethod(f = "+", 
  signature = c(e1 = "Rational", e2 = "integer"), 
  definition = function(e1, e2) {
    num <- e1@num + e1@den * e2
    den <- e1@den
    return(new(Class = "Rational", num = num, den = den))
  }
)

setMethod(f = "+", 
  signature = c(e1 = "integer", e2 = "Rational"), 
  definition = function(e1, e2) e2 + e1
)

It works!

x + 2L

## 19/9 12/5 7/2

2L  + x

## 19/9 12/5 7/2

A shorthand to define all arithmetic operations is by the Arith generic

setMethod(
  f = "Arith", 
  signature = c(e1 = "Rational", e2 = "integer"), 
  definition = function(e1, e2) {
    op <- .Generic[[1]]
    if (op == "+") {
      num <- e1@num + e1@den * e2
      den <- e1@den
    } else if (op == "*") {
      num <- e1@num * e2
      den <- e1@den
    } else {
      stop("only +/* are defined right now")
    }
    return(new(Class = "Rational", num = num, den = den))
  }
)

setMethod(
  f = "Arith",
  signature = c(e1 = "integer", e2 = "Rational"),
  definition = function(e1, e2) {
    op <- .Generic[[1]]
    switch(op,
           "+" = e2 + e1,
           "*" = e2 * e1,
           stop("only +/* are defined right now")
    )
  }
)

x <- Rational(num = c(1L, 2L, 3L), den = c(4L, 5L, 6L))
x + 1L

## 5/4 7/5 9/6

x * 2L

## 2/4 4/5 6/6

2L * x

## 2/4 4/5 6/6

Above, .Generic is a variable defined inside the evaluation environment of the setMethod() definition. It contains the arithmetic operation applied by the user.

Documenting S4 objects

To document an S4 class, add {roxygen2} comments before setClass().
Use the @slot tag to document each slot, just like you would document each parameter of a function.
Make sure you also provide a @name tag of the form classname-class

Let’s document the Person class.

#' An S4 class representing a person
#' 
#' @slot name The name of the person.
#' @slot age The age of the person.
#' 
#' @name Person-class
setClass(
  Class = "Person",
  slots = c(
    name = "character",
     age = "numeric"
  ),
  prototype = list(
    name = character(),
    age = double()
  )
)

S4 generics are regular functions, so document them like usual.
You can use @describeIn to simplify documentation of S4 methods by combining them with the generic or the class documentation.
In an R package, the files are loaded in alphabetical order. However, for S4 objects, you need to load superclasses before you are allowed to create subclasses.
To force another file to load before the current file, put this at the top of the current file
```
#' @include foo.R bar.R
NULL
```
```
## NULL
```
The above says that foo.R and bar.R should be loaded before the current file.

Some Example Uses of S4 by Others

`{Matrix}`

The {Matrix} package provides classes for many types of matrices:
- sparse (mostly zeros)
- triangular (either the upper or lower triangle elements are all 0).]
- symmetric (upper and lower triangle elements are equal).
```
library(Matrix)
```
Many matrix operations can be more efficiently executed given knowledge of these structures.
The USCounties matrix is a 3111 by 3111 matrix where an element measures a spatial relationship between counties. It is non-zero only if counties are contiguous.
```
data("USCounties", package = "Matrix")
```
This is an S4 object of class dsCMatrix, used for handling sparse matrices (only 0.0094% of the values in this matrix are non-zero).
```
sloop::otype(USCounties)
```
```
## [1] "S4"
```
```
class(USCounties)
```
```
## [1] "dsCMatrix"
## attr(,"package")
## [1] "Matrix"
```
Almost every matrix operation you can imagine has been optimized for this sparsity:
```
nrow(sloop::s4_methods_class("dsCMatrix"))
```
```
## [1] 383
```

This makes things like matrix multiplication much faster:

densemat <- as.matrix(USCounties)
ones <- rep(1, length.out = ncol(densemat))
microbenchmark::microbenchmark(
  densemat %*% ones,
  USCounties %*% ones
)

## Unit: microseconds
##                 expr     min      lq    mean  median      uq   max neval
##    densemat %*% ones 13477.0 18498.5 19398.3 18974.8 19894.7 28210   100
##  USCounties %*% ones   114.8   221.4   327.3   279.9   413.7  1082   100

This application doesn’t really demonstrate its abilities. It can be even more orders of magnitude faster.

`{stats4}`

One application of S4 is in the {stats4} package for likelihood inference.
The likelihood is the kind of like the probability of the data that we observed given some parameter values. (not exactly for continuous data)
E.g., let’s say we observed heights (in inches)
```
x <- c(72, 66, 74, 64, 70)
```
and let’s assume that these data were generated from a normal distribution.
Then, given any two parameters of the mean (mu) and standard deviation (sigma), the likelihood of these parameters at each observation would be from dnorm()
```
dnorm(x = x, mean = mu, sd = sigma)
```
If the individuals that these heights are from are independent (e.g. they aren’t family members), then the likelihood of these parameters given all of the data would be
```
prod(dnorm(x = x, mean = mu, sd = sigma))
```

We can calculate this likelihood for a few values of mu and sigma

mu <- 72
sigma <- 3
prod(dnorm(x = x, mean = mu, sd = sigma))

## [1] 1.031e-07

mu <- 70
sigma <- 4
prod(dnorm(x = x, mean = mu, sd = sigma))

## [1] 1.04e-06

Typically, the parameters are not known (mu and sigma are unknown) and our goal is to estimate them when we only see x.
One way to do this is to find the mu and sigma that maximize this likelihood. The resulting estimators are called the maximum likelihood estimators (MLE’s). This is one of the most common methods for estimating parameters from a statistical model.
Intuition: the MLE’s are the parameters that make our observed data as probable to have observed as possible.

Typically, the likelihood produces very small numbers, so folks maximize the log-likelihood, which results in the same MLE’s since $\log()$ is a monotone increasing function.

ll <- sum(dnorm(x = x, mean = mu, sd = sigma, log = TRUE)) ## log likelihood
like <- prod(dnorm(x = x, mean = mu, sd = sigma)) ## likelihood

## they are the same
log(like)

## [1] -13.78

ll

## [1] -13.78

The {stats4} package provides a bunch of S4 objects and methods for dealing with MLE’s.
```
library(stats4)
```
To use the {stats4} package, first create a function that takes as input your parameters nothing else. It assumes that all data comes from the parent environment. It should output the negative log-likelihood.
```
#' @param mu The mean
#' @param sigma The standard deviation
neg_ll <- function(mu, sigma) {
  return(-sum(dnorm(x = x, mean = mu, sd = sigma, log = TRUE)))
}
```

Then, use the mle() function. Include some starting values, and place some limits on constrained parameters.

mlout <- stats4::mle(minuslogl = neg_ll, 
                     start = c(mu = 72, sigma = 1), 
                     lower = c(mu = -Inf, sigma = 0),
                     upper = c(mu = Inf, sigma = Inf))

This object is S4.

sloop::otype(mlout)

## [1] "S4"

class(mlout)

## [1] "mle"
## attr(,"package")
## [1] "stats4"

What generics are available for the objects of type mle?

sloop::s4_methods_class("mle")

## # A tibble: 9 × 4
##   generic class visible source
##   <chr>   <chr> <lgl>   <chr> 
## 1 coef    mle   TRUE    stats4
## 2 confint mle   TRUE    stats4
## 3 logLik  mle   TRUE    stats4
## 4 nobs    mle   TRUE    stats4
## 5 profile mle   TRUE    stats4
## 6 show    mle   TRUE    stats4
## 7 summary mle   TRUE    stats4
## 8 update  mle   TRUE    stats4
## 9 vcov    mle   TRUE    stats4

coef() will get you the parameter estimates

coef(mlout)

##     mu  sigma 
## 69.200  3.709

vcov() will you you the estimated variance/covariance matrix of the MLE’s (for their sampling distribution).
```
vcov(mlout)
```
```
##              mu     sigma
## mu    2.752e+00 1.085e-07
## sigma 1.085e-07 1.376e+00
```

So the standard errors are from

sqrt(diag(vcov(mlout)))

##    mu sigma 
## 1.659 1.173

confint() will get you confidence intervals

confint(mlout)

## Profiling...

##        2.5 % 97.5 %
## mu    65.210 73.190
## sigma  2.219  8.083

In the normal case, these estimates are the same as the sample mean and the sample variance (not adjusting for degrees of freedom).
```
mean(x)
```
```
## [1] 69.2
```
```
sqrt(var(x) * (length(x) - 1) / length(x))
```
```
## [1] 3.709
```
But the MLE is general to many more complicated applications.
Exercise: The number of heads in $n$ coin flips is binomially distributed with size $n$ and success probability $p$, where $p$ is the probability that a single flip turns up heads. The likelihood, given that x heads turned up, is
```
dbinom(x = x, size = n, prob = p)
```
When $x = 11$ and $n = 31$, calculate the MLE of $p$.

New Functions

methods::setClass(): Define a new S4 class.
methods::new(): Create a new object of a given S4 class.
methods::is(): See the class of an object (if you give it just the object), or test that an object is a certain class (if you give it the object and a class name).
methods::slot(): Extract the data from a slot. Equivalent to @ notation.
methods::setGeneric(): Create a new S4 generic function
methods::setMethod(): Create a method for a S4 generic.
methods::setValidity(): Enforce structure on an S4 object created by new().
methods::validObject(): Check validity of an S4 object based on setValidity() definition.
methods::setOldClass(): Register an S3 class to be an S4 class.
methods::setAs(): Define a coercion method.
methods::as(): Apply a coercion method.
methods::getGeneric(): Get generic function definition.
methods::getMethod(): Get method function definition.
base::args(): Show arguments of a function.
sloop::s4_methods_class(): Show methods of an S4 class.
sloop::s4_methods_generic(): Show methods of an S4 generic.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

S4 Objects

David Gerard

2022-03-22

Learning Objectives

Motivation

`{methods}` Package

S4 Classes

Inheritance

Helper Function

Validator Functions

S4 Generics and Methods

Getting Help

`show()` method

Accessors

Coercion

Converting from S3 so S4

Method Dispatch

Multiple inheritance

Multiple dispatch

Rational Vector Example

Methods for Group Generic Functions

Documenting S4 objects

Some Example Uses of S4 by Others

`{Matrix}`

`{stats4}`

New Functions

S4 Objects

David Gerard

2022-03-22

Learning Objectives

Motivation

{methods} Package

S4 Classes

Inheritance

Helper Function

Validator Functions

S4 Generics and Methods

Getting Help

show() method

Accessors

Coercion

Converting from S3 so S4

Method Dispatch

Multiple inheritance

Multiple dispatch

Rational Vector Example

Methods for Group Generic Functions

Documenting S4 objects

Some Example Uses of S4 by Others

{Matrix}

{stats4}

New Functions

`{methods}` Package

`show()` method

`{Matrix}`

`{stats4}`