Learning Objectives

Armadillo Docs
Learning Objectives:
- Using Armadillo for linear algebra

Integration with R

If you need to use numeric matrices or arrays in C++, you should use the Armadillo C++ library.
Eigen is also a really great option, but appears to be less popular in the R community. You can read the docs here.
Armadillo contains fast linear algebra algorithms, as well as a nice interface for working with matrices.
You can set up an R package to use Armadillo by running the following R:

R
```
usethis::use_rcpp_armadillo()
```
This will add RcppArmadillo to the LinkingTo field in the DESCRIPTION file. It will also create Makevars files to set compilation settings.

At the top of each C++ file, you now include the following (instead of just including Rcpp):

C++

#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
using namespace arma;

Armadillo Vectors

If you are using Armadillo for matrices, you should be using Armadillo vectors instead of NumericVectors, so that it is easier to interact between matrices and vectors.
Armadillo vectors are just like NumericVector’s with some minor differences. That is, there are no character or logical Armadillo vectors.
Armadillo vectors are called Col objects (for “column vector”).
Initialize with

C++
```
arma::vec x(n_elem)
```

You can specify a default value for the vector with

C++

// fill with zeros
arma::vec x(n_elem, arma::fill::zeros);

// fill with ones
arma::vec y(3, arma::fill::ones);

// fill with specified value
arma::vec z(3, arma::fill::value(23.0));

You can create custom vectors with braces

C++
```
arma::vec w = {4.3, 2.2, -9.4};
```

Let’s demonstrate:

// [[Rcpp::export]]
void ex1() {
  vec x(3, fill::zeros);
  Rcpp::Rcout << x << std::endl;

  vec y(3, fill::ones);
  Rcpp::Rcout << y << std::endl;

  vec z(3, fill::value(23.0));
  Rcpp::Rcout << z << std::endl;

  vec w = {4.3, 2.2, -9.4};
  Rcpp::Rcout << w << std::endl;
}

ex1()

##         0
##         0
##         0
## 
##    1.0000
##    1.0000
##    1.0000
## 
##    23.0000
##    23.0000
##    23.0000
## 
##    4.3000
##    2.2000
##   -9.4000

Of course, you typically get Armadillo vectors from a user.

C++

// [[Rcpp::export]]
vec ex2(vec x) {
  return x;
}

ex2(c(1, 4, 2))

##      [,1]
## [1,]    1
## [2,]    4
## [3,]    2

If you return an Armadillo vector, then R will interpret this as a matrix object with one column. But you can pass R vectors as inputs to Col objects.

You can get and set elements using parentheses notation (). You can use brackets as well, but this is more dangerous since Armadillo then won’t do a bounds check. So you should always get and set elements by parentheses.

C++

// [[Rcpp::export]]
void ex3() {
  vec x = {6.2, 5.3, 1.1};
  Rcpp::Rcout << "x(0): " << x(0) << std::endl;
  double y = x(1);
  Rcpp::Rcout << "y: " << y << std::endl;
  x(2) = 8.9;
  Rcpp::Rcout << "x(2): " << x(2) << std::endl;
}

ex3()

## x(0): 6.2
## y: 5.3
## x(2): 8.9

You get the number elements in a vector with

C++
```
x.n_elem
```
You do arithmetic operations on vectors as you would in R, but note that element-wise multiplication is different.

C++
```
x + y
x - y
x % y // element-wise multiplication
x / y
```

You can do lots of miscellaneous element-wise functions (table from Armadillo docs):

Armadillo	Explanation
`exp(A)`	base-e exponential: \(e^x\)
`exp2(A)`	base-2 exponential: \(2^x\)
`exp10(A)`	base-10 exponential: \(10^x\)
`expm1(A)`	compute \(\exp(A)-1\) accurately for values of A close to zero
`log(A)`	natural log: \(\log_e(x)\)
`log2(A)`	base-2 log: \(\log_2(x)\)
`log10(A)`	base-10 log: \(\log_{10}(x)\)
`log1p(A)`	compute \(\log(1+A)\) accurately for values of A close to zero
`pow(A, p)`	raise to the power of p: \(x^p\)
`square(A)`	square: \(x^2\)
`sqrt(A)`	square root: \(\sqrt{x}\)
`floor(A)`	largest integral value that is not greater than the input value
`ceil(A)`	smallest integral value that is not less than the input value
`round(A)`	round to nearest integer, with halfway cases rounded away from zero
`trunc(A)`	round to nearest integer, towards zero
`lgamma(A)`	natural log of the absolute value of gamma function
`sign(A)`	sign function

The usual statistics functions are available, but note the different name for standard deviation: mean(), median(), stddev(), var(), range()
Exercise: Implement cumprod(), cummin(), cummax(), and diff() using Armadillo vectors instead of NumericVectors. You can copy and paste your solutions from the exercises in the Rcpp vectors lecture and then modify them to use armadillo vectors.

Hint: You can make a length zero Col object by

C++
```
vec x(1);
x.clear();
```

Armadillo Matrices

Armadillo calls matrix objects Mat. They contain doubles.
You initialize Armadillo matrices with

C++
```
arma::max A(n_rows, n_cols);
```

You can specify a default value for the vector with

C++

// fill with zeros
arma::mat A1(n_rows, n_cols, arma::fill::zeros);

// fill with ones
arma::mat A2(n_rows, n_cols, arma::fill::ones);

// fill with specified value
arma::mat A3(n_rows, n_cols, arma::fill::value(23.0));

You can create custom matrices with braces. Internal braces are the rows.

C++

// Three rows, two columns
arma::mat A4 = {{4.3, 2.2}, {-9.4, 7.0}, {10.0, 11.0}};

Let’s demonstrate these

C++

// [[Rcpp::export]]
void ex4() {
  int n_rows = 3;
  int n_cols = 2;

  arma::mat A1(n_rows, n_cols, arma::fill::zeros);
  Rcpp::Rcout << A1 << std::endl;

  arma::mat A2(n_rows, n_cols, arma::fill::ones);
  Rcpp::Rcout << A2 << std::endl;

  arma::mat A3(n_rows, n_cols, arma::fill::value(23.0));
  Rcpp::Rcout << A3 << std::endl;

  arma::mat A4 = {{4.3, 2.2}, {-9.4, 7.0}, {10.0, 11.0}};
  Rcpp::Rcout << A4 << std::endl;
}

ex4()

##         0        0
##         0        0
##         0        0
## 
##    1.0000   1.0000
##    1.0000   1.0000
##    1.0000   1.0000
## 
##    23.0000   23.0000
##    23.0000   23.0000
##    23.0000   23.0000
## 
##     4.3000    2.2000
##    -9.4000    7.0000
##    10.0000   11.0000

Element-wise arithmetic operations are the same as for vectors:

C++

A + B;
A - B;
A % B; // element-wise multiplication
A / B;

Element-wise functions (e.g. log() and exp()) are the exact same as in vectors.

You get and set elements by parentheses notation. Always remember that indexing starts at 0.

C++

A(0, 2); // get (0,2)th element
double x = A(4, 2); // get (4,2)th element and assign to x
A(1, 0) = 12.0; // set (1,0)th element

Let’s demonstrate these

C++

// [[Rcpp::export]]
void ex5() {
  mat A = {{1.0, 2.0, 3.0}, {4.0, 5.0, 6.0}};

  Rcpp::Rcout << "A: " << A << std::endl;

  Rcpp::Rcout << "A(0, 2): " << A(0, 2) << std::endl;

  A(1, 0) = 12.0;
  Rcpp::Rcout << "A(1, 0): " << A(1, 0) << std::endl;

  A = A + 2.0;
  Rcpp::Rcout << "A: " << A << std::endl;
}

ex5()

## A:    1.0000   2.0000   3.0000
##    4.0000   5.0000   6.0000
## 
## A(0, 2): 3
## A(1, 0): 12
## A:     3.0000    4.0000    5.0000
##    14.0000    7.0000    8.0000

You can extract columns and rows with A.col() and A.row(s). When you do this, it uses value semantics, not reference semantics, so you can edit any newly defined Armadillo vectors without modifying A. You can also set columns with A.col(col_num) = x.

C++

// [[Rcpp::export]]
void ex6() {
  mat A = {{1.0, 2.0}, {3.0, 4.0}, {5.0, 6.0}};
  Rcpp::Rcout << "A:" << std::endl << A << std::endl;

  vec x = A.col(1); // second column
  Rcpp::Rcout << "x:" << std::endl << x << std::endl;

  x(0) = 10.0;
  Rcpp::Rcout << "x:" << std::endl << x << std::endl; // changed
  Rcpp::Rcout << "A:" << std::endl << A << std::endl; // unchanged

  vec a = {7.0, 8.0, 9.0};
  A.col(0) = a;
  Rcpp::Rcout << "A:" << std::endl << A << std::endl; // unchanged
}

ex6()

## A:
##    1.0000   2.0000
##    3.0000   4.0000
##    5.0000   6.0000
## 
## x:
##    2.0000
##    4.0000
##    6.0000
## 
## x:
##    10.0000
##     4.0000
##     6.0000
## 
## A:
##    1.0000   2.0000
##    3.0000   4.0000
##    5.0000   6.0000
## 
## A:
##    7.0000   2.0000
##    8.0000   4.0000
##    9.0000   6.0000

Number of rows/columns/elements from a Mat object:
- A.n_elem: Number of elements in matrix A.
- A.n_rows: Number of rows in matrix A.
- A.n_cols: Number of columns in matrix A.
Generate matrix of ones.

C++
```
mat A = ones<mat>(n_rows, n_cols);
```
Generate \(n\times n\) identity matrix.

C++
```
mat id = eye(n_rows, n_cols);
```
Diagonal matrix, where v is a Col that you already have defined.

C++
```
mat d = diagmat(v);
```

Let’s demonstrate making these special matrices:

C++

// [[Rcpp::export]]
void ex7() {
  mat A = ones<mat>(2, 3);
  Rcpp::Rcout << "A: " << std::endl << A << std::endl;

  mat id = eye(4, 4);
  Rcpp::Rcout << "id: " << std::endl << id << std::endl;

  vec v = {3.0, 2.2, 5.5};
  mat d = diagmat(v);
  Rcpp::Rcout << "d: " << std::endl << d << std::endl;
}

ex7()

## A: 
##    1.0000   1.0000   1.0000
##    1.0000   1.0000   1.0000
## 
## id: 
##    1.0000        0        0        0
##         0   1.0000        0        0
##         0        0   1.0000        0
##         0        0        0   1.0000
## 
## d: 
##    3.0000        0        0
##         0   2.2000        0
##         0        0   5.5000

Linear Algebra Operations

Rcpp has LogicalMatrix, IntegerMatrix, NumericMatrix, and CharacterMatrix objects that I am not covering.
This is because you mostly use C++ for numerical computing, so it’s less usual to use a CharacterMatrix in C++ (I would say you are crazy for trying to use C++ when you are mostly dealing with strings). And for numerical computing, linear algebra is king, and Armadillo has nice linear algebra routines.
Matrix multiplication:

C++
```
A * B;
```
Matrix transpose:

C++
```
A.t();
```
Solving linear systems: Solving for \(X\) in \[ AX = B \] results in \[ X = A^{-1}B \] solve() will calculate \(A^{-1}B\) numerically stably.

C++
```
solve(A, B);
```
Matrix inverse:

C++
```
inv(A);
```
Moore-Penrose inverse:

C++
```
pinv(A);
```
Extracting diagonal:

C++
```
A.diag();
```
Matrix trace:

C++
```
trace(A);
```
Cholesky Decomposition: Find an upper triangular matrix \(R\) with non-negative diagonal elements such that \(X = R^TR\). Applicable when \(X\) is symmetric.

C++
```
mat R = chol(X);
```
Eigendecomposition: Find orthogonal matrix \(U\) and diagonal matrix \(D\) such that \(X = UDU^T\). Applicable when \(X\) is symmetric.

You need to create (i) a vector containing the main diagonal of \(D\) and (ii) a matrix that will hold the eigenvectors. You do that ahead of time. Then you run eig_sym().

C++
```
vec eigval; // main diagonal of D matrix
mat eigvec; // U matrix
eig_sym(eigval, eigvec, B);
```
Singular Value Decomposition: Find orthogonal matrices \(U\) and \(V\), and diagonal matrix \(D\) such that \(X = UDV^T\).

You need to create (i) a vector containing the main diagonal of \(D\) and (ii) matrices that will hold the left and right singular vectors. You do that ahead of time. Then you run svd().

C++
```
mat U;
vec s;
mat V;
svd(U,s,V,X);
```
QR Decomposition: Find an orthogonal matrix \(Q\) and an upper triangular matrix \(R\) with non-negative diagonal entries, such that \(X = QR\).

C++
```
mat Q;
mat R;
qr(Q, R, X);
```
Sometimes, you want to convert a \(1\times 1\) matrix to a double. This usually happens when you do a dot product. You can do that with arma::as_scalar().

C++
```
vec x = {1.0, 2.0, 3.0};
double ss = as_scalar(x.t() * x);
```

Example: This isn’t the most efficient way to find the OLS estimates from a regression, but we can just calculate

\[ \hat{\beta} = (X^TX)^{-1}X^Ty \]

Let’s do this in C++:

C++

// [[Rcpp::export]]
vec stupid_ols(mat X, vec y) {
  vec betahat = inv(X.t() * X) * X.t() * y;
  return betahat;
}

Let’s verify our results.

## Armadillo Way
X <- model.matrix(mpg ~ wt, data = mtcars)
y <- mtcars$mpg
stupid_ols(X = X, y = y)

##        [,1]
## [1,] 37.285
## [2,] -5.344

## R Way
coef(lm(mpg ~ wt, data = mtcars))

## (Intercept)          wt 
##      37.285      -5.344

I called it stupid_ols() because calculating an inverse is generally a computationally unstable task (but in this scenario, it’s probably OK). In numerical computing, you try to avoid calculating an inverse unless you are returning the inverse to a user. In which case, you should calculate the inverse only during the final step.

It’s better to use solve() here, but folks have thought very hard about this seemingly simple problem (Farebrother 1988):

C++

// [[Rcpp::export]]
vec arma_ols(mat X, vec y) {
  vec betahat = solve(X.t() * X, X.t() * y);
  return betahat;
}

arma_ols(X = X, y = y)

##        [,1]
## [1,] 37.285
## [2,] -5.344

Let’s compare performance when we have a large number of covariates (note: lm() returns a lot more stuff, so this only says that when we only want coefficient estimates we can do better).

X <- model.matrix(mpg ~ ., data = mtcars)
y <- mtcars$mpg

bench::mark(
  coef(lm(mpg ~ wt, data = mtcars)),
  stupid_ols(X = X, y = y),
  arma_ols(X = X, y = y), 
  check = FALSE
)[1:8] |>
  knitr::kable()

expression	min	median	itr/sec	mem_alloc	gc/sec	n_itr	n_gc
coef(lm(mpg ~ wt, data = mtcars))	402.12µs	419.75µs	2262	5.18KB	21.92	1032	10
stupid_ols(X = X, y = y)	5.44µs	6.63µs	149921	18.2KB	0.00	10000	0
arma_ols(X = X, y = y)	7.89µs	9.21µs	107100	6.62KB	10.71	9999	1

Exercise: The fitted values are defined as \[ \hat{y} = X\hat{\beta} = X(X^TX)^{-1}X^Ty. \] The residuals are defined as \[ e = y - \hat{y} = (I_n - X(X^TX)^{-1}X^T)y. \] The MSE is defined as \[ \hat{\sigma}^2 = e^Te / (n - p), \] where \(n\) is the samples size (number of rows in \(X\)) and \(p\) is the number of parameters (number of columns in \(X\)). The estimated covariance matrix of \(\hat{\beta}\) is \[ \hat{\sigma}^2 (X^TX)^{-1}. \] Implement all of these equations in Armadillo in one function and return the result as a list. Compare your calculations to the result of stats::lm().

R to Armadillo Conversion Table

In the below table, A and B are a matrices and Q is a three-dimensional array (or “cube”). This table is based on the Matlab/Octave to Armadillo conversion table here.

R	Armadillo	Notes
`A[1, 1]`	`A(0, 0)`	indexing in Armadillo starts at 0
`A[k, k]`	`A(k-1, k-1)`
`nrow(A)`	`A.n_rows`	read only
`ncol(A)`	`A.n_cols`
`dim(A)[[3]]`	`Q.n_slices`	`Q` is a cube (3D array)
`length(A)`	`A.n_elem`
`A[, k]`	`A.col(k)`	this is a conceptual example only; exact conversion from R to Armadillo syntax will require taking into account that indexing starts at 0
`A[k, ]`	`A.row(k)`
`A[, p:q]`	`A.cols(p,q)`
`A[p:q, ]`	`A.rows(p,q)`
`A[p:q, r:s]`	`A(span(p,q),span(r,s))`	A(span(first_row,last_row), span(first_col,last_col))
`Q[, , k]`	`Q.slice(k)`	`Q` is a cube (3D array)
`Q[, , t:u]`	`Q.slices(t,u)`
`Q[p:q, r:s, t:u]`	`Q(span(p,q),span(r,s),span(t,u))`
`t(A)`	`A.t()`or`trans(A)`	matrix transpose
`A[] <- 0`	`A.zeros()`	Fill a matrix with 0’s
`A[] <- 1`	`A.ones()`	Fill a matrix with 1’s
`A <- matrix(0, nrow = k, ncol = k)`	`A = zeros<mat>(k,k)`	Initialize a 0 matrix
`A <- matrix(1, nrow = k, ncol = k)`	`A = ones<mat>(k,k)`	Initialize a 1 matrix
`A * B`	`A % B`	element-wise multiplication
`A / B`	`A / B`	element-wise division
`solve(A, B)`	`solve(A, B)`	Solve linear equation \(Ax = B\) for \(A\)
`A <- A + 1;`	`A++`
`A <- A - 1;`	`A--`
`A <- matrix(c(1, 2, 3, 4), nrow = 2, ncol = 2, byrow = TRUE)`	`A = {{1,2}, {3,4}}`	element initialization
`X <- c(A)`	`X = vectorise(A)`
`X = cbind(A, B)`	`X = join_horiz(A, B)`
`X = rbind(A, B)`	`X = join_vert(A, B)`
`A`	`cout << A << endl` or `A.print("A=")`