Learning Objectives
Rcpp Vectors
Creating Vectors
Accessing Elements
Rcpp Vector Methods
For-loops
Aliasing
Missing Values and Infinity
Exercises

Learning Objectives

Chapters 8, 10, 19, and 24 of Rcpp for Everyone
Chapter 25 of Advanced R
Rcpp Quick Reference Guide
Learning Objectives:
- Rcpp Vectors
- For-loops
- Basic Vector Operations

Rcpp Vectors

{Rcpp} provides classes of C++ vectors which make programming easier for folks used to R.
The classes are called LogicalVector, IntegerVector, NumericVector, and CharacterVector.
Note, in regular C++ (without Rcpp), you typically use arrays or std::vector objects, like std::vector<double>.

If you have a function that accepts and returns vectors, then you use those types.

C++

NumericVector fn(IntegerVector x, bool y) {
  // code to create z
  return z;
}

Creating Vectors

You create an empty vector of length n with

C++
```
NumericVector x(n);
```
You can add default values by including a second argument via

C++
```
NumericVector x(n, 2.0);
```
If you want an IntegerVector from 0 to n - 1, probably the best way is through Rcpp::seq(0, n-1) std::iota()

C++
```
IntegerVector x = Rcpp::seq(0, n - 1);
```
If you have specific elements you want to create the vector with, use curly braces (requires C++11):

C++
```
NumericVector x = {9.0, 1.2, 4.9};
```

For older versions of C++, use NumericVector::create().

C++

NumericVector x = NumericVector::create(9.0, 1.2, 4.9);

Let’s demonstrate these methods

C++

// [[Rcpp::export]]
void vec_create() {
  int n = 5;

  NumericVector x1(n);

  NumericVector x2(n, 2.0);

  CharacterVector x3(n, "A");

  IntegerVector x4 = Rcpp::seq(0, n - 1);

  NumericVector x5 = {9.0, 1.2, 4.9};

  NumericVector x6 = NumericVector::create(9.0, 1.2, 4.9);

  Rcpp::Rcout << "x1: " << x1 << std::endl
              << "x2: " << x2 << std::endl
              << "x3: " << x3 << std::endl
              << "x4: " << x4 << std::endl
              << "x5: " << x5 << std::endl
              << "x6: " << x6 << std::endl;
}

vec_create()

## x1: 0 0 0 0 0
## x2: 2 2 2 2 2
## x3: "A" "A" "A" "A" "A"
## x4: 0 1 2 3 4
## x5: 9 1.2 4.9
## x6: 9 1.2 4.9

Accessing Elements

You get and set individual elements of a vector using brackets [], just like in R.
Note that in C++, indexing starts at 0.

One more time for emphasis, in C++ indexing starts at 0.

C++

// [[Rcpp::export]]
void subset_example(NumericVector x) {
  Rcpp::Rcout << "x[0]:" << x[0] << std::endl
              << "x[1]:" << x[1] << std::endl
              << "x[2]:" << x[2] << std::endl;

  x[0] = 21;
  x[1] = 22;
  x[2] = 23;

  Rcpp::Rcout << "x[0]:" << x[0] << std::endl
              << "x[1]:" << x[1] << std::endl
              << "x[2]:" << x[2] << std::endl;
}

subset_example(c(14, 38, 29))

## x[0]:14
## x[1]:38
## x[2]:29
## x[0]:21
## x[1]:22
## x[2]:23

Rcpp Vector Methods

A method is a function that is attached to an object. In C++, methods are accessed with a . (just like python).
Rcpp vectors have a lot of useful methods.
length() or size(): Get total number of elements in the vector. These are equivalent.

C++
```
x.length();
```
fill(): Fill all elements of a vector with scalar value.

C++
```
x.fill(2.0);
```

sort(): Return a vector that sorts the object.

C++

x.sort(false); // ascending order
x.sort(true); // descending order

begin() and end() returns iterators pointing to the first and last elements of the vector (we’ll talk about this later)

Let’s demonstrate these.

C++

// [[Rcpp::export]]
void method_example() {
  NumericVector x = {1.0, 3.0, 5.0};
  Rcpp::Rcout << "x            : " << x << std::endl;

  Rcpp::Rcout << "x.length()   : " << x.length() << std::endl;

  Rcpp::Rcout << "x.sort(false): " << x.sort(false) << std::endl;
  Rcpp::Rcout << "x.sort(true) : " << x.sort(true) << std::endl;

  x.fill(2.0);
  Rcpp::Rcout << "x.fill(2.0)  : " << x << std::endl;
}

method_example()

## x            : 1 3 5
## x.length()   : 3
## x.sort(false): 1 3 5
## x.sort(true) : 5 3 1
## x.fill(2.0)  : 2 2 2

NumericVectors have methods to remove and add values, but Dirk says this is not a good idea. So you should use std::vector<double> objects to do this efficiently.

For-loops

All for-loops in C++ are of the form

C++
```
for (s1; s2; s3) {
  // code executed each iteration
}
```
- s1 is code that is executed once before the for-loop. Usually this declares an integer index, such as int i = 0.
- s2 is a predicate that, when true, allows the for-loop to execute another iteration. Usually, this is i < n if you want to go through the for-loop n times.
- s3 is code that is evaluated at the end of each for-loop. This is usually i++ to add1 to i.
Almost all for-loops you write will look like this

C++
```
for (int i = 0; i < n; i++) {

}
```
- We define i as an integer, initializing it to be 0.
- We only exit the for-loop if i >= n.
- At the end of each iteration we add 1 to i.
Exercise: In the above for-loop, if n = 10, how many iterations will of the for-loop will run, 9 or 10?

Let’s recreate sum() using Rcpp.

C++

// [[Rcpp::export]]
double sum2(NumericVector x) {
  int n = x.length();
  double sval = 0.0;
  for (int i = 0; i < n; i++) {
    sval += x[i];
  }
  return sval;
}

x <- runif(100)
sum(x)

## [1] 50.35

sum2(x)

## [1] 50.35

Here are the microbenchmarks:

expression min median itr/sec mem_alloc gc/sec n_itr n_gc total_time

sum(x) 339.93ns 387.9ns 2281833 0B 0 10000 0 4.38ms

sum2(x) 1.72µs 3.76µs 308930 18.2KB 0 10000 0 32.37ms
In R for-loops, you specify the vector you iterate over. In C++, you specify the exit condition and the code to run at the end of each iteration.
In effect, this makes for(i in 1:n) in R the same as for(int i = 0; i < n; i++) in C++.

expression	min	median	itr/sec	mem_alloc	gc/sec	n_itr	n_gc	total_time
sum(x)	339.93ns	387.9ns	2281833	0B	0	10000	0	4.38ms
sum2(x)	1.72µs	3.76µs	308930	18.2KB	0	10000	0	32.37ms

Aliasing

If you assign an Rcpp vector x to another object y using =, then the value of x is not copied to y. Rather, x becomes an alias for y.
This means that if you edit x, then that will edit y. And if you edit y then that will edit x.

You can use Rcpp::clone() to make a copy.

C++

// [[Rcpp::export]]
void copy_example() {
  NumericVector x = {1, 2, 3};
  Rcpp::Rcout << x << std::endl;

  // cloning
  NumericVector z = Rcpp::clone(x);
  z[0] = 13;
  Rcpp::Rcout << x << std::endl;

  // aliasing
  NumericVector y = x;
  y[1] = 10;
  Rcpp::Rcout << x << std::endl;
}

copy_example()

## 1 2 3
## 1 2 3
## 1 10 3

This is since x is binding to a pointer for the vector, not for the vector itself. So copying x to y just copies the pointer.

Missing Values and Infinity

Infinity is encoded using R_PosInf and R_NegInf.
Missing values are encoded using NA_REAL, NA_INTEGER, NA_LOGICAL, and NA_STRING.
You can include these in Rcpp vectors without worrying too much.

C++
```
Rcpp::NumericVector x(10, NA_REAL)
```
E.g. if you use Rcpp Sugar then Rcpp will understand how to propagate missing values appropriately.
But if you try to use these missing values as scalars, you have to be very scared.
NA_INTEGER is the smallest integer allowed in C++, so NA_INTEGER + 1 would not longer be considered missing data, e.g.

R
```
Rcpp::evalCpp("NA_INTEGER")
```
```
## [1] NA
```
```
Rcpp::evalCpp("NA_INTEGER + 1")
```
```
## [1] -2147483647
```
Coercing NA_LOGICAL to a bool will evaluate to true, since bool does not allow for anything except true and false.

R
```
Rcpp::evalCpp("(bool)NA_LOGICAL")
```
```
## [1] TRUE
```

Exercises

(from Advanced R) For each of the following functions, read the code and figure out what the corresponding base R function is. You might not understand every part of the code yet, but you should be able to figure out the basics of what the function does.

C++

double f1(NumericVector x) {
  int n = x.size();
  double y = 0;

  for(int i = 0; i < n; ++i) {
    y += x[i] / n;
  }
  return y;
}

C++

NumericVector f2(NumericVector x) {
  int n = x.size();
  NumericVector out(n);

  out[0] = x[0];
  for(int i = 1; i < n; ++i) {
    out[i] = out[i - 1] + x[i];
  }
  return out;
}

C++

bool f3(LogicalVector x) {
  int n = x.size();

  for(int i = 0; i < n; ++i) {
    if (x[i]) return true;
  }
  return false;
}

C++

int f4(Function pred, List x) {
  int n = x.size();

  for(int i = 0; i < n; ++i) {
    LogicalVector res = pred(x[i]);
    if (res[0]) return i + 1;
  }
  return 0;
}

C++

NumericVector f5(NumericVector x, NumericVector y) {
  int n = std::max(x.size(), y.size());
  NumericVector x1 = rep_len(x, n);
  NumericVector y1 = rep_len(y, n);

  NumericVector out(n);

  for (int i = 0; i < n; ++i) {
    out[i] = std::min(x1[i], y1[i]);
  }

  return out;
}

Create a function in C++ call fib() that will take as input n and return the first n Fibonacci numbers, where the sequence begins with 0, 1, 1, 2, 3, 5, 8, 13, …

E.g.

R
```
fib(1)
```
```
## [1] 0
```
```
fib(2)
```
```
## [1] 0 1
```
```
fib(10)
```
```
##  [1]  0  1  1  2  3  5  8 13 21 34
```

(from Advanced R) Convert the following functions into C++. For now, assume the inputs have no missing values. Try not to use Rcpp Sugar (e.g. Rcpp::min()).

all().
cumprod(), cummin(), cummax().
diff(). Start by assuming lag 1, and then generalize for lag n.
range().

all_cpp() output examples:

all_cpp(c(TRUE, TRUE, FALSE))

## [1] FALSE

all_cpp(TRUE)

## [1] TRUE

all_cpp(FALSE)

## [1] FALSE

all_cpp(c(TRUE, TRUE))

## [1] TRUE

cumprod_cpp() output examples

x <- 1:4
cumprod_cpp(x)

## [1]  1  2  6 24

x <- double()
cumprod_cpp(x)

## numeric(0)

x <- 3
cumprod_cpp(x)

## [1] 3

cummin_cpp() output examples:

cummin_cpp(double())

## numeric(0)

cummin_cpp(3)

## [1] 3

cummin_cpp(c(10, 11, 3, 4, 5, 1, 2))

## [1] 10 10  3  3  3  1  1

cummax_cpp() output examples:

cummax_cpp(double())

## numeric(0)

cummax_cpp(3)

## [1] 3

cummax_cpp(c(10, 11, 3, 4, 5, 22, 2))

## [1] 10 11 11 11 11 22 22

diff_cpp() output examples:

diff_cpp(x = c(1, 5, 10, 20), lag = 1)

## [1]  4  5 10

diff_cpp(x = c(1, 5, 10, 20), lag = 2)

## [1]  9 15

diff_cpp(x = c(1, 5, 10, 20), lag = 3)

## [1] 19

diff_cpp(x = c(1, 5, 10, 20), lag = 4)

## numeric(0)

range_cpp() output examples:

range_cpp(c(9, 11, 22, 12, 1))

## [1]  1 22

range_cpp(8)

## [1] 8 8

range_cpp(double())

## [1] -Inf  Inf

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

C++ Vectors

David Gerard

2022-02-23