Compiled C and C++ routines can be called from R using the built-in .


R objects passed to these routines have type SEXP. A SEXP is a pointer to an encapsulated structure that holds the object’s type, value, and other attributes used by the R interpreter.


The R application programming interface (API) provides a limited set of macros and C routines for manipulating SEXPs and calling R functions.


The level of abstraction in the R API is low. Even simple tasks may require writing lengthy boilerplate code.

R API是简易的,简单任务也必须编写漫长的样板代码。

Using the R API from C++ is especially uncomfortable, because it doesn’t take advantage of any of C++’s features.

在C++中使用R API是不令人高兴的,因为它们没有任何C++的特性。

Rcpp is an R package that makes it easier to interface R and C++ code. Rcpp does this by providing a set of C++ wrapper classes for common R data types, as well as tools for automating the process of compiling and loading C++ routines for R.

Rcpp提供常见R数据类型的 C++ 包装类的集合与编译加载C++例程的工具。


Create a blank text file and enter the code:


#include <Rcpp.h>
// [[Rcpp::export]]

void hello()


   Rprintf("Hello, world! ");


Save the file as hello.cpp.


Rprintf()是R API。The syntax is the same as printf.

Time to test the code! Start R and enter the commands:



sourceCpp("hello.cpp")   #编译hello.cpp文件


You should see “Hello, world!” printed on the R console.

在显示器R console上将看到"hello,world"字样。

3.The Rcpp Interface

3.1 Data Structures

Most of Rcpp’s functionality is provided through a set of C++ classes that wrap R data structures.A few of them are:

Rcpp 的大部分功能通过一组包装R 数据结构的C++类提供。有几个是:

        • IntegerVector, NumericVector, LogicalVector, CharacterVector


        • List, DataFrame


        • Named, Dimension


        • IntegerMatrix, NumericMatrix


        • Function


        • Environment


Memory management is handled automatically by the class constructors and destructors. These classes also have methods that mimic various R functions. A few of the most


useful methods are:

        • isNULL


        • attributeNames, hasAttribute, attr


        • length, nrow, ncol        


The vector and list classes have constructors that accept the number of elements as a parameter, similar to their counterparts in R.


The helper class Dimension can be used to create a multidimensional vector:


        // Create a 2-by-3-by-4 vector.

        NumericVector a = NumericVector( Dimension(2, 3, 4) );  #创建数值向量a,有维度(2,3,4)

They also have a static create method, for specifying the elements of the new vector. The helper class Named represents named vector elements. For instance,


        IntegerVector q1_days = IntegerVector::create(

                Named("January") = 31,    #赋值january=31

                Named("February") = 28,

                Named("March") = 31


creates an integer vector with 3 named elements.


3.3 Other Details

Rcpp converts R objects to and from C++ objects with the templated routines as and wrap, respectively. It’s rarely necessary to call these routines explicitly, but since Rcpp makes frequent implicit use of them, it’s important to know what they do.


The clone routine makes a copy of an Rcpp object. Since C++ uses reference semantics, you must explicitly call clone when you want to make a copy.





Missing values can be specified with the constants NA_INTEGER, NA_REAL, NA_LOGICAL, and NA_STRING. The special values NaN, Inf, and -Inf can be specified with the constants R_NaN, R_PosInf, and R_NegInf. These constants all come from the R API rather than Rcpp.

缺失值应被规范表示为NA_INTEGER, NA_REAL, NA_LOGICAL, 和NA_STRING常数量。而特殊值NaN, Inf, 和-Inf应被表示为R_NaN, R_PosInf, 和R_NegInf常量。这些常量产生在R API而不是Rcpp。

4.Programming Strategy 编程战略

Generally speaking, you should write most of your code in R, to take advantage of its high level of abstraction. Then you can profile your code to identify bottlenecks where R is unacceptably slow, and replace those sections with C++ code for a performance boost. The most straightforward way to do this is to rewrite an entire function. As long as your C++ routine has the same call signature as the R function it replaces, the change should be invisible to the rest of your application.


5.Example: Row Maximums

Suppose we want to compute the maximum element of each row in a matrix. To achieve this, we loop over each row of the matrix and use the sugar routine max:

#include <Rcpp.h>

using namespace Rcpp;

// [[Rcpp::export]]

NumericVector row_max(NumericMatrix m)   ##计算矩阵每一行的最大值


int nrow = m.nrow();   ##行数nrow

NumericVector max(nrow);  ##声明max数组,用圆括号表示
for (int i = 0; i < nrow; i++) // Get row i with m(i, _).

max[i] = Rcpp::max( m(i, _) );  ##调用max()计算每一行的最大值,保存在数组max[]中
return max;  ##返回值max数组


Notice that the matrix classes in Rcpp use parentheses ( ) as the subset operator rather than square brackets

[ ]. This is due to limitations in C++.


6.Example: Box Packing 背包问题

Suppose we want to simulate a discrete box-packing Markov chain. At each time step, an item with weight randomly distributed in {1,...,w} arrives for packing. Items are placed in the same box so long as the box weight does not exceed w. If an item would make the current box’s weight exceed w, a new box is started with that item. We might be interested in the weight of the current box at each time step, as well as which times a new box is started.


A simulation of the box-packing chain can be implemented in R, but suppose we want to run the simulation for a large number of time steps in order to estimate long-run statistics. In that case, the simulation might be unacceptably slow. We can use C++ and Rcpp to write a much faster version.



Create a blank text file and enter the code skeleton:  创建.txt文件,输入源码框架

#include <Rcpp.h> using namespace Rcpp;
// [[Rcpp::export]]

List pack_boxes(int n, NumericVector p) {
// ...

The pack_boxes routine will contain our simulation. It needs to sample item weights, add each item weight to the previous time step’s box weight, and then check whether the box is too heavy, starting a new box when necessary. The routine has parameters n, the number of steps to simulate, and p, the probabilities of the item weights. We don’t need to make w a parameter, since w can be inferred from the length of p. The routine has return type List. Rcpp implicitly converts between SEXP and these input/output types.


If we were implementing the simulation in R, we could sample the item weights with the sample function. The R API doesn’t have a corresponding C routine. Fortunately, Rcpp’s Function class makes calling R functions from C++ simple. The constructor takes the name of the desired function as parameter. After creating a Function object for sample, we can call it with the same parameters as the original R function. A word of caution: calling R functions from C++ code is at least as slow as calling them from R itself, so use them sparingly.

如果在R语言环境中我们执行模拟,将应用收集函数收集物品的重量。R API没有C语言例程。幸运的是,Rcpp的函数类能从C++调用R语言函数。构造器使用所需函数的名字作为参数。在为收集数据创建一个函数对象之后,我们能用相同的参数调用此函数当作R语言函数。注意:调用R语言函数尽管用C++源码,和调用R语言函数一样慢,所以应有节制地使用。

For the rest of the simulation, we need a vector weight of length n to hold the weight of the box at each time step, and another vector, first, to hold the first item times. We also need a variable n_boxes to keep track of how many boxes have been packed.


#include <Rcpp.h> using namespace Rcpp;
// [[Rcpp::export]]

List pack_boxes(int n, NumericVector p)  #p物品的概率


Function sample = Environment("package:base")["sample"];  #sample()函数

// Sample item weights.

int w = p.size();    #w=p.size()推理

IntegerVector item = sample(w, n, true, p); #item变量是sample()的值,重量向量
// Initialize loop variables.

IntegerVector weight(n);    #weight[]数组,重量向量

weight[0] = item[0];
IntegerVector first(n);   #重量向量,first[]

first[0] = 1;

int n_boxes = 1;  #包装箱数量
// ...

We don’t know how long first needs to be,but we can ensure it’s long enough by making it length n,as above. Alter natively, if we were concerned about memory usage, we could’ve used a data structure from C++’s standard template library and converted to a correctly-sized IntegerVector at the end of the simulation with Rcpp’s wrap routine.

我们不知道第一个箱子需要多久的时间,但是我们能根据长度n确保足够的程序运行时间。如果我们关注内存的使用率,我们使用了c++标准模板函数库中的一个数据结构,只能在模拟的最后用Rcpp 包装例程进行格式转换,将此数据结构转换为正确长度的整数向量。?

The core of our simulation is a for loop. Unlike R, where for loops be avoided in favor of vectorized code, there’s no penalty for using for loops in C++.

我们的模拟程序的核心是一个for 循环。与R语言不同,for循环避免用向量计算编码,在C++中使用for循环并不悔带来任何坏处。

for (int i = 1; i < n; i++)


int new_weight = weight[i - 1] + item[i];
if (new_weight <= w) {

    // Continue with current

    box. weight[i] = new_weight;

  } else  {

   // Start a new box.

    weight[i] = item[i];

   first[n_boxes++] = i + 1;

// ...

