Rcpp 套件：結合 R 與 C++ 加速程式執行

向量輸入、向量輸出

輸入與輸出都是向量的函數也很常見，這裡我們以一個氣泡排序演算法的排序函數做示範，以下是一個 R 版本的排序函數：

r_sort <- function(x) {
  x.length = length(x)
  for (i in seq(1, x.length-1)) {
    for (j in seq(1, x.length-i)) {
      if (x[j] > x[j+1]) {
        tmp = x[j]
        x[j] = x[j+1]
        x[j+1] = tmp
      }
    }
  }
  x
}

同樣的演算法以 C++ 改寫則變成：

NumericVector cpp_sort(NumericVector x) {
  int x_length = x.size();
  for(int i = 0; i < x_length-1; ++i) {
    for(int j = 0; j < x_length-1-i; ++j) {
      if (x[j] > x[j+1]) {
        double tmp = x[j];
        x[j] = x[j+1];
        x[j+1] = tmp;
      }
    }
  }
  return x;
}

使用 cppFunction 編譯並載入使用：

cppFunction('NumericVector cpp_sort(NumericVector x) {
  int x_length = x.size();
  for(int i = 0; i < x_length-1; ++i) {
    for(int j = 0; j < x_length-1-i; ++j) {
      if (x[j] > x[j+1]) {
        double tmp = x[j];
        x[j] = x[j+1];
        x[j+1] = tmp;
      }
    }
  }
  return x;
}')

在實作這種比較複雜一點的演算法時，通常在程式寫完之後，最好先做一下驗證，確保程式沒有寫錯。這裡我們將自己寫的兩個排序函數跟 R 內建的 sort 函數比較，看看結果是否完全相同：

x <- runif(100)
x.sort <- sort(x)
x.r_sort <- r_sort(x)
x.cpp_sort <- cpp_sort(x)
sum((x.sort - x.r_sort)^2)

[1] 0

sum((x.sort - x.cpp_sort)^2)

[1] 0

確認程式無誤後，使用 microbenchmark 測試執行效能：

library(microbenchmark)
x <- runif(100)
x.benchmark <- microbenchmark(
  sort(x),
  cpp_sort(x),
  r_sort(x)
)
x.benchmark

Unit: microseconds
        expr      min        lq       mean    median        uq      max neval
     sort(x)   11.933   13.1045   21.55053   19.1545   31.8010   44.421   100
 cpp_sort(x)    6.723    7.3220    9.91120    9.7080   12.6855   20.863   100
   r_sort(x) 3626.665 3713.8905 3972.78264 3816.2110 3920.3770 5933.124   100

畫出測試結果的小提琴圖（violin plot）：

library(ggplot2)
autoplot(x.benchmark)

測試結果小提琴圖

在這個例子中，自己使用 Rcpp 寫的 C++ 排序函數，其執行效能還比內建的 sort 函數還要好，而一般的 R 排序函數執行速度一樣是最慢的。若將 x 向量的長度增加一些，結果又會有些不同：

library(microbenchmark)
x <- runif(1000)
x.benchmark <- microbenchmark(
  sort(x),
  cpp_sort(x),
  r_sort(x)
)
x.benchmark

Unit: microseconds
        expr        min          lq        mean      median          uq        max neval
     sort(x)     16.569     18.0955     31.3704     40.2485     43.1295     54.350   100
 cpp_sort(x)    380.192    381.5215    391.0866    388.8055    390.3740    648.092   100
   r_sort(x) 302941.105 304085.3015 314353.2839 304802.2955 306465.2875 751781.425   100

library(ggplot2)
autoplot(x.benchmark)

測試結果小提琴圖

對於長度比較長的 x 向量，內建的 sort 排序函數就會比較快一些，而一般的 R 排序函數執行速度還是一如往常的慢。

氣泡排序是比較簡單的演算法，執行所需時間為 O((n^2))，而內建的 sort 函數所使用的排序演算法絕對比 O((n^2)) 還要好，所以在向量長度較長時，差異會比較明顯。

矩陣輸入、向量輸出

Rcpp 同時也對 R 的幾種矩陣定義了一些 C++ 的類別：

C++ 類別	說明
`NumericMatrix`	浮點數矩陣
`IntegerMatrix`	整數矩陣
`CharacterMatrix`	字元矩陣
`LogicalMatrix`	邏輯矩陣

其使用方式跟向量類似，以下是將 R 的 rowSums 改寫成 C++ 的程式碼：

NumericVector cpp_rowSums(NumericMatrix x) {
  int nrow = x.nrow(), ncol = x.ncol();
  NumericVector result(nrow);
  for (int i = 0; i < nrow; i++) {
    double total = 0;
    for (int j = 0; j < ncol; j++) {
      total += x(i, j);
    }
    result[i] = total;
  }
  return result;
}

NumericMatrix 的 nrow 與 ncol 方法可以用來取得該矩陣的列（row）數與行（column）數，而在提取矩陣內部的元素時，是使用小括號 () 運算子。

使用 cppFunction 編譯並載入使用：

cppFunction('NumericVector rowSumsC(NumericMatrix x) {
  int nrow = x.nrow(), ncol = x.ncol();
  NumericVector result(nrow);
  for (int i = 0; i < nrow; i++) {
    double total = 0;
    for (int j = 0; j < ncol; j++) {
      total += x(i, j);
    }
    result[i] = total;
  }
  return result;
}')

與 R 內建的 rowSums 函數做比較：

set.seed(1)
x <- matrix(sample(100), 10)
rowSums(x)

 [1] 524 494 436 529 423 505 553 451 577 558

cpp_rowSums(x)

 [1] 524 494 436 529 423 505 553 451 577 558

Page: 1 2 3 4 5

G. T. Wang

個人使用 Linux 經驗長達十餘年，樂於分享各種自由軟體技術與實作文章。

下一篇 [開箱] 曼富圖 Manfrotto 190T 四節腳架與 MHXPRO-3W 三向雲台（MK190XPRO4T-3W 套組） »

上一篇 « [DIY] 自製東山椪柑環保酵素

Published by

G. T. Wang

8 年 ago

R 升級版本，自動安裝舊版 R 所有套件教學
本篇介紹如何在安裝新版本的 R...
RSelenium：R 使用 Selenium 操控瀏覽器下載網頁資料
本篇介紹如何在 R 中透過 R...
Red Hat Enterprise Linux 7.3 編譯與安裝 R 語言執行環境教學
本篇介紹如何在 Red Hat...

[開箱] Marttiini MFT G10 芬蘭北歐直刀

本篇是 Marttiini M...

2 年 ago

開箱

[開箱] 一般打火棒、鎂塊打火棒

本篇是鎂塊打火棒與一般普通打火...

2 年 ago

開箱

[開箱] 瑞典 BAHCO 396-Lap 魚牌軍規摺疊鋸

本篇為瑞典 BAHCO 396...

2 年 ago

[DIY] 野炊水煮蛋、紅豆湯記錄

本篇記錄今年過年期間帶阿玄在家...

2 年 ago

生活

[竹北素食] 善菓堂 SHAN GUO TANG

本篇是我最近去竹北善菓堂用餐時...

2 年 ago

Linux

Linux 複製檔案 cp 指令用法教學與範例

本篇介紹如何在 Linux 系...

2 年 ago

Rcpp 套件：結合 R 與 C++ 加速程式執行

向量輸入、向量輸出

矩陣輸入、向量輸出

相關文章

Recent Posts

[開箱] Marttiini MFT G10 芬蘭北歐直刀

[開箱] 一般打火棒、鎂塊打火棒

[開箱] 瑞典 BAHCO 396-Lap 魚牌軍規摺疊鋸

[DIY] 野炊水煮蛋、紅豆湯記錄

[竹北素食] 善菓堂 SHAN GUO TANG

Linux 複製檔案 cp 指令用法教學與範例