C/C++ 語言測量時間函數，評估程式執行效能方法整理

這裡整理了 C/C++ 中各種測量時間的函數與用法，並提供完整的範例程式碼，讓程式開發者方便測量程式執行速度。

這裡我蒐集了一些在 C/C++ 中常見的程式執行速度測量方式，因為時間的量測方式與細節非常多，這裡只是簡單寫一些常用的方式與範例。

程式中的時間

在測量程式執行所花費的時間前，必須先認識一下時間的測量方式，不同的測量方法會得到不同的結果，其意義也不同。

Wall-Clock Time

Wall-clock time 顧名思義就是真實世界的時間，相當於以牆上的時鐘為依據所計算出來的時間，這個時間會牽涉到校時、時區以及夏令時間之類的問題，詳細說明請參考維基百科的 Wall-clock time 說明。

由於 wall-clock time 並不是單調遞增（monotonic）的數值，所以它不是一個穩定的時間依據，只能做為參考用，若需要非常精準的量測程式效能，不建議使用這種時間。

CPU Time

CPU time 是指程式真正使用 CPU 在執行的時間，而這個時間又可以細分為兩種：

user time：程式本身執行的時間（user space）。
system time：作業系統層級執行的時間（kernel space）。

詳細說明請參考維基百科的 CPU time 與 User space 說明。

對於多執行緒（multithreading）的程式，其 CPU time 就是每條執行緒的執行時間總和，所以平行化的程式其 CPU time 可能會比 wall-clock time 還要長。

C 語言範例

這是一個利用蒙地卡羅演算法計算 pi 的範例：

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
double pi(int n) {
  srand(5);
  int count = 0;
  double x, y;
  for (int i = 0; i < n; ++i) {
    x = (double) rand() / RAND_MAX;
    y = (double) rand() / RAND_MAX;
    if (x * x + y * y <= 1) ++count;
  }
  return (double) count / n * 4;
}
int main() {
  double result = pi(1e8);
  printf("PI = %f\n", result);
}

使用 gcc 編譯：

gcc -o pi pi.c

若使用比較舊的 gcc 編譯器，要加上 -std=c99 參數：

gcc -std=c99 -o pi pi.c

以下我們將以這個程式為例，介紹測量程式執行時間的方法。

Linux `time` 指令

在 Linux 中有一個 time 指令可以直接測試程式的執行時間（CPU time）：

time ./pi

PI = 3.142172

real    0m2.210s
user    0m2.209s
sys     0m0.001s

time 指令的輸出分為 user time、system time 以及實際上所花費的時間。

如果在系統上同時有其他的程式也在使用 CPU 時，結果會有些差異。我先使用 stress 讓 CPU 滿載：

stress --cpu 40

接著再測試一次：

time ./pi

PI = 3.142172

real    0m3.706s
user    0m3.600s
sys     0m0.000s

`time` 函數

C 標準函式庫的 time 函數可以傳回系統上的 wall-clock time，精準度為 1 秒，以下是範例。

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <time.h> // time 函數所需之標頭檔
double pi(int n) {
  srand(5);
  int count = 0;
  double x, y;
  for (int i = 0; i < n; ++i) {
    x = (double) rand() / RAND_MAX;
    y = (double) rand() / RAND_MAX;
    if (x * x + y * y <= 1) ++count;
  }
  return (double) count / n * 4;
}
int main() {
  // 儲存時間用的變數
  time_t start, end;

  // 開始計算時間
  start = time(NULL);

  // 主要計算
  double result = pi(1e8);

  // 結束計算時間
  end = time(NULL);

  // 計算實際花費時間
  double diff = difftime(end, start);

  printf("PI = %f\n", result);
  printf("Time = %f\n", diff);
}

用 gcc 編譯：

gcc -o pi pi.c

執行：

./pi

PI = 3.142172
Time = 2.000000

因為 time 函數精準度只有 1 秒，所以這個測量方式不太適合太小的程式。

在 CPU 滿載（使用 stress）的狀況下，測試的結果：

./pi

PI = 3.142172
Time = 4.000000

`clock` 函數

在標準的 C 函式庫中，有一個 clock 函數可以傳回程式的 CPU 時脈數（clock ticks），可計算 CPU time，使用範例如下：

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <time.h> // clock 函數所需之標頭檔
double pi(int n) {
  srand(5);
  int count = 0;
  double x, y;
  for (int i = 0; i < n; ++i) {
    x = (double) rand() / RAND_MAX;
    y = (double) rand() / RAND_MAX;
    if (x * x + y * y <= 1) ++count;
  }
  return (double) count / n * 4;
}
int main() {
  // 儲存時間用的變數
  clock_t start, end;
  double cpu_time_used;

  // 計算開始時間
  start = clock();

  // 主要計算
  double result = pi(1e8);

  // 計算結束時間
  end = clock();

  // 計算實際花費時間
  cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;

  printf("PI = %f\n", result);
  printf("Time = %f\n", cpu_time_used);
}

clock 函式所傳回的數值是 CPU 的時脈數，並不是真正的時間，而 CLOCKS_PER_SEC 這個巨集（macro）是每秒 CPU 的時脈數，相除之後就是程式所使用的 CPU time，單位為秒，這個時間包含 user time 與 system time。

用 gcc 編譯：

gcc -o pi pi.c

執行：

./pi

PI = 3.142172
Time = 2.250000

在 CPU 滿載（使用 stress）的狀況下，測試的結果：

./pi

PI = 3.142172
Time = 3.670000

`clock_gettime` 函數

clock_gettime 函數可以取得 wall-clock time 或程式的 CPU time，其所傳回的時間是用 timespec 這個結構（struct）所儲存的：

struct timespec {
  time_t tv_sec;  /* seconds */
  long   tv_nsec; /* nanoseconds */
};

使用 timespec 來儲存時間的話，其精準度最高可達十億分之一秒（nanosecond），若要查詢實際的精確度，可以使用 clock_getres 函數：

#include <time.h>
#include <stdio.h>
int main() {
  struct timespec t;
  clock_getres(CLOCK_MONOTONIC, &t);
  printf("Resolution: %ld nanosecond\n", t.tv_nsec);
  return 0;
}

gcc -o getres getres.c
./getres

Resolution: 1 nanosecond

clock_getres 的第一個參數是指定時間的類型，常見的類型有：

CLOCK_REALTIME：系統的實際時間（wall-clock time）。
CLOCK_REALTIME_COARSE：系統的實際時間（wall-clock time），取得速度快，但精確度校低。
CLOCK_MONOTONIC：單調遞增時間（monotonic time），這個時間會非常穩定的持續遞增，不會因為系統時間改變而有變動，適合用於測量程式執行效能。
CLOCK_MONOTONIC_COARSE：與 CLOCK_MONOTONIC 類似，取得速度快，但精確度校低。
CLOCK_MONOTONIC_RAW：與 CLOCK_MONOTONIC 類似，但是它是從硬體時鐘所讀取出來的值。
CLOCK_PROCESS_CPUTIME_ID：程式行程的 CPU time，這個時間包含所有的執行序所花費的時間。
CLOCK_THREAD_CPUTIME_ID：程式單一執行序所耗費的時間。

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <time.h> // clock_gettime 函數所需之標頭檔
double pi(int n) {
  srand(5);
  int count = 0;
  double x, y;
  for (int i = 0; i < n; ++i) {
    x = (double) rand() / RAND_MAX;
    y = (double) rand() / RAND_MAX;
    if (x * x + y * y <= 1) ++count;
  }
  return (double) count / n * 4;
}
struct timespec diff(struct timespec start, struct timespec end) {
  struct timespec temp;
  if ((end.tv_nsec-start.tv_nsec)<) {
    temp.tv_sec = end.tv_sec-start.tv_sec-1;
    temp.tv_nsec = 1000000000+end.tv_nsec-start.tv_nsec;
  } else {
    temp.tv_sec = end.tv_sec-start.tv_sec;
    temp.tv_nsec = end.tv_nsec-start.tv_nsec;
  }
  return temp;
}
int main() {
  // 儲存時間用的變數
  struct timespec start, end;
  double time_used;

  // 計算開始時間
  clock_gettime(CLOCK_MONOTONIC, &start);

  // 主要計算
  double result = pi(1e8);

  // 計算結束時間
  clock_gettime(CLOCK_MONOTONIC, &end);

  // 計算實際花費時間
  struct timespec temp = diff(start, end);
  time_used = temp.tv_sec + (double) temp.tv_nsec / 1000000000.0;

  printf("PI = %f\n", result);
  printf("Time = %f\n", time_used);
}

用 gcc 編譯：

gcc -o pi pi.c

執行：

./pi

PI = 3.142172
Time = 2.145160

在 CPU 滿載（使用 stress）的狀況下，測試的結果：

./pi

PI = 3.142172
Time = 3.773047

`getrusage` 函數

getrusage 函數可以取得程式所使用的各種系統資源統計數據，包含 CPU、記憶體、I/O 等，所以我們也可以利用這個函數來測量程式的 CPU time：

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <time.h>
#include <stdint.h>
#include <sys/time.h>
#include <sys/resource.h>
double pi(int n) {
  srand(5);
  int count = 0;
  double x, y;
  for (int i = 0; i < n; ++i) {
    x = (double) rand() / RAND_MAX;
    y = (double) rand() / RAND_MAX;
    if (x * x + y * y <= 1) ++count;
  }
  return (double) count / n * 4;
}
int main(int argc, char *argv[]) {
  struct rusage ru;
  struct timeval utime;
  struct timeval stime;

  // 主要計算
  double result = pi(1e8);

  // 取得程式的 user time 與 system time
  getrusage(RUSAGE_SELF, &ru);

  printf("PI = %f\n", result);
  utime = ru.ru_utime;
  stime = ru.ru_stime;
  double utime_used = utime.tv_sec + (double) utime.tv_usec / 1000000.0;
  double stime_used = stime.tv_sec + (double) stime.tv_usec / 1000000.0;
  printf("User Time = %f\n", utime_used);
  printf("System Time = %f\n", stime_used);

  return 0;
}

getrusage 函數可以分別取得程式的 user CPU time 與 system CPU time，有類似 time 指令的效果。

用 gcc 編譯：

gcc -o pi pi.c

執行：

./pi

PI = 3.142172
User Time = 2.258956
System Time = 0.000999

在 CPU 滿載（使用 stress）的狀況下，測試的結果：

./pi

PI = 3.142172
User Time = 3.670714
System Time = 0.001000

`gettimeofday` 函數

gettimeofday 函數可以取得很精確的 wall-clock time。

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <sys/time.h>
double pi(int n) {
  srand(5);
  int count = 0;
  double x, y;
  for (int i = 0; i < n; ++i) {
    x = (double) rand() / RAND_MAX;
    y = (double) rand() / RAND_MAX;
    if (x * x + y * y <= 1) ++count;
  }
  return (double) count / n * 4;
}
int main(int argc, char *argv[]) {
  struct timeval start, end, diff;

  // 開始計算時間
  gettimeofday(&start, NULL);

  // 主要計算
  double result = pi(1e8);

  // 結束計算時間
  gettimeofday(&end, NULL);

  // 計算實際花費時間
  timersub(&end, &start, &diff);

  double time_used = diff.tv_sec + (double) diff.tv_usec / 1000000.0;
  printf("PI = %f\n", result);
  printf("Time = %f\n", time_used);

  return 0;
}

用 gcc 編譯：

gcc -o pi pi.c

執行：

./pi

PI = 3.142172
Time = 2.258592

在 CPU 滿載（使用 stress）的狀況下，測試的結果：

./pi

PI = 3.142172
Time = 3.760840

後記

我原本想要找一個比較穩定的測試方式，但是透過 stress 的測試結果來看，幾乎每一種方式都會受到系統負載的影響，若未來有看到比較好的方式，再補上來。

程式中的時間#

Wall-Clock Time#

CPU Time#

C 語言範例#

Linux time 指令#

time 函數#

clock 函數#

clock_gettime 函數#

getrusage 函數#

gettimeofday 函數#

後記#

參考資料#

程式中的時間

Wall-Clock Time

CPU Time

C 語言範例

Linux `time` 指令

`time` 函數

`clock` 函數

`clock_gettime` 函數

`getrusage` 函數

`gettimeofday` 函數

後記

參考資料