GPU Notes

Theano

Set GPU Device

device=gpu
device=gpu01
device=gpu02
..

GPU Multi-task Running Model

Everytime when new gpu computation program instructions is submitted into the GPU device, it will suspend the on-going task and execute the new instructions (program) immediately

CUDA Benchmark

Purpose

The purpose of this benchmark is to prove that parallel computing on gpu does significantly improves program performance in terms of speed. Also, this benchmark gives an estimate of the performance increase.

Experiment 1

Environment

Alienware 14R2 (i5-6GB-GT650M)
Ubuntu14.04 CUDA7.5 GCC4.8.2

Code

C++11
CUDA

Time Elapsed:

C++11: 5418.464355 millsec
CUDA: 229.8250 millsec

Performance Evaluation

C++11: CUDA = 23.6
CUDA Time Percentage = 4%

Experiment 2

Environment

Alienware 14R2 (i5-6GB-GT650M)
Ubuntu14.04 CUDA7.5 GCC4.8.2
Tensorflow 0.9
Tensorflow 0.8

Code

Python2.7 with Tensorflow 0.9 CPU
Python3.5 with Tensorflow 0.8 GPU

Time Elapsed:

CPU: Time Elapsed: 298.434136 s
GPU: 144.865046 s

Performance Evaluation

CPU: CUDA = 2.06
GPU Time Percntage = 48.5%