Tensorflow Study Note, Part Three


  1. Interactive Session

tf.InteractiveSession() creates a default session that can be used without explictly called in a IPython environment


a = tf.constant(1)

  1. Regular Session

Regular session needs to be run within a python context or through session object

Example 1:

a = tf.constant(1)
sess = tf.Session()

Example 2:

a = tf.constant(1)
with tf.Session():


This function is used to add bias to the input tensor (element-wise addition between "bias" vector and the feature vector)

It should be noted that the bias added here is completed different from the normal concept of adding a bias to the hidden unit (summed weight before activation)


The run function of tf.Session provides a interface to execute provided TF operations and evaluate Tensors.

Feature Preprocessing

Continuous features can be feed into the first hidden layer of the neural network directly. Discrete features are recommended to go through an embedding layer.


This API accepts a [batch_size, doc_length] tensor of type "int32" or "int64"


This API is used to loopup an embedding using ID.

Categorial to Embedding Pipeline

Categorial Label ---> Ordinal ID ---> One-hot Embedding ----> Dense Embedding


Tensorflow(TF)项目的Python3版本安装文件的官方打包只有python 3.4版本的,如果想要将TF安装到Python3.5上,请不要使用官方提供的pip安装方式。


  1. 将官方wheel安装包下载到本地: wget https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp34-cp34m-linux_x86_64.whl
  2. 重命名wheel安装文件: mv ~/username/tensorflow-0.8.0-cp34-cp34m-linux_x86_64.whl ~/username/tensorflow-0.8.0-cp35-cp35m-linux_x86_64.whl
  3. 使用pip从命令行手动安装wheel: pip install ~/username/tensorflow-0.8.0-cp35-cp35m-linux_x86_64.whl


  1. GPU版TF的安装除了上述步骤外,还需要提前确认用于计算的GPU(显卡)是否有CUDA Compute Capability 3.0以上的运算能力,具体档位和显卡型号对应表请查阅下面的链接: CUDA Compute Capability 官方链接
  2. CUDA Driver的安装, CUDA Driver包含在CUDA Toolkit里,请直接安装CUDA Toolkit; 请参照Nvida官方的文档, 如果安装环境以前装过非官方的Nvida显卡驱动,需要手动卸载。CUDA Driver Instructions 官方链接。CUDA Toolkit (含Driver和工具)的安装请使用Standalone安装方式(即从runfile离线安装)
  3. Cudnn (GPU运算库)的安装:官方下载需要注册和审核,这里给出一个临时下载地址:Cudnn Library 百度网盘链接下载解压候使用如下命令安装Cudnn文件
tar xvzf cudnn-7.0-linux-x64-v4.0-prod.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo cp cuda/lib64/libcudnn* /usr/lib/x86_64-linux-gnu/
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
sudo chmod a+r /usr/lib/x86_64-linux-gnu/libcudnn*


  1. Miscellaneous Problems:
    a. Cannot find CUDA cuda.xx.so.4.5. (for example): sudo ldconfig /usr/local/cuda/lib64
    (Update the cache file of linker targets so that linker can find shared libraries)
    b. Cannot Open CUDNN cudnn.xx. (for example): sudo ldconfig /usr/local/cuda/lib64

  2. 常用命令:
    lspci | grep -i nvidia
    nvidia-smi -L 显示GPU硬件信息
    nvidia-smi -a 显示GPU使用情况
    nvidia-smi 显示GPU综合情况(含Current Running GPU Process)

  3. GPU编号
    nvidia-smi -L 命令显示的GPU编号不一定是GPU运算库配置文件里所需要的编号, 配置文件有时需要按GPU个数编号如0, 1, 2。
    nvidia-smi -L 命令显示出来的GPU编号可以从4, 5, 6这些开始
    nvidia-smi -a 显示GPU使用情况

Tensorflow Study Note, Part Two

The creation of tensor object under tensorflow framework.

  • The generation of tensors as the input of a computation graph is through tf.constant() and tf.Variable() constructor.
  • The basic arithmatic operation between tensors in a given computation node (ops) must be done through specialized operations such as tf.add, tf.mul, etc
  • The operations (and the entire graph) on tensors will be executed on a specified device (CPU or GPU).
  • The return of the execution of the computation graph is a ndarray of numpy

The programming style of the tensorflow framework.

A complete computation process is defined by a series of computation steps, the output of one step become the input of the next step.
In short, TF adopts functional programming.

TF does not require a "graph" object to be explicitly constructed, rather, "graph" are built through a series of function calls. All functions(ops) defined in the "graph" is carried implicitly with the tensor returned. The run method of session method in the execution phrase will call all function calls carried with the tensor input.

Matrix Representation

In tensorflow, matrix is represented by a two dimensional list or np array. Row vector is represented by a list, column vector is represented by the transpose of row vector. The vector must be represented in matrix form ([[1, 2]]) in stead of a plain list.

Under the notation of tf, a matrix (tensor) is a column vector which is composed of several row vectors.

N-row is determined by the length of the outer list, N-column is determined by the length of the inner list.

Input/Output of TF Operations

The input of an TF operation is typically tensors (matrix), and the output is typically tensors (matrix) also. Even scalars should be wrapped in tensors (matrices). This implies the flow of datasets.

The output of a tf.reduce_mean() could be a scalar or tensor (ndarray), the output of a tf.matmul() is a tensor (even as a tensor with rank 0).

Indexing and Item assignment on Tensor Object

Indexing on a tensor object is allowed while item assignment is not supported.

Tensorflow Study Note, Part One

Tensor v.s. Matrix v.s. Vector

Hierarchy: tensor > matrix > vector

a matrix is a tensor with rank equals 2, a vector is tensor with rank equals 1.
Vector is the basic building block of a multi-dimensional tensor

Overloaded Terminologies

Current days, the exact definition of terms such as shape, rank, dimension, degree differs greatly across the general CS community. In other words, "dimension" as a term is excessively overloaded.

  1. When referring to a point in an Euclidean space, people may use the term 3-D, 4-D to specify the number of dimensionality, which is number of coordinates that is needed to specify the position of a point in such a space. 3-D is the geometry property of a point with three elements (coordinates).

Example: 1-D => (1) 2-D => (1,2) 3-D => (1,2,1) 4-D => (1,2,4,1)

The term "dimension" here has to be interpreted together with the underlying assumption that the subject described is point in a Euclidean space. Therefore, the mathematic description of this point studied is a tuple (or list, array as in the context of different programming languages) with a length of x (which is the value of dimension).

Point is a mathematical object.

  1. When referring to an array, dimension is used to describe the least number of indices needed to specific an element in the array. 3-D is the structural property of the matrix (multi-dimension array).

Example: 1-D => (1,1,2) 2-D => ((1,2),(2,3))

Array is a data structure.

The dimension of a mathematical object and the dimension of a data structure share common characteristics, but reside in different domain. They are thus not equivalent.

  1. A tensor in tensorflow may not directly represent a geometry object (or other abstract object), like points, plane, etc. In most cases, a tensor is the representation of a set of such objects. The dimension of the object is thus different from the dimension of the tensor as a whole.

When referring to a tensor (in the context of tensorflow), dimension (1-D, 2-D, 3-D) is the depth of nested array (also called rank). Each dimension has its own length (indicates the number of elements allowed along this dimension/direction). Dimension represents the level of object in a tensor, implies that the member/elements of an object can be other objects. In this context, the length refers to the number of elements in a certain dimension).

Shape of A Tensor

In tensorflow, shape is the compact description of the structure of a tensor. It describes how many layers are there in this multi-dimen array (nested array), how many elements are there in each layer. The rank of a tensor is the depth of layers.

Example: shape = [dim(4), dim(3)] ==> [[2,2,2],[1,1,1],[1,1,1],[2,2,2]]
rank = len(shape) = 2
dim(4): index of this dim = 0, len of this dim = 4
dim(3): index of this dim = 1, len of this dim = 3

In the above case, the length of the first dim is 4, which represents the size of our data set (that we have 4 data points). The length of the second dim is 3, which indicates that each data point in this data set has 3 attributes.