Tensorflow Study Note, Part One

Tensor v.s. Matrix v.s. Vector

Hierarchy: tensor > matrix > vector

a matrix is a tensor with rank equals 2, a vector is tensor with rank equals 1.
Vector is the basic building block of a multi-dimensional tensor

Overloaded Terminologies

Current days, the exact definition of terms such as shape, rank, dimension, degree differs greatly across the general CS community. In other words, "dimension" as a term is excessively overloaded.

  1. When referring to a point in an Euclidean space, people may use the term 3-D, 4-D to specify the number of dimensionality, which is number of coordinates that is needed to specify the position of a point in such a space. 3-D is the geometry property of a point with three elements (coordinates).

Example: 1-D => (1) 2-D => (1,2) 3-D => (1,2,1) 4-D => (1,2,4,1)

The term "dimension" here has to be interpreted together with the underlying assumption that the subject described is point in a Euclidean space. Therefore, the mathematic description of this point studied is a tuple (or list, array as in the context of different programming languages) with a length of x (which is the value of dimension).

Point is a mathematical object.

  1. When referring to an array, dimension is used to describe the least number of indices needed to specific an element in the array. 3-D is the structural property of the matrix (multi-dimension array).

Example: 1-D => (1,1,2) 2-D => ((1,2),(2,3))

Array is a data structure.

The dimension of a mathematical object and the dimension of a data structure share common characteristics, but reside in different domain. They are thus not equivalent.

  1. A tensor in tensorflow may not directly represent a geometry object (or other abstract object), like points, plane, etc. In most cases, a tensor is the representation of a set of such objects. The dimension of the object is thus different from the dimension of the tensor as a whole.

When referring to a tensor (in the context of tensorflow), dimension (1-D, 2-D, 3-D) is the depth of nested array (also called rank). Each dimension has its own length (indicates the number of elements allowed along this dimension/direction). Dimension represents the level of object in a tensor, implies that the member/elements of an object can be other objects. In this context, the length refers to the number of elements in a certain dimension).

Shape of A Tensor

In tensorflow, shape is the compact description of the structure of a tensor. It describes how many layers are there in this multi-dimen array (nested array), how many elements are there in each layer. The rank of a tensor is the depth of layers.

Example: shape = [dim(4), dim(3)] ==> [[2,2,2],[1,1,1],[1,1,1],[2,2,2]]
rank = len(shape) = 2
dim(4): index of this dim = 0, len of this dim = 4
dim(3): index of this dim = 1, len of this dim = 3

In the above case, the length of the first dim is 4, which represents the size of our data set (that we have 4 data points). The length of the second dim is 3, which indicates that each data point in this data set has 3 attributes.