Work Journal, May 17 – May 27

May 17

  1. Unable to register docker service inside a docker container
  2. All development headers provided by Python-dev package (Centos 6, maybe including ubuntu) should already be included by anconda2/3 distribution
  3. Floating-number precision lost error was caused by lacking of python package: nose_parameterized

May 18

  1. $PATH, $LD_LIBRARY_PATH are different bash shell variables
  2. "hostname" can be viewed in file /etc/hostnames

May 19

  1. /etc/profile, ~/.profile, ~/.bashrc will be executed by bash shell sequentially
  2. "alias" in bash shell command is purely a replacement for longer commands when put into effects

May 27

  1. Theano run time error:
    Initialisation of device gpu failed! Reason=CNMEM_STATUS_OUT_OF_MEMORY
    Solution: disable CNMeM by deleting the following lines in .theanorc
    cnmem = 1

Python Java and C++,Part One

Entry Point

Java and C++ has the main method as the sole entry point, python has no official definition or requirement for entry point. In python, it is common to use a script called as the entry point of the program (the interpreter starts running the script first).

Arithmetic Evaluation (Division)

Python3 will by default evaluate integral division in a float division manner ( 1/2 = 0.5). Java, C++ and Scala will evaluate integral division in an integral division manner ( 1/2 = 0 )

Import System

  1. Python's real equivalence of java package is called module (which is actually a .py file). Python's package is equivalent to Java's package of package (nested package). Unlike a Python package, a Java package (a namespace) needs to be explicitly declared. Python allows programming at the package level. In Java, package is just a namespace.

  2. By default, in Java the import statement applies to class names in the package name space. Import is followed by the full name of a class (package.package.....class). However, the import statement in python applies to not only class names (from package.module import class), but also other variables defined in the module (functions, constants, etc). In addition, the import mechanism of python allows the import of only package name (while in case the module names can not be referenced via the package name if no corresponding imports are found in the "" file of the package).

  3. After a full name import of module in Python, the imported module has to be referenced by the full name only. In Java, only the class name should be used for reference after the imports. To alleviate this problem, Python supports the use of alias by "as" keyword in import statement.

  4. Python supports defining variables and introducing names at the package level, which makes package level API design available.
    Example: import tensorflow as tf (all APIs are introduced into the tensorflow namespace by using imports in "" file of package tensorflow)

In contrast, a java version of tensorflow API would have to be imported by the following statements: import
(In java, package object can not have methods/functions. Instead, all methods are defined in classes and must be introduced through importing classes)

Python's import mechanism is pretty messed up with lots of inconsistency, partly due to its natural as an interpreting language.

  1. Wildcard imports in python can only be used when "all" magic variable is defined in the "" file of the package. Only names defined by "all" will be imported by the wildcard import statement.In Java, no such definition of wildcard names is required.

  2. Avoid using module attribute: "file" in any cases ( due to inconsistency and ambiguity), all module/package path issue should be resolved through entry point: sys.path[0] or cwd() (1 in 2).
    Note: After py3.4.3, "file" will return the absolute path of the module object when not invoked from "main". Using "file" would result in different behavior between running a script directly under cwd (using relative path) and running a script indirectly from cwd/abs path. Due to the fact that "file" is a module attribute, this different behavior can not be mitigated by patch wrapping.

  3. Good python API design should collect all important APIs in the package namespace through programming "" file. This avoids explicit imports of all sub-modules and class/function names

Class Instantiation

Class instantiation in python does not require the use of new keyword.

Local Scope

All if and loop "block" in python has global scope.

Identity Test

  1. In python, "is" operator is the identity test, "a" is "b" is the equivalent of id(a) == id(b). "==" operator is the value equality test. When in the equality test, if the right operand is a primitive type object (an expression), then the "==" operator is safe to be replaced by "is" operator.

  2. java中的数据类型,可分为两类:
    当他们用(==)进行比较的时候,比较的是他们在内存中的存放地址,所以,除非是同一个new出来的对象,他们的比较后的结果为true,否则比较后结果为false。 JAVA当中所有的类都是继承于Object这个基类的,在Object中的基类中定义了一个equals的方法,这个方法的初始行为是比较对象的内存地 址,但在一些类库当中这个方法被覆盖掉了,如String,Integer,Date在这些类当中equals有其自身的实现,而不再是比较类在堆内存中的存放地址了。

pass by value v.s. pass by reference v.s. pass by object reference

Python and Java and C++ has its own language describes their object model and argument pass mechanism.

Note: Object reference means an object is bound to an identifier (called a variable), all changes to the variable applies to the object referenced as well. (Java: Pass the object reference by value).
A reference to a variable is treated exactly the same as the variable itself, any changes made to the reference are passed through to the argument.

Python is designed to be "pass by object reference", that is for primitive types and non-primitive types argument passing is done through passing of object reference (which is the id of this object, iid, memory address).

Java is designed to be "pass by value", for primitive types it is equal to value copying, for non-primitive types it is the copying and passing of the object's reference by value.

C++ has pass by value and pass by reference mechanism. Pass by value applies to both arithmetic values and object pointers (is itself an object) and class instances, it is essentially a copying of the argument (applies to class instance as well). Pass by reference applies to all types in that all changes to the parameters affects the arguments as well.

Note: The effect of C++ pass by reference cannot be produced in Java and Python (for both the primitive types and the non-primitive types), which is, a function can change the reference of the variable that acts as the argument of this function, iid, change the reference of outer scope variable via operation in the inner scope. (Needs verification).

The above mechanism reflects the justification of "pass reference by copying value".


In C++, a variable is a storage location. "int a = 1", assign value 1 to the address location of a.
In Python, a variable is a purely an identifier (bounded to certain objects). "a = 1", assign/change the object reference (not a reference in c++ context) of a to object 1.

Program Constructions

Java requires that all functions been implemented as methods of a class. Variables take two forms, one is the class members (attributes in JS, property member in python), the other is temp variables that reside in the local scope of methods.

C++ can have variables and functions defined/used in the global scope.
Python technically can only have variables and functions defined/used in the module and all kinds of local scope. "Global Scope" in python usually refers the module scope of the "main" module.

Pointer Declaration

In C++, the pointer is declared by adding a affix to a non-pointer type to construct a pointer type. int* a = &b
In golang, the pointer is declared by prefixing the non-pointer type to construct a pointer type. var a *int = &b

Concurrency and Network IO

Single Threaded Concurrency

Due to the fact that CPython implementation has an embedded GIL, the mainstream python ecosystem is build on the single threading assumption. All major IO libraries are consequently written in a synchronous manner.

In modern python, concurrency is achieved through the use of coroutine, which is implemented via generators and the yield statement (previously iterators).

A function which implements a yield statement become a coroutine, the context switch between the coroutine and the master routine (the caller) is triggered whenever the generator function is called. The yield statement allows the local variables, states, and execution progress of the coroutine to be retained, which makes single threaded concurrency available.

Single-threaded languages have urgent need for asynchronous IO libraries because in a multi-threaded environment, even when the IO of a specific thread is blocked, the overall program is still responsive. The extra burden of single threaded libraries such as python and js is the requirement of asynchronous implementation of IO library. The lack of IO std library is the reason why JS is chosen by Ryan Dahl as the implementation language of NodeJS, largely to achieve asynchronous IO from the bottom. To be noted, NodeJS is a run-time environment packaged with JavaScritpV8 engine rather than a new language. In NodeJS, all IO is performed in the default event-loop (with no event-loop called to be defined/declared explicitly).

The Limitation of Single-threaded Concurrency

Single-threaded concurrency can only solve one type of Blocking: IO-blocking, this is the emphasis of NodeJS and Python tornado. In an environment where all IO is non-blocking, a computationally intensive operation will block the entire server. For the above reason, a single threaded non-blocking IO server must and can only handle computationally intensive operation through Restful Http/RPC call to other service providers (which should be non-blocking). In this chain of service request / providing, except for the first call from client, all the following services should be non-blocking, otherwise the overall service is blocked.

Single-threaded non-blocking IO concurrency can only save CPU resources for the Http Server (The nodejs/tornado server that only handles http requests) in a manner that one thread v.s. multiple threads (may contain suspended threads). This server can only handle tasks that (besides IO operation) will not be CPU intensive enough to block the whole IO chain (a light task).

Blocking/Non-blocking v.s. Synchronous/Asynchronous

Synchronous/Asynchronous describes the identity of the thread who process the IO. In synchronous mode, the main thread process the IO, so all IO must be processed one after another (refers as "blocking") and the main thread cannot do anything until the IO request returns. In asynchronous mode, the main thread assign the IO request to a child thread (or coroutine), so all IO can be processed simultaneously (non-blocking). Synchronous/Asynchronous is used to describe the mechanism of the request processing logic, this terminology is commonly applied from a service provider's viewpoint.

Blocking/Non-blocking is used to describe how requests are processed, this terminology is commonly applied from a IO request caller's viewpoint. Blocking means the service operates in a blocked manner.

All kinds of Blocking

Blocking for single-threaded language includes all executions (computationally or non-computationally) that would suspend the current (main and only) thread. Even sleep(1) in different coroutines (python) will block each other because there is only one real thread running at a given time. In this sense, single-threaded languages tend to be very sensitive to all "blocking" operations.

Single threaded concurrency can not handle cpu blocking (deployment of dl network, syntactic parsing, etc).
When dealing with blockings from other sources: network-io, disk-io, etc, event-loop is more efficient in terms of cpu resources. Event-loop is most suitable for light-weighted http server.


  1. Blocking/Non-blocking Http Client -> Async Tornado Http Server (Thrift: AsyncClient) -> (Thrift) TThreadPoolServer
  2. Blocking/Non-blocking Http Client -> Actor-based Concurrency Playframework Http Server (Thrift: SyncClient) -> (Thrift) TThreadPoolServer

Note: service requester can have blocking io, only service providers should be able to handle concurrency request (through multi-threading or event-loop).


Tensorflow官方有一个CNN图像识别的本地应用Demo 原始链接,现在对这个应用做简单的修改和封装生成一个Web应用,用于演示图像识别效果。



  1. 配置模块

  2. 模型下载模块

  3. 类和对象的引用

  4. 本地测试的入口由改为run_inference_on_image(), 测试图片的路径在if name == "main"下修改,避免更改模块代码。


  1. 图片上传功能
    利用class UploadHandler(tornado.web.RequestHandler)实现了上传文件的命名和本地存储

  2. 图片识别

  3. 识别命名翻译



Context-Free Grammar Study Note, Part One


The grammar formalism applies to many other natural languages despite the fact that all discussions are focused on modern English.

  1. Dependency Grammar (Subject, Predicate, Object)
  2. Constituency Grammar (NP, VP)
  3. Others - sub-categorization, etc (Verb followed by infinitive)

Among them, constituency grammar is universally modeled by context-free grammar (CFG), which is also called phrase-structure grammars. The formalism of CFG is equivalent to Backus-Naur Form (BNF).

Relationships that a Parser Models

  1. Constituency Relationships (Constituent Relation)
  2. Grammatically Relationships (Lexical Relation)

Parsing With CFG (Context-Free Grammars)

  1. Top-down (goal-directed search, rationalist tradition)
  2. Bottom-up (data directed search, empiricist tradition)

Data Structure of Parse Tree (CFG tree)

Implemented as a multi-way tree via OOP mechanism. Each internal node represents a non-terminal of CFG;each leaf node represents a terminal of CFG;The value of each node is the symbol of the non-terminal/terminal it represented. The value of each node is a symbol (NP, VP, prefer, etc), the pointers of each node points to its descendants until the reach of leaf nodes (non-terminal terminal pair). The leaf node is a 2-tuple that represents the production rule of a non-terminal to a terminal.

Serialization of a Parse Tree

parse tree = (ROOT (S (NP (PRP I)) (VP (VBP love) (NP (PRP you)))))

A JSON string that contains only list structure with all comma replaced by white space and double quotation replaced by empty string, and optionally squared bracket replaced by parentheses.

In any internal node (within a bracket), only one symbol ("a non-terminal symbol, without any bracket enclosed") can be presented and it is the value of this node. The child node of any internal node is defined to be all the bracketed blocks that lays in the same level with this node. All bracketed blocks is itself a parse tree (may not origin from root)

Parse Tree and Part-of-Speech Tagging

The non-terminal in the leaf node is the part-of-speech tagging of the terminal.

Why Not Context-sensitive Grammar In NLP Application?

Context-sensitive grammar is non-deterministic, which means the production process may not converge.

CFG Data Structure

  1. Search Space Forest
    Search Space -> Search Space Forest (Search Tree in Manning's book)

Constructed by all possible partial parse trees and complete parse trees, among them, partial parse trees act as the internal node and the complete parse trees act as the leaf node (of the tree data structure -> search space forest).

Note: Search Space Forest is actually a and-or tree

  1. Parse Tree
    Parse Tree -> Syntax Parse Tree (Complete Syntax Tree)
    Partial Parse Tree (Incomplete Sytax Tree)

The function of Part-of-Speech Tagging

PoS has both syntactic and semantic meanings. Syntactically, a noun is usually followed by a verb; semantically, a noun typically refer to people, animals, concepts and things.

Probability and Statistics Reviews, Part One

Classic Probability Problem - Standard Procedures

  1. Define simple events (basic events) based on the given information
  2. Abstract the problem to be solved into a compound event
  3. Solve the problem based on classic probability equations

Random Experiment

The concept of random experiment arise from the assumption that the event to be observed is described on a occurring or not basis. Many real world probability problem, however, does not self-describe in such nature. These types of problems can be conceptually transformed into the occurring&observation pattern.

Random Variable

A random variable is the mapping of the result of a random experiment to a real value.

Statistics and Probability

Statistics is the necessary measure that takes to reach a probability (when the population is known) or probability estimate (when only sampling is available).

Relative Frequency

Phrase "Relative" here refers to the fact that in any experiment that is infinitely repeatable, the absolute frequency (proportion ratio) cannot be obtained.

Simple Event (Elementary Event)

A random experiment with an outcome (coin toss with a head on top).

In contrast, compound events have multiple outcomes. An elementary event can be a process with outcomes that are "complex", for example, toss 3 dices and get 1, 3, 2 on top. In this case, the elementary event A here is constructed by three elementary events in a related probability space (toss only one dice to see the outcome).

Coin toss with a head on top and coin toss with a tail on top are two simple events.
Simple and compound events are relative concepts, when no lower level events are defined, a complex events may be defined as simple events when analyzing certain problems.

Sample Space (One Element of Probability Space)

A sample space is a set, denoted as S, which enumerates each and every possible outcome or simple events.
All possible simple events (event condition together with its outcomes).

Any events (to be observed) covers part of the sample space (the event space is a subset of sample space)

The equation of conditional probability should only be applied when P(A), P(B) and P(A*B) are reflects the probability of events under the same sample space.

Distribution Mixture

N1 = The population size of distribution 1
N2 = The population size of distribution 2
P3(x) = (P1(x)N1 + P2(x)N2)/(N1+N2)

Tensorflow Study Note, Part Two

The creation of tensor object under tensorflow framework.

  • The generation of tensors as the input of a computation graph is through tf.constant() and tf.Variable() constructor.
  • The basic arithmatic operation between tensors in a given computation node (ops) must be done through specialized operations such as tf.add, tf.mul, etc
  • The operations (and the entire graph) on tensors will be executed on a specified device (CPU or GPU).
  • The return of the execution of the computation graph is a ndarray of numpy

The programming style of the tensorflow framework.

A complete computation process is defined by a series of computation steps, the output of one step become the input of the next step.
In short, TF adopts functional programming.

TF does not require a "graph" object to be explicitly constructed, rather, "graph" are built through a series of function calls. All functions(ops) defined in the "graph" is carried implicitly with the tensor returned. The run method of session method in the execution phrase will call all function calls carried with the tensor input.

Matrix Representation

In tensorflow, matrix is represented by a two dimensional list or np array. Row vector is represented by a list, column vector is represented by the transpose of row vector. The vector must be represented in matrix form ([[1, 2]]) in stead of a plain list.

Under the notation of tf, a matrix (tensor) is a column vector which is composed of several row vectors.

N-row is determined by the length of the outer list, N-column is determined by the length of the inner list.

Input/Output of TF Operations

The input of an TF operation is typically tensors (matrix), and the output is typically tensors (matrix) also. Even scalars should be wrapped in tensors (matrices). This implies the flow of datasets.

The output of a tf.reduce_mean() could be a scalar or tensor (ndarray), the output of a tf.matmul() is a tensor (even as a tensor with rank 0).

Indexing and Item assignment on Tensor Object

Indexing on a tensor object is allowed while item assignment is not supported.

Tensorflow Study Note, Part One

Tensor v.s. Matrix v.s. Vector

Hierarchy: tensor > matrix > vector

a matrix is a tensor with rank equals 2, a vector is tensor with rank equals 1.
Vector is the basic building block of a multi-dimensional tensor

Overloaded Terminologies

Current days, the exact definition of terms such as shape, rank, dimension, degree differs greatly across the general CS community. In other words, "dimension" as a term is excessively overloaded.

  1. When referring to a point in an Euclidean space, people may use the term 3-D, 4-D to specify the number of dimensionality, which is number of coordinates that is needed to specify the position of a point in such a space. 3-D is the geometry property of a point with three elements (coordinates).

Example: 1-D => (1) 2-D => (1,2) 3-D => (1,2,1) 4-D => (1,2,4,1)

The term "dimension" here has to be interpreted together with the underlying assumption that the subject described is point in a Euclidean space. Therefore, the mathematic description of this point studied is a tuple (or list, array as in the context of different programming languages) with a length of x (which is the value of dimension).

Point is a mathematical object.

  1. When referring to an array, dimension is used to describe the least number of indices needed to specific an element in the array. 3-D is the structural property of the matrix (multi-dimension array).

Example: 1-D => (1,1,2) 2-D => ((1,2),(2,3))

Array is a data structure.

The dimension of a mathematical object and the dimension of a data structure share common characteristics, but reside in different domain. They are thus not equivalent.

  1. A tensor in tensorflow may not directly represent a geometry object (or other abstract object), like points, plane, etc. In most cases, a tensor is the representation of a set of such objects. The dimension of the object is thus different from the dimension of the tensor as a whole.

When referring to a tensor (in the context of tensorflow), dimension (1-D, 2-D, 3-D) is the depth of nested array (also called rank). Each dimension has its own length (indicates the number of elements allowed along this dimension/direction). Dimension represents the level of object in a tensor, implies that the member/elements of an object can be other objects. In this context, the length refers to the number of elements in a certain dimension).

Shape of A Tensor

In tensorflow, shape is the compact description of the structure of a tensor. It describes how many layers are there in this multi-dimen array (nested array), how many elements are there in each layer. The rank of a tensor is the depth of layers.

Example: shape = [dim(4), dim(3)] ==> [[2,2,2],[1,1,1],[1,1,1],[2,2,2]]
rank = len(shape) = 2
dim(4): index of this dim = 0, len of this dim = 4
dim(3): index of this dim = 1, len of this dim = 3

In the above case, the length of the first dim is 4, which represents the size of our data set (that we have 4 data points). The length of the second dim is 3, which indicates that each data point in this data set has 3 attributes.

Recursion v.s. Dynamic Programming

Recursive Thinking

  • Identifying for the base case: What is the easiest case ? What case(s) can I solve trivially ?
  • Redefining the difficult cases

DP Thinking

  • How can I build a path to solve a difficult case based on the easy case ?

Shared Characteristics Between Recursion and DP

Reasoning is about walking from the starting point, usually some known facts/existing materials, through a path and reach the predefined target.

The solution, which is the goal of problem solving, is the definition or description of such a path. The routine that is necessary to solve the problem is sequential in a forwarding manner because of the temporal nature of computing, however, the measure that takes to find the exact steps of such solutions may require backward reasoning. This is because the threads that leads to the target is often limited in comparison to the possible threads that can be pulled from the start point.

The logic behind recursion and dynamic programming when applied to solve a problem are both based on the inference chain, either through forwarding reasoning or backward reasoning.

When applied in the form of backward reasoning, the logic chain is built through the form of "if the goal is to reach E, then D has to be reached, etc, etc, etc".

When applied as a forward reasoning, the form changed into "From A, we can reach B, the from B we can reach C, etc, etc, etc".

The path can be found by either forwarding or backward reasoning, which are referred to recursive thinking and deduction thinking.

Recursion and DP actually refers more to the how the path is implemented than how it is discovered. Recursion is a style of software implementation as a direct translation from the backward reasoning mental procedure to the source code. DP, in comparison, is a style of algorithm implementation that reflects a way of natural forwarding reasoning.

Algorithms are the representations of logic to solve problems in the form of discrete mathematics in the context of computer science. Algorithm implementation can be strict or loose.

Why Recursion is Useful in Programming ?

Recursive programming avoids the need to write nested loops which is both difficult to understand and debug. It also improves code simplicity, Besides, recursion (make a function call in the body of itself) can implement logic that is nearly impossible to be done in loop structure by human beings.

In recursive programming, the control flow is forwarded through the change of argument values of recursive function calls. This beats loop control structure in cases when it requires that the loop counter changes in a manner other than natural incremental. Additionally, the mechanism of loop counter determines that it is not the best tool in traversalling ADTs.