Hidden Markov Model

The assumption made by HMM

The assumption is that for all random variables in the (conditional) probability chain, the conditions is only made on the previous n variables in the sequence. In another way, the conditional probability of the variable is independent of the variables other than the previous n variables.


The tagging problem can be abstracted as to model the joint probability of two sequences: sentence sequence and tag sequence. In a HMM approach to solve this joint probability. Tag sequence is modeled (approximated) as a Markov sequence. Sentence sequence is modeled as a independent occurring of events that are only conditioned on the tagging of the corresponding position.

Generative or Discriminative

HMM is by definition is generative model because it models the sequence with joint probability rather than conditional probability.

Interpretation from ML Perspective

The training objective of the HMM is a probabilistic model that can not output target labeling directly. Instead, a labeling function has to be defined in addition to the HMM probabilistic model. The training set of the HMM model consists of training samples made by a pair of word sequence and pos tagging sequence.

The training process is essentially a counting process in which the statistical property of the labeling sequences (pos tagging) is estimated. Also, the conditionally probability of word/tagging pair is estimated. These estimates then are used to generate the parameters of the HMM model (transitional probability and emission probability).

In prediction, the labeling function (output function) acquire parameters in the HMM to make predictions of the labeling of the new word sequences.

Origin of Name: Probability Distribution

"Distribution" indicates that the sum of probability "1" is divided and distributed into the probability of each random variable.

Distributed Cluster Study Note

  1. Application dependency v.s. Software dependency
    Application dependency is one application in one container depends on another application on another container.
    Software dependency is one software depends on other software on the same container.

Grammar Study Note, Part One

  1. Pos tagging indicates the property of the word itself.
  2. Dependency tagging indicates the function (or relative information) of the word.
  3. Both N-gram model and PCFG model assumes the a Markov property and then applies the conditional probability.

Dependency Grammar Notions

  1. Subject-Predicate relationship
    subject: The person or thing about whom the statement is made
    predicate: The purpose of the predicate is to complete an idea about the subject

Part of Speech Tagging

  1. Ambiguity:
    Local Preference -> The probability of a part-of-speech tag for a specific word in the vocabulary
    Contextual Preference -> The probability of a part-of-speech tag for a specific word in given context.

Java Multi-threading

How to Write Multi-threading Code ?

The multi-threading programming of Java is achieved through the use of Thread object.
1. Declare a class to be the subclass of Thread class and overrides its run method.
2. Declare a class and implement the Runnable interface, then implement the run method. (Recommended)

The create of a new thread requires that a Thread object to be created with a Runnable object given as the first argument. To start running a new thread, a thread.start() method is provided. Also, thread.join() method is provided to synchronize states between different threads.


Static class members are shared by all threads (atomic w/r). Keyword: Synchronized is provided to ensure the all threads execute certain methods sequentially.

Class method members (as well as local variables in the method definition) is independent for each threads.

Context-Free Grammar Study Note, Part Two

Challenges of Parsing CFG Correctly

Syntactic Ambiguity

  1. Part-of-speech Ambiguity
    : The grammar can assign multiple part-of-speech tags (word categories) to a lexicon
  2. Structural Ambiguity
    : The grammar can assign more than one possible parse to a sentence

    • Attachment Ambiguity
      : A particular constituent can be attached to the parse tree at more than one place
    • Coordination Ambiguity
      : The reference of a lexicon has multiple choices
    • Local Ambiguity
      : Some part of the sentence has more that one parse while the whole sentence is not ambiguous
    • etc
      : ...

Phrasal Nodes

  1. a phrasal node is a non-terminal node

CFG Tree v.s. Dependency Grammar Tree

Strucutrally cfg tree has little difference with dp tree.
1. In cfg tree, the internal nodes are phrasal nodes (non-lexicalized). In dp tree, the internal nodes are by definition lexicalized.
2. Both cfg and dp tree has pos tags as the next-to-leaf node. For dp tree this pos tag can be replaced by synthetic parts-of-speech (subject, object, predicate, etc).

The Meaning of Arithmetic Operations and Mathematical Terms

  1. Addition
    The amount of adder and addee combined

  2. Multiplication
    A repeated addition

  3. Exponentiation
    A repeated multiplication

  4. Multinomial
    An algebraic expression constructed by the sum of multiple mathematical terms (could be exponential with a degree of 1, 2, 3, etc with any base).

  5. Polynomial
    A special case of Multinomial expression. It is constructed by the sum of powers of one or more variables multiplied by coefficients.

  6. Algebraic Expression
    An expression in which a finite number of symbols is combined using only the operations of addition, subtraction, multiplication, division and exponentiation with constant rational exponent.

  7. Term
    The components in an expression connected by the addition or subtraction.

  8. Factor
    The elements in a term, which is connected by multiplication

  9. Coefficient
    The constant in a term, which is also a factor.

图像文法(Image Grammar)


文法在文本领域的应用(比如Context-Free Grammar)可以看成是对自然语言(字符串)在结构上的分解和抽象(分解+抽象=解析),I want to eat a burger => NP VP. 这种分解和抽象后形成的数据结构包含里文本字符串的结构信息(和Metadata相似),文法解析的过程是将自然语言字符串这种非结构化数据结构化的过程,这些结构信息以Tree这种数据结构在计算机里被表示,处理和存储。(非结构化数据=无法抽象出数据模型或者没有整理成预定义的数据模型的数据)



File Management Guidelines

Programming Language Test Files

  1. All programming language test files goes into folder named in the following convention: "python-test1", "python-test2" , except for the case in ruling 2 given below.
  2. The folder naming of the specialized test files should reflect the objective of the test, for example: "stdnlp-test"

Example files and Temp files

  1. Temp files and example files should go into waffle, apollo, and zeus project source tree when they are needed to be versioned.
  2. Temp files should be periodically moved into one target folder or removed.
  3. Example files should be periodically moved into the example project source tree: coding-examples

Tutorial Folder and Examples Folder

  1. Repositories hosting tutorial projects designated to a specific topic should be named xxxx-tutorial.
  2. Repositories hosting loosely organized example files should be named xxxx-examples


No upper-case char should appear in project name and file names. Dash is allowed in project name, underscore is allowed in file name.

  1. Python
    modulename/module_name, packagename, ClassName, method_name, ExceptionName, function_name, GLOBAL_CONSTANT_NAME, global_var_name, instance_var_name, function_parameter_name, local_var_name

  2. Java
    FileName, packagename, ClassName, methodName, CONSTANT_NAME, localVarName, fieldName, parameterName, TypeVarName (T, single char)

  3. C/C++
    file_name, foldername, local_var_name, ClassName, FunctionName, class_data_member_, struct_data_member