Skip to content
- The embedding of a word is the hidden layer ouput (a vector) got when feeding the one-hot vecotr (input) of the word.
- The target value in the training data carries sequential information (word sequence in a sentence)
- The hidden layer weight matrix is the word vector lookup table
- Word embedding is a by-product of language modelling
- Embedding layer is just a linear layer that transforms one-hot input into embedding matrix (look-up table)
- The embedding of categorial data is obtained in model training, just as that of word embedding.
- The target value carries prediction information (predicted value), which is opposed to the sequential information in word embedding
- Pre-trained embedding can be used in new model training to improve performance (both accuracy and speed)
- Ordinal Values that carries information but in discrete (by definition) values should be normalized to 0-1 (or -1 - 1) scale as the input for DNN