• Sentiment Analysis
• FFN, RNN & LSTM
• Live video commentary using CNN-LSTM
The sentence is tokenized into words and sentiment analysis of each word
is done to conclude the overall sentiment of the sentence.
Words with positive sentiment
will be near to each other.
Word2vec is a two-layer neural net that
processes text. Its input is a text corpus
and its output is a set of vectors:
feature vectors for words in that corpus.
While Word2vec is not a deep neural
network, it turns text into a numerical
form that deep nets can understand.
1. Identify the dataset to created Word2Vec Relation : http://mattmahoney.net/dc/
2. Read the data into a list of strings.
3. Build the dictionary and replace rare words with unique key (UNK) token.
1. Dictionary – map of words (strings) to their codes
2. Count – map of words( strings) to count of occurrences
3. Reverse Dictionary – map codes ( integers) to words ( strings).
4. generate a training batch for the skip-gram model.
5. Build and train a skip-gram model.
1. Construct the SGD optimizer using a learning rate of 1.0.
2. Compute the cosine similarity between minibatch examples and all embeddings.
6. Begin training
7. Visualize the embeddings
Skip-gram model is neural network implementation which gives the probability of two words
occurring adjacent to each other
stacked_lstm = tf.contrib.rnn.MultiRNNCell(
[lstm_cell() for _ in range(number_of_layers)])
initial_state = state = stacked_lstm.zero_state(batch_size, tf.float32)
for i in range(num_steps):
# The value of state is updated after processing each batch of words.
# The rest of the code.
Stacked LSTM improves performance but only upto 8 layers. After that, the performance is not great.