2. How do different Neural Nets
perform on the same task?
• Paper : “Comparative Study of CNN and RNN for Natural
Language Processing”
• NLP tasks —sentiment/relation classification, textual
entailment, answer selection, question-relation matching
and part-of-speech tagging.
•
6. How do well do RNNs and CNNs
work for sentiment analysis?
• CNNs are hierarchical and RNNs sequential architectures
• Are CNNs better for sentiment classification since
sentiment is usually determined by some key phrases?
• GCNN outperforms the comparable LSTM results on
Google billion words.
• Sentiment analysis of Russian tweets, found GRU
outperforms LSTM and CNN.
7. Experiment design
• Always train from scratch, no extra knowledge, e.g., no pretrained
word embeddings
• Always train using a basic setup without complex tricks such as
batch normalization
• Search for optimal hyperparameters
• Dataset —Stanford Sentiment Treebank (SST) (Socher et al., 2013).
This dataset predicts the sentiment (positive or negative) of movie
reviews. Split into 6920 train, 872 dev and 1821 test sentences.
• “Unlike the surreal Leon, this movie is weird but likeable.”
• “Unlike the surreal but likeable Leon, this movie is weird.”
8. What were the results?
GRU and CNN are comparable when lengths are small, e.g.,
<10, then GRU gets increasing advantage over CNN when
analyzing longer sentences.
9. GRU was better but how
stable are the results?
Variation in hidden size and batch size cause large oscillations,
but GRU is on average better
11. • "Comparative Study of CNN and RNN for Natural
Language Processing”, Wenpeng Yin, Katharina Kann, Mo
Yu and Hinrich Schutze — https://arxiv.org/pdf/
1702.01923.pdf
• “Smart Reply: Automated Response Suggestion for
Email” —https://static.googleusercontent.com/media/
research.google.com/en//pubs/archive/45189.pdf
• “The Unreasonable Effectiveness of Recurrent Neural
Networks” —http://karpathy.github.io/2015/05/21/rnn-
effectiveness/