Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

2,251 views

Published on

Richard Socher, ICML 2011, RNN, deep learning

Published in:
Data & Analytics

No Downloads

Total views

2,251

On SlideShare

0

From Embeds

0

Number of Embeds

24

Shares

0

Downloads

0

Comments

0

Likes

6

No embeds

No notes for slide

- 1. Parsing Natural Scenes and Natural Language with Recursive Neural Networks Richard Socher, Cliff Chiung-Yu Lin, Andrew Y. Ng, Christopher D. Manning ICML’ 2011 Jie Cao
- 2. Outline • Context • Recursive Neural Network Deﬁnition • Input Representation • Output • Greedy Structure Predicting RNNs • Loss Function • Max-Margin Framework • Back propagation Through Structure • L-BFGS • Experiment and Improved RNN
- 3. Recursive vs Recurrent NN
- 4. f: X→Y (Input X )
- 5. Map Phrase into Vector Space
- 6. Word Embedding Matrix dense vector co-occurrence statistic Collobert, R. and Weston, J. A uniﬁed architecture for natural language processing: deep neural networks with multitask learning. In ICML, 2008
- 7. Input Representation for Scene Image the features each segment i = 1,...,Nsegs in an image the matrix of parameters we want to learn bias applied element-wise, can be any sigmoid-like function，original one “semantic” n-dimensional space 78 segments per image 119 features for every segement Gould, S., Fulton, R., and Koller, D. Decomposing a Scene into Geometric and Semantically Consistent Regions. In ICCV, 2009
- 8. f: X→Y (Output Y ) • For Visual Parser: • A visual tree is correct if all adjacent segments that belong to the same class(all segments labeled) are merged into one super segment before merges occur with super segments of different classes. • how object parts are internally merged or how complete, neighboring objects are merged into the full scene image • A set of correct trees • For Language Parser: • only has one element, the annotated ground truth tree: Y (x) = {y} How to evaluate to error between Y_true and Y’? (Loss Function)
- 9. Recursive NN Deﬁnition new presentation of parent(i,j) new score of parent(i,j) C recursively adding new merged parent, and update the adjacent matrix Potential Adjacent Pairs
- 10. Greedy Structure Predicting RNNs
- 11. Greedy Structure Predicting RNNs
- 12. Greedy Structure Predicting RNNs
- 13. Parsing a sentence
- 14. Category Classify in RNN Each node of the tree built by the RNN has associated with it a distributed feature representation We can leverage this representation by adding to each RNN parent node (after removing the scoring layer) a simple softmax layer to predict class labels
- 15. Loss Function for Language For Constituency Parser:(Phrase Structure Parser) A constituent(non-terminal) is correct only if : 1. it dominates exactly the correct span of words 2. it is the correct type of constituent (S[1:7] (NP[1:1] Jim) (VP[2:2] ate) (NP[3:4] the cookies) (PP[5:7] in (NP[6:7] the bowl) ) ) (S[1:7] (NP[1:1] Jim) (VP[2:7] ate (NP[3:7] the cookies (PP[5:7] in (NP[6:7] the bowl) ) ) ) ) Hamming Distance
- 16. Loss Function for Image For Visual Parser: A set of correct trees for proposing a parse yˆ for input x with labels l
- 17. RNN for Structure Prediction Given the training set, we search for a function f with small expected loss on unseen inputs. T(x) is the set of possibly correct trees. Assuming this problem can be described in terms of a computationally tractable max over a score function s How to deﬁne the margin?
- 18. Max Margin Hard-Margin: Soft-Margin: Adding a slack to handle not separable data We need to minimize as the hinge loss max for true Y is because not only one true tree for image Max
- 19. Max-Margin Framework
- 20. Backpropagation Through Structure
- 21. cho
- 22. Experiment in ICML’2011 The ﬁnal unlabeled bracketing F-measure of our language parser is 90.29%, compared to 91.63% for the widely used Berkeley parser (Petrov et al., 2006) (development F1 is virtually identical with 92.06% for the RNN and 92.08% for the Berkeley parser). Unlike most previous systems, our parser does not provide a parent with information about the syntactic categories of its children. This shows that our learned, continuous representations capture enough syntactic information to make good parsing decisions.
- 23. Experiment
- 24. Allow different W for different pairs syntactic categories
- 25. Thanks

No public clipboards found for this slide

Be the first to comment