CO620

•

1 like•137 views

This document proposes a method to train dependency parser models without manually annotated data by: 1) Creating an initial "blank" model that outputs unlabeled dependencies 2) Parsing an unannotated corpus with the blank model and logging the decisions 3) Extracting heuristics from the parse decisions to identify common dependency patterns 4) Generating new training examples by modifying the logged decisions based on the heuristics 5) Retraining the model on the generated examples, which improves the parser outputs over the blank model. Further work to optimize efficiency and heuristic analysis is suggested.

CORPORA-BASED GENERATION OF
DEPENDENCY PARSER MODELS FOR
NATURAL LANGUAGE PROCESSING
by Edmond Lepedus
supervised by Marek Grześ, Christian Kissig and Laura Bocchi

Dependency Parsing
• Structure consists of dependencies between
words:

Dependency Parsing
• Structure consists of dependencies between
words:
Hello world

Dependency Parsing
• Structure consists of dependencies between
words:
Hello world
dependent

Dependency Parsing
• Structure consists of dependencies between
words:
Hello world
headdependent

Stanford CoreNLP
• Free, open-source NLP toolkit

Stanford CoreNLP
• Free, open-source NLP toolkit
• Includes a dependency parser backed by a neural
network classiﬁer

Stanford CoreNLP
• Free, open-source NLP toolkit
• Includes a dependency parser backed by a neural
network classiﬁer
• Parses 1000 sentences per second at 92.2% accuracy:

Aim
Train a classiﬁer using an unparsed corpus of
English language text

Motivation
• Decrease the cost of training data

Motivation
• Decrease the cost of training data
• Increase the availability of training data

Motivation
• Decrease the cost of training data
• Increase the availability of training data
• Increase parsing accuracy

Motivation
• Decrease the cost of training data
• Increase the availability of training data
• Increase parsing accuracy
• Enable the parsing of languages with few
remaining speakers

Overview
1. Create a ‘blank’ model
2. Parse corpus with model & log decisions

Overview
1. Create a ‘blank’ model
2. Parse corpus with model & log decisions
3. Extract heuristics from corpus & parse log

Overview
1. Create a ‘blank’ model
2. Parse corpus with model & log decisions
3. Extract heuristics from corpus & parse log
4. Generate training examples by modifying the
logged decisions to ﬁt the discovered heuristics

Diagram
Create a ‘blank’ model
Parse & log
Extract heuristics
Generate training examples
Train new model

Blank Model Creation
• Outputs left arcs with custom ‘unknown’ label:

Blank Model Creation
• Outputs left arcs with custom ‘unknown’ label:
• Supports the creation of new training examples

Parse Decision Logs
• Log every parse decision to YAML:

Parse Decision Logs
• Log every parse decision to YAML:
required for training

Heuristic Extraction
• Count bigram occurrences:

Heuristic Extraction
• Count bigram occurrences:
• Assume that frequent bigrams indicate
dependency

Further Work
• Improve efﬁciency to enable the use of larger
corpora

Further Work
• Improve efﬁciency to enable the use of larger
corpora
• Develop better heuristic analyses

Further Work
• Improve efﬁciency to enable the use of larger
corpora
• Develop better heuristic analyses
• Implement arc labels

Conclusion
• We modiﬁed the Stanford CoreNLP toolkit to
enable the creation of ‘blank’ parser models

Conclusion
• We modiﬁed the Stanford CoreNLP toolkit to
enable the creation of ‘blank’ parser models
• We developed a workﬂow for training parser
models without using annotated corpora

REFERENCES
[1] C. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. Bethard, and D. McClosky,
“The Stanford CoreNLP Natural Language Processing Toolkit,” presented at the
Proceedings of 52nd Annual Meeting of the Association for Computational
Linguistics: System Demonstrations, Stroudsburg, PA, USA, 2014, pp. 55–60.
[2] J. Nivre, “Dependency Parsing,” vol. 4, no. 3, pp. 138–152, Mar. 2010.
[3] D. Chen and C. D. Manning, “A Fast and Accurate Dependency Parser
using Neural Networks.,” EMNLP, pp. 740–750, 2014.
[4] D. Jurafsky and J. H. Martin, Speech and language processing: an
introduction to natural language processing, computational linguistics, and
speech recognition. Upper Saddle River, NJ: Prentice-Hall, 2000.

We present a novel approach, called selectional branching, which uses confidence estimates to decide when to employ a beam, providing the accuracy of beam search at speeds close to a greedy transition-based dependency parsing approach. Selectional branching is guaranteed to perform a fewer number of transitions than beam search yet performs as accurately. We also present a new transition-based dependency parsing algorithm that gives a complexity of O(n) for projective parsing and an expected linear time speed for non-projective parsing. With the standard setup, our parser shows an unlabeled attachment score of 92.96% and a parsing speed of 9 milliseconds per sentence, which is faster and more accurate than the current state-of-the-art transition- based parser that uses beam search.

Dependency Parsing Algorithms Analysis - Major Project

Bhuvnesh Pratap

Dependency parsing

Canyon Boak

Methods in Unsupervised Dependency Parsing

Mohammad Sadegh Rasooli

Dependency Parsing

Jinho Choi

This report shows what a dependency structure is, why a dependency structure is useful, and how to parse natural sentences to dependency structures. The report describes two stat-of-art dependency parsers, MaltParser and MSTParser, and shows comparisons between the parsers and ways to integrate them. Finally, it suggests a new parsing algorithm and possible applications using dependency structures.

Dependency parsing (2013)

Craig Trim

Programming languages and platforms improve over time, sometimes resulting in new language features that offer many benefits. However, despite these benefits, developers may not always be willing to adopt them in their projects for various reasons. In this paper, we describe an empirical study where we assess the adoption of a particular new language feature. Studying how developers use (or do not use) new language features is important in programming language research and engineering because it gives designers insight into the usability of the language to create meaning programs in that language. This knowledge, in turn, can drive future innovations in the area. Here, we explore Java 8 default methods, which allow interfaces to contain (instance) method implementations. Default methods can ease interface evolution, make certain ubiquitous design patterns redundant, and improve both modularity and maintainability. A focus of this work is to discover, through a scientific approach and a novel technique, situations where developers found these constructs useful and where they did not, and the reasons for each. Although several studies center around assessing new language features, to the best of our knowledge, this kind of construct has not been previously considered. Despite their benefits, we found that developers did not adopt default methods in all situations. Our study consisted of submitting pull requests introducing the language feature to 19 real-world, open source Java projects without altering original program semantics. This novel assessment technique is proactive in that the adoption was driven by an automatic refactoring approach rather than waiting for developers to discover and integrate the feature themselves. In this way, we set forth best practices and patterns of using the language feature effectively earlier rather than later and are able to possibly guide (near) future language evolution. We foresee this technique to be useful in assessing other new language features, design patterns, and other programming idioms.

Using a keyword extraction pipeline to understand concepts in future work sec...

Kai Li

Strata San Jose 2016: Scalable Ensemble Learning with H2O

Sri Ambati

An Introduction to NLP4L - Natural Language Processing Tool for Apache Lucene...

Lucidworks

An NLP-based architecture for the autocompletion of partial domain models

Lola Burgueño

Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab

Sri Ambati

Natural Language Processing using Java

Sangameswar Venkatraman

This presentation talks about Natural Language Processing using Java. At Museaic, a music intelligence platform, we spent time figuring out how to extract central themes from song lyrics. In this talk, I will cover some of the tasks involved in natural language processing such as named entity recognition, word sense disambiguation and concept/theme extraction. I will also cover libraries available in java such as stanford-nlp, dbpedia-spotlight and graph approaches using WordNet and semantic databases. This talk would help people understand text processing beyond simple keyword approaches and provide them with some of the best techniques/libraries for it in the Java world.

Mendeley’s Research Catalogue: building it, opening it up and making it even ...

Kris Jack

Sergey Nikolenko and Anton Alekseev User Profiling in Text-Based Recommende...

AIST

Best Practices for Hyperparameter Tuning with MLflow

Databricks

Hyperparameter tuning and optimization is a powerful tool in the area of AutoML, for both traditional statistical learning models as well as for deep learning. There are many existing tools to help drive this process, including both blackbox and whitebox tuning. In this talk, we'll start with a brief survey of the most popular techniques for hyperparameter tuning (e.g., grid search, random search, Bayesian optimization, and parzen estimators) and then discuss the open source tools which implement each of these techniques. Finally, we will discuss how we can leverage MLflow with these tools and techniques to analyze how our search is performing and to productionize the best models. Speaker: Joseph Bradley

GPT-2: Language Models are Unsupervised Multitask Learners

Young Seok Kim

Webinar: OpenNLP and Solr for Superior Relevance

Lucidworks

Data analysis patterns, tools and data types in genomics

Altuna Akalin

2014 nicta-reproducibility

c.titus.brown

Tutorial on Coreference Resolution

Anirudh Jayakumar

Apply chinese radicals into neural machine translation: deeper than character...

Lifeng (Aaron) Han

Tuning ML Models: Scaling, Workflows, and Architecture

Databricks

Webinar at AgileTD Mondays: Mind maps to support exploratory testing: a team ...

Claudia Badell

This webinar is about how mind maps are used to support exploratory testing in a cross-functional team. Claudia will share how mind maps help the team to have a common understanding of what to test, and how mind maps are designed by the team in a way that they can easily be read and understood regardless who created them. She will also present how mind maps are re-used through the different releases. At the end of the webinar, Claudia will share what they have learned as a team when applying this testing strategy. This story is set during the process of building a multi-platform UI prototyping tool mainly for interaction designers. The team, fully dedicated to building the product, consists of highly qualified and experienced professionals: developers (7), interaction designers (1), visual designers (1), technical writers (1), and testers (1). Duration: 20 minutes

Similar to CO620

Eskm20140903

Shuhei Otani

"Hands Off! Best Practices for Code Hand Offs"

Naomi Dushay

Proactive Empirical Assessment of New Language Feature Adoption via Automated...

Raffi Khatchadourian

Using a keyword extraction pipeline to understand concepts in future work sec...

Kai Li

Strata San Jose 2016: Scalable Ensemble Learning with H2O

Sri Ambati

An Introduction to NLP4L - Natural Language Processing Tool for Apache Lucene...

Lucidworks

An NLP-based architecture for the autocompletion of partial domain models

Lola Burgueño

Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab

Sri Ambati

Natural Language Processing using Java

Sangameswar Venkatraman

Mendeley’s Research Catalogue: building it, opening it up and making it even ...

Kris Jack

Sergey Nikolenko and Anton Alekseev User Profiling in Text-Based Recommende...

AIST

Best Practices for Hyperparameter Tuning with MLflow

Databricks

GPT-2: Language Models are Unsupervised Multitask Learners

Young Seok Kim

Webinar: OpenNLP and Solr for Superior Relevance

Lucidworks

Data analysis patterns, tools and data types in genomics

Altuna Akalin

2014 nicta-reproducibility

c.titus.brown

Tutorial on Coreference Resolution

Anirudh Jayakumar

Apply chinese radicals into neural machine translation: deeper than character...

Lifeng (Aaron) Han

Tuning ML Models: Scaling, Workflows, and Architecture

Databricks

Webinar at AgileTD Mondays: Mind maps to support exploratory testing: a team ...

Claudia Badell

Similar to CO620 (20)

Eskm20140903

"Hands Off! Best Practices for Code Hand Offs"

Proactive Empirical Assessment of New Language Feature Adoption via Automated...

Using a keyword extraction pipeline to understand concepts in future work sec...

Strata San Jose 2016: Scalable Ensemble Learning with H2O

An Introduction to NLP4L - Natural Language Processing Tool for Apache Lucene...

An NLP-based architecture for the autocompletion of partial domain models

Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab

Natural Language Processing using Java

Mendeley’s Research Catalogue: building it, opening it up and making it even ...

Sergey Nikolenko and Anton Alekseev User Profiling in Text-Based Recommende...

Best Practices for Hyperparameter Tuning with MLflow

GPT-2: Language Models are Unsupervised Multitask Learners

Webinar: OpenNLP and Solr for Superior Relevance

Data analysis patterns, tools and data types in genomics

2014 nicta-reproducibility

Tutorial on Coreference Resolution

Apply chinese radicals into neural machine translation: deeper than character...

Tuning ML Models: Scaling, Workflows, and Architecture

Webinar at AgileTD Mondays: Mind maps to support exploratory testing: a team ...

CO620

1. CORPORA-BASED GENERATION OF DEPENDENCY PARSER MODELS FOR NATURAL LANGUAGE PROCESSING by Edmond Lepedus supervised by Marek Grześ, Christian Kissig and Laura Bocchi

2. BACKGROUND

3. Dependency Parsing

4. Dependency Parsing • Structure consists of dependencies between words:

5. Dependency Parsing • Structure consists of dependencies between words: Hello world

6. Dependency Parsing • Structure consists of dependencies between words: Hello world dependent

7. Dependency Parsing • Structure consists of dependencies between words: Hello world headdependent

8. Stanford CoreNLP

9. Stanford CoreNLP • Free, open-source NLP toolkit

10. Stanford CoreNLP • Free, open-source NLP toolkit • Includes a dependency parser backed by a neural network classiﬁer

11. Stanford CoreNLP • Free, open-source NLP toolkit • Includes a dependency parser backed by a neural network classiﬁer • Parses 1000 sentences per second at 92.2% accuracy:

12. Stanford CoreNLP • Free, open-source NLP toolkit • Includes a dependency parser backed by a neural network classiﬁer • Parses 1000 sentences per second at 92.2% accuracy:

13. Stanford CoreNLP • Free, open-source NLP toolkit • Includes a dependency parser backed by a neural network classiﬁer • Parses 1000 sentences per second at 92.2% accuracy: • Trained on manually annotated text

14. AIM

15. Aim Train a classiﬁer using an unparsed corpus of English language text

16. MOTIVATION

17. Motivation

18. Motivation • Decrease the cost of training data

19. Motivation • Decrease the cost of training data • Increase the availability of training data

20. Motivation • Decrease the cost of training data • Increase the availability of training data • Increase parsing accuracy

21. Motivation • Decrease the cost of training data • Increase the availability of training data • Increase parsing accuracy • Enable the parsing of languages with few remaining speakers

22. APPROACH

23. Overview

24. Overview 1. Create a ‘blank’ model

25. Overview 1. Create a ‘blank’ model 2. Parse corpus with model & log decisions

26. Overview 1. Create a ‘blank’ model 2. Parse corpus with model & log decisions 3. Extract heuristics from corpus & parse log

27. Overview 1. Create a ‘blank’ model 2. Parse corpus with model & log decisions 3. Extract heuristics from corpus & parse log 4. Generate training examples by modifying the logged decisions to ﬁt the discovered heuristics

28. Overview 1. Create a ‘blank’ model 2. Parse corpus with model & log decisions 3. Extract heuristics from corpus & parse log 4. Generate training examples by modifying the logged decisions to ﬁt the discovered heuristics 5. Train model on new examples

29. Diagram Create a ‘blank’ model Parse & log Extract heuristics Generate training examples Train new model

30. IMPLEMENTATION

31. Blank Model Creation

32. Blank Model Creation • Outputs left arcs with custom ‘unknown’ label:

33. Blank Model Creation • Outputs left arcs with custom ‘unknown’ label:

34. Blank Model Creation • Outputs left arcs with custom ‘unknown’ label: • Supports the creation of new training examples

35. Parse Decision Logs

36. Parse Decision Logs • Log every parse decision to YAML:

37. Parse Decision Logs • Log every parse decision to YAML:

38. Parse Decision Logs • Log every parse decision to YAML: required for training

39. Heuristic Extraction

40. Heuristic Extraction • Count bigram occurrences:

41. Heuristic Extraction • Count bigram occurrences:

42. Heuristic Extraction • Count bigram occurrences: • Assume that frequent bigrams indicate dependency

43. Training Example Generation

44. Training Example Generation

45. Training Example Generation

46. Training Example Generation

47. Training Example Generation

48. RESULTS

49. Post-training Parses

50. Post-training Parses

51. Post-training Parses

52. FURTHER WORK

53. Further Work

54. Further Work • Improve efﬁciency to enable the use of larger corpora

55. Further Work • Improve efﬁciency to enable the use of larger corpora • Develop better heuristic analyses

56. Further Work • Improve efﬁciency to enable the use of larger corpora • Develop better heuristic analyses • Implement arc labels

57. CONCLUSION

58. Conclusion

59. Conclusion • We modiﬁed the Stanford CoreNLP toolkit to enable the creation of ‘blank’ parser models

60. Conclusion • We modiﬁed the Stanford CoreNLP toolkit to enable the creation of ‘blank’ parser models • We developed a workﬂow for training parser models without using annotated corpora

61. Conclusion • We modiﬁed the Stanford CoreNLP toolkit to enable the creation of ‘blank’ parser models • We developed a workﬂow for training parser models without using annotated corpora • We showed that this quickly yields qualitative improvements in parser outputs over the ‘blank’ models

62. Conclusion • We modiﬁed the Stanford CoreNLP toolkit to enable the creation of ‘blank’ parser models • We developed a workﬂow for training parser models without using annotated corpora • We showed that this quickly yields qualitative improvements in parser outputs over the ‘blank’ models • We proposed three avenues for further research

63. ANY QUESTIONS?

64. REFERENCES [1] C. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. Bethard, and D. McClosky, “The Stanford CoreNLP Natural Language Processing Toolkit,” presented at the Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Stroudsburg, PA, USA, 2014, pp. 55–60. [2] J. Nivre, “Dependency Parsing,” vol. 4, no. 3, pp. 138–152, Mar. 2010. [3] D. Chen and C. D. Manning, “A Fast and Accurate Dependency Parser using Neural Networks.,” EMNLP, pp. 740–750, 2014. [4] D. Jurafsky and J. H. Martin, Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition. Upper Saddle River, NJ: Prentice-Hall, 2000.

CO620

Recommended

Recommended

More Related Content

Similar to CO620

Similar to CO620 (20)

CO620