This document summarizes a presentation in Japanese on deep reinforcement learning for coreference resolution models. The presentation covered:
1) An introduction to coreference resolution and its applications in natural language processing.
2) Challenges in designing effective loss functions for neural coreference models.
3) Prior work using heuristic loss functions and its limitations.
4) A proposed reinforcement learning approach using the REINFORCE algorithm and reward rescaling that directly optimizes coreference metrics and improves over heuristic losses.
5) Experimental results showing the reinforcement learning model makes fewer severe errors.
Landing Page Testing to Attract Super AffiliatesAffiliate Summit
Landing Page Testing can lead to double digit conversion rate gains. Learn how to test your pages to improve conversion, and attract super affiliates without changing your payout structure.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org. If you would like to use this course content for teaching, please reach out to inquiry@deltanalytics.org
How to build an effective conversion program on a budget Joni Lindgren
How to get going with a conversion optimization process on a budget.
A/B test case and framework Swedish Unicef, delivered by consultancy company Outfox Intelligence
Slides with presenters notes
Originally held in Swedish at Outfox your competition conference in Stockholm march 12th 2015
Landing Page Testing to Attract Super AffiliatesAffiliate Summit
Landing Page Testing can lead to double digit conversion rate gains. Learn how to test your pages to improve conversion, and attract super affiliates without changing your payout structure.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org. If you would like to use this course content for teaching, please reach out to inquiry@deltanalytics.org
How to build an effective conversion program on a budget Joni Lindgren
How to get going with a conversion optimization process on a budget.
A/B test case and framework Swedish Unicef, delivered by consultancy company Outfox Intelligence
Slides with presenters notes
Originally held in Swedish at Outfox your competition conference in Stockholm march 12th 2015
Startup finance: valuation of tech companiesRianne Vogels
Tech startups operate under great uncertainty, and this makes their financial valuation difficult. I reviewed the literature and interviewed 26 venture capitalists about their methods. This presentation introduces a variety of valuation approaches, along with their advantages and drawbacks. The slide deck was developed for the Norwegian School of Entrepreneurship.
Optimizely Workshop: Take Action on Results with StatisticsOptimizely
Optimizely recently released the stats engine, which moves away from the traditional statistics model and into a new framework that is more aligned with modern business operations. In this workshop, we’ll walk you through the core trade-offs in A/B Testing, and how you can use them to decide when to stop running your test.
Chris Stuccio - Data science - Conversion Hotel 2015Webanalisten .nl
Slides of the keynote by Chris Stuccio (USA) at Conversion Hotel 2015, Texel, the Netherlands (#CH2015): "What’s this all about data science? Explain baysian statistics to me as a kid – what should I know?" http://conversionhotel.com
"Portfolio Optimisation When You Don’t Know the Future (or the Past)" by Rob...Quantopian
We generally assume the past is a good guide to the future, but well do we even know the past? What effect does this uncertainty when estimating inputs have on the notoriously unstable algorithms for portfolio optimization?
I explore this issue, look at some commonly used solutions, and also introduce some alternative methods.
Stop Flying Blind! Quantifying Risk with Monte Carlo SimulationSam McAfee
Product development is inherently risky. While lean and agile methods are praised for supporting rapid feedback from customers through experiments and continuous iteration, teams could do a lot better at prioritizing using basic modeling techniques from finance. This talk will focus on quantitative risk modeling when developing new products or services that do not have a well understood product/market fit scenario. Using modeling approaches like Monte Carlo simulations and Cost of Delay scenarios, combined with qualitative tools like the Lean Canvas and Value Dynamics, we will explore how lean innovation teams can bring scientific rigor back into their process.
New Product Development Cost Assessment PowerPoint Presentation SlidesSlideTeam
This deck consists of total of twenty one slides. It has PPT slides highlighting important topics of New Product Development Cost Assessment PowerPoint Presentation Slides. This deck comprises of amazing visuals with thoroughly researched content. Each template is well crafted and designed by our PowerPoint experts. Our designers have included all the necessary PowerPoint layouts in this deck. From icons to graphs, this PPT deck has it all. The best part is that these templates are easily customizable. Just click the DOWNLOAD button shown below. Edit the colour, text, font size, add or delete the content as per the requirement. Download this deck now and engage your audience with this ready made presentation. http://bit.ly/2StKxth
Part 1 of 8 - Question 1 of 17 1.0 Points A pha.docxherbertwilson5999
Part 1 of 8 -
Question 1 of 17
1.0 Points
A pharmaceutical company is testing the effectiveness of a new drug for lowering cholesterol. As part of
this trial, they wish to determine whether there is a difference between the effectiveness for women and
for men. Assume α = 0.05. What is the test value?
Women Men
Sample size 50 80
Mean effect 7 6.95
Sample variance 3 4
A.t = 0.151
B.z = 0.081
C.t = 3.252
D.z = 0.455
Reset Selection
Question 2 of 17
1.0 Points
Q-Mart is interested in comparing its male and female customers. Q-Mart would like to know if the
amount of money spent by its female charge customers differs, on average, from the amount spent by its
male charge customers.
To answer this question, an analyst collected random samples of 25 female customers and 22 male
customers. Based on these samples, on average, the 25 women charge customers spent $102.23 and
the 22 men charge customers spent $86.46. Moreover, the sample standard deviation of the amount
charged by the 25 women was $93.393, and the sample standard deviation of the amount charged by
the 22 men was $59.695.
Using the procedure advocated by Bluman, at the 10% level of significance, is there sufficient evidence
for Q-Mart to conclude that, on average, the amount spent by women charge customers differs from the
amount spent by men charge customers.
https://edge.apus.edu/portal/tool/e0599bbb-37e1-42a1-b2bf-368ce3fe543c/jsf/delivery/beginTakingAssessment
A.Yes, the p-value of this test is greater than 0.10
B.Yes, the test value does not exceed the critical value
C.No, the test value does not exceed the critical value.
D.No, the p-value of this test is less than 0.10.
Reset Selection
Question 3 of 17
1.0 Points
Some defendants in criminal proceedings plead guilty and are sentenced without a trial, whereas others who
plead innocent are subsequently found guilty and then are sentenced. In recent years, legal scholars
have speculated as to whether sentences of those who plead guilty differ in severity from sentences for
those who plead innocent and are subsequently judged guilty. Consider the data given below on
defendants accused of robbery, all of whom, by the way, had previous prison records. At the .01 level of
significance, do these data suggest that the proportion of all defendants in these circumstances who
plead guilty and are sent to prison differs from the proportion who are sent to prison after pleading
innocent and being found guilty?
Plea
Guilty Not Guilty
Number judged guilty n1 = 191 n2 = 64
Number sentenced to prison x1 = 101 x2 = 56
Sample proportion .529 .875
A.Yes, because the test value -4.94 is outside the interval (-2.58, 2.58)
B.No, because the test value -4.94 is outside the interval (-1.96, 1.96)
C.No, because the test value -1.96 is inside the interval (-2.58, 2.58)
D.Yes, because the test value 2.58 is inside the interval (.
This presentation by Morris Kleiner (University of Minnesota), was made during the discussion “Competition and Regulation in Professions and Occupations” held at the Working Party No. 2 on Competition and Regulation on 10 June 2024. More papers and presentations on the topic can be found out at oe.cd/crps.
This presentation was uploaded with the author’s consent.
More Related Content
Similar to Reinforcement learning for NLP coreference
Startup finance: valuation of tech companiesRianne Vogels
Tech startups operate under great uncertainty, and this makes their financial valuation difficult. I reviewed the literature and interviewed 26 venture capitalists about their methods. This presentation introduces a variety of valuation approaches, along with their advantages and drawbacks. The slide deck was developed for the Norwegian School of Entrepreneurship.
Optimizely Workshop: Take Action on Results with StatisticsOptimizely
Optimizely recently released the stats engine, which moves away from the traditional statistics model and into a new framework that is more aligned with modern business operations. In this workshop, we’ll walk you through the core trade-offs in A/B Testing, and how you can use them to decide when to stop running your test.
Chris Stuccio - Data science - Conversion Hotel 2015Webanalisten .nl
Slides of the keynote by Chris Stuccio (USA) at Conversion Hotel 2015, Texel, the Netherlands (#CH2015): "What’s this all about data science? Explain baysian statistics to me as a kid – what should I know?" http://conversionhotel.com
"Portfolio Optimisation When You Don’t Know the Future (or the Past)" by Rob...Quantopian
We generally assume the past is a good guide to the future, but well do we even know the past? What effect does this uncertainty when estimating inputs have on the notoriously unstable algorithms for portfolio optimization?
I explore this issue, look at some commonly used solutions, and also introduce some alternative methods.
Stop Flying Blind! Quantifying Risk with Monte Carlo SimulationSam McAfee
Product development is inherently risky. While lean and agile methods are praised for supporting rapid feedback from customers through experiments and continuous iteration, teams could do a lot better at prioritizing using basic modeling techniques from finance. This talk will focus on quantitative risk modeling when developing new products or services that do not have a well understood product/market fit scenario. Using modeling approaches like Monte Carlo simulations and Cost of Delay scenarios, combined with qualitative tools like the Lean Canvas and Value Dynamics, we will explore how lean innovation teams can bring scientific rigor back into their process.
New Product Development Cost Assessment PowerPoint Presentation SlidesSlideTeam
This deck consists of total of twenty one slides. It has PPT slides highlighting important topics of New Product Development Cost Assessment PowerPoint Presentation Slides. This deck comprises of amazing visuals with thoroughly researched content. Each template is well crafted and designed by our PowerPoint experts. Our designers have included all the necessary PowerPoint layouts in this deck. From icons to graphs, this PPT deck has it all. The best part is that these templates are easily customizable. Just click the DOWNLOAD button shown below. Edit the colour, text, font size, add or delete the content as per the requirement. Download this deck now and engage your audience with this ready made presentation. http://bit.ly/2StKxth
Part 1 of 8 - Question 1 of 17 1.0 Points A pha.docxherbertwilson5999
Part 1 of 8 -
Question 1 of 17
1.0 Points
A pharmaceutical company is testing the effectiveness of a new drug for lowering cholesterol. As part of
this trial, they wish to determine whether there is a difference between the effectiveness for women and
for men. Assume α = 0.05. What is the test value?
Women Men
Sample size 50 80
Mean effect 7 6.95
Sample variance 3 4
A.t = 0.151
B.z = 0.081
C.t = 3.252
D.z = 0.455
Reset Selection
Question 2 of 17
1.0 Points
Q-Mart is interested in comparing its male and female customers. Q-Mart would like to know if the
amount of money spent by its female charge customers differs, on average, from the amount spent by its
male charge customers.
To answer this question, an analyst collected random samples of 25 female customers and 22 male
customers. Based on these samples, on average, the 25 women charge customers spent $102.23 and
the 22 men charge customers spent $86.46. Moreover, the sample standard deviation of the amount
charged by the 25 women was $93.393, and the sample standard deviation of the amount charged by
the 22 men was $59.695.
Using the procedure advocated by Bluman, at the 10% level of significance, is there sufficient evidence
for Q-Mart to conclude that, on average, the amount spent by women charge customers differs from the
amount spent by men charge customers.
https://edge.apus.edu/portal/tool/e0599bbb-37e1-42a1-b2bf-368ce3fe543c/jsf/delivery/beginTakingAssessment
A.Yes, the p-value of this test is greater than 0.10
B.Yes, the test value does not exceed the critical value
C.No, the test value does not exceed the critical value.
D.No, the p-value of this test is less than 0.10.
Reset Selection
Question 3 of 17
1.0 Points
Some defendants in criminal proceedings plead guilty and are sentenced without a trial, whereas others who
plead innocent are subsequently found guilty and then are sentenced. In recent years, legal scholars
have speculated as to whether sentences of those who plead guilty differ in severity from sentences for
those who plead innocent and are subsequently judged guilty. Consider the data given below on
defendants accused of robbery, all of whom, by the way, had previous prison records. At the .01 level of
significance, do these data suggest that the proportion of all defendants in these circumstances who
plead guilty and are sent to prison differs from the proportion who are sent to prison after pleading
innocent and being found guilty?
Plea
Guilty Not Guilty
Number judged guilty n1 = 191 n2 = 64
Number sentenced to prison x1 = 101 x2 = 56
Sample proportion .529 .875
A.Yes, because the test value -4.94 is outside the interval (-2.58, 2.58)
B.No, because the test value -4.94 is outside the interval (-1.96, 1.96)
C.No, because the test value -1.96 is inside the interval (-2.58, 2.58)
D.Yes, because the test value 2.58 is inside the interval (.
This presentation by Morris Kleiner (University of Minnesota), was made during the discussion “Competition and Regulation in Professions and Occupations” held at the Working Party No. 2 on Competition and Regulation on 10 June 2024. More papers and presentations on the topic can be found out at oe.cd/crps.
This presentation was uploaded with the author’s consent.
0x01 - Newton's Third Law: Static vs. Dynamic AbusersOWASP Beja
f you offer a service on the web, odds are that someone will abuse it. Be it an API, a SaaS, a PaaS, or even a static website, someone somewhere will try to figure out a way to use it to their own needs. In this talk we'll compare measures that are effective against static attackers and how to battle a dynamic attacker who adapts to your counter-measures.
About the Speaker
===============
Diogo Sousa, Engineering Manager @ Canonical
An opinionated individual with an interest in cryptography and its intersection with secure software development.
Acorn Recovery: Restore IT infra within minutesIP ServerOne
Introducing Acorn Recovery as a Service, a simple, fast, and secure managed disaster recovery (DRaaS) by IP ServerOne. A DR solution that helps restore your IT infra within minutes.
This presentation, created by Syed Faiz ul Hassan, explores the profound influence of media on public perception and behavior. It delves into the evolution of media from oral traditions to modern digital and social media platforms. Key topics include the role of media in information propagation, socialization, crisis awareness, globalization, and education. The presentation also examines media influence through agenda setting, propaganda, and manipulative techniques used by advertisers and marketers. Furthermore, it highlights the impact of surveillance enabled by media technologies on personal behavior and preferences. Through this comprehensive overview, the presentation aims to shed light on how media shapes collective consciousness and public opinion.
1. NN論文を肴に酒を飲む会 #5
紹介者 Shitian Ni (倪石天)
ENMLP 2016
Deep Reinforcement Learning for Mention-Ranking
Coreference Models
Kevin Clark Christopher D. Manning
Computer Science Department
Stanford University
Computer Science Department
Stanford University
3. 自己紹介
Shitian Ni (倪石天)
東京工業大学 工学部
• Topcoder blue
• Kaggle Silver medalist (Recruit Restaurant Visitor Forecasting)
• Nvidia Deep Learning Institute TA
1/15
4. Coreference
• Identify all noun phrases (mentions) that refer to the same real world
identity
• 共通の指示対象を持つ2つ以上の単語の文法的関係
• 同一指示
2/15
5. Coreference
• Identify all noun phrases (mentions) that refer to the same real world
identity
• 共通の指示対象を持つ2つ以上の単語の文法的関係
• 同一指示
Example
2/15
My university that has TSUBAME 3.0,
which is a TOP500 supercomputer that accelerates my research
but cost Tokyo Tech a lot of money,
is located in Oookayama.
9. Applications
• Full text understanding
• Text summary
• Information retrieval
• Machine translation
3/15
10. Applications
• Full text understanding
• Text summary
• Information retrieval
• Machine translation
• I have a dog. It is 2 years old. <-> 2歳の犬を飼っている
3/15
11. Applications
• Full text understanding
• Text summary
• Information retrieval
• Machine translation
• I have a dog. It is 2 years old. <-> 2歳の犬を飼っている
• Chat bot question answering
3/15
12. Applications
• Full text understanding
• Text summary
• Information retrieval
• Machine translation
• I have a dog. It is 2 years old. <-> 2歳の犬を飼っている
• Chat bot question answering
• I want to eat Japanese food. Where can I find that?
3/15
14. Neural Mention-Ranking Model
• m: mention
• c: candidate antecedent
• s(c,m): compatibility for coreference
Hidden Layer
Input Layer
Scoring Layer
s(c,m)
trained with heuristic loss functions
tuned via hyperparameters
4/15
15. Challenge
• Finding Effective Error Penalties for loss calculations.
• Some errors are severe, some errors are minor
5/15
16. Challenge
• Finding Effective Error Penalties for loss calculations.
• Some errors are severe, some errors are minor
• Bill’s girlfriend is a friend of Michael’s wife.
5/15
17. Challenge
• Finding Effective Error Penalties for loss calculations.
• Some errors are severe, some errors are minor
• Bill’s girlfriend is a friend of Michael’s wife.
5/15
Severe error
18. Challenge
• Finding Effective Error Penalties for loss calculations.
• Some errors are severe, some errors are minor
• It is raining. That is my dog.
Minor error
5/15
19. Error types
• False New
New I bought a gift which is a chocolate for my girlfriend.
6/15
以前同一ものを指す単語が現れたが、初めてのものと認識される
20. Error types
• False New
New I bought a gift which is a chocolate for my girlfriend.
6/15
False New
以前同一ものを指す単語が現れたが、初めて現れたものと認識される
21. Error types
• False New
• False Anaphoric
New I bought a gift which is a chocolate for my girlfriend.
New I bought a gift which is a chocolate for my girlfriend.
6/15
False New
False Anaphoric
以前同一ものを指す単語が現れたが、初めて現れたものと認識される
初めて現れたものを指す単語なのに、他の単語と同一指示関係にあると誤認識
(照応)
22. Error types
• False New
• False Anaphoric
• False Link
New I bought a gift which is a chocolate for my girlfriend.
New I bought a gift which is a chocolate for my girlfriend.
New I bought a gift which is a chocolate for my girlfriend.
6/15
False New
False Anaphoric
False Link
以前同一ものを指す単語が現れたが、初めて現れたものと認識される
初めて現れたものを指す単語なのに、他の単語と同一指示関係にあると誤認識
二回以上現れたものを指す単語が、他の単語と同一指示関係にあると誤認識
(照応)
23. Error types
• False New
• False Anaphoric
• False Link
New I bought a gift which is a chocolate for my girlfriend.
New I bought a gift which is a chocolate for my girlfriend.
New I bought a gift which is a chocolate for my girlfriend.
6/15
False New
False Anaphoric
False Link
以前同一ものを指す単語が現れたが、初めて現れたものと認識される
初めて現れたものを指す単語なのに、他の単語と同一指示関係にあると誤認識
二回以上現れたものを指す単語が、他の単語と同一指示関係にあると誤認識
(照応)
24. Prior work: Heuristic Loss Function
• Use max margin loss
(c,mi) (1 + s(c, mi) - s(ti, mi))hL(θ) = ∑ max
C
Max over candidate
coreference decision
Cost for this
coref decision
Loss for scoring this decision too highly
h (c,mi) =
0 if c ∈ T (mi) if c and mi are coreferent
αFN if c = NA ∧ T (mi) != {NA} if false new error
αFA if c != NA ∧ T (mi) = {NA} if false anaphoric error
αWL if c != NA ∧ c ∉ T (mi) if wrong link error
7/15
Costs for linking mi to a candidate antecedent c ∈ C(mi):
ti := the highest scoring true antecedent of mi
25. Prior work: Heuristic Loss Function
• Use max margin loss
(c,mi) (1 + s(c, mi) - s(ti, mi))hL(θ) = max
C
Max over candidate
coreference decision
Cost for this
coref decision
Loss for scoring this decision too highly
h (c,mi) =
0 if c ∈ T (mi) if c and mi are coreferent
αFN if c = NA ∧ T (mi) != {NA} if false new error
αFA if c != NA ∧ T (mi) = {NA} if false anaphoric error
αWL if c != NA ∧ c ∉ T (mi) if wrong link error
7/15
Costs for linking mi to a candidate antecedent c ∈ C(mi):
ti := the highest scoring true antecedent of mi
Tune !
26. Prior work: Heuristic Loss Function
• Disadvantage
• Grid search over hyperparameters
h (c,mi) =
0 if c ∈ T (mi) if c and mi are coreferent
αFN if c = NA ∧ T (mi) != {NA} if false new error
αFA if c != NA ∧ T (mi) = {NA} if false anaphoric error
αWL if c != NA ∧ c ∉ T (mi) if wrong link error
7/15
Grid search: 機械学習モデルのハイパーパラメータを自動的に最適化
Costs for linking mi to a candidate antecedent c ∈ C(mi):
27. Proposed Reinforcement Learning methods
• Model takes a sequence of actions
-> Receive a reward
• REINFORCE algorithm
• Reward rescaling
8/15
New I bought a gift which is a chocolate for my girlfriend.
a1
a2
a3
a4
28. REINFORCE algorithm
• Define probability distribution over action.
• Maximize expected reward
• Sample trajectories of actions to approximate gradient
• アクション軌跡のサンプリングで勾配を近似
• (Policy gradient)
9/15
31. REINFORCE algorithm
• CON:
• REINFORCE maximizes performance in expectation(choose better-result action)
• Only need highest scoring action to be correct (choose better score for action)
• Only links the current mention to a single antecedent(先行詞), but is trained
to assign high probability to all correct antecedents.
10/15
32. Reward Rescaling
• Incorporate reward into the max-margin objective’s slack rescaling
h (c,mi) =
0 if c ∈ T (mi) if c and mi are coreferent
αFN if c = NA ∧ T (mi) != {NA} if false new error
αFA if c != NA ∧ T (mi) = {NA} if false anaphoric error
αWL if c != NA ∧ c ∉ T (mi) if wrong link error
max-margin objective
11/15
33. Reward Rescaling
• Incorporate reward into the max-margin objective’s slack rescaling
h (c,mi) =
0 if c ∈ T (mi) if c and mi are coreferent
αFN if c = NA ∧ T (mi) != {NA} if false new error
αFA if c != NA ∧ T (mi) = {NA} if false anaphoric error
αWL if c != NA ∧ c ∉ T (mi) if wrong link error
max-margin objective
11/15
34. Reward Rescaling
• Since actions are independent, we can change an action a to a
different action a’ and see what the (B3 coreference metric) reward
we would have instead.
12/15
35. Reward Rescaling
• Since actions are independent, we can change an action a to a
different action a’ and see what the (B3 coreference metric) reward
we would have instead.
Reward = 1
Regret = 99
12/15
New I bought a chocolate for my girlfriend.
a
36. Reward Rescaling
• Since actions are independent, we can change an action a to a
different action a’ and see what the reward we would have instead.
Reward = 35
Regret = 65
12/15
New I bought a chocolate for my girlfriend.
a’
37. Reward Rescaling
• Since actions are independent, we can change an action a to a
different action a’ and see what the reward we would have instead.
Reward = 100
Regret = 0
12/15
New I bought a chocolate for my girlfriend.
a’’
38. Reward Rescaling
• Cost is the regret taking the action
• Replaces the heuristic cost
• Benefit from its max-margin loss as well as directly optimizing for
coreference metrics
h (c,mi) =
max R(a1,…,a’,…,aT) Reward for best action
- R(a1,…,(c,mi),…,aT) Reward for current action
13/15
39. Reward Rescaling
• Cost is the regret taking the action
• Replaces the heuristic cost
• Benefit from its max-margin loss as well as directly optimizing for
coreference metrics
h (c,mi) =
max R(a1,…,a’,…,aT) Reward for best action
- R(a1,…,(c,mi),…,aT) Reward for current action
13/15
40. Experiment
• B3 coreference metric for action sequence reward
• MUC has the flaw of treating all errors equally
• CEAFφ4 is slow to compute
14/15
41. Experiment result
• Reward-rescaling model make more errors
• However, the errors are less severe
• ~0.7% lower cost on average
• Comparing to Heuristic Loss
• Reward Rescaling make
• More errors on
• False anaphoric(照応)
• False New (word)
• Less error on
• Wrong link
14/15
42. Thank you
• Question and comments ?
15/15
Reference
• Deep Reinforcement Learning for Mention-Ranking Coreference
Models (Kevin Clark, Christopher D. Manning)
• Stanford CS224n
Lecture 15: Coreference Resolution
https://www.youtube.com/watch?v=rpwEWLaueRk
• https://github.com/clarkkev/deep-coref