SlideShare a Scribd company logo
1 of 45
Download to read offline
Making natural language processing
robust to sociolinguistic variation
Jacob Eisenstein
@jacobeisenstein
Georgia Institute of Technology
September 9, 2017
Machine reading
From text to structured
representations.
Annotate
and train
Machine reading
From text to structured
representations.
Annotate
and train
Machine reading
From text to structured
representations.
New domains of digitized
texts offer opportunities as
well as challenges.
Language data then and now
Then: news text, small set
of authors, professionally
edited, fixed style
Language data then and now
Then: news text, small set
of authors, professionally
edited, fixed style
Now: open domain,
everyone is an author,
unedited, many styles
Social media has forced
NLP to confront the
challenge of missing
social context
(Eisenstein, 2013):
(Gimpel et al., 2011)
(Ritter et al., 2011)
(Foster et al., 2011)
Social media has forced
NLP to confront the
challenge of missing
social context
(Eisenstein, 2013):
tacit assumptions
about audience
knowledge
language variation
across social groups
(Gimpel et al., 2011)
(Ritter et al., 2011)
(Foster et al., 2011)
Social media has forced
NLP to confront the
challenge of missing
social context
(Eisenstein, 2013):
tacit assumptions
about audience
knowledge
language variation
across social groups
(Gimpel et al., 2011)
(Ritter et al., 2011)
(Foster et al., 2011)
Finding tacit context in the social network
Social media texts lack
context, because it is
implicit between the
writer and the reader.
Homophily: socially
connected individuals tend
to share traits.
Assortativity of entity references
We project embeddings for entities, words, and
authors into a shared semantic space.
“Dirk Novitsky”
“the warriors”
Inner products in this space indicate compatibility.
Socially-Infused	En,ty	Linking
47
Socially-Infused	En,ty	Linking
47
tweet
en,ty	assignments
author
Socially-Infused	En,ty	Linking
47
tweet
en,ty	assignments
author
‣						is	employed	to	model	surface	features.g1
Socially-Infused	En,ty	Linking
47
tweet
en,ty	assignments
author
‣						is	used	to	capture	two	assump,ons:
‣	En,ty	homophily
‣						is	employed	to	model	surface	features.
‣	Seman,cally	related	men,ons	tend	to	refer	similar	en,,es
g1
g2
Socially-Infused	En,ty	Linking
48
g2(x, yt, u, t; ⇥2) = v(u)
u
>
W(u,e)
v(e)
yt
+ v
(m)
t
>
W(m,e)
v(e)
yt
author	embedding men,on	embedding
v(u)
u
v(e)
yt
v(e)
yt v
(m)
t
g2
(x, yt, t)
g1
en,ty	embedding
Loss-augmented	
training
Socially-Infused	En,ty	Linking
48
g2(x, yt, u, t; ⇥2) = v(u)
u
>
W(u,e)
v(e)
yt
+ v
(m)
t
>
W(m,e)
v(e)
yt
author	embedding men,on	embedding
v(u)
u
v(e)
yt
v(e)
yt v
(m)
t
g2
(x, yt, t)
g1
en,ty	embedding
Learning
49
Learning
49
‣	Loss-augmented	inference:
Learning
49
‣	Loss-augmented	inference: hamming	loss
Learning
49
‣	Loss-augmented	inference:
‣	Op,miza,on:	stochas,c	gradient	descent
hamming	loss
Inference
50
‣	Non-overlapping	structure
In	order	to	link	‘Red	Sox’	to	a	real	en,ty,	‘Red’	and	‘Sox’	
should	be	linked	to	Nil.
Classifier Struct Struct+Social S-MART
64
66
68
70
72
74
76
78
F1
Dataset
NEEL
TACL
+3.2
+2.0
Structure prediction improves accuracy.
Social context yields further improvements.
S-MART is the prior state-of-the-art
(Yang & Chang, 2015).
Social media has forced
NLP to confront the
challenge of missing
social context
(Eisenstein, 2013):
tacit assumptions
about audience
knowledge
language variation
across social groups
(Gimpel et al., 2011)
(Ritter et al., 2011)
(Foster et al., 2011)
Social media has forced
NLP to confront the
challenge of missing
social context
(Eisenstein, 2013):
tacit assumptions
about audience
knowledge
language variation
across social groups
(Gimpel et al., 2011)
(Ritter et al., 2011)
(Foster et al., 2011)
Language variation: a challenge for NLP
“I would like to believe he’s
sick rather than just mean
and evil.”
Language variation: a challenge for NLP
“I would like to believe he’s
sick rather than just mean
and evil.”
“You could’ve been getting
down to this sick beat.”
(Yang & Eisenstein, 2017)
Personalization by ensemble
Goal: personalized conditional likelihood,
P(y | x, a), where a is the author.
Problem: We have labeled examples for only a
few authors.
Personalization by ensemble
Goal: personalized conditional likelihood,
P(y | x, a), where a is the author.
Problem: We have labeled examples for only a
few authors.
Personalization ensemble
P(y | x, a) =
k
Pk(y | x)πa(k)
Pk(y | x) is a basis model
πa(·) are the ensemble weights for author a
Homophily to the rescue?
Sick!
Sick!
Sick!Sick!
Labeled
data
Unlabeled
data
Are language styles assortative on the social
network?
Evidence for linguistic homophily
Pilot study: is classifier accuracy assortative on the
Twitter social network?
assort(G) =
1
#|G|
(i,j)∈G
δ(yi = ˆyi)δ(yj = ˆyj)
+ δ(yi = ˆyi)δ(yj = ˆyj)
Evidence for linguistic homophily
Pilot study: is classifier accuracy assortative on the
Twitter social network?
assort(G) =
1
#|G|
(i,j)∈G
δ(yi = ˆyi)δ(yj = ˆyj)
+ δ(yi = ˆyi)δ(yj = ˆyj)
0 20 40 60 80 100
rewiring epochs
0.700
0.705
0.710
0.715
0.720
0.725
0.730
0.735
assortativity
follow
0 20 40 60 80 100
rewiring epochs
mention
0 20 40 60 80 100
rewiring epochs
retweet
original network
random rewiring
Network-driven personalization
For each author, estimate
a node embedding
ea (Tang et al., 2015).
Nodes who share
neighbors get similar
embeddings.
πa =SoftMax(f (ea))
P(y | x, a) =
K
k=1
Pk(y | x)πa(k)
Results
Mixture of Experts NLSE Social Personalization
0.0
0.5
1.0
1.5
2.0
2.5
3.0F1improvementoverConvNet
+0.10
+1.90
+2.80
Twitter Sentiment Analysis
Improvements over ConvNet baseline:
+2.8% on Twitter Sentiment Analysis
+2.7% on Ciao Product Reviews
NLSE is prior state-of-the-art (Astudillo et al., 2015).
Variable sentiment words
More positive More negative
1 banging loss fever broken
fucking
dear like god yeah wow
2 chilling cold ill sick suck satisfy trust wealth strong
lmao
3 ass damn piss bitch shit talent honestly voting win
clever
4 insane bawling fever weird cry lmao super lol haha hahaha
5 ruin silly bad boring dreadful lovatics wish beliebers ariana-
tors kendall
Summary
Robustness is a key challenge for making NLP effective on
social media data:
Tacit assumptions about shared knowledge; language
variation
Social metadata gives NLP systems the flexibility to
handle each author differently.
Summary
Robustness is a key challenge for making NLP effective on
social media data:
Tacit assumptions about shared knowledge; language
variation
Social metadata gives NLP systems the flexibility to
handle each author differently.
The long tail of rare events is the other big challenge.
Word embeddings for unseen words (Pinter et al., 2017)
Lexicon-based supervision (Eisenstein, 2017)
Applications to finding rare events in electronic health
records (ongoing work with Jimeng Sun)
Acknowledgments
Students and collaborators:
Yi Yang (GT → Bloomberg)
Mingwei Chang (Google Research)
See https://gtnlp.wordpress.com/ for more!
Funding: National Science Foundation,
National Institutes for Health, Georgia Tech
References I
Astudillo, R. F., Amir, S., Lin, W., Silva, M., & Trancoso, I. (2015). Learning word representations from scarce and
noisy data with embedding sub-spaces. In Proceedings of the Association for Computational Linguistics
(ACL), Beijing.
Eisenstein, J. (2013). What to do about bad language on the internet. In Proceedings of the North American
Chapter of the Association for Computational Linguistics (NAACL), (pp. 359–369).
Eisenstein, J. (2017). Unsupervised learning for lexicon-based classification. In Proceedings of the National
Conference on Artificial Intelligence (AAAI), San Francisco.
Foster, J., Cetinoglu, O., Wagner, J., Le Roux, J., Nivre, J., Hogan, D., & van Genabith, J. (2011). From news to
comment: Resources and benchmarks for parsing the language of web 2.0. In Proceedings of the International
Joint Conference on Natural Language Processing (IJCNLP), (pp. 893–901)., Chiang Mai, Thailand. Asian
Federation of Natural Language Processing.
Gimpel, K., Schneider, N., O’Connor, B., Das, D., Mills, D., Eisenstein, J., Heilman, M., Yogatama, D., Flanigan,
J., & Smith, N. A. (2011). Part-of-speech tagging for Twitter: annotation, features, and experiments. In
Proceedings of the Association for Computational Linguistics (ACL), (pp. 42–47)., Portland, OR.
Pinter, Y., Guthrie, R., & Eisenstein, J. (2017). Mimicking word embeddings using subword rnns. In Proceedings of
Empirical Methods for Natural Language Processing (EMNLP).
Ritter, A., Clark, S., Mausam, & Etzioni, O. (2011). Named entity recognition in tweets: an experimental study. In
Proceedings of EMNLP.
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., & Mei, Q. (2015). Line: Large-scale information network
embedding. In Proceedings of the Conference on World-Wide Web (WWW), (pp. 1067–1077).
Yang, Y. & Chang, M.-W. (2015). S-mart: Novel tree-based structured learning algorithms applied to tweet entity
linking. In Proceedings of the Association for Computational Linguistics (ACL), (pp. 504–513)., Beijing.
Yang, Y. & Eisenstein, J. (2017). Overcoming language variation in sentiment analysis with social attention.
Transactions of the Association for Computational Linguistics (TACL), in press.

More Related Content

Viewers also liked

Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017MLconf
 
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017MLconf
 
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...MLconf
 
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017MLconf
 
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017MLconf
 
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017MLconf
 
Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...
Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...
Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...MLconf
 
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017MLconf
 
Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017
Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017
Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017MLconf
 
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...MLconf
 
Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference
Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference
Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference MLconf
 
Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017
Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017
Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017MLconf
 
Claudia Perlich, Chief Scientist, Dstillery
Claudia Perlich, Chief Scientist, Dstillery Claudia Perlich, Chief Scientist, Dstillery
Claudia Perlich, Chief Scientist, Dstillery MLconf
 
Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017
Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017
Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017MLconf
 
Erik Bernhardsson, CTO, Better Mortgage
Erik Bernhardsson, CTO, Better MortgageErik Bernhardsson, CTO, Better Mortgage
Erik Bernhardsson, CTO, Better MortgageMLconf
 
Yuri M. Brovman, Data Scientist, eBay
Yuri M. Brovman, Data Scientist, eBayYuri M. Brovman, Data Scientist, eBay
Yuri M. Brovman, Data Scientist, eBayMLconf
 
Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017
Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017 Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017
Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017 MLconf
 
Garrett Goh, Scientist, Pacific Northwest National Lab
Garrett Goh, Scientist, Pacific Northwest National Lab Garrett Goh, Scientist, Pacific Northwest National Lab
Garrett Goh, Scientist, Pacific Northwest National Lab MLconf
 

Viewers also liked (18)

Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
 
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
 
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
 
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
 
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
 
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
 
Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...
Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...
Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...
 
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017
 
Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017
Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017
Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017
 
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
 
Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference
Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference
Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference
 
Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017
Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017
Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017
 
Claudia Perlich, Chief Scientist, Dstillery
Claudia Perlich, Chief Scientist, Dstillery Claudia Perlich, Chief Scientist, Dstillery
Claudia Perlich, Chief Scientist, Dstillery
 
Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017
Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017
Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017
 
Erik Bernhardsson, CTO, Better Mortgage
Erik Bernhardsson, CTO, Better MortgageErik Bernhardsson, CTO, Better Mortgage
Erik Bernhardsson, CTO, Better Mortgage
 
Yuri M. Brovman, Data Scientist, eBay
Yuri M. Brovman, Data Scientist, eBayYuri M. Brovman, Data Scientist, eBay
Yuri M. Brovman, Data Scientist, eBay
 
Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017
Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017 Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017
Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017
 
Garrett Goh, Scientist, Pacific Northwest National Lab
Garrett Goh, Scientist, Pacific Northwest National Lab Garrett Goh, Scientist, Pacific Northwest National Lab
Garrett Goh, Scientist, Pacific Northwest National Lab
 

Similar to Jacob Eisenstein, Assistant Professor, School of Interactive Computing, Georgia Institute of Technology at MLconf ATL 2017

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingpunedevscom
 
Wei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI PanelWei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI PanelRehgan Avon
 
___ __ Newlanguage evolution ___ BernabeuatLangUE.pdf
___ __ Newlanguage evolution ___  BernabeuatLangUE.pdf___ __ Newlanguage evolution ___  BernabeuatLangUE.pdf
___ __ Newlanguage evolution ___ BernabeuatLangUE.pdftkobelt
 
2022 AAAI DSTC10 Invited Talk
2022 AAAI DSTC10 Invited Talk2022 AAAI DSTC10 Invited Talk
2022 AAAI DSTC10 Invited TalkVerena Rieser
 
New Frontiers in IA: Design in the Era of Cognitive Computing
New Frontiers in IA: Design in the Era of Cognitive ComputingNew Frontiers in IA: Design in the Era of Cognitive Computing
New Frontiers in IA: Design in the Era of Cognitive ComputingPaul King
 
Natural Language Processing for Games Research
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games ResearchJose Zagal
 
Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...
Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...
Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...Grammarly
 
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxSpeaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxwilliame8
 
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxSpeaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxrafbolet0
 
The Social Impact of NLP
The Social Impact of NLPThe Social Impact of NLP
The Social Impact of NLPantonellarose
 
Questions On Natural Language Processing
Questions On Natural Language ProcessingQuestions On Natural Language Processing
Questions On Natural Language ProcessingAdriana Wilson
 
Can you tell if they're learning? ICALT 7 July 2015
Can you tell if they're learning? ICALT 7 July 2015Can you tell if they're learning? ICALT 7 July 2015
Can you tell if they're learning? ICALT 7 July 2015studywbv
 
From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...
From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...
From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...Amit Sheth
 
Communication between open source developers
Communication between open source developersCommunication between open source developers
Communication between open source developersAlexander Serebrenik
 
Unlocking the Power of Social Chatter; Recent Endeavors @ Netflix | Wrangle C...
Unlocking the Power of Social Chatter; Recent Endeavors @ Netflix | Wrangle C...Unlocking the Power of Social Chatter; Recent Endeavors @ Netflix | Wrangle C...
Unlocking the Power of Social Chatter; Recent Endeavors @ Netflix | Wrangle C...Cloudera, Inc.
 
OpenmHealth Overview
OpenmHealth OverviewOpenmHealth Overview
OpenmHealth OverviewOpen mHealth
 
Edet 637 Dual Coding Theory
Edet 637 Dual Coding TheoryEdet 637 Dual Coding Theory
Edet 637 Dual Coding Theoryguestb8ed61
 
Meta design and social creativity
Meta design and social creativityMeta design and social creativity
Meta design and social creativityJohn Thomas
 

Similar to Jacob Eisenstein, Assistant Professor, School of Interactive Computing, Georgia Institute of Technology at MLconf ATL 2017 (20)

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Wei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI PanelWei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI Panel
 
___ __ Newlanguage evolution ___ BernabeuatLangUE.pdf
___ __ Newlanguage evolution ___  BernabeuatLangUE.pdf___ __ Newlanguage evolution ___  BernabeuatLangUE.pdf
___ __ Newlanguage evolution ___ BernabeuatLangUE.pdf
 
2022 AAAI DSTC10 Invited Talk
2022 AAAI DSTC10 Invited Talk2022 AAAI DSTC10 Invited Talk
2022 AAAI DSTC10 Invited Talk
 
New Frontiers in IA: Design in the Era of Cognitive Computing
New Frontiers in IA: Design in the Era of Cognitive ComputingNew Frontiers in IA: Design in the Era of Cognitive Computing
New Frontiers in IA: Design in the Era of Cognitive Computing
 
Natural Language Processing for Games Research
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games Research
 
Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...
Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...
Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...
 
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxSpeaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
 
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxSpeaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
 
The Social Impact of NLP
The Social Impact of NLPThe Social Impact of NLP
The Social Impact of NLP
 
Why is My Team Failing? (By Christine Loch)
Why is My Team Failing? (By Christine Loch)Why is My Team Failing? (By Christine Loch)
Why is My Team Failing? (By Christine Loch)
 
Questions On Natural Language Processing
Questions On Natural Language ProcessingQuestions On Natural Language Processing
Questions On Natural Language Processing
 
Can you tell if they're learning? ICALT 7 July 2015
Can you tell if they're learning? ICALT 7 July 2015Can you tell if they're learning? ICALT 7 July 2015
Can you tell if they're learning? ICALT 7 July 2015
 
From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...
From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...
From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...
 
Communication between open source developers
Communication between open source developersCommunication between open source developers
Communication between open source developers
 
08 09.4.what is-discourse-2
08 09.4.what is-discourse-208 09.4.what is-discourse-2
08 09.4.what is-discourse-2
 
Unlocking the Power of Social Chatter; Recent Endeavors @ Netflix | Wrangle C...
Unlocking the Power of Social Chatter; Recent Endeavors @ Netflix | Wrangle C...Unlocking the Power of Social Chatter; Recent Endeavors @ Netflix | Wrangle C...
Unlocking the Power of Social Chatter; Recent Endeavors @ Netflix | Wrangle C...
 
OpenmHealth Overview
OpenmHealth OverviewOpenmHealth Overview
OpenmHealth Overview
 
Edet 637 Dual Coding Theory
Edet 637 Dual Coding TheoryEdet 637 Dual Coding Theory
Edet 637 Dual Coding Theory
 
Meta design and social creativity
Meta design and social creativityMeta design and social creativity
Meta design and social creativity
 

More from MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

More from MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Recently uploaded

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 

Recently uploaded (20)

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 

Jacob Eisenstein, Assistant Professor, School of Interactive Computing, Georgia Institute of Technology at MLconf ATL 2017

  • 1. Making natural language processing robust to sociolinguistic variation Jacob Eisenstein @jacobeisenstein Georgia Institute of Technology September 9, 2017
  • 2. Machine reading From text to structured representations.
  • 3. Annotate and train Machine reading From text to structured representations.
  • 4. Annotate and train Machine reading From text to structured representations. New domains of digitized texts offer opportunities as well as challenges.
  • 5. Language data then and now Then: news text, small set of authors, professionally edited, fixed style
  • 6. Language data then and now Then: news text, small set of authors, professionally edited, fixed style Now: open domain, everyone is an author, unedited, many styles
  • 7. Social media has forced NLP to confront the challenge of missing social context (Eisenstein, 2013): (Gimpel et al., 2011) (Ritter et al., 2011) (Foster et al., 2011)
  • 8. Social media has forced NLP to confront the challenge of missing social context (Eisenstein, 2013): tacit assumptions about audience knowledge language variation across social groups (Gimpel et al., 2011) (Ritter et al., 2011) (Foster et al., 2011)
  • 9. Social media has forced NLP to confront the challenge of missing social context (Eisenstein, 2013): tacit assumptions about audience knowledge language variation across social groups (Gimpel et al., 2011) (Ritter et al., 2011) (Foster et al., 2011)
  • 10.
  • 11.
  • 12. Finding tacit context in the social network Social media texts lack context, because it is implicit between the writer and the reader. Homophily: socially connected individuals tend to share traits.
  • 14.
  • 15.
  • 16.
  • 17. We project embeddings for entities, words, and authors into a shared semantic space. “Dirk Novitsky” “the warriors” Inner products in this space indicate compatibility.
  • 22. Socially-Infused En,ty Linking 48 g2(x, yt, u, t; ⇥2) = v(u) u > W(u,e) v(e) yt + v (m) t > W(m,e) v(e) yt author embedding men,on embedding v(u) u v(e) yt v(e) yt v (m) t g2 (x, yt, t) g1 en,ty embedding
  • 23. Loss-augmented training Socially-Infused En,ty Linking 48 g2(x, yt, u, t; ⇥2) = v(u) u > W(u,e) v(e) yt + v (m) t > W(m,e) v(e) yt author embedding men,on embedding v(u) u v(e) yt v(e) yt v (m) t g2 (x, yt, t) g1 en,ty embedding
  • 29. Classifier Struct Struct+Social S-MART 64 66 68 70 72 74 76 78 F1 Dataset NEEL TACL +3.2 +2.0 Structure prediction improves accuracy. Social context yields further improvements. S-MART is the prior state-of-the-art (Yang & Chang, 2015).
  • 30. Social media has forced NLP to confront the challenge of missing social context (Eisenstein, 2013): tacit assumptions about audience knowledge language variation across social groups (Gimpel et al., 2011) (Ritter et al., 2011) (Foster et al., 2011)
  • 31. Social media has forced NLP to confront the challenge of missing social context (Eisenstein, 2013): tacit assumptions about audience knowledge language variation across social groups (Gimpel et al., 2011) (Ritter et al., 2011) (Foster et al., 2011)
  • 32. Language variation: a challenge for NLP “I would like to believe he’s sick rather than just mean and evil.”
  • 33. Language variation: a challenge for NLP “I would like to believe he’s sick rather than just mean and evil.” “You could’ve been getting down to this sick beat.” (Yang & Eisenstein, 2017)
  • 34. Personalization by ensemble Goal: personalized conditional likelihood, P(y | x, a), where a is the author. Problem: We have labeled examples for only a few authors.
  • 35. Personalization by ensemble Goal: personalized conditional likelihood, P(y | x, a), where a is the author. Problem: We have labeled examples for only a few authors. Personalization ensemble P(y | x, a) = k Pk(y | x)πa(k) Pk(y | x) is a basis model πa(·) are the ensemble weights for author a
  • 36. Homophily to the rescue? Sick! Sick! Sick!Sick! Labeled data Unlabeled data Are language styles assortative on the social network?
  • 37. Evidence for linguistic homophily Pilot study: is classifier accuracy assortative on the Twitter social network? assort(G) = 1 #|G| (i,j)∈G δ(yi = ˆyi)δ(yj = ˆyj) + δ(yi = ˆyi)δ(yj = ˆyj)
  • 38. Evidence for linguistic homophily Pilot study: is classifier accuracy assortative on the Twitter social network? assort(G) = 1 #|G| (i,j)∈G δ(yi = ˆyi)δ(yj = ˆyj) + δ(yi = ˆyi)δ(yj = ˆyj) 0 20 40 60 80 100 rewiring epochs 0.700 0.705 0.710 0.715 0.720 0.725 0.730 0.735 assortativity follow 0 20 40 60 80 100 rewiring epochs mention 0 20 40 60 80 100 rewiring epochs retweet original network random rewiring
  • 39. Network-driven personalization For each author, estimate a node embedding ea (Tang et al., 2015). Nodes who share neighbors get similar embeddings. πa =SoftMax(f (ea)) P(y | x, a) = K k=1 Pk(y | x)πa(k)
  • 40. Results Mixture of Experts NLSE Social Personalization 0.0 0.5 1.0 1.5 2.0 2.5 3.0F1improvementoverConvNet +0.10 +1.90 +2.80 Twitter Sentiment Analysis Improvements over ConvNet baseline: +2.8% on Twitter Sentiment Analysis +2.7% on Ciao Product Reviews NLSE is prior state-of-the-art (Astudillo et al., 2015).
  • 41. Variable sentiment words More positive More negative 1 banging loss fever broken fucking dear like god yeah wow 2 chilling cold ill sick suck satisfy trust wealth strong lmao 3 ass damn piss bitch shit talent honestly voting win clever 4 insane bawling fever weird cry lmao super lol haha hahaha 5 ruin silly bad boring dreadful lovatics wish beliebers ariana- tors kendall
  • 42. Summary Robustness is a key challenge for making NLP effective on social media data: Tacit assumptions about shared knowledge; language variation Social metadata gives NLP systems the flexibility to handle each author differently.
  • 43. Summary Robustness is a key challenge for making NLP effective on social media data: Tacit assumptions about shared knowledge; language variation Social metadata gives NLP systems the flexibility to handle each author differently. The long tail of rare events is the other big challenge. Word embeddings for unseen words (Pinter et al., 2017) Lexicon-based supervision (Eisenstein, 2017) Applications to finding rare events in electronic health records (ongoing work with Jimeng Sun)
  • 44. Acknowledgments Students and collaborators: Yi Yang (GT → Bloomberg) Mingwei Chang (Google Research) See https://gtnlp.wordpress.com/ for more! Funding: National Science Foundation, National Institutes for Health, Georgia Tech
  • 45. References I Astudillo, R. F., Amir, S., Lin, W., Silva, M., & Trancoso, I. (2015). Learning word representations from scarce and noisy data with embedding sub-spaces. In Proceedings of the Association for Computational Linguistics (ACL), Beijing. Eisenstein, J. (2013). What to do about bad language on the internet. In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL), (pp. 359–369). Eisenstein, J. (2017). Unsupervised learning for lexicon-based classification. In Proceedings of the National Conference on Artificial Intelligence (AAAI), San Francisco. Foster, J., Cetinoglu, O., Wagner, J., Le Roux, J., Nivre, J., Hogan, D., & van Genabith, J. (2011). From news to comment: Resources and benchmarks for parsing the language of web 2.0. In Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP), (pp. 893–901)., Chiang Mai, Thailand. Asian Federation of Natural Language Processing. Gimpel, K., Schneider, N., O’Connor, B., Das, D., Mills, D., Eisenstein, J., Heilman, M., Yogatama, D., Flanigan, J., & Smith, N. A. (2011). Part-of-speech tagging for Twitter: annotation, features, and experiments. In Proceedings of the Association for Computational Linguistics (ACL), (pp. 42–47)., Portland, OR. Pinter, Y., Guthrie, R., & Eisenstein, J. (2017). Mimicking word embeddings using subword rnns. In Proceedings of Empirical Methods for Natural Language Processing (EMNLP). Ritter, A., Clark, S., Mausam, & Etzioni, O. (2011). Named entity recognition in tweets: an experimental study. In Proceedings of EMNLP. Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., & Mei, Q. (2015). Line: Large-scale information network embedding. In Proceedings of the Conference on World-Wide Web (WWW), (pp. 1067–1077). Yang, Y. & Chang, M.-W. (2015). S-mart: Novel tree-based structured learning algorithms applied to tweet entity linking. In Proceedings of the Association for Computational Linguistics (ACL), (pp. 504–513)., Beijing. Yang, Y. & Eisenstein, J. (2017). Overcoming language variation in sentiment analysis with social attention. Transactions of the Association for Computational Linguistics (TACL), in press.