Dr. Catherine Havasi's keynote talk from the AI Community Conference on Natural Language Processing (by NYAI.co) on Thurs, Jun 27th 2019 at Moody's Analytics.
Sponsored by Moody's Analytics, NYU Tandon Future Lab, NYAI.co
For more information & the full talk video, please visit nyai.co
6. We have built models of how people think about the world in 73
languages – called ConceptNet.
7. Languages in ConceptNetMultilingual coverage
English
6.5 million edges
French
4.9 million edges
German
1.6M
Italian
1.1M
Spanish
830k
Japanese
740k
Russian
620k
Portuguese
540k
Chinese
500k
Finnish
420k
Dutch
400k
Swedish
300k
bg pl cs sh eo ms sl ar
Total: 24.6 million edges in 70+ languages
. . .
rces
mmon Sense
l
glish, French, German)
ed knowledge
ish)
nese)
ame (Chinese)
a purpose
ual WordNet
urces
Open code
At http://conceptnet
• Code on GitHub to
• A browsable Web in
• A Linked Data REST
All data is available un
Creative Commons Att
ShareAlike 4.0 license.
8. Photo: Steve Hopson, CC-By
But then things moved on without us –
users changed how they interacted with search engines
9.
10. We found a new use of this data:
integrating into machine learning to
facilitate adaptability
Photo by: Chris Rodley
11. “I don’t have to actually experience crashing
my car into a wall a few hundred times before I
slowly start avoiding to do so.”
- Andrej Karpathy,OpenAI
17. Retrofitting
• Created by Manaal Faruqui in 2015
• Apply knowledge-based constraints after training
distributional word vectors
• It works better than during training, for some reason
18. Retrofitting
• Terms that are connected in the knowledge graph should
have vectors that are closer together
• Many extensions now, such as “antonyms should be farther
apart” (Mrkšić et al., 2016)-
oak
tree
furnitur
e
19. Retrofitting just works
• On intrinsic evaluations, the top-performing systems almost
always use retrofitting
– If you see a purely distributional algorithm claim “state of
the art on SimLex”, it may be “state of the art assuming
no knowledge graph”
20. • State-of-the-art word vectors
• Hybrid of ConceptNet and distributional
semantics
• Multilingual by design
• Open source, open data
22. Photo by: David Lapetina CC BY-SA 3.0.
In order to beat a human
player at chess, Google’s
AlphaZero had to play
68 million games against
itself.
23. You cannot simulate your call center
calling itself 68 million times.
Photo by: Nebiyu.s CC BY-SA 4.0.
24. What is domain Adaptation?
domain
general
data
domain
specific
data
customer intents,
product names,
industry jargon,
specific issues
common words,
multiple languages,
paraphrases,
general sentiment
27. What are other examples of transfer
learning?
• Pretraining
• Fine tuning and layer freezing for Elmo and
Bert (and GPT-2)
• Fast.ai’s ULMFiT (http://nlp.fast.ai/)
30. Could you train networks to modify
networks?
Work Credit: PedroColon
31. John Hewitt and Christopher D. Manning: A Structural Probe for Finding Syntax in Word
Representations
Andy Coenen , Emily Reif, Ann Yuan, Been Kim, Adam Pearce, Fernanda Viégas, Martin
Wattenberg.Visualizing and Measuring the Geometry of BERT
40. One Size Does Not Fit All:
Language
Domain
Expertise
Customer Journey Stage
Engagement Goal
41. In 2017, the NYTimes reported Icelanders bemoaning a
decrease in younger people speaking Icelandic because
“voice activated” systems didn’t speak it.
Image Credit: Andreas Tille, CC-by-SA. NyTimes: Icelanders Seek to Keep Their Language Alive and Out of ‘the Latin Bin’
61. We are creating a new type of media.
Just for you.
We are building an engine for creators and brands to build personalized
experiences at scale.
Confidential Image Credit: Anna Dziubinska CC-by-SA
62. Let’s make a promise to our
users about their data.
63. 51% of consumers expect that companies will
anticipate their needs and make relevant
suggestions – even if they’ve never bought
from the brand before.
Source: Salesforce
64. 80% of shoppers are more likely to frequent and buy
from stores with more personalized experiences.
Source: Epsilon
65. Source: Split learning for health: Distributed deep learning without sharing raw patient data, Praneeth
Federated or Split Learning