SlideShare a Scribd company logo
Towards Universal
Natural Language
Understanding
Yunyao Li (@yunyao_li)
Senior Research Manager
Scalable Knowledge Intelligence
IBM Research – Almaden
@yunyao_li
How many
languages
are there in the world?
2
3
7,102
known languages
23
most spoken language
4.1+ Billion
people
Source: https://www.iflscience.com/environment/worlds-most-spoken-languages-and-where-they-are-spoken/
Conventional Approach
towards Language
Enablement
4
English Text English NLU English Applications
German Text German NLU German Applications
Chinese Text Chinese NLU Chinese Applications
Separate NLU pipeline
for each language
Separate application
for each language
Universal Semantic
Understanding of Natural
Languages
5
English Text
German Text Universal NLU Cross-lingual Applications
Chinese Text
Single NLU pipeline for
different languages
Develop once for
different language
The Challenges
6
Models
– Low-frequency exceptions
– Built for one task at a time
Training Data
– High quality labeled data is
required but hard to obtain
Meaning Representation
– Different meaning
representation
• for different languages
• for the same languages
- Data: Auto-generation + human-
in-the-loop [ACL’15, EMNLP’16, EMNLP’17,
EMNLP’20 Findings]
- Training: Cross-Lingual transfer
[EMNLP’20 Findings]
Unified Meaning Representation
[ACL’15, ACL’16, ACL-DMR’19]
– Instance-based learning
[COLING’16]
– Deep learning + instance-based
learning [In Submission]
– Human-machine co-creation
[ACL’19, EMNLP’20]
Our Research
6
The Challenges
7
Models
– Low-frequency exceptions
– Built for one task at a time
Training Data
– High quality labeled data is
required but hard to obtain
Meaning Representation
– Different meaning
representation
• for different languages
• for the same languages
- Data: Auto-generation + human-
in-the-loop [ACL’15, EMNLP’16, EMNLP’17,
EMNLP’20 Findings]
- Training: Cross-Lingual transfer
[EMNLP’20 Findings]
Unified Meaning Representation
[ACL’15, ACL’16, ACL-DMR’19]
– Instance-based learning
[COLING’16]
– Deep learning + instance-based
learning [In Submission]
– Human-machine co-creation
[ACL’19, EMNLP’20]
Our Research
7
John hastily ordered a dozen dandelions for Mary from Amazon’s Flower Shop.
order.02 (request to be delivered)
A0: Orderer
A1: Thing ordered
A2: Benefactive, ordered-for
A3: Source
A0: Orderer
A1: Thing ordered
A2: Benefactive, ordered-for
A3: Source
AM-MNR: Manner
WHO
HOW
DID
WHAT WHERE
Semantic Role Labeling (SRL)
FOR
WHOM
Who did what to whom, when, where and how?
Dirk broke the window with a hammer.
Break.01
A0 A1 A2
The window was broken by Dirk with a hammer.
A1 Break.01 A0
Break.01
A0 – Breaker
A1 – Thing broken
A2 – Instrument
A3 – Pieces
Break.15
A0 – Journalist,
exposer
A1 – Story,
thing exposed
Syntax vs. Semantic Parsing
What type of labels are valid across languages?
A2
WhatsApp was bought by Facebook
Facebook hat WhatsApp gekauft
Facebook a achété WhatsApp
buy.01
Facebook WhatsApp
Buyer Thing bought
Cross-lingual representation
Multilingual input text
Buy.01 A0
A1
Buy.01
A1
A0
Buy.01
A0 A1
Shared Frames Across Languages
A0 A1
The Challenges
11
Models
– Low-frequency exceptions
– Built for one task at a time
Training Data
– High quality labeled data is
required but hard to obtain
Meaning Representation
– Different meaning
representation
• for different languages
• for the same languages
- Data: Auto-generation + human-
in-the-loop [ACL’15, EMNLP’16, EMNLP’17,
EMNLP’20 Findings]
- Training: Cross-Lingual transfer
[EMNLP’20 Findings]
Unified Meaning Representation
[ACL’15, ACL’16, ACL-DMR’19]
– Instance-based learning
[COLING’16]
– Deep learning + instance-based
learning [In Submission]
– Human-machine co-creation
[ACL’19, EMNLP’20]
Our Research
11
Generate SRL resources for many other languages
• Shared frame set
• Minimal effort
Il faut qu‘ il y ait des responsables
Need.01
A0
Je suis responsable pour le chaos
Be.01
A1 A2 AM-PRD
Les services postaux ont achété des …
Be.01 A2
A1
Buy.01
A0
Corpus of annotated text data
Universal Proposition Banks
Frame set
Buy.01
A0 – Buyer
A1 – Thing bought
A2 – Seller
A3 – Price paid
A4 – Benefactive
Pay.01
A0 – Payer
A1 – Money
A2 – Being payed
A3 – Commodity
Example: TV subtitles
Our Idea: Annotation projection with parallel corpora
Das würde ich für einen Dollar kaufen German subtitles
I would buy that for a dollar! English subtitles
PRICE
BUYER ITEM
BUYER
ITEM
Training data
• Semantically annotated
• Multilingual
• Large amount
I would buy that for a dollar
PRICE
projection
Das würde ich für einen Dollar kaufen
Auto-Generation of Universal
Preposition Bank
13
Resource: https://www.youtube.com/watch?v=u5HOt0ZOcYk
Filtered Projection &
Bootstrapping
Two-step process
– Filters to detect translation shift, block
projections (more precision at cost of
recall)
– Bootstrap learning to increase recall
– Generated 7 universal proposition banks
from 3 language groups
• Version 1.0: https://github.com/System-
T/UniversalPropositions/
• Version 2.0 coming soon
[ACL’15] Generating High Quality Proposition Banks for Multilingual Semantic Role Labeling.
Multilingual Aliasing
• Problem: Target language frame
lexicon automatically generated from
alignments
– False frames
– Redundant frames
• Expert curation of frame mappings
[COLING’16] Multilingual Aliasing for Auto-Generating Proposition
Banks
Annotation
Tasks (all)
Task
Router
raw text
Corpus
predicted
annotations
Corpus
curated
annotations
Corpus
Easy tasks are curated by crowd
Difficult tasks are curated by experts
Crowd-in-the-Loop Curation
[EMNLP’17] CROWD-IN-THE-LOOP: A Hybrid Approach for Annotating Semantic Roles
­9pp F1
improvement over SRL
results
Effectiveness of Crowd-in-
the-Loop
¯66.4pp
expert efforts
­10pp F1
improvement over SRL
results
¯87.3pp
expert efforts
Latest: Filter à Select à Expert
[Findings of EMNLP’20] A Novel Workflow for Accurately and Efficiently Crowdsourcing Predicate Senses and Argument Labels
The Challenges
18
Models
– Low-frequency exceptions
– Built for one task at a time
Training Data
– High quality labeled data is
required but hard to obtain
Meaning Representation
– Different meaning
representation
• for different languages
• for the same languages
- Data: Auto-generation + human-
in-the-loop [ACL’15, EMNLP’16, EMNLP’17,
EMNLP’20 Findings]
- Training: Cross-Lingual transfer
[EMNLP’20 Findings]
Unified Meaning Representation
[ACL’15, ACL’16, ACL-DMR’19]
– Instance-based learning
[COLING’16]
– Deep learning + instance-based
learning [In Submission]
– Human-machine co-creation
[ACL’19, EMNLP’20]
Our Research
18
WhatsApp was bought by Facebook
Facebook hat WhatsApp gekauft
Facebook a achété WhatsApp
buy.01
Facebook WhatsApp
Buyer Thing bought
Cross-lingual representation
Multilingual input text
Buy.01 A0
A1
Buy.01
A1
A0
Buy.01
A0 A1
Crosslingual Information
Extraction
Sentence Verb Buyer Thing bought
1 buy.01 Facebook WhatsApp
2 buy.01 Facebook WhatsApp
3 buy.01 Facebook WhatsApp
Crosslingual extraction
Task: Extract who bought what
[NAACL’18] SystemT: Declarative Text Understanding for Enterprise
[ACL’16] POLYGLOT: Multilingual Semantic Role Labeling with Unified Labels
[COLING’16] Multilingual Information Extraction with PolyglotIE https://vimeo.com/180382223
Multilingual or Polyglot
Training Goal
• Transfer knowledge and resources from
rich resource language to low resource
language
Main Idea
• Combine training data from multiple
languages with multilingual word
embeddings
• Train a common encoder model to enable
parameter sharing.
Challenge
Different languages have different
annotations scheme
EN DE YO
. . .
Different Annotations across
Languages
Observation:
Certain argument labels do share common
semantic meaning across languages.
Intuition:
Identify and exploit the commonalities
between annotation of different languages.
Know.01
A0: Knower
A1: Thing known
A2: A1 known about
AM: Adjuncts
Knnen.01
A0: Knower
A1: Entity
AM: Adjuncts
Hypothesis
Pair Matching:
Identify arguments with similar semantic meaning
across languages and
Source
Manifold
ZH-A0
A0
AM-TMP
ZH-TMP
Target
Manifold
1
2 Argument Regularization
Represent them close to each other in the feature
space.
CLAR Performance
Dataset: CoNLL2009
Our is SoTA
- Average performance over all languages
- 3 out of 5 non-English languages
- General approach:
- Independent of base model.
- Independent of language.
- Require no parallel data.
The Challenges
24
Models
– Low-frequency exceptions
– Built for one task at a time
Training Data
– High quality labeled data is
required but hard to obtain
Meaning Representation
– Different meaning
representation
• for different languages
• for the same languages
- Data: Auto-generation + human-
in-the-loop [ACL’15, EMNLP’16, EMNLP’17,
EMNLP’20 Findings]
- Training: Cross-Lingual transfer
[EMNLP’20 Findings]
Unified Meaning Representation
[ACL’15, ACL’16, ACL-DMR’19]
– Instance-based learning
[COLING’16]
– Deep learning + instance-based
learning [In Submission]
– Human-machine co-creation
[ACL’19, EMNLP’20]
Our Research
24
Dependency Parsing Vs. SRL
75 80 85 90 95 100
WSJ
BROWN
SRL Depeendency Parsing
What Makes SRL So Difficult?
Heavy-tailed distribution of class labels
– Common frames
• say.01 (8243), have.01 (2040), sell.01 (1009)
– Many uncommon frames
• swindle.01, feed.01, hum.01, toast.01
– Almost half of all frames seen fewer than 3
times in training data
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
Distribution of frame labels
Many low-frequency exceptions à Difficult to capture in models
Low-Frequency Exceptions
Strong correlation of syntactic function of an argument to its role
Example: passive subject
The window was broken by Dirk
SBJ
PMOD
VC NMOD
A1
The silver was sold by the man.
SBJ
PMOD
VC NMOD
A1
Creditors were told to hold off.
SBJ
ORPD
VC
IM PRT
TELL.01
A0: speaker (agent)
A1: utterance (topic)
A2: hearer (recipient)
Instance-based Learning kNN: k-Nearest Neighbors classification
Find the k most similar instances in training data
Derive class label from nearest neighbors
A0
A1
A1
A2
A1
A1
A1
A1
A1
A0
A0
A1
A0
A2
A2
A2
A2
A1
A2
?
1 2 3 n
distance
Creditors were told to hold off.
SBJ
ORPD
VC
IM PRT
“creditor” passive subject of TELL.01
noun passive subject of TELL.01
COMPOSITE FEATURE DISTANCE
1
2
.
.
.
.
.
.
any passive subject of any agentive verb n
?
Main idea: Back off to composite feature seen at least k times
[COLING 2016] K-SRL: Instance-based Learning for Semantic Role Labeling
Results
In-domain Out-of-domain
• Significantly outperform previous approaches
– Especially on out-of-domain data
• Small neighborhoods suffice (k=3)
• Fast runtime ­1.4pp F1
In-Domain
­5.1pp F1
Out-of-Domain
Latest results (improvement over SoAT.
with DL + IL)
[In Submission] Deep learning + Instance-based Learning
[COLING 2016] K-SRL: Instance-based Learning for Semantic Role Labeling
The Challenges
30
Models
– Low-frequency exceptions
– Built for one task at a time
Training Data
– High quality labeled data is
required but hard to obtain
Meaning Representation
– Different meaning
representation
• for different languages
• for the same languages
- Data: Auto-generation + human-
in-the-loop [ACL’15, EMNLP’16, EMNLP’17,
EMNLP’20 Findings]
- Training: Cross-Lingual transfer
[EMNLP’20 Findings]
Unified Meaning Representation
[ACL’15, ACL’16, ACL-DMR’19]
– Instance-based learning
[COLING’16]
– Deep learning + instance-based
learning [In Submission]
– Human-machine co-creation
[ACL’19, EMNLP’20]
Our Research
30
Transparent Linguistic Models for Contract Understanding
31
[NAACL-NLLP’19] Transparent Linguistic Models for Contract Understanding and Comparison https://www.ibm.com/cloud/compare-and-comply
Transparent Model Design
Purchaser will
purchase the Assets
by a cash payment.
Element
[NAACL-NLLP’19] Transparent Linguistic Models for Contract Understanding and Comparison
[NAACL’21] Development of an Enterprise-Grade Contract Understanding System https://www.ibm.com/cloud/compare-and-comply
Transparent Model Design
Purchaser will
purchase the Assets
by a cash payment.
Element
[Purchaser]A0
[will]TENSE-FUTURE
purchase
[the Assets]A1
[by a cash payment]ARGM-MNR
Core NLP Understanding
Core NLP Primitives &
Operators
Provided by SystemT
[ACL '10, NAACL ‘18]
Semantic NLP Primitives
[NAACL-NLLP’19] Transparent Linguistic Models for Contract Understanding and Comparison
[NAACL’21] Development of an Enterprise-Grade Contract Understanding System
https://www.ibm.com/cloud/compare-and-comply
Transparent Model Design
Purchaser will
purchase the Assets
by a cash payment.
Element Legal Domain LLEs
[Purchaser]ARG0
[will]TENSE-FUTURE
purchase
[the Assets]ARG1
[by a cash payment]ARGM-MNR
LLE1:
PREDICATE ∈ DICT Business-Transaction
∧ TENSE = Future
∧ POLARITY = Positive
→ NATURE = Obligation ∧ PARTY =
ARG0
LLE2:
…........
Domain Specific Concepts
Business transact. verbs
in future tense
with positive polarity
Core NLP Primitives &
Operators
Semantic NLP Primitives
https://www.ibm.com/cloud/compare-and-comply
[NAACL-NLLP’19] Transparent Linguistic Models for Contract Understanding and Comparison
[NAACL’21] Development of an Enterprise-Grade Contract Understanding System
Transparent Model Design
Purchaser will
purchase the Assets
by a cash payment.
Element Model Output
[Purchaser]ARG0
[will]TENSE-FUTURE
purchase
[the Assets]ARG1
[by a cash payment]ARGM-MNR
Obligation for
Purchaser
Nature/Party:
Domain Specific Concepts
Core NLP Primitives &
Operators
LLE1:
PREDICATE ∈ DICT Business-Transaction
∧ TENSE = Future
∧ POLARITY = Positive
→ NATURE = Obligation ∧ PARTY =
ARG0
LLE2:
…........
Legal Domain LLEs
Semantic NLP Primitives
https://www.ibm.com/cloud/compare-and-comply
[NAACL-NLLP’19] Transparent Linguistic Models for Contract Understanding and Comparison
[NAACL’21] Development of an Enterprise-Grade Contract Understanding System
Human & Machine Co-Creation
Labeled
Data
Evaluati
on
Results
Productio
n
Deep
Learning
Learned Rules
(Explainable)
Modify Rules
Machine performs heavy lifting to abstract out patterns Humans verify/
transparent model
Evaluation & Deployment
Raises the abstraction level for domain experts to interact with
[EMNLP’20] Learning Explainable Linguistic Expressions with Neural Inductive Logic Programming for Sentence Classification
User Study: Human+Machine
Co-Created Model
Performance
User study
– 4 NLP Engineers with < 2 years experience
– 2 NLP experts with 10+ years experience
Key Takeaways
– Explanation of learned rules: Visualization tool is very
effective
– Reduction in human labor: Co-created model created within
1.5 person-hrs outperforms black-box sentence classifier
– Lower requirement on human expertise: Co-created model is
at par with the model created by Super-Experts
Ua Ub Uc Ud
0.0
0.1
0.2
0.3
0.4
0.5
0.6
F-measure
RuleNN+Human
BiLSTM
[ACL’19] HEIDL: Learning Linguistic Expressions with Deep Learning and Human-in-the-Loop
Conclusion
Research
prototype
Early adaption (EN)
Cross-lingual
adaptation
• Watson products
• Customer engagements
• Research projects …
• 10+ languages
• SoAT models
• Paper: 10+ publications
• Patent: 6 patent filed
• Data: ibm.biz/LanguageData
• In progress
To Learn More:
• ibm.biz/ScalableKnowledgeIntelligence
• ibm.biz/SystemT
Data Sets:
• ibm.biz/LanguageData
Follow me:
• LinkedIn: https://www.linkedin.com/in/yunyao-li/
• Twitter: @yunyao_li

More Related Content

What's hot

Succeeding with Functional-first Programming in Enterprise
Succeeding with Functional-first Programming in EnterpriseSucceeding with Functional-first Programming in Enterprise
Succeeding with Functional-first Programming in Enterprisedsyme
 
Architecting Domain-Specific Languages
Architecting Domain-Specific LanguagesArchitecting Domain-Specific Languages
Architecting Domain-Specific Languages
Markus Voelter
 
From Programming to Modeling And Back Again
From Programming to Modeling And Back AgainFrom Programming to Modeling And Back Again
From Programming to Modeling And Back Again
Markus Voelter
 
Machine Translation: The Neural Frontier
Machine Translation: The Neural FrontierMachine Translation: The Neural Frontier
Machine Translation: The Neural Frontier
John Tinsley
 
Generic Tools - Specific Languages (PhD Defense Slides)
Generic Tools - Specific Languages (PhD Defense Slides)Generic Tools - Specific Languages (PhD Defense Slides)
Generic Tools - Specific Languages (PhD Defense Slides)
Markus Voelter
 
Language-Oriented Business Applications
Language-Oriented Business ApplicationsLanguage-Oriented Business Applications
Language-Oriented Business Applications
Markus Voelter
 
Vitalii Braslavskyi - Declarative engineering
Vitalii Braslavskyi - Declarative engineering Vitalii Braslavskyi - Declarative engineering
Vitalii Braslavskyi - Declarative engineering
Grammarly
 
Programming paradigm
Programming paradigmProgramming paradigm
Programming paradigm
busyking03
 
Introduction to compilers
Introduction to compilersIntroduction to compilers
Introduction to compilers
Bilal Maqbool ツ
 
Principles Of Programing Languages
Principles Of Programing LanguagesPrinciples Of Programing Languages
Principles Of Programing Languages
Matthew McCullough
 
Introduction to programming c
Introduction to programming cIntroduction to programming c
Introduction to programming c
Md. Rakibuzzaman Khan Pathan
 
Proven ETL Developer Interview Questions to Assess and Hire ETL Developers
Proven ETL Developer Interview Questions to Assess and Hire ETL DevelopersProven ETL Developer Interview Questions to Assess and Hire ETL Developers
Proven ETL Developer Interview Questions to Assess and Hire ETL Developers
Interview Mocha
 
CS152 Programming Paradigm
CS152 Programming Paradigm CS152 Programming Paradigm
CS152 Programming Paradigm
Kaya Ota
 
Comparative Study of programming Languages
Comparative Study of programming LanguagesComparative Study of programming Languages
Comparative Study of programming Languages
Ishan Monga
 
Programming Paradigm & Languages
Programming Paradigm & LanguagesProgramming Paradigm & Languages
Programming Paradigm & Languages
Gaditek
 
Ppl for students unit 1,2 and 3
Ppl for students unit 1,2 and 3Ppl for students unit 1,2 and 3
Ppl for students unit 1,2 and 3Akshay Nagpurkar
 
Cs111 ch01 v4
Cs111 ch01 v4Cs111 ch01 v4
Cs111 ch01 v4
ArnoldNarte
 
Programming Languages An Intro
Programming Languages An IntroProgramming Languages An Intro
Programming Languages An IntroKimberly De Guzman
 
C aptitude book
C aptitude bookC aptitude book
C aptitude book
MadipadigaYashwanth
 
What is programming what are its benefits
What is programming  what are its benefits What is programming  what are its benefits
What is programming what are its benefits
Vijay Singh Khatri
 

What's hot (20)

Succeeding with Functional-first Programming in Enterprise
Succeeding with Functional-first Programming in EnterpriseSucceeding with Functional-first Programming in Enterprise
Succeeding with Functional-first Programming in Enterprise
 
Architecting Domain-Specific Languages
Architecting Domain-Specific LanguagesArchitecting Domain-Specific Languages
Architecting Domain-Specific Languages
 
From Programming to Modeling And Back Again
From Programming to Modeling And Back AgainFrom Programming to Modeling And Back Again
From Programming to Modeling And Back Again
 
Machine Translation: The Neural Frontier
Machine Translation: The Neural FrontierMachine Translation: The Neural Frontier
Machine Translation: The Neural Frontier
 
Generic Tools - Specific Languages (PhD Defense Slides)
Generic Tools - Specific Languages (PhD Defense Slides)Generic Tools - Specific Languages (PhD Defense Slides)
Generic Tools - Specific Languages (PhD Defense Slides)
 
Language-Oriented Business Applications
Language-Oriented Business ApplicationsLanguage-Oriented Business Applications
Language-Oriented Business Applications
 
Vitalii Braslavskyi - Declarative engineering
Vitalii Braslavskyi - Declarative engineering Vitalii Braslavskyi - Declarative engineering
Vitalii Braslavskyi - Declarative engineering
 
Programming paradigm
Programming paradigmProgramming paradigm
Programming paradigm
 
Introduction to compilers
Introduction to compilersIntroduction to compilers
Introduction to compilers
 
Principles Of Programing Languages
Principles Of Programing LanguagesPrinciples Of Programing Languages
Principles Of Programing Languages
 
Introduction to programming c
Introduction to programming cIntroduction to programming c
Introduction to programming c
 
Proven ETL Developer Interview Questions to Assess and Hire ETL Developers
Proven ETL Developer Interview Questions to Assess and Hire ETL DevelopersProven ETL Developer Interview Questions to Assess and Hire ETL Developers
Proven ETL Developer Interview Questions to Assess and Hire ETL Developers
 
CS152 Programming Paradigm
CS152 Programming Paradigm CS152 Programming Paradigm
CS152 Programming Paradigm
 
Comparative Study of programming Languages
Comparative Study of programming LanguagesComparative Study of programming Languages
Comparative Study of programming Languages
 
Programming Paradigm & Languages
Programming Paradigm & LanguagesProgramming Paradigm & Languages
Programming Paradigm & Languages
 
Ppl for students unit 1,2 and 3
Ppl for students unit 1,2 and 3Ppl for students unit 1,2 and 3
Ppl for students unit 1,2 and 3
 
Cs111 ch01 v4
Cs111 ch01 v4Cs111 ch01 v4
Cs111 ch01 v4
 
Programming Languages An Intro
Programming Languages An IntroProgramming Languages An Intro
Programming Languages An Intro
 
C aptitude book
C aptitude bookC aptitude book
C aptitude book
 
What is programming what are its benefits
What is programming  what are its benefits What is programming  what are its benefits
What is programming what are its benefits
 

Similar to Towards Universal Language Understanding

PL Lecture 01 - preliminaries
PL Lecture 01 - preliminariesPL Lecture 01 - preliminaries
PL Lecture 01 - preliminaries
Schwannden Kuo
 
Swift vs. Language X
Swift vs. Language XSwift vs. Language X
Swift vs. Language X
Scott Wlaschin
 
MVP Virtual Conference - Americas 2015 - Cross platform localization for mobi...
MVP Virtual Conference - Americas 2015 - Cross platform localization for mobi...MVP Virtual Conference - Americas 2015 - Cross platform localization for mobi...
MVP Virtual Conference - Americas 2015 - Cross platform localization for mobi...
Christopher Miller
 
Are High Level Programming Languages for Multicore and Safety Critical Conver...
Are High Level Programming Languages for Multicore and Safety Critical Conver...Are High Level Programming Languages for Multicore and Safety Critical Conver...
Are High Level Programming Languages for Multicore and Safety Critical Conver...
InfinIT - Innovationsnetværket for it
 
Envisioning the Future of Language Workbenches
Envisioning the Future of Language WorkbenchesEnvisioning the Future of Language Workbenches
Envisioning the Future of Language Workbenches
Markus Voelter
 
Build your own Language - Why and How?
Build your own Language - Why and How?Build your own Language - Why and How?
Build your own Language - Why and How?
Markus Voelter
 
Programming language design and implemenation
Programming language design and implemenationProgramming language design and implemenation
Programming language design and implemenationAshwini Awatare
 
System softare
System softareSystem softare
System softare
Dr. C.V. Suresh Babu
 
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
ISPMAIndia
 
Introduction to oop with c++
Introduction to oop with c++Introduction to oop with c++
Introduction to oop with c++
Shruti Patel
 
Oop by edgar lagman jr
Oop by edgar lagman jr Oop by edgar lagman jr
Oop by edgar lagman jr Jun-jun Lagman
 
“Neural Machine Translation for low resource languages: Use case anglais - wo...
“Neural Machine Translation for low resource languages: Use case anglais - wo...“Neural Machine Translation for low resource languages: Use case anglais - wo...
“Neural Machine Translation for low resource languages: Use case anglais - wo...
Paris Women in Machine Learning and Data Science
 
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA DATASCIENCE
 
Plc part 1
Plc part 1Plc part 1
Plc part 1
Taymoor Nazmy
 
Laura Dent: Single-Source and Localization
Laura Dent: Single-Source and LocalizationLaura Dent: Single-Source and Localization
Laura Dent: Single-Source and Localization
Jack Molisani
 
VOC real world enterprise needs
VOC real world enterprise needsVOC real world enterprise needs
VOC real world enterprise needs
Ivan Berlocher
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
Roelof Pieters
 
Drupal entity translation
Drupal entity translationDrupal entity translation
Drupal entity translation
Grigory Naumovets
 
Icwl2015 wahl
Icwl2015 wahlIcwl2015 wahl
Icwl2015 wahl
Harald Wahl
 

Similar to Towards Universal Language Understanding (20)

PL Lecture 01 - preliminaries
PL Lecture 01 - preliminariesPL Lecture 01 - preliminaries
PL Lecture 01 - preliminaries
 
Programing Language
Programing LanguagePrograming Language
Programing Language
 
Swift vs. Language X
Swift vs. Language XSwift vs. Language X
Swift vs. Language X
 
MVP Virtual Conference - Americas 2015 - Cross platform localization for mobi...
MVP Virtual Conference - Americas 2015 - Cross platform localization for mobi...MVP Virtual Conference - Americas 2015 - Cross platform localization for mobi...
MVP Virtual Conference - Americas 2015 - Cross platform localization for mobi...
 
Are High Level Programming Languages for Multicore and Safety Critical Conver...
Are High Level Programming Languages for Multicore and Safety Critical Conver...Are High Level Programming Languages for Multicore and Safety Critical Conver...
Are High Level Programming Languages for Multicore and Safety Critical Conver...
 
Envisioning the Future of Language Workbenches
Envisioning the Future of Language WorkbenchesEnvisioning the Future of Language Workbenches
Envisioning the Future of Language Workbenches
 
Build your own Language - Why and How?
Build your own Language - Why and How?Build your own Language - Why and How?
Build your own Language - Why and How?
 
Programming language design and implemenation
Programming language design and implemenationProgramming language design and implemenation
Programming language design and implemenation
 
System softare
System softareSystem softare
System softare
 
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
 
Introduction to oop with c++
Introduction to oop with c++Introduction to oop with c++
Introduction to oop with c++
 
Oop by edgar lagman jr
Oop by edgar lagman jr Oop by edgar lagman jr
Oop by edgar lagman jr
 
“Neural Machine Translation for low resource languages: Use case anglais - wo...
“Neural Machine Translation for low resource languages: Use case anglais - wo...“Neural Machine Translation for low resource languages: Use case anglais - wo...
“Neural Machine Translation for low resource languages: Use case anglais - wo...
 
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2
 
Plc part 1
Plc part 1Plc part 1
Plc part 1
 
Laura Dent: Single-Source and Localization
Laura Dent: Single-Source and LocalizationLaura Dent: Single-Source and Localization
Laura Dent: Single-Source and Localization
 
VOC real world enterprise needs
VOC real world enterprise needsVOC real world enterprise needs
VOC real world enterprise needs
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
Drupal entity translation
Drupal entity translationDrupal entity translation
Drupal entity translation
 
Icwl2015 wahl
Icwl2015 wahlIcwl2015 wahl
Icwl2015 wahl
 

More from Yunyao Li

The Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language Models
Yunyao Li
 
Building, Growing and Serving Large Knowledge Graphs with Human-in-the-Loop
Building, Growing and Serving Large Knowledge Graphs with Human-in-the-LoopBuilding, Growing and Serving Large Knowledge Graphs with Human-in-the-Loop
Building, Growing and Serving Large Knowledge Graphs with Human-in-the-Loop
Yunyao Li
 
Meaning Representations for Natural Languages: Design, Models and Applications
Meaning Representations for Natural Languages:  Design, Models and ApplicationsMeaning Representations for Natural Languages:  Design, Models and Applications
Meaning Representations for Natural Languages: Design, Models and Applications
Yunyao Li
 
Taming the Wild West of NLP
Taming the Wild West of NLPTaming the Wild West of NLP
Taming the Wild West of NLP
Yunyao Li
 
Towards Deep Table Understanding
Towards Deep Table UnderstandingTowards Deep Table Understanding
Towards Deep Table Understanding
Yunyao Li
 
Explainability for Natural Language Processing
Explainability for Natural Language ProcessingExplainability for Natural Language Processing
Explainability for Natural Language Processing
Yunyao Li
 
Explainability for Natural Language Processing
Explainability for Natural Language ProcessingExplainability for Natural Language Processing
Explainability for Natural Language Processing
Yunyao Li
 
Human in the Loop AI for Building Knowledge Bases
Human in the Loop AI for Building Knowledge Bases Human in the Loop AI for Building Knowledge Bases
Human in the Loop AI for Building Knowledge Bases
Yunyao Li
 
Explainability for Natural Language Processing
Explainability for Natural Language ProcessingExplainability for Natural Language Processing
Explainability for Natural Language Processing
Yunyao Li
 
An In-depth Analysis of the Effect of Text Normalization in Social Media
An In-depth Analysis of the Effect of Text Normalization in Social MediaAn In-depth Analysis of the Effect of Text Normalization in Social Media
An In-depth Analysis of the Effect of Text Normalization in Social Media
Yunyao Li
 
Exploiting Structure in Representation of Named Entities using Active Learning
Exploiting Structure in Representation of Named Entities using Active LearningExploiting Structure in Representation of Named Entities using Active Learning
Exploiting Structure in Representation of Named Entities using Active Learning
Yunyao Li
 
K-SRL: Instance-based Learning for Semantic Role Labeling
K-SRL: Instance-based Learning for Semantic Role LabelingK-SRL: Instance-based Learning for Semantic Role Labeling
K-SRL: Instance-based Learning for Semantic Role Labeling
Yunyao Li
 
Coling poster
Coling posterColing poster
Coling poster
Yunyao Li
 
Coling demo
Coling demoColing demo
Coling demo
Yunyao Li
 
Natural Language Data Management and Interfaces: Recent Development and Open ...
Natural Language Data Management and Interfaces: Recent Development and Open ...Natural Language Data Management and Interfaces: Recent Development and Open ...
Natural Language Data Management and Interfaces: Recent Development and Open ...
Yunyao Li
 
Polyglot: Multilingual Semantic Role Labeling with Unified Labels
Polyglot: Multilingual Semantic Role Labeling with Unified LabelsPolyglot: Multilingual Semantic Role Labeling with Unified Labels
Polyglot: Multilingual Semantic Role Labeling with Unified Labels
Yunyao Li
 
Transparent Machine Learning for Information Extraction: State-of-the-Art and...
Transparent Machine Learning for Information Extraction: State-of-the-Art and...Transparent Machine Learning for Information Extraction: State-of-the-Art and...
Transparent Machine Learning for Information Extraction: State-of-the-Art and...
Yunyao Li
 
The Power of Declarative Analytics
The Power of Declarative AnalyticsThe Power of Declarative Analytics
The Power of Declarative Analytics
Yunyao Li
 
Enterprise Search in the Big Data Era: Recent Developments and Open Challenges
Enterprise Search in the Big Data Era: Recent Developments and Open ChallengesEnterprise Search in the Big Data Era: Recent Developments and Open Challenges
Enterprise Search in the Big Data Era: Recent Developments and Open Challenges
Yunyao Li
 
SystemT: Declarative Information Extraction
SystemT: Declarative Information ExtractionSystemT: Declarative Information Extraction
SystemT: Declarative Information Extraction
Yunyao Li
 

More from Yunyao Li (20)

The Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language Models
 
Building, Growing and Serving Large Knowledge Graphs with Human-in-the-Loop
Building, Growing and Serving Large Knowledge Graphs with Human-in-the-LoopBuilding, Growing and Serving Large Knowledge Graphs with Human-in-the-Loop
Building, Growing and Serving Large Knowledge Graphs with Human-in-the-Loop
 
Meaning Representations for Natural Languages: Design, Models and Applications
Meaning Representations for Natural Languages:  Design, Models and ApplicationsMeaning Representations for Natural Languages:  Design, Models and Applications
Meaning Representations for Natural Languages: Design, Models and Applications
 
Taming the Wild West of NLP
Taming the Wild West of NLPTaming the Wild West of NLP
Taming the Wild West of NLP
 
Towards Deep Table Understanding
Towards Deep Table UnderstandingTowards Deep Table Understanding
Towards Deep Table Understanding
 
Explainability for Natural Language Processing
Explainability for Natural Language ProcessingExplainability for Natural Language Processing
Explainability for Natural Language Processing
 
Explainability for Natural Language Processing
Explainability for Natural Language ProcessingExplainability for Natural Language Processing
Explainability for Natural Language Processing
 
Human in the Loop AI for Building Knowledge Bases
Human in the Loop AI for Building Knowledge Bases Human in the Loop AI for Building Knowledge Bases
Human in the Loop AI for Building Knowledge Bases
 
Explainability for Natural Language Processing
Explainability for Natural Language ProcessingExplainability for Natural Language Processing
Explainability for Natural Language Processing
 
An In-depth Analysis of the Effect of Text Normalization in Social Media
An In-depth Analysis of the Effect of Text Normalization in Social MediaAn In-depth Analysis of the Effect of Text Normalization in Social Media
An In-depth Analysis of the Effect of Text Normalization in Social Media
 
Exploiting Structure in Representation of Named Entities using Active Learning
Exploiting Structure in Representation of Named Entities using Active LearningExploiting Structure in Representation of Named Entities using Active Learning
Exploiting Structure in Representation of Named Entities using Active Learning
 
K-SRL: Instance-based Learning for Semantic Role Labeling
K-SRL: Instance-based Learning for Semantic Role LabelingK-SRL: Instance-based Learning for Semantic Role Labeling
K-SRL: Instance-based Learning for Semantic Role Labeling
 
Coling poster
Coling posterColing poster
Coling poster
 
Coling demo
Coling demoColing demo
Coling demo
 
Natural Language Data Management and Interfaces: Recent Development and Open ...
Natural Language Data Management and Interfaces: Recent Development and Open ...Natural Language Data Management and Interfaces: Recent Development and Open ...
Natural Language Data Management and Interfaces: Recent Development and Open ...
 
Polyglot: Multilingual Semantic Role Labeling with Unified Labels
Polyglot: Multilingual Semantic Role Labeling with Unified LabelsPolyglot: Multilingual Semantic Role Labeling with Unified Labels
Polyglot: Multilingual Semantic Role Labeling with Unified Labels
 
Transparent Machine Learning for Information Extraction: State-of-the-Art and...
Transparent Machine Learning for Information Extraction: State-of-the-Art and...Transparent Machine Learning for Information Extraction: State-of-the-Art and...
Transparent Machine Learning for Information Extraction: State-of-the-Art and...
 
The Power of Declarative Analytics
The Power of Declarative AnalyticsThe Power of Declarative Analytics
The Power of Declarative Analytics
 
Enterprise Search in the Big Data Era: Recent Developments and Open Challenges
Enterprise Search in the Big Data Era: Recent Developments and Open ChallengesEnterprise Search in the Big Data Era: Recent Developments and Open Challenges
Enterprise Search in the Big Data Era: Recent Developments and Open Challenges
 
SystemT: Declarative Information Extraction
SystemT: Declarative Information ExtractionSystemT: Declarative Information Extraction
SystemT: Declarative Information Extraction
 

Recently uploaded

PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
ViralQR
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 

Recently uploaded (20)

PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 

Towards Universal Language Understanding

  • 1. Towards Universal Natural Language Understanding Yunyao Li (@yunyao_li) Senior Research Manager Scalable Knowledge Intelligence IBM Research – Almaden @yunyao_li
  • 3. 3 7,102 known languages 23 most spoken language 4.1+ Billion people Source: https://www.iflscience.com/environment/worlds-most-spoken-languages-and-where-they-are-spoken/
  • 4. Conventional Approach towards Language Enablement 4 English Text English NLU English Applications German Text German NLU German Applications Chinese Text Chinese NLU Chinese Applications Separate NLU pipeline for each language Separate application for each language
  • 5. Universal Semantic Understanding of Natural Languages 5 English Text German Text Universal NLU Cross-lingual Applications Chinese Text Single NLU pipeline for different languages Develop once for different language
  • 6. The Challenges 6 Models – Low-frequency exceptions – Built for one task at a time Training Data – High quality labeled data is required but hard to obtain Meaning Representation – Different meaning representation • for different languages • for the same languages - Data: Auto-generation + human- in-the-loop [ACL’15, EMNLP’16, EMNLP’17, EMNLP’20 Findings] - Training: Cross-Lingual transfer [EMNLP’20 Findings] Unified Meaning Representation [ACL’15, ACL’16, ACL-DMR’19] – Instance-based learning [COLING’16] – Deep learning + instance-based learning [In Submission] – Human-machine co-creation [ACL’19, EMNLP’20] Our Research 6
  • 7. The Challenges 7 Models – Low-frequency exceptions – Built for one task at a time Training Data – High quality labeled data is required but hard to obtain Meaning Representation – Different meaning representation • for different languages • for the same languages - Data: Auto-generation + human- in-the-loop [ACL’15, EMNLP’16, EMNLP’17, EMNLP’20 Findings] - Training: Cross-Lingual transfer [EMNLP’20 Findings] Unified Meaning Representation [ACL’15, ACL’16, ACL-DMR’19] – Instance-based learning [COLING’16] – Deep learning + instance-based learning [In Submission] – Human-machine co-creation [ACL’19, EMNLP’20] Our Research 7
  • 8. John hastily ordered a dozen dandelions for Mary from Amazon’s Flower Shop. order.02 (request to be delivered) A0: Orderer A1: Thing ordered A2: Benefactive, ordered-for A3: Source A0: Orderer A1: Thing ordered A2: Benefactive, ordered-for A3: Source AM-MNR: Manner WHO HOW DID WHAT WHERE Semantic Role Labeling (SRL) FOR WHOM Who did what to whom, when, where and how?
  • 9. Dirk broke the window with a hammer. Break.01 A0 A1 A2 The window was broken by Dirk with a hammer. A1 Break.01 A0 Break.01 A0 – Breaker A1 – Thing broken A2 – Instrument A3 – Pieces Break.15 A0 – Journalist, exposer A1 – Story, thing exposed Syntax vs. Semantic Parsing What type of labels are valid across languages? A2
  • 10. WhatsApp was bought by Facebook Facebook hat WhatsApp gekauft Facebook a achété WhatsApp buy.01 Facebook WhatsApp Buyer Thing bought Cross-lingual representation Multilingual input text Buy.01 A0 A1 Buy.01 A1 A0 Buy.01 A0 A1 Shared Frames Across Languages A0 A1
  • 11. The Challenges 11 Models – Low-frequency exceptions – Built for one task at a time Training Data – High quality labeled data is required but hard to obtain Meaning Representation – Different meaning representation • for different languages • for the same languages - Data: Auto-generation + human- in-the-loop [ACL’15, EMNLP’16, EMNLP’17, EMNLP’20 Findings] - Training: Cross-Lingual transfer [EMNLP’20 Findings] Unified Meaning Representation [ACL’15, ACL’16, ACL-DMR’19] – Instance-based learning [COLING’16] – Deep learning + instance-based learning [In Submission] – Human-machine co-creation [ACL’19, EMNLP’20] Our Research 11
  • 12. Generate SRL resources for many other languages • Shared frame set • Minimal effort Il faut qu‘ il y ait des responsables Need.01 A0 Je suis responsable pour le chaos Be.01 A1 A2 AM-PRD Les services postaux ont achété des … Be.01 A2 A1 Buy.01 A0 Corpus of annotated text data Universal Proposition Banks Frame set Buy.01 A0 – Buyer A1 – Thing bought A2 – Seller A3 – Price paid A4 – Benefactive Pay.01 A0 – Payer A1 – Money A2 – Being payed A3 – Commodity
  • 13. Example: TV subtitles Our Idea: Annotation projection with parallel corpora Das würde ich für einen Dollar kaufen German subtitles I would buy that for a dollar! English subtitles PRICE BUYER ITEM BUYER ITEM Training data • Semantically annotated • Multilingual • Large amount I would buy that for a dollar PRICE projection Das würde ich für einen Dollar kaufen Auto-Generation of Universal Preposition Bank 13 Resource: https://www.youtube.com/watch?v=u5HOt0ZOcYk
  • 14. Filtered Projection & Bootstrapping Two-step process – Filters to detect translation shift, block projections (more precision at cost of recall) – Bootstrap learning to increase recall – Generated 7 universal proposition banks from 3 language groups • Version 1.0: https://github.com/System- T/UniversalPropositions/ • Version 2.0 coming soon [ACL’15] Generating High Quality Proposition Banks for Multilingual Semantic Role Labeling.
  • 15. Multilingual Aliasing • Problem: Target language frame lexicon automatically generated from alignments – False frames – Redundant frames • Expert curation of frame mappings [COLING’16] Multilingual Aliasing for Auto-Generating Proposition Banks
  • 16. Annotation Tasks (all) Task Router raw text Corpus predicted annotations Corpus curated annotations Corpus Easy tasks are curated by crowd Difficult tasks are curated by experts Crowd-in-the-Loop Curation [EMNLP’17] CROWD-IN-THE-LOOP: A Hybrid Approach for Annotating Semantic Roles
  • 17. ­9pp F1 improvement over SRL results Effectiveness of Crowd-in- the-Loop ¯66.4pp expert efforts ­10pp F1 improvement over SRL results ¯87.3pp expert efforts Latest: Filter à Select à Expert [Findings of EMNLP’20] A Novel Workflow for Accurately and Efficiently Crowdsourcing Predicate Senses and Argument Labels
  • 18. The Challenges 18 Models – Low-frequency exceptions – Built for one task at a time Training Data – High quality labeled data is required but hard to obtain Meaning Representation – Different meaning representation • for different languages • for the same languages - Data: Auto-generation + human- in-the-loop [ACL’15, EMNLP’16, EMNLP’17, EMNLP’20 Findings] - Training: Cross-Lingual transfer [EMNLP’20 Findings] Unified Meaning Representation [ACL’15, ACL’16, ACL-DMR’19] – Instance-based learning [COLING’16] – Deep learning + instance-based learning [In Submission] – Human-machine co-creation [ACL’19, EMNLP’20] Our Research 18
  • 19. WhatsApp was bought by Facebook Facebook hat WhatsApp gekauft Facebook a achété WhatsApp buy.01 Facebook WhatsApp Buyer Thing bought Cross-lingual representation Multilingual input text Buy.01 A0 A1 Buy.01 A1 A0 Buy.01 A0 A1 Crosslingual Information Extraction Sentence Verb Buyer Thing bought 1 buy.01 Facebook WhatsApp 2 buy.01 Facebook WhatsApp 3 buy.01 Facebook WhatsApp Crosslingual extraction Task: Extract who bought what [NAACL’18] SystemT: Declarative Text Understanding for Enterprise [ACL’16] POLYGLOT: Multilingual Semantic Role Labeling with Unified Labels [COLING’16] Multilingual Information Extraction with PolyglotIE https://vimeo.com/180382223
  • 20. Multilingual or Polyglot Training Goal • Transfer knowledge and resources from rich resource language to low resource language Main Idea • Combine training data from multiple languages with multilingual word embeddings • Train a common encoder model to enable parameter sharing. Challenge Different languages have different annotations scheme EN DE YO . . .
  • 21. Different Annotations across Languages Observation: Certain argument labels do share common semantic meaning across languages. Intuition: Identify and exploit the commonalities between annotation of different languages. Know.01 A0: Knower A1: Thing known A2: A1 known about AM: Adjuncts Knnen.01 A0: Knower A1: Entity AM: Adjuncts
  • 22. Hypothesis Pair Matching: Identify arguments with similar semantic meaning across languages and Source Manifold ZH-A0 A0 AM-TMP ZH-TMP Target Manifold 1 2 Argument Regularization Represent them close to each other in the feature space.
  • 23. CLAR Performance Dataset: CoNLL2009 Our is SoTA - Average performance over all languages - 3 out of 5 non-English languages - General approach: - Independent of base model. - Independent of language. - Require no parallel data.
  • 24. The Challenges 24 Models – Low-frequency exceptions – Built for one task at a time Training Data – High quality labeled data is required but hard to obtain Meaning Representation – Different meaning representation • for different languages • for the same languages - Data: Auto-generation + human- in-the-loop [ACL’15, EMNLP’16, EMNLP’17, EMNLP’20 Findings] - Training: Cross-Lingual transfer [EMNLP’20 Findings] Unified Meaning Representation [ACL’15, ACL’16, ACL-DMR’19] – Instance-based learning [COLING’16] – Deep learning + instance-based learning [In Submission] – Human-machine co-creation [ACL’19, EMNLP’20] Our Research 24
  • 25. Dependency Parsing Vs. SRL 75 80 85 90 95 100 WSJ BROWN SRL Depeendency Parsing
  • 26. What Makes SRL So Difficult? Heavy-tailed distribution of class labels – Common frames • say.01 (8243), have.01 (2040), sell.01 (1009) – Many uncommon frames • swindle.01, feed.01, hum.01, toast.01 – Almost half of all frames seen fewer than 3 times in training data 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Distribution of frame labels Many low-frequency exceptions à Difficult to capture in models
  • 27. Low-Frequency Exceptions Strong correlation of syntactic function of an argument to its role Example: passive subject The window was broken by Dirk SBJ PMOD VC NMOD A1 The silver was sold by the man. SBJ PMOD VC NMOD A1 Creditors were told to hold off. SBJ ORPD VC IM PRT TELL.01 A0: speaker (agent) A1: utterance (topic) A2: hearer (recipient)
  • 28. Instance-based Learning kNN: k-Nearest Neighbors classification Find the k most similar instances in training data Derive class label from nearest neighbors A0 A1 A1 A2 A1 A1 A1 A1 A1 A0 A0 A1 A0 A2 A2 A2 A2 A1 A2 ? 1 2 3 n distance Creditors were told to hold off. SBJ ORPD VC IM PRT “creditor” passive subject of TELL.01 noun passive subject of TELL.01 COMPOSITE FEATURE DISTANCE 1 2 . . . . . . any passive subject of any agentive verb n ? Main idea: Back off to composite feature seen at least k times [COLING 2016] K-SRL: Instance-based Learning for Semantic Role Labeling
  • 29. Results In-domain Out-of-domain • Significantly outperform previous approaches – Especially on out-of-domain data • Small neighborhoods suffice (k=3) • Fast runtime ­1.4pp F1 In-Domain ­5.1pp F1 Out-of-Domain Latest results (improvement over SoAT. with DL + IL) [In Submission] Deep learning + Instance-based Learning [COLING 2016] K-SRL: Instance-based Learning for Semantic Role Labeling
  • 30. The Challenges 30 Models – Low-frequency exceptions – Built for one task at a time Training Data – High quality labeled data is required but hard to obtain Meaning Representation – Different meaning representation • for different languages • for the same languages - Data: Auto-generation + human- in-the-loop [ACL’15, EMNLP’16, EMNLP’17, EMNLP’20 Findings] - Training: Cross-Lingual transfer [EMNLP’20 Findings] Unified Meaning Representation [ACL’15, ACL’16, ACL-DMR’19] – Instance-based learning [COLING’16] – Deep learning + instance-based learning [In Submission] – Human-machine co-creation [ACL’19, EMNLP’20] Our Research 30
  • 31. Transparent Linguistic Models for Contract Understanding 31 [NAACL-NLLP’19] Transparent Linguistic Models for Contract Understanding and Comparison https://www.ibm.com/cloud/compare-and-comply
  • 32. Transparent Model Design Purchaser will purchase the Assets by a cash payment. Element [NAACL-NLLP’19] Transparent Linguistic Models for Contract Understanding and Comparison [NAACL’21] Development of an Enterprise-Grade Contract Understanding System https://www.ibm.com/cloud/compare-and-comply
  • 33. Transparent Model Design Purchaser will purchase the Assets by a cash payment. Element [Purchaser]A0 [will]TENSE-FUTURE purchase [the Assets]A1 [by a cash payment]ARGM-MNR Core NLP Understanding Core NLP Primitives & Operators Provided by SystemT [ACL '10, NAACL ‘18] Semantic NLP Primitives [NAACL-NLLP’19] Transparent Linguistic Models for Contract Understanding and Comparison [NAACL’21] Development of an Enterprise-Grade Contract Understanding System https://www.ibm.com/cloud/compare-and-comply
  • 34. Transparent Model Design Purchaser will purchase the Assets by a cash payment. Element Legal Domain LLEs [Purchaser]ARG0 [will]TENSE-FUTURE purchase [the Assets]ARG1 [by a cash payment]ARGM-MNR LLE1: PREDICATE ∈ DICT Business-Transaction ∧ TENSE = Future ∧ POLARITY = Positive → NATURE = Obligation ∧ PARTY = ARG0 LLE2: …........ Domain Specific Concepts Business transact. verbs in future tense with positive polarity Core NLP Primitives & Operators Semantic NLP Primitives https://www.ibm.com/cloud/compare-and-comply [NAACL-NLLP’19] Transparent Linguistic Models for Contract Understanding and Comparison [NAACL’21] Development of an Enterprise-Grade Contract Understanding System
  • 35. Transparent Model Design Purchaser will purchase the Assets by a cash payment. Element Model Output [Purchaser]ARG0 [will]TENSE-FUTURE purchase [the Assets]ARG1 [by a cash payment]ARGM-MNR Obligation for Purchaser Nature/Party: Domain Specific Concepts Core NLP Primitives & Operators LLE1: PREDICATE ∈ DICT Business-Transaction ∧ TENSE = Future ∧ POLARITY = Positive → NATURE = Obligation ∧ PARTY = ARG0 LLE2: …........ Legal Domain LLEs Semantic NLP Primitives https://www.ibm.com/cloud/compare-and-comply [NAACL-NLLP’19] Transparent Linguistic Models for Contract Understanding and Comparison [NAACL’21] Development of an Enterprise-Grade Contract Understanding System
  • 36. Human & Machine Co-Creation Labeled Data Evaluati on Results Productio n Deep Learning Learned Rules (Explainable) Modify Rules Machine performs heavy lifting to abstract out patterns Humans verify/ transparent model Evaluation & Deployment Raises the abstraction level for domain experts to interact with [EMNLP’20] Learning Explainable Linguistic Expressions with Neural Inductive Logic Programming for Sentence Classification
  • 37. User Study: Human+Machine Co-Created Model Performance User study – 4 NLP Engineers with < 2 years experience – 2 NLP experts with 10+ years experience Key Takeaways – Explanation of learned rules: Visualization tool is very effective – Reduction in human labor: Co-created model created within 1.5 person-hrs outperforms black-box sentence classifier – Lower requirement on human expertise: Co-created model is at par with the model created by Super-Experts Ua Ub Uc Ud 0.0 0.1 0.2 0.3 0.4 0.5 0.6 F-measure RuleNN+Human BiLSTM [ACL’19] HEIDL: Learning Linguistic Expressions with Deep Learning and Human-in-the-Loop
  • 38. Conclusion Research prototype Early adaption (EN) Cross-lingual adaptation • Watson products • Customer engagements • Research projects … • 10+ languages • SoAT models • Paper: 10+ publications • Patent: 6 patent filed • Data: ibm.biz/LanguageData • In progress To Learn More: • ibm.biz/ScalableKnowledgeIntelligence • ibm.biz/SystemT Data Sets: • ibm.biz/LanguageData Follow me: • LinkedIn: https://www.linkedin.com/in/yunyao-li/ • Twitter: @yunyao_li