Natural language understanding is a fundamental task in artificial intelligence. English understanding has reached a mature state and successfully deployed in multiple IBM AI products and services, such as Watson Natural Language Understanding and Watson Discovery. However, scaling existing products/services to support additional languages remain an open challenge. In this talk, we will discuss the open challenges in supporting universal natural language understanding. We will share our work in the past few years in addressing these challenges. We will also showcase how universal semantic representation of natural languages can enable cross-lingual information extraction in concrete domains (e.g. compliance) and show ongoing efforts towards seamless scaling existing NLP capabilities across languages with minimal efforts.
1. Taming the Wild West of
Natural Language Processing
Yunyao Li
Distinguished Research Staff Member
Senior Manager
Scalable Knowledge Intelligence Department
IBM Research – Almaden
@yunyao_li
yunyaoli@us.ibm.com
NLP4MusA
4. Complexity
Complex
documents
4
Explainability
- Explain the rationale
behind predictions
and reasoning to
experts
Small Data
Limited labeled
data
Even unlabeled
data may not
available
Customizability
Rich domain
knowledge
Wide spectrum of
users
Challenges of NLP in the Wild
6. Customizability
Rich domain
knowledge
Wide spectrum of
users
The Enterprise Challenges
Explainability
- Explain the rationale
behind predictions
and reasoning to
experts
Small Data
Limited labeled
data
Even unlabeled
data may not
available
Complexity
Complex
documents
Wide variety of
tasks
6
7. Document Conversion
Preserve Reading Order: Content
should be ordered across multiple
columns, tables & images
Table Region & Structure
Identification: Identify tables
& individual cell values
through combination of
“explicit graphical lines” and
implicit cell alignment
Formatting Metadata: Identify
headings, paragraphs from character-
level font information and positional
information
• Raw PDF information: positional character level information, graphical lines, font and image data
• Semantic constructs such as Table Structure and Reading order need to be inferred from these “low-level features” using “visual clues” and ML
algorithms
• Need to handle variations in PDF formats across programmatic PDFs (created from variety of tools) and scanned PDFs
8. Table with visual
clues only
Why is it hard? Variety in PDF Tables
Complex tables – graphical lines can be
misleading – is this 1, 2 or 3 tables ?
Multi-row, multi-
column column
headers
Nested row
headers
Tables with Textual
content
Table with
graphic lines
Table interleaved
with text and charts
9. Data Augmentation via Auto-Data Generation
FinTabNet
>110k Tables
10K reports of S&P 500 companies.
ibm.biz/fintabnet
PubTabNet
568K Tables
PubMed publications
ibm.biz/pubtabnet
[ECCV’20] Image-Based Table Recognition: Data, Model, and Evaluation
Auto-generated table boundary, cell boundary and structure annotations.
10. Global Table Extraction
[WACV’21] Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context.
11. New SoAT on ICDAR 2013 and ICDAR 2019 datasets
Note: Our model significantly outperforms competing models when using metrics better reflecting downstream processing
• Product transfer (2021)
• CORD-19 table data (since May 12, 2020): [ACL-COVID-19,’20] CORD-19: The COVID-19 Open Research Dataset.
[IBM Blog’20] Bringing IBM NLP capabilities to the CORD-19 Dataset. http://ibm.biz/CORD19-IBM
• TableQA: [AAAI’21] KAAPA: Knowledge Aware Answers from PDF Analysis.
[NAACL’21 ] Capturing Row and Column Semantics in Transformer Based Question Answering over Tables.
13. Easy Customization via Adaptive Deep Learning
[IUI’21] TableLab: An Interactive Table Extraction System with Adaptive Deep Learning. (demo) 13
14. Customizability
Rich domain
knowledge
Wide spectrum of
users
The Enterprise Challenges
Explainability
- Explain the rationale
behind predictions
and reasoning to
experts
Complexity
Complex
documents
Wide variety of
tasks
14
Small Data
Limited labeled
data
Even unlabeled
data may not
available
Data Augmentation
Neural-Symbolic AI
Domain-Specific
Languages
+
Powerful Primitives
Human-in-the-Loop
15. John hastily ordered a dozen dandelions for Mary from Amazon’s Flower
Shop.
order.02 (request to be delivered)
A0: Orderer
A1: Thing ordered
A2: Benefactive, ordered-for
A3: Source
A0: Orderer
A1: Thing ordered
A2: Benefactive, ordered-for
A3: Source
AM-MNR: Manner
WHO
HOW
DID
WHAT WHERE
Semantic Role Labeling (SRL)
FOR
WHOM
Who did what to whom, when, where and
16. Generate SRL resources for many other languages
• Shared frame set
• Minimal effort
Il faut qu‘ il y ait des responsables
Need.01
A0
Je suis responsable pour le chaos
Be.01
A1 A2 AM-PRD
Les services postaux ont achété des …
Be.01 A2
A1
Buy.01
A0
Corpus of annotated text data
Universal Proposition Banks
Frame set
Buy.01
A0 – Buyer
A1 – Thing bought
A2 – Seller
A3 – Price paid
A4 – Benefactive
Pay.01
A0 – Payer
A1 – Money
A2 – Being payed
A3 – Commodity
17. Example: TV subtitles
Our Idea: Annotation projection with parallel corpora
Das würde ich für einen Dollar kaufen German subtitles
I would buy that for a dollar! English subtitles
PRICE
BUYER ITEM
BUYER
ITEM
Training data
• Semantically annotated
• Multilingual
• Large amount
I would buy that for a dollar
PRICE
projection
Das würde ich für einen Dollar kaufen
Auto-Generation of Universal
Preposition Bank
17
Resource: https://www.youtube.com/watch?v=u5HOt0ZOcYk
18. Filtered Projection &
Bootstrapping
Two-step process
– Filters to detect translation shift, block
projections (more precision at cost of
recall)
– Bootstrap learning to increase recall
– Generated 7 universal proposition banks
from 3 language groups
• Version 1.0: https://github.com/System-
T/UniversalPropositions/
• Version 2.0 coming soon
[ACL’15] Generating High Quality Proposition Banks for Multilingual Semantic Role Labeling.
19. Multilingual Aliasing
• Problem: Target language frame
lexicon automatically generated from
alignments
– False frames
– Redundant frames
• Expert curation of frame mappings
[COLING’16] Multilingual Aliasing for Auto-Generating Proposition
Banks
21. 9% F1
improvement over SRL
results
Effectiveness of Crowd-in-
the-Loop
66.4%
expert efforts
10% F1
improvement over SRL
results
87.3%
expert efforts
Latest: Filter Select Expert
[EMNLP’20 (Finding)] A Novel Workflow for Accurately and Efficiently Crowdsourcing Predicate Senses and Argument Labels
22. The Enterprise Challenges
Explainability
- Explain the rationale
behind predictions
and reasoning to
experts
Complexity
Complex
documents
Wide variety of
tasks
22
Small Data
Limited labeled
data
Even unlabeled
data may not
available
Customizability
Rich domain
knowledge
Wide spectrum of
users
23. Targeted Users – A Simplified View
23
AI Engineer Data Scientist Subject Matter Expert (SME)
• Software programming
• Develop and deploy AI models
• Understand business needs
• Data wrangling
• Develop AI models
• Distill insights from data
• Communicate business needs
• Data labeling
• Use AI models and provide feedback
• Create simple AI models
24. Dimensions of NLP Customization – An Overview
24
Approaches Labeling Efforts Constraints
Rules
Simple
ML
Rule Induction
Declarative Systems
ML*
Errors
Supervised Learning
Low High
Dev. Efforts
AutoML, Active Learning
25. Dimensions of NLP Customization – An Overview
25
Approaches Labeling Efforts Constraints
Rules
Simple
ML
Rule Induction
Declarative Systems
ML*
Errors
Supervised Learning
Low High
Dev. Efforts
AutoML, Active Learning
26. AQL Language
Optimizer
Operator
Runtime
Specify extractor
semantics declaratively
Choose efficient
execution plan that
implements semantics
Example AQL Extractor
Fundamental Results & Theorems
Expressivity: The class of extraction tasks expressible in AQL is a strict superset of
that expressible through cascaded regular automata.
Performance: For any acyclic token-based finite state transducer T, there exists an
operator graph G such that evaluating T and G has the same computational
complexity.
create view PersonPhone as
select P.name as person, N.number as phone
from Person P, PhoneNumber N, Sentence S
where
Follows(P.name, N.number, 0, 30)
and Contains(S.sentence, P.name)
and Contains(S.sentence, N.number)
and ContainsRegex(/b(phone|at)b/,
SpanBetween(P.name, N.number));
Within asinglesentence
<Person> <PhoneNum>
0-30chars
Contains“phone” or “at”
Within asinglesentence
<Person> <PhoneNum>
0-30chars
Contains“phone” or “at”
SystemT: Declarative Text Understanding for The Enterprise
SystemT Architecture
[ACL’2010] SystemT: An Algebraic Approach to Declarative Information Extraction ibm.biz/SystemT
27. Explainable AI for the Enterprise – SystemT: Text Understanding
27
Domain-Specific Models
Cross-lingual
Semantic Abstraction
(semantic parsing,
table understanding)
Syntactic Abstraction
(tokenization,
lemmatization, POS, etc.)
Syntactic NLP
Operators
(HTML operators, Regular
expressions, dictionaries,
span operators,
AI for IT AI for Customer Care AI for Compliance …
[NAACL’18] SystemT: Declarative Text Understanding for Enterprise.
• 20+ IBM products
• 50+ papers.
ibm.biz/SystemT
28. Tooling for Different Users
[ACL’12] WizIE: A Best Practices Guided Development Environment
for Information Extraction
IBM InfoSphere BigInsights Text Analytics Eclipse Tooling
28
[VLDB’15] VINERy: A Visual IDE for Information Extraction
[KDD’19] Declarative Text Understanding with SystemT. (hands-on tutorial)
IBM Watson Knowledge Studio. Advanced Rule Editor
AI Engineers AI Engineers/Data Scientists
Full-fledged IDE Visual IDE
http://ibm.biz/VineryIE
29. Entity Extraction for Watson AIOps
29
Entity Extraction in AIOps https://www.ibm.com/cloud/blog/entity-extraction-in-aiops
30. Dimensions of NLP Customization – An Overview
30
Approaches Labeling Efforts Constraints
Rules
Simple
ML
Pattern Induction
Declarative Systems
ML*
Errors
Supervised Learning
Low High
Dev. Efforts
AutoML, Active Learning
31. Example-Driven Extraction vis
Pattern Induction
31
[CHI’17] SEER: Auto-Generating Information Extraction Rules from User-Specified Examples
[SIGMOD’17] Synthesizing Extraction Rules from User Examples with SEER. SIGMOD’2017
[AAAI’22] InteractEva: A Simulation-based Evaluation Framework for Interactive AI Systems (to appear)
IBM Watson Discovery (Beta in Plus since Oct. 2021) http://ibm.biz/SEER_IE, https://ibm.biz/WDSPressReleaseNov
32. Example-Driven Extraction vis
Pattern Induction
32
[CHI’17] SEER: Auto-Generating Information Extraction Rules from User-Specified Examples
[SIGMOD’17] Synthesizing Extraction Rules from User Examples with SEER. SIGMOD’2017
[AAAI’22] InteractEva: A Simulation-based Evaluation Framework for Interactive AI Systems (to appear)
IBM Watson Discovery (Beta in Plus since Oct. 2021) http://ibm.biz/SEER_IE, https://ibm.biz/WDSPressReleaseNov
33. Example-Driven Extraction vis
Pattern Induction
33
[CHI’17] SEER: Auto-Generating Information Extraction Rules from User-Specified Examples
[SIGMOD’17] Synthesizing Extraction Rules from User Examples with SEER. SIGMOD’2017
[AAAI’22] InteractEva: A Simulation-based Evaluation Framework for Interactive AI Systems (to appear)
IBM Watson Discovery (Beta in Plus since Oct. 2021) http://ibm.biz/SEER_IE, https://ibm.biz/WDSPressReleaseNov
34. Example-Driven Extraction vis
Pattern Induction
34
[CHI’17] SEER: Auto-Generating Information Extraction Rules from User-Specified Examples
[SIGMOD’17] Synthesizing Extraction Rules from User Examples with SEER. SIGMOD’2017
[AAAI’22] InteractEva: A Simulation-based Evaluation Framework for Interactive AI Systems (to appear)
IBM Watson Discovery (Beta in Plus since Oct. 2021) http://ibm.biz/SEER_IE, https://ibm.biz/WDSPressReleaseNov
35. Dimensions of NLP Customization – An Overview
35
Approaches Labeling Efforts Constraints
Rules
Simple
ML
Rule Induction
Declarative Systems
ML*
Errors
Supervised Learning
Low High
Dev. Efforts
AutoML, Active Learning
36. Human & Machine Co-Creation
Labele
d Data
Evaluati
on
Results
Productio
n
Deep
Learning
Learned Rules
(Explainable)
Modify Rules
Machine performs heavy lifting to abstract out
patterns
Humans verify/
transparent model
Evaluation & Deployment
Raises the abstraction level for domain experts to interact with
[EMNLP’20] Learning Explainable Linguistic Expressions with Neural Inductive Logic Programming for Sentence Classification
37. Learning Rules with Neuro-
Symbolic AI
Hides a structure learning task as a
parameter-estimation task:
–Learnable parameters include distributions expressing
which predicate is included
Network architecture encodes soft analogs
for conjunction (∙) and disjunction (max)
Contains a layer per predicate in the clause
Last layer aggregates scores across actions
in the sentence
Multiple clauses in the DNF can be
supported by adding more channels
38. Human-Machine Co-Creation
38
[ACL’19] HEIDL: Learning Linguistic Expressions with Deep Learning and Human-in-the-Loop.
[EMNLP’20] Learning Explainable Linguistic Expressions with Neural Inductive Logic Programming for Sentence Classification.
Multiple customer engagements
http://ibm.biz/HEIDL_Demo
39. User Study: Human+Machine Co-Created Model Performance
User study
–4 NLP Engineers with < 2 years experience
–2 NLP experts with 10+ years experience
Key Takeaways
–Explanation of learned rules: Visualization tool is very
effective
–Reduction in human labor: Co-created model created within
1.5 person-hrs outperforms black-box sentence classifier
–Lower requirement on human expertise: Co-created model
is at par with the model created by Super-Experts
[ACL’19] HEIDL: Learning Linguistic Expressions with Deep Learning and Human-in-the-Loop.
[EMNLP’20] Learning Explainable Linguistic Expressions with Neural Inductive Logic Programming for Sentence Classification.
40. Dimensions of NLP Customization – An Overview
40
Approaches Labeling Efforts Constraints
Rules
Simple
ML
Rule Induction
Declarative Systems
ML*
Errors
Supervised Learning
Low High
Dev. Efforts
AutoML, Active Learning
41. AutoAI for Text (AutoText)
[AAAI’21] AutoText: An End-to-End AutoAI Framework for Text. (demo) 41
42. Easy to Use UI: No-Code
42
[AAAI’21] AutoText: An End-to-End AutoAI Framework for Text. (demo)
46. Initial Success Stories:
Academic Benchmarks
46
Text classification: TREC-6
Sentiment analysis: SST1, SST2
High-Barrier
Years of training and days of
model building & tuning
SOTA
or better
No Code
Simply load datasets
into UI
Results
Solution
Problem
48. Initial Success Stories:
Watson NLP Production Model for Text Classification
48
Expensive
Manual weight tuning & classifier
selection for OOB ensemble model
>10x
Speed-up in training at
comparable quality, via auto weight
tuning and HPO
>30%
Reduction in combined training and
prediction time, via classifier selection
Automatic
Weight Tuning and Classifier selection
Results
Solution
Problem
49. Easy to Use UI: No-Code
49
[AAAI’21] AutoText: An End-to-End AutoAI Framework for Text. (demo)
53. 53
Model Training
SME Input
System for L Earning to Understand Text with Human-in-the-loop
S LE U T H
Continuous feedback
between model training
and SME input
• No need to label 1000 data
points to get the first impression
on model’s performance
• Early identification of concept
drift, ill-defined class (garbage in,
garbage out situation)
Q
Q
Q
SLEUTH:
54. Ongoing Work
Provide a unified and
empowering user
experience while
taking advantage of
wide range of
techniques to help
users to provide the
best models for their
own use cases.
54
System for L Earning to Understand Text with Human-in-the-loop
SLEUTH:S LE U T H
55. The Enterprise Challenges
Complexity
Complex
documents
Wide variety of
tasks
55
Small Data
Limited labeled
data
Even unlabeled
data may not
available
Customizability
Rich domain
knowledge
Wide spectrum of
users
Explainability
- Explain the rationale
behind predictions
and reasoning to
experts
56. Who Needs to Know What, When?
56
AI Lifecycle
Touchpoints
Initial Model building
Model validation during
proof-of-concept
Model in-production
Audience
(Whom does the AI model
interface with)
Model developers
Data Scientists
Product Mangers
Domain experts
Business owners
Business IT Operations
Model developers
Data Scientists | Technical
Strategists
Product managers | Design teams
Business owners/users
Business IT Operations
Explainability
Motivations
(Information needs)
- Peeking inside models to
understand their inner workings
- Improving model design (e.g.,
how should the model be
retrained, re-tuned)
- Selecting the right model
- Characteristics of data
(proprietary, public, training
data)
- Understanding model design
- Ensuring ethical model
development
- Expectation mismatch
- Augmenting business workflow
and business actionability
[DIS’21] Who needs to know what, when?: Broadening the Explainable AI (XAI) Design Space by Looking at
Explanations Across the AI Lifecycle.
57. Who Needs to Know What, When?
57
AI Lifecycle
Touchpoints
Initial Model building
Model validation during
proof-of-concept
Model in-production
Audience
(Whom does the AI model
interface with)
Model developers
Data Scientists
Product Mangers
Domain experts
Business owners
Business IT Operations
Model developers
Data Scientists | Technical
Strategists
Product managers | Design teams
Business owners/users
Business IT Operations
Explainability
Motivations
(Information needs)
- Peeking inside models to
understand their inner workings
- Improving model design (e.g.,
how should the model be
retrained, re-tuned)
- Selecting the right model
- Characteristics of data
(proprietary, public, training
data)
- Understanding model design
- Ensuring ethical model
development
- Expectation mismatch
- Augmenting business workflow
and business actionability
[DIS’21] Who needs to know what, when?: Broadening the Explainable AI (XAI) Design Space by Looking at
Explanations Across the AI Lifecycle.
58. Who Needs to Know What, When?
58
AI Lifecycle
Touchpoints
Initial Model building
Model validation during
proof-of-concept
Model in-production
Audience
(Whom does the AI model
interface with)
Model developers
Data Scientists
Product Mangers
Domain experts
Business owners
Business IT Operations
Model developers
Data Scientists | Technical
Strategists
Product managers | Design teams
Business owners/users
Business IT Operations
Explainability
Motivations
(Information needs)
- Peeking inside models to
understand their inner workings
- Improving model design (e.g.,
how should the model be
retrained, re-tuned)
- Selecting the right model
- Characteristics of data
(proprietary, public, training
data)
- Understanding model design
- Ensuring ethical model
development
- Expectation mismatch
- Augmenting business workflow
and business actionability
[DIS’21] Who needs to know what, when?: Broadening the Explainable AI (XAI) Design Space by Looking at
Explanations Across the AI Lifecycle.
60. Who Needs to Know What, When?
60
AI Lifecycle
Touchpoints
Initial Model building
Model validation during
proof-of-concept
Model in-production
Audience
(Whom does the AI model
interface with)
Model developers
Data Scientists
Product Mangers
Domain experts
Business owners
Business IT Operations
Model developers
Data Scientists | Technical
Strategists
Product managers | Design teams
Business owners/users
Business IT Operations
Explainability
Motivations
(Information needs)
- Peeking inside models to
understand their inner workings
- Improving model design (e.g.,
how should the model be
retrained, re-tuned)
- Selecting the right model
- Characteristics of data
(proprietary, public, training
data)
- Understanding model design
- Ensuring ethical model
development
- Expectation mismatch
- Augmenting business workflow
and business actionability
[DIS’21] Who needs to know what, when?: Broadening the Explainable AI (XAI) Design Space by Looking at
Explanations Across the AI Lifecycle.
61. Learning Explainable Models with Low
Resources
Ensemble maintains quality
Natural language generation
offers explainability for SMEs
Only used scarce labeled data
for the entire process
Labeled
Data
(Scarce)
Transfer
Learning
Pre-trained
Language
Embeddings
Unlabeled
Data
Augmented
with Weak
Labels
Deep
Learning
Explainable
Model (Rules)
Ensemble
Human-machine
Co-creaation
Natural Language
Explanations for
Predictions
62. Transferred BERT-
based model
Hybrid Models
Precision Recall F1
Production 82% 57% 67%
Precision Recall F1
BERT-based 50% 84% 62%
BERT-based
→HEIDL
77% 62% 68%
Ease
of
Development
(E)
Explainable (X)
Hand-crafted
ESSP rules
Bert-based Model
→ HEIDL
Trained on only
7% of data! HEIDL
Ensemble
Precision within 5% of
production model’s
Precision Recall F1 Expl
Product 79% 68% 73% 89.5
Rules-first-
BERT-fallback
59% 91% 71% 51.4
Recall far
exceeds anything
else
Precision almost matches
production model’s
63. Who Needs to Know What, When?
63
AI Lifecycle
Touchpoints
Initial Model building
Model validation during
proof-of-concept
Model in-production
Audience
(Whom does the AI model
interface with)
Model developers
Data Scientists
Product Mangers
Domain experts
Business owners
Business IT Operations
Model developers
Data Scientists | Technical
Strategists
Product managers | Design teams
Business owners/users
Business IT Operations
Explainability
Motivations
(Information needs)
- Peeking inside models to
understand their inner workings
- Improving model design (e.g.,
how should the model be
retrained, re-tuned)
- Selecting the right model
- Characteristics of data
(proprietary, public, training
data)
- Understanding model design
- Ensuring ethical model
development
- Expectation mismatch
- Augmenting business workflow
and business actionability
[DIS’21] Who needs to know what, when?: Broadening the Explainable AI (XAI) Design Space by Looking at
Explanations Across the AI Lifecycle.
64. Transparent Linguistic Models for Contract Understanding
64
[NAACL’21] Development of an Enterprise-Grade Contract Understanding System, (Industry Track) Watson Discovery Content Intelligence
65. Transparent Model Design
Purchaser will
purchase the Assets
by a cash payment.
Element
[NAACL-NLLP’19] Transparent Linguistic Models for Contract Understanding and Comparison
[NAACL’21] Development of an Enterprise-Grade Contract Understanding System https://www.ibm.com/cloud/compare-and-comply
66. Transparent Model Design
Purchaser will
purchase the Assets
by a cash payment.
Element
[Purchaser]A0
[will]TENSE-FUTURE
purchase
[the Assets]A1
[by a cash payment]ARGM-MNR
Core NLP Understanding
Core NLP Primitives &
Operators
Provided by SystemT
[ACL '10, NAACL ‘18]
Semantic NLP Primitives
[NAACL-NLLP’19] Transparent Linguistic Models for Contract Understanding and Comparison
[NAACL’21] Development of an Enterprise-Grade Contract Understanding System https://www.ibm.com/cloud/compare-and-comply
67. Transparent Model Design
Purchaser will
purchase the Assets
by a cash payment.
Element Legal Domain LLEs
[Purchaser]ARG0
[will]TENSE-FUTURE
purchase
[the Assets]ARG1
[by a cash payment]ARGM-MNR
LLE1:
PREDICATE ∈ DICT Business-Transaction
∧ TENSE = Future
∧ POLARITY = Positive
→ NATURE = Obligation ∧ PARTY =
ARG0
LLE2:
…........
Domain Specific
Concepts
Business transact. verbs
in future tense
with positive polarity
Core NLP Primitives &
Operators
Semantic NLP Primitives
https://www.ibm.com/cloud/compare-and-comply
[NAACL-NLLP’19] Transparent Linguistic Models for Contract Understanding and Comparison
[NAACL’21] Development of an Enterprise-Grade Contract Understanding System
68. Transparent Model Design
Purchaser will
purchase the Assets
by a cash payment.
Element Model Output
[Purchaser]ARG0
[will]TENSE-FUTURE
purchase
[the Assets]ARG1
[by a cash payment]ARGM-MNR
Obligation for
Purchaser
Nature/Party:
Domain Specific
Concepts
Core NLP Primitives &
Operators
LLE1:
PREDICATE ∈ DICT Business-Transaction
∧ TENSE = Future
∧ POLARITY = Positive
→ NATURE = Obligation ∧ PARTY =
ARG0
LLE2:
…........
Legal Domain LLEs
Semantic NLP Primitives
https://www.ibm.com/cloud/compare-and-comply
[NAACL-NLLP’19] Transparent Linguistic Models for Contract Understanding and Comparison
[NAACL’21] Development of an Enterprise-Grade Contract Understanding System
69. Scalable
Through
summarization and
integrated AI
algorithms
Customizabl
e
Allows adaptation to
different models
Systematic
Offers a systematic way
of improving models
Visual interactive tool for model improvement
ModelLens
Acquire overview of errors
(identify heavy hitters)
1 Drill down into individual errors
2
Get context about
each error (text,
location in input
document, model
provenance, …)
3
Record error root
cause
4
[CSCW’19] ModelLens: An Interactive System to Support the Model Improvement
Practices of Data Science Teams.
Currently used by multiple product and research teams
70. Systematic Model
Improvement
70
Error Analysis
e.g. “$100 is due at signing”
is not classified as <Obligation>
Root Cause Identification
e.g. <Currency> is not
considered in the model
Model Improvement
e.g. Include <Currency> as a
feature / or introduce entity-
aware attention
[CSCW’19] ModelLens: An Interactive System to Support the Model Improvement
Practices of Data Science Teams.
Currently used by multiple product and research teams
71. XNLP - interactive explore the literature on XNLP
71
:https://xainlp2020.github.io/xainlp/home
List view: list the set of papers in a table
Search view: keyword search and faceted search
[KDD’21] Explainability for Natural Language Processing. (tutorial)
[IUI’21] XNLP: A Living Survey for XAI Research in Natural Language Processing. (demo)
[AACL’20] A Survey of the State of Explainable AI for Natural Language Processing.
[AACL’20] Explainability for Natural Language Processing. (tutorial)
[IUI’20] XAIT: An Interactive Website for Explainable AI for Text. (demo)
72. XNLP - interactive explore the literature on XNLP
72
Tree view:categorize papers in a tree like structure
Cluster view: group papers based on explainability and
visualization
Citation graph: the evolution of
the field and influential works
:https://xainlp2020.github.io/xainlp/home
[KDD’21] Explainability for Natural Language Processing. (tutorial)
[IUI’21] XNLP: A Living Survey for XAI Research in Natural Language Processing. (demo)
[AACL’20] A Survey of the State of Explainable AI for Natural Language Processing.
[AACL’20] Explainability for Natural Language Processing. (tutorial)
[IUI’20] XAIT: An Interactive Website for Explainable AI for Text. (demo)
73. Summary
73
Credit: Simon Sinek, 2010
Why we need to build this
NLP model?
How will it be used
and evaluated?
What is the best
way to build it?
WHY
HOW
WHAT
76. Human-in-the-Loop Throughout the Entire Life Cycle
76
Learner
Learner
Labeled data
Learner
Data Labeling Model Development Deployment + Feedback
IDE
AutoML
Build explainable model directly via an IDE
Human-machine co-creation
Scale data labeling with auto-
generation + crowd-in-the-loop
Curb data hunger with transfer learning + active
learning
Raw data
Scale model building with AutoML
• End user provides feedback
• Influence the entire AI life cycle
Editor's Notes
Good morning everyone. Today I’m going to talk about our stories on taming the wild west of NLP.
Today’s NLP landscape just looks like wild west. There are many players in the wild. As you seen, the number of peer-reviewed AI publication is fast increasing, the same with publication with top NLP conferences.
Similar, the global NLP market is fast increasing with an annual growth rate of 19.7%. Meanwhile, the market is highly fragmented.
So what’s the unique challenges for NLP in the wild?
First let me briefly introduce what is semantic role labeling. SRL is a very fundamental NLP task. SRL is short for Semantic role labeling aims to recover the predicate-argument structure of an input sentence. To be more simplied, it basically telling you, given a sentence, Who did what to whom, when, where and how in the sentence? For example, you find that John is “orderer” also “who”, did something, which is “ordered”, the something is “thing ordered” also “what”. The A0, A1 and so on are just semantic labels. We follow PropBank formalism in the presentation.
* Construct interpretable domain-specific models based on the abstraction with learning & reasoning + HIL tooling + DSLCapture understanding of text in a language-agnostic abstraction in DSL
Build SoAT models + HIL tooling to provide such abstraction with language and domain adaptability
Relevant hyperparameters:
Length of each clause
Length of the DNF
Although the system is still undergoing development and improvements, we can already report some initial success stories. First, we have shown that over multiple academic benchmarks for text classification and sentiment analysis, AutoAI for Text overcomes the problem of high-barrier of entrance in model building, by providing a no-code solution that produces models with quality that is comparable or better than the state of the art.
We have also successfully deployed AutoAI for Text to help our Watson NLP product team build better out-of-box models. In this case, we are given a pre-existing production model for text classification that is based on an ensemble of multiple base classifiers which are manually combined and tuned. The dual goal here is, 1) to replace the expensive manual step of weight tuning with an automatic step, and 2) determine whether all the classifiers are needed and if not select an appropriate subset of the base classifiers that does the job. Our solution automates both these steps. As a result, we obtained a more than 10x speed-up in training without any compromise in quality. We also obtained more than 30% reduction in combined training and prediction time, via a new meta-learning algorithm for classifier selection.
Explain that SME input does not need to be limited to labeling (the user could also answer additional types of questions asked by the system)
We have walked through in detail about the different ways model explanations are seeked and offered.
We capture these findings in a visual representation here with the three high level areas of model building, proof of concept and model in-production. We include the primary audience and their explanation needs throughout the AI model’s lifecycle.
To recap, the initial model development phase relies heavily on explanations to peek inside AI models to understand their inner workings, improving their designs and ultimately selecting the ideal most one for further stages.
Proof-of-concept demonstrations, relies heavily on communication about the data on which the model was trained, understanding high level mechanics of the AI model and ensuring its design meets ethical and regulatory standards.
When the model is in-production, explainability is triggered when expectations are violated. Furthermore, for using AI model in their main workflow, business stakeholders need assurance that explanations it generates can prompt business actionability.
We take the case of HEIDL next (covered previously in part2), to examine how explainability is an integral part of its design.
We have walked through in detail about the different ways model explanations are seeked and offered.
We capture these findings in a visual representation here with the three high level areas of model building, proof of concept and model in-production. We include the primary audience and their explanation needs throughout the AI model’s lifecycle.
To recap, the initial model development phase relies heavily on explanations to peek inside AI models to understand their inner workings, improving their designs and ultimately selecting the ideal most one for further stages.
Proof-of-concept demonstrations, relies heavily on communication about the data on which the model was trained, understanding high level mechanics of the AI model and ensuring its design meets ethical and regulatory standards.
When the model is in-production, explainability is triggered when expectations are violated. Furthermore, for using AI model in their main workflow, business stakeholders need assurance that explanations it generates can prompt business actionability.
We take the case of HEIDL next (covered previously in part2), to examine how explainability is an integral part of its design.
We have walked through in detail about the different ways model explanations are seeked and offered.
We capture these findings in a visual representation here with the three high level areas of model building, proof of concept and model in-production. We include the primary audience and their explanation needs throughout the AI model’s lifecycle.
To recap, the initial model development phase relies heavily on explanations to peek inside AI models to understand their inner workings, improving their designs and ultimately selecting the ideal most one for further stages.
Proof-of-concept demonstrations, relies heavily on communication about the data on which the model was trained, understanding high level mechanics of the AI model and ensuring its design meets ethical and regulatory standards.
When the model is in-production, explainability is triggered when expectations are violated. Furthermore, for using AI model in their main workflow, business stakeholders need assurance that explanations it generates can prompt business actionability.
We take the case of HEIDL next (covered previously in part2), to examine how explainability is an integral part of its design.
Demo here
We have walked through in detail about the different ways model explanations are seeked and offered.
We capture these findings in a visual representation here with the three high level areas of model building, proof of concept and model in-production. We include the primary audience and their explanation needs throughout the AI model’s lifecycle.
To recap, the initial model development phase relies heavily on explanations to peek inside AI models to understand their inner workings, improving their designs and ultimately selecting the ideal most one for further stages.
Proof-of-concept demonstrations, relies heavily on communication about the data on which the model was trained, understanding high level mechanics of the AI model and ensuring its design meets ethical and regulatory standards.
When the model is in-production, explainability is triggered when expectations are violated. Furthermore, for using AI model in their main workflow, business stakeholders need assurance that explanations it generates can prompt business actionability.
We take the case of HEIDL next (covered previously in part2), to examine how explainability is an integral part of its design.
Maybe move this to later
Add all 5 categories
We have walked through in detail about the different ways model explanations are seeked and offered.
We capture these findings in a visual representation here with the three high level areas of model building, proof of concept and model in-production. We include the primary audience and their explanation needs throughout the AI model’s lifecycle.
To recap, the initial model development phase relies heavily on explanations to peek inside AI models to understand their inner workings, improving their designs and ultimately selecting the ideal most one for further stages.
Proof-of-concept demonstrations, relies heavily on communication about the data on which the model was trained, understanding high level mechanics of the AI model and ensuring its design meets ethical and regulatory standards.
When the model is in-production, explainability is triggered when expectations are violated. Furthermore, for using AI model in their main workflow, business stakeholders need assurance that explanations it generates can prompt business actionability.
We take the case of HEIDL next (covered previously in part2), to examine how explainability is an integral part of its design.
Now I am going to explain how we have designed a model on top of this stack.
The first step is to identify the element in the contract.
For example, in this contract, "Purchaser will purchase the Assets by a cash payment" is one of the element.
Each element is then analyzed using NLP Primitives.
For example, the element here is analyzed using Semantic layer, and the arguments are identified. (Purchaser, the Assets and by cash payment are identified as argument). And the tense is identified as Future tense.
After this step, we have developed LLEs that capture domain knowledge.
Let me walk you through this: In the first step, the predicates are compared with business transaction verbs, their tense is checked as future and their polarity is checked to be positive.
If the predicates are satisfied for an element, the LLE will then assign the label to the element. In this case, the element is an obligation for the purchaser.
Here we show a few screenshots taken from the website, and due to limited time, we won’t show a live demo of the website, but we encourage interested audience explore the website at your own pace.
Here we show a few screenshots taken from the website, and due to limited time, we won’t show a live demo of the website, but we encourage interested audience explore the website at your own pace.