Taming the Wild West of NLP

Taming the Wild West of
Natural Language Processing
Yunyao Li
Distinguished Research Staff Member
Senior Manager
Scalable Knowledge Intelligence Department
IBM Research – Almaden
@yunyao_li
yunyaoli@us.ibm.com
NLP4MusA

Growing Research
2
Source: Stanford HAI THE AI INDEX REPORT 2021

Growing Market
3
Source: https://www.psmarketresearch.com/market-analysis/natural-language-processing-market

Complexity
Complex
documents
4
Explainability
- Explain the rationale
behind predictions
and reasoning to
experts
Small Data
Limited labeled
data
Even unlabeled
data may not
available
Customizability
Rich domain
knowledge
Wide spectrum of
users
Challenges of NLP in the Wild

Our Approach
5
Data
Augmentation
Neural-Symbolic
AI
Domain-Specific
Languages
+
Powerful
Primitives
Human-in-the-
Loop

Customizability
Rich domain
knowledge
Wide spectrum of
users
The Enterprise Challenges
Explainability
behind predictions
and reasoning to
experts
Small Data
Limited labeled
data
Even unlabeled
data may not
available
Complexity
Complex
documents
Wide variety of
tasks
6

Document Conversion
Preserve Reading Order: Content
should be ordered across multiple
columns, tables & images
Table Region & Structure
Identification: Identify tables
& individual cell values
through combination of
“explicit graphical lines” and
implicit cell alignment
Formatting Metadata: Identify
headings, paragraphs from character-
level font information and positional
information
• Raw PDF information: positional character level information, graphical lines, font and image data
• Semantic constructs such as Table Structure and Reading order need to be inferred from these “low-level features” using “visual clues” and ML
algorithms
• Need to handle variations in PDF formats across programmatic PDFs (created from variety of tools) and scanned PDFs

Table with visual
clues only
Why is it hard? Variety in PDF Tables
Complex tables – graphical lines can be
misleading – is this 1, 2 or 3 tables ?
Multi-row, multi-
column column
headers
Nested row
headers
Tables with Textual
content
Table with
graphic lines
Table interleaved
with text and charts

Data Augmentation via Auto-Data Generation
FinTabNet
>110k Tables
10K reports of S&P 500 companies.
ibm.biz/fintabnet
PubTabNet
568K Tables
PubMed publications
ibm.biz/pubtabnet
[ECCV’20] Image-Based Table Recognition: Data, Model, and Evaluation
Auto-generated table boundary, cell boundary and structure annotations.

Global Table Extraction
[WACV’21] Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context.

New SoAT on ICDAR 2013 and ICDAR 2019 datasets
Note: Our model significantly outperforms competing models when using metrics better reflecting downstream processing
• Product transfer (2021)
• CORD-19 table data (since May 12, 2020): [ACL-COVID-19,’20] CORD-19: The COVID-19 Open Research Dataset.
[IBM Blog’20] Bringing IBM NLP capabilities to the CORD-19 Dataset. http://ibm.biz/CORD19-IBM
• TableQA: [AAAI’21] KAAPA: Knowledge Aware Answers from PDF Analysis.
[NAACL’21 ] Capturing Row and Column Semantics in Transformer Based Question Answering over Tables.

Optimistic Accuracy Metrics
April 22, 2021 / © 2021 IBM Corporation 12
Revenue ($Bn)
2008 2009
AAL 23.8 19.9
LUV 11.0 10.4
SAVE 1.10 0.70
ULCC 1.40 1.10
UAL 20.2 16.3
DAL 22.7 28.1
JBLU
E
3.40 3.30
ALK 3.70 3.40
Revenue ($Bn)
2008 2009
AAL 23.8 19.9
LUV 11.0 10.4
SAVE 1.10 0.70
ULCC 1.40 1.10
UAL 20.2 16.3
DAL 22.7 28.1
JBLUE 3.40 3.30
ALK 3.70 3.40
Candidate Table A Candidate Table B
Example: Table structure
Which is better?
P R F1 P R F1
Table Area 1 0.79 0.88 1 0.81 0.90
Cell Adjacency 1 0.76 0.86 1 0.88 0.94
IOU IOU
Table Area 0.8 0.8
ICDAR 2013
ICDAR 2019
Downstream Applications
P R F1 P R F1
Functional metric[1] 1 0.75 0.86 1 0 0
[1] Max C. Göbel, Tamir Hassan, Ermelinda Oro, Giorgio Orsi: A methodology for evaluating
algorithms for table understanding in PDF documents. Document Engineering 2012

Easy Customization via Adaptive Deep Learning
[IUI’21] TableLab: An Interactive Table Extraction System with Adaptive Deep Learning. (demo) 13

Customizability
Rich domain
knowledge
Wide spectrum of
users
Explainability
behind predictions
and reasoning to
experts
Complexity
Complex
documents
Wide variety of
tasks
14
Small Data
Limited labeled
data
Even unlabeled
data may not
available
Data Augmentation
Neural-Symbolic AI
Domain-Specific
Languages
+
Powerful Primitives
Human-in-the-Loop

John hastily ordered a dozen dandelions for Mary from Amazon’s Flower
Shop.
order.02 (request to be delivered)
A0: Orderer
A1: Thing ordered
A2: Benefactive, ordered-for
A3: Source
A0: Orderer
A1: Thing ordered
A2: Benefactive, ordered-for
A3: Source
AM-MNR: Manner
WHO
HOW
DID
WHAT WHERE
Semantic Role Labeling (SRL)
FOR
WHOM
Who did what to whom, when, where and

Generate SRL resources for many other languages
• Shared frame set
• Minimal effort
Il faut qu‘ il y ait des responsables
Need.01
A0
Je suis responsable pour le chaos
Be.01
A1 A2 AM-PRD
Les services postaux ont achété des …
Be.01 A2
A1
Buy.01
A0
Corpus of annotated text data
Universal Proposition Banks
Frame set
Buy.01
A0 – Buyer
A1 – Thing bought
A2 – Seller
A3 – Price paid
A4 – Benefactive
Pay.01
A0 – Payer
A1 – Money
A2 – Being payed
A3 – Commodity

Example: TV subtitles
Our Idea: Annotation projection with parallel corpora
Das würde ich für einen Dollar kaufen German subtitles
I would buy that for a dollar! English subtitles
PRICE
BUYER ITEM
BUYER
ITEM
Training data
• Semantically annotated
• Multilingual
• Large amount
I would buy that for a dollar
PRICE
projection
Das würde ich für einen Dollar kaufen
Auto-Generation of Universal
Preposition Bank
17
Resource: https://www.youtube.com/watch?v=u5HOt0ZOcYk

Filtered Projection &
Bootstrapping
Two-step process
– Filters to detect translation shift, block
projections (more precision at cost of
recall)
– Bootstrap learning to increase recall
– Generated 7 universal proposition banks
from 3 language groups
• Version 1.0: https://github.com/System-
T/UniversalPropositions/
• Version 2.0 coming soon
[ACL’15] Generating High Quality Proposition Banks for Multilingual Semantic Role Labeling.

Multilingual Aliasing
• Problem: Target language frame
lexicon automatically generated from
alignments
– False frames
– Redundant frames
• Expert curation of frame mappings
[COLING’16] Multilingual Aliasing for Auto-Generating Proposition
Banks

Annotatio
n Tasks
Task
Router
raw text
Corpus
predicted
annotations
Corpus
curated
annotations
Corpus
Easy tasks are curated by crowd
Difficult tasks are curated by experts
Crowd-in-the-Loop Curation
[EMNLP’17] CROWD-IN-THE-LOOP: A Hybrid Approach for Annotating Semantic Roles

9% F1
improvement over SRL
results
Effectiveness of Crowd-in-
the-Loop
66.4%
expert efforts
10% F1
improvement over SRL
results
87.3%
expert efforts
Latest: Filter  Select  Expert
[EMNLP’20 (Finding)] A Novel Workflow for Accurately and Efficiently Crowdsourcing Predicate Senses and Argument Labels

Explainability
behind predictions
and reasoning to
experts
Complexity
Complex
documents
Wide variety of
tasks
22
Small Data
Limited labeled
data
Even unlabeled
data may not
available
Customizability
Rich domain
knowledge
Wide spectrum of
users

Targeted Users – A Simplified View
23
AI Engineer Data Scientist Subject Matter Expert (SME)
• Software programming
• Develop and deploy AI models
• Understand business needs
• Data wrangling
• Develop AI models
• Distill insights from data
• Communicate business needs
• Data labeling
• Use AI models and provide feedback
• Create simple AI models

Dimensions of NLP Customization – An Overview
24
Approaches Labeling Efforts Constraints
Rules
Simple
ML
Rule Induction
Declarative Systems
ML*
Errors
Supervised Learning
Low High
Dev. Efforts
AutoML, Active Learning

25
Rules
Simple
ML
Rule Induction
Declarative Systems
ML*
Errors
Supervised Learning
Low High
Dev. Efforts

AQL Language
Optimizer
Operator
Runtime
Specify extractor
semantics declaratively
Choose efficient
execution plan that
implements semantics
Example AQL Extractor
Fundamental Results & Theorems
Expressivity: The class of extraction tasks expressible in AQL is a strict superset of
that expressible through cascaded regular automata.
Performance: For any acyclic token-based finite state transducer T, there exists an
operator graph G such that evaluating T and G has the same computational
complexity.
create view PersonPhone as
select P.name as person, N.number as phone
from Person P, PhoneNumber N, Sentence S
where
Follows(P.name, N.number, 0, 30)
and Contains(S.sentence, P.name)
and Contains(S.sentence, N.number)
and ContainsRegex(/b(phone|at)b/,
SpanBetween(P.name, N.number));
Within asinglesentence
<Person> <PhoneNum>
0-30chars
Contains“phone” or “at”
Within asinglesentence
<Person> <PhoneNum>
0-30chars
Contains“phone” or “at”
SystemT: Declarative Text Understanding for The Enterprise
SystemT Architecture
[ACL’2010] SystemT: An Algebraic Approach to Declarative Information Extraction ibm.biz/SystemT

Explainable AI for the Enterprise – SystemT: Text Understanding
27
Domain-Specific Models
Cross-lingual
Semantic Abstraction
(semantic parsing,
table understanding)
Syntactic Abstraction
(tokenization,
lemmatization, POS, etc.)
Syntactic NLP
Operators
(HTML operators, Regular
expressions, dictionaries,
span operators,
AI for IT AI for Customer Care AI for Compliance …
[NAACL’18] SystemT: Declarative Text Understanding for Enterprise.
• 20+ IBM products
• 50+ papers.
ibm.biz/SystemT

Tooling for Different Users
[ACL’12] WizIE: A Best Practices Guided Development Environment
for Information Extraction
IBM InfoSphere BigInsights Text Analytics Eclipse Tooling
28
[VLDB’15] VINERy: A Visual IDE for Information Extraction
[KDD’19] Declarative Text Understanding with SystemT. (hands-on tutorial)
IBM Watson Knowledge Studio. Advanced Rule Editor
AI Engineers AI Engineers/Data Scientists
Full-fledged IDE Visual IDE
http://ibm.biz/VineryIE

Entity Extraction for Watson AIOps
29
Entity Extraction in AIOps https://www.ibm.com/cloud/blog/entity-extraction-in-aiops

30
Rules
Simple
ML
Pattern Induction
Declarative Systems
ML*
Errors
Supervised Learning
Low High
Dev. Efforts

Example-Driven Extraction vis
Pattern Induction
31
[CHI’17] SEER: Auto-Generating Information Extraction Rules from User-Specified Examples
[SIGMOD’17] Synthesizing Extraction Rules from User Examples with SEER. SIGMOD’2017
[AAAI’22] InteractEva: A Simulation-based Evaluation Framework for Interactive AI Systems (to appear)
IBM Watson Discovery (Beta in Plus since Oct. 2021) http://ibm.biz/SEER_IE, https://ibm.biz/WDSPressReleaseNov

Pattern Induction
32

Pattern Induction
33

Pattern Induction
34

35
Rules
Simple
ML
Rule Induction
Declarative Systems
ML*
Errors
Supervised Learning
Low High
Dev. Efforts

Human & Machine Co-Creation
Labele
d Data
Evaluati
on
Results
Productio
n
Deep
Learning
Learned Rules
(Explainable)
Modify Rules
Machine performs heavy lifting to abstract out
patterns
Humans verify/
transparent model
Evaluation & Deployment
Raises the abstraction level for domain experts to interact with
[EMNLP’20] Learning Explainable Linguistic Expressions with Neural Inductive Logic Programming for Sentence Classification

Learning Rules with Neuro-
Symbolic AI
Hides a structure learning task as a
parameter-estimation task:
–Learnable parameters include distributions expressing
which predicate is included
Network architecture encodes soft analogs
for conjunction (∙) and disjunction (max)
Contains a layer per predicate in the clause
Last layer aggregates scores across actions
in the sentence
Multiple clauses in the DNF can be
supported by adding more channels

Human-Machine Co-Creation
38
[ACL’19] HEIDL: Learning Linguistic Expressions with Deep Learning and Human-in-the-Loop.
[EMNLP’20] Learning Explainable Linguistic Expressions with Neural Inductive Logic Programming for Sentence Classification.
Multiple customer engagements
http://ibm.biz/HEIDL_Demo

User Study: Human+Machine Co-Created Model Performance
User study
–4 NLP Engineers with < 2 years experience
–2 NLP experts with 10+ years experience
Key Takeaways
–Explanation of learned rules: Visualization tool is very
effective
–Reduction in human labor: Co-created model created within
1.5 person-hrs outperforms black-box sentence classifier
–Lower requirement on human expertise: Co-created model
is at par with the model created by Super-Experts
[ACL’19] HEIDL: Learning Linguistic Expressions with Deep Learning and Human-in-the-Loop.
[EMNLP’20] Learning Explainable Linguistic Expressions with Neural Inductive Logic Programming for Sentence Classification.

40
Rules
Simple
ML
Rule Induction
Declarative Systems
ML*
Errors
Supervised Learning
Low High
Dev. Efforts

AutoAI for Text (AutoText)
[AAAI’21] AutoText: An End-to-End AutoAI Framework for Text. (demo) 41

Easy to Use UI: No-Code
42
[AAAI’21] AutoText: An End-to-End AutoAI Framework for Text. (demo)

Assist Decision-Making
43

44

Python Interface with More Controls
45

Initial Success Stories:
Academic Benchmarks
46
Text classification: TREC-6
Sentiment analysis: SST1, SST2
High-Barrier
Years of training and days of
model building & tuning
SOTA
or better
No Code
Simply load datasets
into UI
Results
Solution
Problem

47
Text Classification on Academic Benchmarks – Accuracy (%)
Pipeline sst1 sst2 cr mr trec6 mpqa
cnn 36.79 77.92 79.84 74.1 90.6 85.57
cnn_trainable 36.97 77.43 81.22 74.19 89.6 85.57
glove_lstm_no_max_len 32.85 77.27 76.67 72.74 84.6 84.02
glove_lstm_no_max_trainable 33.35 77.32 76.56 72.85 85.4 84.05
glove_lstm_trainable 36.79 78.69 63.76 74.56 85.6 68.8
glove_lstm 37.78 78.69 63.73 74.36 87.2 68.8
glove_bilstm_no_max_len 32.62 77.98 76.46 73.01 84.4 84.67
glove_bilstm_no_max_trainable 32.58 77.92 77.09 72.72 85.2 84.74
glove_bilstm_trainable 36.97 79.41 63.7 72.45 86.4 68.8
glove_bilstm 37.47 79.24 63.7 72.78 84.6 68.8
bert_mlp 56.77 93.41 89.92 87.11 95.39 90.95
roberta_mlp 58.22 95.27 92.97 89.11 96.19 90.42
bert_bilstm_mlp 55.82 93.63 90.45 87.15 96.59 91.02
distilbert_bilstm_mlp 55.32 91.48 90.72 84.92 96.39 90.56
roberta_bilstm_mlp 57.13 95.49 93.1 89.38 95.99 90.24
xlnet_mlp 56.79 94.23 90.48 85.9 92 88.54
distilbert_mlp 53.33 92.03 90.37 85.18 96.59 90.58
SOTA* (https://paperswithcode.com/sota) 97.4 92.2 84.5 98.07 88.14
Comprehensive framework
allows exploration and
identification of optimal
combinations (sometimes better
than SOTA)

Initial Success Stories:
Watson NLP Production Model for Text Classification
48
Expensive
Manual weight tuning & classifier
selection for OOB ensemble model
>10x
Speed-up in training at
comparable quality, via auto weight
tuning and HPO
>30%
Reduction in combined training and
prediction time, via classifier selection
Automatic
Weight Tuning and Classifier selection
Results
Solution
Problem

Easy to Use UI: No-Code
49

50

51

Python Interface with More Controls
52

53
Model Training
SME Input
System for L Earning to Understand Text with Human-in-the-loop
S LE U T H
Continuous feedback
between model training
and SME input
• No need to label 1000 data
points to get the first impression
on model’s performance
• Early identification of concept
drift, ill-defined class (garbage in,
garbage out situation)
Q
Q
Q
SLEUTH:

Ongoing Work
Provide a unified and
empowering user
experience while
taking advantage of
wide range of
techniques to help
users to provide the
best models for their
own use cases.
54
System for L Earning to Understand Text with Human-in-the-loop
SLEUTH:S LE U T H

Complexity
Complex
documents
Wide variety of
tasks
55
Small Data
Limited labeled
data
Even unlabeled
data may not
available
Customizability
Rich domain
knowledge
Wide spectrum of
users
Explainability
behind predictions
and reasoning to
experts

Who Needs to Know What, When?
56
AI Lifecycle
Touchpoints
Initial Model building
Model validation during
proof-of-concept
Model in-production
Audience
(Whom does the AI model
interface with)
Model developers
Data Scientists
Product Mangers
Domain experts
Business owners
Business IT Operations
Model developers
Data Scientists | Technical
Strategists
Product managers | Design teams
Business owners/users
Explainability
Motivations
(Information needs)
- Peeking inside models to
understand their inner workings
- Improving model design (e.g.,
how should the model be
retrained, re-tuned)
- Selecting the right model
- Characteristics of data
(proprietary, public, training
data)
- Understanding model design
- Ensuring ethical model
development
- Expectation mismatch
- Augmenting business workflow
and business actionability
[DIS’21] Who needs to know what, when?: Broadening the Explainable AI (XAI) Design Space by Looking at
Explanations Across the AI Lifecycle.

57
AI Lifecycle
Touchpoints
proof-of-concept
Model in-production
Audience
interface with)
Model developers
Data Scientists
Product Mangers
Domain experts
Business owners
Model developers
Strategists
Explainability
Motivations
(Information needs)
data)
development

58
AI Lifecycle
Touchpoints
proof-of-concept
Model in-production
Audience
interface with)
Model developers
Data Scientists
Product Mangers
Domain experts
Business owners
Model developers
Strategists
Explainability
Motivations
(Information needs)
data)
development

Passive-Voice
Possibility-ModalClass
Passive-Voice
VerbBase-MatchesDictionary-SectionTitleClues
Theme-MatchesDictionary-NonAgentSectionTitleClues
Theme-MatchesDictionary-NonAgentSectionTitleClues
Manner-MatchesDictionary-MannerContextClues
Manner-MatchesDictionary-MannerContextClues
NounPhrase-MatchesDictionary-SectionTitleClues
NounPhrase-MatchesDictionary-SectionTitleClues
Label being
assigned
Various ways of
selecting/rankin
g ranking rules
Center panel lists all rules
HEIDL: Explainability for Data
Scientists
Rule-specific
performance metrics
models are fully explainable
users involved in model co-creation with feedback

60
AI Lifecycle
Touchpoints
proof-of-concept
Model in-production
Audience
interface with)
Model developers
Data Scientists
Product Mangers
Domain experts
Business owners
Model developers
Strategists
Explainability
Motivations
(Information needs)
data)
development

Learning Explainable Models with Low
Resources
Ensemble maintains quality
Natural language generation
offers explainability for SMEs
Only used scarce labeled data
for the entire process
Labeled
Data
(Scarce)
Transfer
Learning
Pre-trained
Language
Embeddings
Unlabeled
Data
Augmented
with Weak
Labels
Deep
Learning
Explainable
Model (Rules)
Ensemble
Human-machine
Co-creaation
Natural Language
Explanations for
Predictions

Transferred BERT-
based model
Hybrid Models
Precision Recall F1
Production 82% 57% 67%
Precision Recall F1
BERT-based 50% 84% 62%
BERT-based
→HEIDL
77% 62% 68%
Ease
of
Development
(E)
Explainable (X)
Hand-crafted
ESSP rules
Bert-based Model
→ HEIDL
Trained on only
7% of data! HEIDL
Ensemble
Precision within 5% of
production model’s
Precision Recall F1 Expl
Product 79% 68% 73% 89.5
Rules-first-
BERT-fallback
59% 91% 71% 51.4
Recall far
exceeds anything
else
Precision almost matches
production model’s

63
AI Lifecycle
Touchpoints
proof-of-concept
Model in-production
Audience
interface with)
Model developers
Data Scientists
Product Mangers
Domain experts
Business owners
Model developers
Strategists
Explainability
Motivations
(Information needs)
data)
development

Transparent Linguistic Models for Contract Understanding
64
[NAACL’21] Development of an Enterprise-Grade Contract Understanding System, (Industry Track) Watson Discovery Content Intelligence

Transparent Model Design
Purchaser will
purchase the Assets
by a cash payment.
Element
[NAACL-NLLP’19] Transparent Linguistic Models for Contract Understanding and Comparison
[NAACL’21] Development of an Enterprise-Grade Contract Understanding System https://www.ibm.com/cloud/compare-and-comply

Purchaser will
purchase the Assets
by a cash payment.
Element
[Purchaser]A0
[will]TENSE-FUTURE
purchase
[the Assets]A1
[by a cash payment]ARGM-MNR
Core NLP Understanding
Core NLP Primitives &
Operators
Provided by SystemT
[ACL '10, NAACL ‘18]
Semantic NLP Primitives
[NAACL’21] Development of an Enterprise-Grade Contract Understanding System https://www.ibm.com/cloud/compare-and-comply

Purchaser will
purchase the Assets
by a cash payment.
Element Legal Domain LLEs
[Purchaser]ARG0
[will]TENSE-FUTURE
purchase
[the Assets]ARG1
LLE1:
PREDICATE ∈ DICT Business-Transaction
∧ TENSE = Future
∧ POLARITY = Positive
→ NATURE = Obligation ∧ PARTY =
ARG0
LLE2:
…........
Domain Specific
Concepts
Business transact. verbs
in future tense
with positive polarity
Operators
https://www.ibm.com/cloud/compare-and-comply
[NAACL’21] Development of an Enterprise-Grade Contract Understanding System

Purchaser will
purchase the Assets
by a cash payment.
Element Model Output
[Purchaser]ARG0
[will]TENSE-FUTURE
purchase
[the Assets]ARG1
Obligation for
Purchaser
Nature/Party:
Domain Specific
Concepts
Operators
LLE1:
PREDICATE ∈ DICT Business-Transaction
∧ TENSE = Future
∧ POLARITY = Positive
→ NATURE = Obligation ∧ PARTY =
ARG0
LLE2:
…........
Legal Domain LLEs
https://www.ibm.com/cloud/compare-and-comply
[NAACL’21] Development of an Enterprise-Grade Contract Understanding System

Scalable
Through
summarization and
integrated AI
algorithms
Customizabl
e
Allows adaptation to
different models
Systematic
Offers a systematic way
of improving models
Visual interactive tool for model improvement
ModelLens
Acquire overview of errors
(identify heavy hitters)
1 Drill down into individual errors
2
Get context about
each error (text,
location in input
document, model
provenance, …)
3
Record error root
cause
4
[CSCW’19] ModelLens: An Interactive System to Support the Model Improvement
Practices of Data Science Teams.
Currently used by multiple product and research teams

Systematic Model
Improvement
70
Error Analysis
e.g. “$100 is due at signing”
is not classified as <Obligation>
Root Cause Identification
e.g. <Currency> is not
considered in the model
Model Improvement
e.g. Include <Currency> as a
feature / or introduce entity-
aware attention
[CSCW’19] ModelLens: An Interactive System to Support the Model Improvement
Practices of Data Science Teams.
Currently used by multiple product and research teams

XNLP - interactive explore the literature on XNLP
71
:https://xainlp2020.github.io/xainlp/home
List view: list the set of papers in a table
Search view: keyword search and faceted search
[KDD’21] Explainability for Natural Language Processing. (tutorial)
[IUI’21] XNLP: A Living Survey for XAI Research in Natural Language Processing. (demo)
[AACL’20] A Survey of the State of Explainable AI for Natural Language Processing.
[AACL’20] Explainability for Natural Language Processing. (tutorial)
[IUI’20] XAIT: An Interactive Website for Explainable AI for Text. (demo)

XNLP - interactive explore the literature on XNLP
72
Tree view：categorize papers in a tree like structure
Cluster view: group papers based on explainability and
visualization
Citation graph： the evolution of
the field and influential works
:https://xainlp2020.github.io/xainlp/home
[KDD’21] Explainability for Natural Language Processing. (tutorial)
[IUI’21] XNLP: A Living Survey for XAI Research in Natural Language Processing. (demo)
[AACL’20] A Survey of the State of Explainable AI for Natural Language Processing.
[AACL’20] Explainability for Natural Language Processing. (tutorial)
[IUI’20] XAIT: An Interactive Website for Explainable AI for Text. (demo)

Summary
73
Credit: Simon Sinek, 2010
Why we need to build this
NLP model?
How will it be used
and evaluated?
What is the best
way to build it?
WHY
HOW
WHAT

Thank You!
74
https://ibm.biz/ScalableKnowledgeIntelligence

Human-in-the-Loop Throughout the Entire Life Cycle
76
Learner
Learner
Labeled data
Learner
Data Labeling Model Development Deployment + Feedback
IDE
AutoML
Build explainable model directly via an IDE
Human-machine co-creation
Scale data labeling with auto-
generation + crowd-in-the-loop
Curb data hunger with transfer learning + active
learning
Raw data
Scale model building with AutoML
• End user provides feedback
• Influence the entire AI life cycle

Taming the Wild West of NLP

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Taming the Wild West of NLP

Similar to Taming the Wild West of NLP (20)

More from Yunyao Li

More from Yunyao Li (20)

Recently uploaded

Recently uploaded (20)

Taming the Wild West of NLP

Editor's Notes