SlideShare a Scribd company logo
Pre-training Text-to-Text Transformers
for Concept-centric Common Sense
1
2021
Wangchunshu Zhou*, Dong-Ho Lee*, Ravi Kiran Selvam,
Seyeon Lee, Bill Yuchen Lin, Xiang Ren
* equal contribution
Commonsense Reasoning with PTLMs
2
What do you fill with ink to write
notes on a piece of copy paper ?
(A) Fountain pen
(B) Pencil case
(C) Printer
(D) Notepad
Commonsense Reasoning with PTLMs
3
What do you fill with ink to write
notes on a piece of copy paper ?
(A) Fountain pen
(B) Pencil case
(C) Printer
(D) Notepad
Commonsense Reasoning with PTLMs
4
What do you fill with ink to write
notes on a piece of copy paper ?
(A) Fountain pen
(B) Pencil case
(C) Printer
(D) Notepad
Fails to reason with the concept-centric knowledge
Base : pencil case
Large : printer
Current PTLMs
5
… Copy paper is thinner than printer
paper, which doesn’t make a huge
difference when you’re printing text, but
it does when you’re printing large
images. Images require a lot of ink and
because copy paper has a thinner
structure, the ink will need to spread out
more for the paper to absorb it all. ...
The model may be sensitive to the co-occurence (ink, copy, paper)
Pre-train
Corpus
PTLMs
Text Infilling
/ MLM
Base : pencil case
Large : printer
How can we teach PTLMs to
write and reason with Common sense Concepts ?
6
What do you fill with ink to write
notes on a piece of copy paper ?
(A) Fountain pen
(B) Pencil case
(C) Printer
(D) Notepad
fill, ink, write, copy paper
Fountain Pen
Our idea : Novel Self-supervised Objectives to improve
common sense reasoning ability.
7
Text-to-Text
Transformer
Generate a sentence with the
following concepts :
Hold Woman Position
She was the first woman to hold the position.
Generative Objective : Concept-to-Sentence Generation (C2S)
Ask model to recover the original sentence
given only a few unordered keywords of the sentence.
Our idea : Novel Self-supervised Objectives to improve
common sense reasoning ability.
8
Text-to-Text
Transformer
Generate a sentence with the
following concepts :
Hold Woman Position
She was the first woman to hold the position.
Correct the order of the following
sentence :
Tree grows on the apple
Apple grows on the tree
Generative Objective : Concept Order Recovering (COR)
Ask model to recover the original sentence
given order-of-concept shuffled sentence.
Our idea : Novel Self-supervised Objectives to improve
common sense reasoning ability.
9
Text-to-Text
Transformer
Generate a sentence with the
following concepts :
Hold Woman Position
She was the first woman to hold the position.
Correct the order of the following
sentence :
Tree grows on the apple
Apple grows on the tree
Which sentence is correct ? :
(1) Tree grows on the apple
(2) Apple grows on the tree
Apple grows on the tree
Discriminative Objective : Generative QA
Ask model to distinguish the real sentence from a concept-distracted sentence.
CALM : Concept-Aware Language Model
10
(1) Given an input sentence x (“She was the first woman to hold the position.”),
extract concept-set C (woman, hold, position).
She was the first woman to hold the position.
Original Sentence x
Extract Concept Set C
(woman, hold, position)
CALM : Concept-Aware Language Model
11
(1) Given an input sentence x (“She was the first woman to hold the position.”),
extract concept-set C (woman, hold, position).
(1) Given x and C, produce corrupted source sentence x’ either for C2S and COR
She was the first woman to hold the position.
Original Sentence x
hold, woman, position
She was the first position to hold the woman.
Extract Concept Set C
(woman, hold, position)
CALM : Concept-Aware Language Model
12
(1) Given an input sentence x (“She was the first woman to hold the position.”),
extract concept-set C (woman, hold, position).
(1) Given x and C, produce corrupted source sentence x’ either for C2S and COR
(2) The generator trained with C2S and COR recovers sentence x’ to distractor x”
She was the first woman to hold the position.
Original Sentence x
hold, woman, position
She was the first position to hold the woman.
Extract Concept Set C
(woman, hold, position)
C2S
COR
Woman holds the position.
She was the first woman to position the hold.
Generator
CALM : Concept-Aware Language Model
13
(1) Given an input sentence x (“She was the first woman to hold the position.”),
extract concept-set C (woman, hold, position).
(1) Given x and C, produce corrupted source sentence x’ either for C2S and COR
(2) The generator trained with C2S and COR recovers sentence x’ to distractor x”
(3) The discriminator is trained to distinguish truth sentence from distractor x”
She was the first woman to hold the position.
Original Sentence x
hold, woman, position
She was the first position to hold the woman.
Extract Concept Set C
(woman, hold, position)
C2S
COR
Woman holds the position.
She was the first woman to position the hold.
Generator
Discriminator
Woman holds the position.
She was the first woman to hold the position.
CALM : Concept-Aware Language Model
14
(1) Given an input sentence x (“She was the first woman to hold the position.”),
extract concept-set C (woman, hold, position).
(1) Given x and C, produce corrupted source sentence x’ either for C2S and COR
(2) The generator trained with C2S and COR recovers sentence x’ to distractor x”
(3) The discriminator is trained to distinguish truth sentence from distractor x”
She was the first woman to hold the position.
Original Sentence x
hold, woman, position
She was the first position to hold the woman.
Extract Concept Set C
(woman, hold, position)
C2S
COR
Woman holds the position.
She was the first woman to position the hold.
Generator
Discriminator
Woman holds the position.
She was the first woman to hold the position.
Weight Sharing
Is CALM reason with concepts ? Yes !
15
Experimental results on
commonsense reasoning dataset.
CALM consistently and significantly outperforms
the backbone T5-base model.
Is CALM reason with concepts ? Yes !
16
Effective in Large Models.
Performance is consistent in
large model & different fraction of the dataset.
Performance of compared models
fine-tuned with different fraction of the dataset
Is CALM write with concepts ? Yes !
17
(Left) : Comparison between PTLMs
(Below) : Comparison on generated sentences
with same concept-set
Summary
18
● Novel self-supervised strategies for concept-centric Common Sense
○ Concept to Sentence
○ Concept Order Recovering
○ Generative QA
● Two-stage training strategy
○ Generator and Discriminator
Text-to-Text models can be pre-trained with better parameter and sample efficiency
by carefully designed self-supervised objectives that
focus on the ability required by target tasks.

More Related Content

Recently uploaded

一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 

Recently uploaded (20)

一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 

Featured

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
SpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Christy Abraham Joy
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
Vit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
MindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
GetSmarter
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
Alireza Esmikhani
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
Project for Public Spaces & National Center for Biking and Walking
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
Erica Santiago
 

Featured (20)

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 

pretraining text.pdf

  • 1. Pre-training Text-to-Text Transformers for Concept-centric Common Sense 1 2021 Wangchunshu Zhou*, Dong-Ho Lee*, Ravi Kiran Selvam, Seyeon Lee, Bill Yuchen Lin, Xiang Ren * equal contribution
  • 2. Commonsense Reasoning with PTLMs 2 What do you fill with ink to write notes on a piece of copy paper ? (A) Fountain pen (B) Pencil case (C) Printer (D) Notepad
  • 3. Commonsense Reasoning with PTLMs 3 What do you fill with ink to write notes on a piece of copy paper ? (A) Fountain pen (B) Pencil case (C) Printer (D) Notepad
  • 4. Commonsense Reasoning with PTLMs 4 What do you fill with ink to write notes on a piece of copy paper ? (A) Fountain pen (B) Pencil case (C) Printer (D) Notepad Fails to reason with the concept-centric knowledge Base : pencil case Large : printer
  • 5. Current PTLMs 5 … Copy paper is thinner than printer paper, which doesn’t make a huge difference when you’re printing text, but it does when you’re printing large images. Images require a lot of ink and because copy paper has a thinner structure, the ink will need to spread out more for the paper to absorb it all. ... The model may be sensitive to the co-occurence (ink, copy, paper) Pre-train Corpus PTLMs Text Infilling / MLM Base : pencil case Large : printer
  • 6. How can we teach PTLMs to write and reason with Common sense Concepts ? 6 What do you fill with ink to write notes on a piece of copy paper ? (A) Fountain pen (B) Pencil case (C) Printer (D) Notepad fill, ink, write, copy paper Fountain Pen
  • 7. Our idea : Novel Self-supervised Objectives to improve common sense reasoning ability. 7 Text-to-Text Transformer Generate a sentence with the following concepts : Hold Woman Position She was the first woman to hold the position. Generative Objective : Concept-to-Sentence Generation (C2S) Ask model to recover the original sentence given only a few unordered keywords of the sentence.
  • 8. Our idea : Novel Self-supervised Objectives to improve common sense reasoning ability. 8 Text-to-Text Transformer Generate a sentence with the following concepts : Hold Woman Position She was the first woman to hold the position. Correct the order of the following sentence : Tree grows on the apple Apple grows on the tree Generative Objective : Concept Order Recovering (COR) Ask model to recover the original sentence given order-of-concept shuffled sentence.
  • 9. Our idea : Novel Self-supervised Objectives to improve common sense reasoning ability. 9 Text-to-Text Transformer Generate a sentence with the following concepts : Hold Woman Position She was the first woman to hold the position. Correct the order of the following sentence : Tree grows on the apple Apple grows on the tree Which sentence is correct ? : (1) Tree grows on the apple (2) Apple grows on the tree Apple grows on the tree Discriminative Objective : Generative QA Ask model to distinguish the real sentence from a concept-distracted sentence.
  • 10. CALM : Concept-Aware Language Model 10 (1) Given an input sentence x (“She was the first woman to hold the position.”), extract concept-set C (woman, hold, position). She was the first woman to hold the position. Original Sentence x Extract Concept Set C (woman, hold, position)
  • 11. CALM : Concept-Aware Language Model 11 (1) Given an input sentence x (“She was the first woman to hold the position.”), extract concept-set C (woman, hold, position). (1) Given x and C, produce corrupted source sentence x’ either for C2S and COR She was the first woman to hold the position. Original Sentence x hold, woman, position She was the first position to hold the woman. Extract Concept Set C (woman, hold, position)
  • 12. CALM : Concept-Aware Language Model 12 (1) Given an input sentence x (“She was the first woman to hold the position.”), extract concept-set C (woman, hold, position). (1) Given x and C, produce corrupted source sentence x’ either for C2S and COR (2) The generator trained with C2S and COR recovers sentence x’ to distractor x” She was the first woman to hold the position. Original Sentence x hold, woman, position She was the first position to hold the woman. Extract Concept Set C (woman, hold, position) C2S COR Woman holds the position. She was the first woman to position the hold. Generator
  • 13. CALM : Concept-Aware Language Model 13 (1) Given an input sentence x (“She was the first woman to hold the position.”), extract concept-set C (woman, hold, position). (1) Given x and C, produce corrupted source sentence x’ either for C2S and COR (2) The generator trained with C2S and COR recovers sentence x’ to distractor x” (3) The discriminator is trained to distinguish truth sentence from distractor x” She was the first woman to hold the position. Original Sentence x hold, woman, position She was the first position to hold the woman. Extract Concept Set C (woman, hold, position) C2S COR Woman holds the position. She was the first woman to position the hold. Generator Discriminator Woman holds the position. She was the first woman to hold the position.
  • 14. CALM : Concept-Aware Language Model 14 (1) Given an input sentence x (“She was the first woman to hold the position.”), extract concept-set C (woman, hold, position). (1) Given x and C, produce corrupted source sentence x’ either for C2S and COR (2) The generator trained with C2S and COR recovers sentence x’ to distractor x” (3) The discriminator is trained to distinguish truth sentence from distractor x” She was the first woman to hold the position. Original Sentence x hold, woman, position She was the first position to hold the woman. Extract Concept Set C (woman, hold, position) C2S COR Woman holds the position. She was the first woman to position the hold. Generator Discriminator Woman holds the position. She was the first woman to hold the position. Weight Sharing
  • 15. Is CALM reason with concepts ? Yes ! 15 Experimental results on commonsense reasoning dataset. CALM consistently and significantly outperforms the backbone T5-base model.
  • 16. Is CALM reason with concepts ? Yes ! 16 Effective in Large Models. Performance is consistent in large model & different fraction of the dataset. Performance of compared models fine-tuned with different fraction of the dataset
  • 17. Is CALM write with concepts ? Yes ! 17 (Left) : Comparison between PTLMs (Below) : Comparison on generated sentences with same concept-set
  • 18. Summary 18 ● Novel self-supervised strategies for concept-centric Common Sense ○ Concept to Sentence ○ Concept Order Recovering ○ Generative QA ● Two-stage training strategy ○ Generator and Discriminator Text-to-Text models can be pre-trained with better parameter and sample efficiency by carefully designed self-supervised objectives that focus on the ability required by target tasks.