SlideShare a Scribd company logo
1 of 15
Contextual Definition
Generation
Jeffrey Yarbro, Andrew Olney
Institute for Intelligent Systems
University of Memphis
Introduction
• This paper explores the idea of generating contextual definitions
for words using a deep-learning model. It does this by accepting a
word and a context for that word and then autoregressively
generating a definition to match the specific context.
Overview
• Created a new dataset with definition and context pairs.
• Trained a GPT-2 model on dataset
• Evaluated the model with human raters
Motivation for work
• Approximately 98% of words must be within a reader’s vocabulary for
optimal reading comprehension to occur.
• Textbooks often attempt to make up for potential vocabulary gaps by
defining key-terms.
• Problems:
• Reader is required to stop reading and lookup definition
• Limited number of terms defined
• Term may have multiple definitions
Motivation for work (cont.)
• Modern software can make the process easier.
• Can use search engine
• Newer tools allow reader to highlight word and have the definition appear in
a pop-up.
• Problems:
• Definitions may be vague and not adequately fit the context.
• Word may have a long list of definitions.
• If the word has multiple definitions, you must pick the most appropriate one.
Data Collection
• All data was required to have definitions and a labeled context paired
with that definition.
• With this in mind, we collected data from the following sources:
• Lexico
• Wikipedia
• Wiktionary
• Wordnet
Data Collection (cont.)
Source:
Dataset:
Definition Modification
• Some definitions contained low information.
• We attempt to expand these definitions by using regular expressions, parts of
speech tags, and word frequency to find the key reference word.
• We then choose the most fitting definition by using word vectors and comparing
each definition for the reference word (e.g., “country”) with the context and
choose the most similar one by performing cosine similarity.
Model
• GPT-2 is an autoregressive model that uses the decoding blocks of the
transformer architecture.
1. Animation sourced from The Illustrated GPT-2 written by Jay Alammar
Model (cont.)
• Trained the model for 1 epoch
• Used GPT-2 Large: 774 parameter model.
• Two special tokens: <CONTEXT> and <DEFINITION>
Human Evaluation
• Posted survey on CloudResearch. Which sources high quality
participants on Mechanical Turks.
• Allowed participants to choose what topic they wanted to evaluate.
The topics available were from the following subjects:
• American Government
• Anatomy and Physiology
• Astronomy
• Psychology
• Three different surveys for the following context types:
1. Model-generated Short-context: Term used in a sentence
2. Model-generated Long-context: Term used in a sentence along with both
the prior and following sentence.
3. Human-generated: Definitions from the training dataset.
• Raters evaluated 50 questions each.
Survey Format
Results
• Short-context performed significantly better than long-context in terms of accuracy (𝑝 = 0.045). We
speculate the reason for this has to do with the training data containing far more shorter-contexts than
long.
• Real definitions performed significantly better than both short-context (𝑝 < 0.001).
• There were no significant differences between fluency.
• Topic was trending but not significant
Short-Context vs Human-Generated Density Plots
Problems with model
• Too much fluctuation depending on context.
• Trouble interpreting some contexts.
• Some tendency to memorize definitions
Q&A

More Related Content

What's hot

Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Parang Saraf
 
Mining Product Reputations On the Web
Mining Product Reputations On the WebMining Product Reputations On the Web
Mining Product Reputations On the Webfeiwin
 
Concurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsConcurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsParang Saraf
 
Real Time Competitive Marketing Intelligence
Real Time Competitive Marketing IntelligenceReal Time Competitive Marketing Intelligence
Real Time Competitive Marketing Intelligencefeiwin
 
Ran zhou poster 2018
Ran zhou poster 2018Ran zhou poster 2018
Ran zhou poster 2018Ran Zhou
 
Survey of natural language processing(midp2)
Survey of natural language processing(midp2)Survey of natural language processing(midp2)
Survey of natural language processing(midp2)Tariqul islam
 
Interactive Analysis of Word Vector Embeddings
Interactive Analysis of Word Vector EmbeddingsInteractive Analysis of Word Vector Embeddings
Interactive Analysis of Word Vector Embeddingsgleicher
 
Report
ReportReport
Reportbutest
 
Question Answering for Machine Reading Evaluation on Romanian and English
Question Answering for Machine Reading Evaluation on Romanian and EnglishQuestion Answering for Machine Reading Evaluation on Romanian and English
Question Answering for Machine Reading Evaluation on Romanian and EnglishFaculty of Computer Science
 
NAACL2015 presentation
NAACL2015 presentationNAACL2015 presentation
NAACL2015 presentationHan Xu, PhD
 
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTIONTEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTIONijistjournal
 
Feature selection, optimization and clustering strategies of text documents
Feature selection, optimization and clustering strategies of text documentsFeature selection, optimization and clustering strategies of text documents
Feature selection, optimization and clustering strategies of text documentsIJECEIAES
 
Machine translation course program (in English)
Machine translation course program (in English)Machine translation course program (in English)
Machine translation course program (in English)Dmitry Kan
 

What's hot (20)

Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
 
Mining Product Reputations On the Web
Mining Product Reputations On the WebMining Product Reputations On the Web
Mining Product Reputations On the Web
 
Concurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsConcurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector Representations
 
Real Time Competitive Marketing Intelligence
Real Time Competitive Marketing IntelligenceReal Time Competitive Marketing Intelligence
Real Time Competitive Marketing Intelligence
 
Ran zhou poster 2018
Ran zhou poster 2018Ran zhou poster 2018
Ran zhou poster 2018
 
Survey of natural language processing(midp2)
Survey of natural language processing(midp2)Survey of natural language processing(midp2)
Survey of natural language processing(midp2)
 
Mobile Computing
Mobile ComputingMobile Computing
Mobile Computing
 
Interactive Analysis of Word Vector Embeddings
Interactive Analysis of Word Vector EmbeddingsInteractive Analysis of Word Vector Embeddings
Interactive Analysis of Word Vector Embeddings
 
Lect06
Lect06Lect06
Lect06
 
Report
ReportReport
Report
 
Data wrangling week 9
Data wrangling week 9Data wrangling week 9
Data wrangling week 9
 
Question Answering for Machine Reading Evaluation on Romanian and English
Question Answering for Machine Reading Evaluation on Romanian and EnglishQuestion Answering for Machine Reading Evaluation on Romanian and English
Question Answering for Machine Reading Evaluation on Romanian and English
 
NAACL2015 presentation
NAACL2015 presentationNAACL2015 presentation
NAACL2015 presentation
 
Text categorization
Text categorizationText categorization
Text categorization
 
mlss
mlssmlss
mlss
 
Meta-Learning Presentation
Meta-Learning PresentationMeta-Learning Presentation
Meta-Learning Presentation
 
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTIONTEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
 
Query expansion
Query expansionQuery expansion
Query expansion
 
Feature selection, optimization and clustering strategies of text documents
Feature selection, optimization and clustering strategies of text documentsFeature selection, optimization and clustering strategies of text documents
Feature selection, optimization and clustering strategies of text documents
 
Machine translation course program (in English)
Machine translation course program (in English)Machine translation course program (in English)
Machine translation course program (in English)
 

Similar to Contextual Definition Generation

Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkSimon Hughes
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...RajkiranVeluri
 
Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...Kai Li
 
ML slide share.pptx
ML slide share.pptxML slide share.pptx
ML slide share.pptxGoodReads1
 
Personalized Search and Job Recommendations - Simon Hughes, Dice.com
Personalized Search and Job Recommendations - Simon Hughes, Dice.comPersonalized Search and Job Recommendations - Simon Hughes, Dice.com
Personalized Search and Job Recommendations - Simon Hughes, Dice.comLucidworks
 
6.domain extraction from research papers
6.domain extraction from research papers6.domain extraction from research papers
6.domain extraction from research papersEditorJST
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Lucidworks
 
Keyword_extraction.pptx
Keyword_extraction.pptxKeyword_extraction.pptx
Keyword_extraction.pptxBiswarupDas18
 
Tutorial on Coreference Resolution
Tutorial on Coreference Resolution Tutorial on Coreference Resolution
Tutorial on Coreference Resolution Anirudh Jayakumar
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsParisa Niksefat
 
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 ReviewNatural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 Reviewchangedaeoh
 
Natural Language Processing Advancements By Deep Learning - A Survey
Natural Language Processing Advancements By Deep Learning - A SurveyNatural Language Processing Advancements By Deep Learning - A Survey
Natural Language Processing Advancements By Deep Learning - A SurveyAkshayaNagarajan10
 
DAA Mini Project.pptx
DAA Mini Project.pptxDAA Mini Project.pptx
DAA Mini Project.pptxAkashDudhane4
 
DAA Mini Project.pptx
DAA Mini Project.pptxDAA Mini Project.pptx
DAA Mini Project.pptxAkashDudhane4
 
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...idescitation
 
An Automatic Question Paper Generation : Using Bloom's Taxonomy
An Automatic Question Paper Generation : Using Bloom's   TaxonomyAn Automatic Question Paper Generation : Using Bloom's   Taxonomy
An Automatic Question Paper Generation : Using Bloom's TaxonomyIRJET Journal
 
Dr.saleem gul assignment summary
Dr.saleem gul assignment summaryDr.saleem gul assignment summary
Dr.saleem gul assignment summaryJaved Riza
 
A Gentle Introduction to Text Analysis I
A Gentle Introduction to Text Analysis IA Gentle Introduction to Text Analysis I
A Gentle Introduction to Text Analysis IUNCResearchHub
 

Similar to Contextual Definition Generation (20)

Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
 
Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...
 
ML slide share.pptx
ML slide share.pptxML slide share.pptx
ML slide share.pptx
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Personalized Search and Job Recommendations - Simon Hughes, Dice.com
Personalized Search and Job Recommendations - Simon Hughes, Dice.comPersonalized Search and Job Recommendations - Simon Hughes, Dice.com
Personalized Search and Job Recommendations - Simon Hughes, Dice.com
 
6.domain extraction from research papers
6.domain extraction from research papers6.domain extraction from research papers
6.domain extraction from research papers
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
 
Keyword_extraction.pptx
Keyword_extraction.pptxKeyword_extraction.pptx
Keyword_extraction.pptx
 
Tutorial on Coreference Resolution
Tutorial on Coreference Resolution Tutorial on Coreference Resolution
Tutorial on Coreference Resolution
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation Outputs
 
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 ReviewNatural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
 
Natural Language Processing Advancements By Deep Learning - A Survey
Natural Language Processing Advancements By Deep Learning - A SurveyNatural Language Processing Advancements By Deep Learning - A Survey
Natural Language Processing Advancements By Deep Learning - A Survey
 
DAA Mini Project.pptx
DAA Mini Project.pptxDAA Mini Project.pptx
DAA Mini Project.pptx
 
DAA Mini Project.pptx
DAA Mini Project.pptxDAA Mini Project.pptx
DAA Mini Project.pptx
 
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...
 
An Automatic Question Paper Generation : Using Bloom's Taxonomy
An Automatic Question Paper Generation : Using Bloom's   TaxonomyAn Automatic Question Paper Generation : Using Bloom's   Taxonomy
An Automatic Question Paper Generation : Using Bloom's Taxonomy
 
Dr.saleem gul assignment summary
Dr.saleem gul assignment summaryDr.saleem gul assignment summary
Dr.saleem gul assignment summary
 
A Gentle Introduction to Text Analysis I
A Gentle Introduction to Text Analysis IA Gentle Introduction to Text Analysis I
A Gentle Introduction to Text Analysis I
 
Text Mining
Text MiningText Mining
Text Mining
 

More from Sergey Sosnovsky

Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...Sergey Sosnovsky
 
Toward Eliminating Hallucinations: GPT-based Explanatory AI for Intelligent T...
Toward Eliminating Hallucinations: GPT-based Explanatory AI for Intelligent T...Toward Eliminating Hallucinations: GPT-based Explanatory AI for Intelligent T...
Toward Eliminating Hallucinations: GPT-based Explanatory AI for Intelligent T...Sergey Sosnovsky
 
Layout- and Activity-based Textbook Modeling for Automatic PDF Textbook Extra...
Layout- and Activity-based Textbook Modeling for Automatic PDF Textbook Extra...Layout- and Activity-based Textbook Modeling for Automatic PDF Textbook Extra...
Layout- and Activity-based Textbook Modeling for Automatic PDF Textbook Extra...Sergey Sosnovsky
 
Exploring the Content Ecosystem of the First Open-source Adaptive Tutor and i...
Exploring the Content Ecosystem of the First Open-source Adaptive Tutor and i...Exploring the Content Ecosystem of the First Open-source Adaptive Tutor and i...
Exploring the Content Ecosystem of the First Open-source Adaptive Tutor and i...Sergey Sosnovsky
 
Advancing Intelligent Textbooks with Automatically Generated Practice: A Larg...
Advancing Intelligent Textbooks with Automatically Generated Practice: A Larg...Advancing Intelligent Textbooks with Automatically Generated Practice: A Larg...
Advancing Intelligent Textbooks with Automatically Generated Practice: A Larg...Sergey Sosnovsky
 
Creating Session Data from eTextbook Event Streams
Creating Session Data from eTextbook Event StreamsCreating Session Data from eTextbook Event Streams
Creating Session Data from eTextbook Event StreamsSergey Sosnovsky
 
Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions ...
Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions ...Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions ...
Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions ...Sergey Sosnovsky
 
Interactions of reading and assessment activities
Interactions of reading and assessment activitiesInteractions of reading and assessment activities
Interactions of reading and assessment activitiesSergey Sosnovsky
 
Parallel Construction: A Parallel Corpus Approach for Automatic Question Gene...
Parallel Construction: A Parallel Corpus Approach for Automatic Question Gene...Parallel Construction: A Parallel Corpus Approach for Automatic Question Gene...
Parallel Construction: A Parallel Corpus Approach for Automatic Question Gene...Sergey Sosnovsky
 
YAI4Edu: an Explanatory AI to Generate Interactive e-Books for Education
YAI4Edu: an Explanatory AI to Generate Interactive e-Books for EducationYAI4Edu: an Explanatory AI to Generate Interactive e-Books for Education
YAI4Edu: an Explanatory AI to Generate Interactive e-Books for EducationSergey Sosnovsky
 
Automatic Question Generation for Evidence-based Online Courseware Engineering
Automatic Question Generation for Evidence-based Online Courseware EngineeringAutomatic Question Generation for Evidence-based Online Courseware Engineering
Automatic Question Generation for Evidence-based Online Courseware EngineeringSergey Sosnovsky
 
Reading Comprehension Quiz Generation using Generative Pre-trained Transformers
Reading Comprehension Quiz Generation using Generative Pre-trained TransformersReading Comprehension Quiz Generation using Generative Pre-trained Transformers
Reading Comprehension Quiz Generation using Generative Pre-trained TransformersSergey Sosnovsky
 
Transforming Textbooks into Learning by Doing Environments: An Evaluation of ...
Transforming Textbooks into Learning by Doing Environments: An Evaluation of ...Transforming Textbooks into Learning by Doing Environments: An Evaluation of ...
Transforming Textbooks into Learning by Doing Environments: An Evaluation of ...Sergey Sosnovsky
 
Generation of Assessment Questions from Textbooks Enriched with Knowledge Models
Generation of Assessment Questions from Textbooks Enriched with Knowledge ModelsGeneration of Assessment Questions from Textbooks Enriched with Knowledge Models
Generation of Assessment Questions from Textbooks Enriched with Knowledge ModelsSergey Sosnovsky
 
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...Using Semantics of Textbook Highlights to Predict Student Comprehension and K...
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...Sergey Sosnovsky
 
Dental TutorBot: Exploitation of Dental Textbooks for Automated Learning
Dental TutorBot: Exploitation of Dental Textbooks for Automated LearningDental TutorBot: Exploitation of Dental Textbooks for Automated Learning
Dental TutorBot: Exploitation of Dental Textbooks for Automated LearningSergey Sosnovsky
 
Using Programmed Instruction to Help Students Engage with eTextbook Content
Using Programmed Instruction to Help Students Engage with eTextbook Content Using Programmed Instruction to Help Students Engage with eTextbook Content
Using Programmed Instruction to Help Students Engage with eTextbook Content Sergey Sosnovsky
 
Adding Intelligence to a Textbook for Human Anatomy with a Causal Concept Map...
Adding Intelligence to a Textbook for Human Anatomy with a Causal Concept Map...Adding Intelligence to a Textbook for Human Anatomy with a Causal Concept Map...
Adding Intelligence to a Textbook for Human Anatomy with a Causal Concept Map...Sergey Sosnovsky
 
Interlingua: Linking Textbooks Across Different Languages
Interlingua: Linking Textbooks Across Different Languages Interlingua: Linking Textbooks Across Different Languages
Interlingua: Linking Textbooks Across Different Languages Sergey Sosnovsky
 

More from Sergey Sosnovsky (20)

Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
 
Toward Eliminating Hallucinations: GPT-based Explanatory AI for Intelligent T...
Toward Eliminating Hallucinations: GPT-based Explanatory AI for Intelligent T...Toward Eliminating Hallucinations: GPT-based Explanatory AI for Intelligent T...
Toward Eliminating Hallucinations: GPT-based Explanatory AI for Intelligent T...
 
Layout- and Activity-based Textbook Modeling for Automatic PDF Textbook Extra...
Layout- and Activity-based Textbook Modeling for Automatic PDF Textbook Extra...Layout- and Activity-based Textbook Modeling for Automatic PDF Textbook Extra...
Layout- and Activity-based Textbook Modeling for Automatic PDF Textbook Extra...
 
Exploring the Content Ecosystem of the First Open-source Adaptive Tutor and i...
Exploring the Content Ecosystem of the First Open-source Adaptive Tutor and i...Exploring the Content Ecosystem of the First Open-source Adaptive Tutor and i...
Exploring the Content Ecosystem of the First Open-source Adaptive Tutor and i...
 
Advancing Intelligent Textbooks with Automatically Generated Practice: A Larg...
Advancing Intelligent Textbooks with Automatically Generated Practice: A Larg...Advancing Intelligent Textbooks with Automatically Generated Practice: A Larg...
Advancing Intelligent Textbooks with Automatically Generated Practice: A Larg...
 
Creating Session Data from eTextbook Event Streams
Creating Session Data from eTextbook Event StreamsCreating Session Data from eTextbook Event Streams
Creating Session Data from eTextbook Event Streams
 
Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions ...
Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions ...Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions ...
Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions ...
 
Interactions of reading and assessment activities
Interactions of reading and assessment activitiesInteractions of reading and assessment activities
Interactions of reading and assessment activities
 
Parallel Construction: A Parallel Corpus Approach for Automatic Question Gene...
Parallel Construction: A Parallel Corpus Approach for Automatic Question Gene...Parallel Construction: A Parallel Corpus Approach for Automatic Question Gene...
Parallel Construction: A Parallel Corpus Approach for Automatic Question Gene...
 
YAI4Edu: an Explanatory AI to Generate Interactive e-Books for Education
YAI4Edu: an Explanatory AI to Generate Interactive e-Books for EducationYAI4Edu: an Explanatory AI to Generate Interactive e-Books for Education
YAI4Edu: an Explanatory AI to Generate Interactive e-Books for Education
 
Automatic Question Generation for Evidence-based Online Courseware Engineering
Automatic Question Generation for Evidence-based Online Courseware EngineeringAutomatic Question Generation for Evidence-based Online Courseware Engineering
Automatic Question Generation for Evidence-based Online Courseware Engineering
 
Reading Comprehension Quiz Generation using Generative Pre-trained Transformers
Reading Comprehension Quiz Generation using Generative Pre-trained TransformersReading Comprehension Quiz Generation using Generative Pre-trained Transformers
Reading Comprehension Quiz Generation using Generative Pre-trained Transformers
 
Transforming Textbooks into Learning by Doing Environments: An Evaluation of ...
Transforming Textbooks into Learning by Doing Environments: An Evaluation of ...Transforming Textbooks into Learning by Doing Environments: An Evaluation of ...
Transforming Textbooks into Learning by Doing Environments: An Evaluation of ...
 
Generation of Assessment Questions from Textbooks Enriched with Knowledge Models
Generation of Assessment Questions from Textbooks Enriched with Knowledge ModelsGeneration of Assessment Questions from Textbooks Enriched with Knowledge Models
Generation of Assessment Questions from Textbooks Enriched with Knowledge Models
 
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...Using Semantics of Textbook Highlights to Predict Student Comprehension and K...
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...
 
Dental TutorBot: Exploitation of Dental Textbooks for Automated Learning
Dental TutorBot: Exploitation of Dental Textbooks for Automated LearningDental TutorBot: Exploitation of Dental Textbooks for Automated Learning
Dental TutorBot: Exploitation of Dental Textbooks for Automated Learning
 
What's in a textbook
What's in a textbookWhat's in a textbook
What's in a textbook
 
Using Programmed Instruction to Help Students Engage with eTextbook Content
Using Programmed Instruction to Help Students Engage with eTextbook Content Using Programmed Instruction to Help Students Engage with eTextbook Content
Using Programmed Instruction to Help Students Engage with eTextbook Content
 
Adding Intelligence to a Textbook for Human Anatomy with a Causal Concept Map...
Adding Intelligence to a Textbook for Human Anatomy with a Causal Concept Map...Adding Intelligence to a Textbook for Human Anatomy with a Causal Concept Map...
Adding Intelligence to a Textbook for Human Anatomy with a Causal Concept Map...
 
Interlingua: Linking Textbooks Across Different Languages
Interlingua: Linking Textbooks Across Different Languages Interlingua: Linking Textbooks Across Different Languages
Interlingua: Linking Textbooks Across Different Languages
 

Recently uploaded

Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 

Recently uploaded (20)

Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 

Contextual Definition Generation

  • 1. Contextual Definition Generation Jeffrey Yarbro, Andrew Olney Institute for Intelligent Systems University of Memphis
  • 2. Introduction • This paper explores the idea of generating contextual definitions for words using a deep-learning model. It does this by accepting a word and a context for that word and then autoregressively generating a definition to match the specific context. Overview • Created a new dataset with definition and context pairs. • Trained a GPT-2 model on dataset • Evaluated the model with human raters
  • 3. Motivation for work • Approximately 98% of words must be within a reader’s vocabulary for optimal reading comprehension to occur. • Textbooks often attempt to make up for potential vocabulary gaps by defining key-terms. • Problems: • Reader is required to stop reading and lookup definition • Limited number of terms defined • Term may have multiple definitions
  • 4. Motivation for work (cont.) • Modern software can make the process easier. • Can use search engine • Newer tools allow reader to highlight word and have the definition appear in a pop-up. • Problems: • Definitions may be vague and not adequately fit the context. • Word may have a long list of definitions. • If the word has multiple definitions, you must pick the most appropriate one.
  • 5. Data Collection • All data was required to have definitions and a labeled context paired with that definition. • With this in mind, we collected data from the following sources: • Lexico • Wikipedia • Wiktionary • Wordnet
  • 7. Definition Modification • Some definitions contained low information. • We attempt to expand these definitions by using regular expressions, parts of speech tags, and word frequency to find the key reference word. • We then choose the most fitting definition by using word vectors and comparing each definition for the reference word (e.g., “country”) with the context and choose the most similar one by performing cosine similarity.
  • 8. Model • GPT-2 is an autoregressive model that uses the decoding blocks of the transformer architecture. 1. Animation sourced from The Illustrated GPT-2 written by Jay Alammar
  • 9. Model (cont.) • Trained the model for 1 epoch • Used GPT-2 Large: 774 parameter model. • Two special tokens: <CONTEXT> and <DEFINITION>
  • 10. Human Evaluation • Posted survey on CloudResearch. Which sources high quality participants on Mechanical Turks. • Allowed participants to choose what topic they wanted to evaluate. The topics available were from the following subjects: • American Government • Anatomy and Physiology • Astronomy • Psychology • Three different surveys for the following context types: 1. Model-generated Short-context: Term used in a sentence 2. Model-generated Long-context: Term used in a sentence along with both the prior and following sentence. 3. Human-generated: Definitions from the training dataset. • Raters evaluated 50 questions each.
  • 12. Results • Short-context performed significantly better than long-context in terms of accuracy (𝑝 = 0.045). We speculate the reason for this has to do with the training data containing far more shorter-contexts than long. • Real definitions performed significantly better than both short-context (𝑝 < 0.001). • There were no significant differences between fluency. • Topic was trending but not significant
  • 14. Problems with model • Too much fluctuation depending on context. • Trouble interpreting some contexts. • Some tendency to memorize definitions
  • 15. Q&A