SlideShare a Scribd company logo
Competence-Level Prediction and Resume & Job Description Matching
Using Context-Aware Transformer Models
Changmao Li, Elaine Fisher, Rebecca S. Thomas, Stephen Pittard, Vicki Hertzberg, and Jinho Choi
Emory NLP
Outline
● Dataset
● Tasks
● Approaches
● Experiments
● Error Analysis
● Contributions
Dataset
Source: Clinical Research Coordinators(CRC) Applicants Resumes
Here we have two kinds of annotations:
1. The levels they applied(an applicant can apply multiple levels).
2. The level they should be qualified. This is annotated by human experts with
some annotation agreements. There are four levels, CRC1, CRC2, CRC3,
CRC4. For the annotation, if the resume cannot match any level it will be
annotated with Not Qualified(NQ)
Besides, there is a job description for each level.
Dataset
Preprocessing:
The original resume files are in DOC or PDF, they are parsed using some tools
and splitted into 6 sections and finally put into the json file for the convenient use.
The existence ratio of each section in the CRC levels
Dataset
Annotation:
Two experts with experience in recruiting applicants for CRC positions of all
levels design the annotation guidelines in 5 rounds by labeling each resume.
Kappa scores measured for ITA during the five rounds of guideline development
Tasks
Two novel tasks are proposed for this new dataset:
1. (Multiclass classification(5 class))Given a resume, decide which level of
CRC positions that the corresponding applicant is suitable for.(Use the
resume as input and the annotation 2 as the gold output)
2. (Binary classification)Given a resume and a CRC level job description,
decide whether the applicant is suitable for that particular level.(Use both
resume and job description for the levels they applied for as input and
combine the annotation 1 and annotation 2 to get the binary gold output)
Approaches
Baseline Approaches for both tasks
Approaches
Strategies when applying baseline models
● Section Trimming for baseline models due to input length limitation of
transformer encoders
Task 1 Task 2
Approaches
Proposed Models for the multiclass classification task
The context-aware model using section pruning and section encoding
Approaches
Proposed Models for the multiclass classification task
The context-aware model using chunk segmenting and section encoding
Approaches
Proposed Models for the binary classification task
Approaches
The context-aware models using chunk segmenting + section encoding + job description embedding
and multi-head attention between the resume and the job description
Approaches
Strategies when applying models
● Section Pruning for Proposed “encoding by sections” models in case
each section exceeds the input length of transformer encoders
Analysis on Section Pruning (in Appendix)
Section lengths before section pruning
Section lengths after section pruning
Experiments
Data split for the multiclass classification task(Keep label distributions):
Data statistics for the competence-level classification task
Experiments
Data split for binary classification task(keep label and CRC distributions
without overlap resumes between training and dev or test set ):
Data statistics for the resume-to-job description matching task
Algorithm to split dataset while avoiding overlaps
between training and evaluation dataset(in Appendix)
The key idea is
1. Split the data by targeted label distributions but with a smaller initial training
set ratio than the original one.
2. If there are overlapping applicants, then the algorithm puts all the overlaps
into the training set so that the training set ratio will be large enough to be
close to the targeted training set ratio while the label distributions are still kept
in a great extent.
Experiments
Experimented Models
W!: Whole context model + section trimming
P: Context-aware model + section pruning
P⊕I:P+ section encoding
C: Context-aware model + chunk segmenting
C⊕I:C+ section encoding
Models for the competence-level classification task
W!" : Whole context + sec./job_desc. trimming
P⊕I⊕J:P⊕I+ job_desc. embedding
P⊕I⊕J⊕A:P⊕I⊕J+ multi-head attention
P⊕I⊕J⊕AE:P⊕I⊕J-E#
C⊕I⊕J:C⊕I+ job_desc. embedding
C⊕I⊕J⊕A:C⊕I⊕J+ multi-head attention
C⊕I⊕J⊕AE:C⊕I⊕J- E#
Models for the resume-to-job description matching task
Experiments
Results for the competence-level classification task.
Experiments
Results for the resume-to-job description matching task.
Experiments
Analysis for the competence-level classification task.
Confusion matrix for the best model of the competence-level classification task
Experiments
Analysis for the resume-to-job description matching task.
Confusion matrix for the best model of the resume-to-job description matching task
Error Analysis
• It’s unable to identify clinical research experience.
• It can’t identify dates of experience.
• It’s hard to distinguish adjacent CRC positions.
Contributions
● Introduced a new resume classification dataset.
● Proposed two new tasks for this new dataset.
● Proposed novel context-aware transformer approaches for two tasks.
● Conducted experiments with several proposed models.
● Conducted both quantitative and qualitative analysis for future improvements.
Thank You
Q & A

More Related Content

What's hot

OODP Unit 1 OOPs classes and objects
OODP Unit 1 OOPs classes and objectsOODP Unit 1 OOPs classes and objects
OODP Unit 1 OOPs classes and objects
Shanmuganathan C
 
Bert pre_training_of_deep_bidirectional_transformers_for_language_understanding
Bert  pre_training_of_deep_bidirectional_transformers_for_language_understandingBert  pre_training_of_deep_bidirectional_transformers_for_language_understanding
Bert pre_training_of_deep_bidirectional_transformers_for_language_understanding
ThyrixYang1
 
Cs 2001
Cs 2001Cs 2001
Cs 2001
Ravi Rajput
 
17430 data communication & net
17430  data communication & net17430  data communication & net
17430 data communication & net
soni_nits
 
The DE-9IM Matrix in Details using ST_Relate: In Picture and SQL
The DE-9IM Matrix in Details using ST_Relate: In Picture and SQLThe DE-9IM Matrix in Details using ST_Relate: In Picture and SQL
The DE-9IM Matrix in Details using ST_Relate: In Picture and SQL
torp42
 
Ay34306312
Ay34306312Ay34306312
Ay34306312
IJERA Editor
 
1984 Article on An Application of AI to Operations Reserach
1984 Article on An Application of AI to Operations Reserach1984 Article on An Application of AI to Operations Reserach
1984 Article on An Application of AI to Operations Reserach
Bob Marcus
 
Mit203 analysis and design of algorithms
Mit203  analysis and design of algorithmsMit203  analysis and design of algorithms
Mit203 analysis and design of algorithms
smumbahelp
 
Algorithms,graph theory and combinatorics
Algorithms,graph theory and combinatoricsAlgorithms,graph theory and combinatorics
Algorithms,graph theory and combinatoricsProf.Dr.Hanumanthappa J
 
PHP
PHPPHP
GENERATING PYTHON CODE FROM OBJECT-Z SPECIFICATIONS
GENERATING PYTHON CODE FROM OBJECT-Z SPECIFICATIONSGENERATING PYTHON CODE FROM OBJECT-Z SPECIFICATIONS
GENERATING PYTHON CODE FROM OBJECT-Z SPECIFICATIONS
ijseajournal
 
Spatial Indexing
Spatial IndexingSpatial Indexing
Spatial Indexing
torp42
 
Reference Scope Identification in Citing Sentences
Reference Scope Identification in Citing SentencesReference Scope Identification in Citing Sentences
Reference Scope Identification in Citing Sentences
Akihiro Kameda
 
Data compression using python draft
Data compression using python draftData compression using python draft
Data compression using python draft
Ashok Govindarajan
 
On the Semantics of Linking and Importing in Modular Ontologies
On the Semantics of Linking and Importing in Modular OntologiesOn the Semantics of Linking and Importing in Modular Ontologies
On the Semantics of Linking and Importing in Modular OntologiesJie Bao
 
Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Avelin Huo
 
Regular Expressions -- SAS and Perl
Regular Expressions -- SAS and PerlRegular Expressions -- SAS and Perl
Regular Expressions -- SAS and Perl
Mark Tabladillo
 
Knowledg graphs yosi mass
Knowledg graphs yosi massKnowledg graphs yosi mass
Knowledg graphs yosi mass
diannepatricia
 

What's hot (20)

grammer
grammergrammer
grammer
 
OODP Unit 1 OOPs classes and objects
OODP Unit 1 OOPs classes and objectsOODP Unit 1 OOPs classes and objects
OODP Unit 1 OOPs classes and objects
 
Bert pre_training_of_deep_bidirectional_transformers_for_language_understanding
Bert  pre_training_of_deep_bidirectional_transformers_for_language_understandingBert  pre_training_of_deep_bidirectional_transformers_for_language_understanding
Bert pre_training_of_deep_bidirectional_transformers_for_language_understanding
 
Cs 2001
Cs 2001Cs 2001
Cs 2001
 
17430 data communication & net
17430  data communication & net17430  data communication & net
17430 data communication & net
 
The DE-9IM Matrix in Details using ST_Relate: In Picture and SQL
The DE-9IM Matrix in Details using ST_Relate: In Picture and SQLThe DE-9IM Matrix in Details using ST_Relate: In Picture and SQL
The DE-9IM Matrix in Details using ST_Relate: In Picture and SQL
 
Ay34306312
Ay34306312Ay34306312
Ay34306312
 
1984 Article on An Application of AI to Operations Reserach
1984 Article on An Application of AI to Operations Reserach1984 Article on An Application of AI to Operations Reserach
1984 Article on An Application of AI to Operations Reserach
 
Mit203 analysis and design of algorithms
Mit203  analysis and design of algorithmsMit203  analysis and design of algorithms
Mit203 analysis and design of algorithms
 
Algorithms,graph theory and combinatorics
Algorithms,graph theory and combinatoricsAlgorithms,graph theory and combinatorics
Algorithms,graph theory and combinatorics
 
PHP
PHPPHP
PHP
 
GENERATING PYTHON CODE FROM OBJECT-Z SPECIFICATIONS
GENERATING PYTHON CODE FROM OBJECT-Z SPECIFICATIONSGENERATING PYTHON CODE FROM OBJECT-Z SPECIFICATIONS
GENERATING PYTHON CODE FROM OBJECT-Z SPECIFICATIONS
 
Spatial Indexing
Spatial IndexingSpatial Indexing
Spatial Indexing
 
Thesis_Presentation
Thesis_PresentationThesis_Presentation
Thesis_Presentation
 
Reference Scope Identification in Citing Sentences
Reference Scope Identification in Citing SentencesReference Scope Identification in Citing Sentences
Reference Scope Identification in Citing Sentences
 
Data compression using python draft
Data compression using python draftData compression using python draft
Data compression using python draft
 
On the Semantics of Linking and Importing in Modular Ontologies
On the Semantics of Linking and Importing in Modular OntologiesOn the Semantics of Linking and Importing in Modular Ontologies
On the Semantics of Linking and Importing in Modular Ontologies
 
Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03
 
Regular Expressions -- SAS and Perl
Regular Expressions -- SAS and PerlRegular Expressions -- SAS and Perl
Regular Expressions -- SAS and Perl
 
Knowledg graphs yosi mass
Knowledg graphs yosi massKnowledg graphs yosi mass
Knowledg graphs yosi mass
 

Similar to Competence-Level Prediction and Resume & Job Description Matching Using Context-Aware Transformer Models

17 pcds syllabus
17 pcds syllabus17 pcds syllabus
17 pcds syllabus
anandgudnavar
 
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRFEnd-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
Jayavardhan Reddy Peddamail
 
Predicting Tweet Sentiment
Predicting Tweet SentimentPredicting Tweet Sentiment
Predicting Tweet Sentiment
Lucinda Linde
 
Concept Detection of Multiple Choice Questions using Transformer Based Models
Concept Detection of Multiple Choice Questions using Transformer Based ModelsConcept Detection of Multiple Choice Questions using Transformer Based Models
Concept Detection of Multiple Choice Questions using Transformer Based Models
IRJET Journal
 
Cncwebworld c programming,
Cncwebworld c programming,Cncwebworld c programming,
Cncwebworld c programming,
CNC WEB WORLD
 
A WEB BASED APPLICATION FOR RESUME PARSER USING NATURAL LANGUAGE PROCESSING T...
A WEB BASED APPLICATION FOR RESUME PARSER USING NATURAL LANGUAGE PROCESSING T...A WEB BASED APPLICATION FOR RESUME PARSER USING NATURAL LANGUAGE PROCESSING T...
A WEB BASED APPLICATION FOR RESUME PARSER USING NATURAL LANGUAGE PROCESSING T...
IRJET Journal
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
Sanghamitra Deb
 
IJCAI01 MSPC.ppt
IJCAI01 MSPC.pptIJCAI01 MSPC.ppt
IJCAI01 MSPC.pptPtidej Team
 
Fy secondsemester2016
Fy secondsemester2016Fy secondsemester2016
Fy secondsemester2016
Ankit Dubey
 
Fy secondsemester2016
Fy secondsemester2016Fy secondsemester2016
Fy secondsemester2016
Ankit Dubey
 
Fy secondsemester2016
Fy secondsemester2016Fy secondsemester2016
Fy secondsemester2016
Ankit Dubey
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
Lionel Briand
 
22316-2019-Summer-model-answer-paper.pdf
22316-2019-Summer-model-answer-paper.pdf22316-2019-Summer-model-answer-paper.pdf
22316-2019-Summer-model-answer-paper.pdf
PradipShinde53
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
Lifeng (Aaron) Han
 
Programming in C [Module One]
Programming in C [Module One]Programming in C [Module One]
Programming in C [Module One]
Abhishek Sinha
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
ISSEL
 
M.tech.(cse) (regular) part i(semester i & ii)
M.tech.(cse) (regular) part i(semester i & ii)M.tech.(cse) (regular) part i(semester i & ii)
M.tech.(cse) (regular) part i(semester i & ii)
Rekha Bhatia
 
Ijcai01 mspc.ppt
Ijcai01 mspc.pptIjcai01 mspc.ppt
Ijcai01 mspc.ppt
Yann-Gaël Guéhéneuc
 
Cis115 programming logic
Cis115 programming logicCis115 programming logic
Cis115 programming logicCarla Michael
 

Similar to Competence-Level Prediction and Resume & Job Description Matching Using Context-Aware Transformer Models (20)

17 pcds syllabus
17 pcds syllabus17 pcds syllabus
17 pcds syllabus
 
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRFEnd-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
 
Predicting Tweet Sentiment
Predicting Tweet SentimentPredicting Tweet Sentiment
Predicting Tweet Sentiment
 
Concept Detection of Multiple Choice Questions using Transformer Based Models
Concept Detection of Multiple Choice Questions using Transformer Based ModelsConcept Detection of Multiple Choice Questions using Transformer Based Models
Concept Detection of Multiple Choice Questions using Transformer Based Models
 
Cncwebworld c programming,
Cncwebworld c programming,Cncwebworld c programming,
Cncwebworld c programming,
 
A WEB BASED APPLICATION FOR RESUME PARSER USING NATURAL LANGUAGE PROCESSING T...
A WEB BASED APPLICATION FOR RESUME PARSER USING NATURAL LANGUAGE PROCESSING T...A WEB BASED APPLICATION FOR RESUME PARSER USING NATURAL LANGUAGE PROCESSING T...
A WEB BASED APPLICATION FOR RESUME PARSER USING NATURAL LANGUAGE PROCESSING T...
 
C24011018
C24011018C24011018
C24011018
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
 
IJCAI01 MSPC.ppt
IJCAI01 MSPC.pptIJCAI01 MSPC.ppt
IJCAI01 MSPC.ppt
 
Fy secondsemester2016
Fy secondsemester2016Fy secondsemester2016
Fy secondsemester2016
 
Fy secondsemester2016
Fy secondsemester2016Fy secondsemester2016
Fy secondsemester2016
 
Fy secondsemester2016
Fy secondsemester2016Fy secondsemester2016
Fy secondsemester2016
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
22316-2019-Summer-model-answer-paper.pdf
22316-2019-Summer-model-answer-paper.pdf22316-2019-Summer-model-answer-paper.pdf
22316-2019-Summer-model-answer-paper.pdf
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
 
Programming in C [Module One]
Programming in C [Module One]Programming in C [Module One]
Programming in C [Module One]
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
 
M.tech.(cse) (regular) part i(semester i & ii)
M.tech.(cse) (regular) part i(semester i & ii)M.tech.(cse) (regular) part i(semester i & ii)
M.tech.(cse) (regular) part i(semester i & ii)
 
Ijcai01 mspc.ppt
Ijcai01 mspc.pptIjcai01 mspc.ppt
Ijcai01 mspc.ppt
 
Cis115 programming logic
Cis115 programming logicCis115 programming logic
Cis115 programming logic
 

More from Jinho Choi

Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Jinho Choi
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Jinho Choi
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference Resolution
Jinho Choi
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Jinho Choi
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning Representation
Jinho Choi
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
Jinho Choi
 
CKY Parsing
CKY ParsingCKY Parsing
CKY Parsing
Jinho Choi
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet Similarities
Jinho Choi
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical Relations
Jinho Choi
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue Management
Jinho Choi
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR Parsing
Jinho Choi
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue Understanding
Jinho Choi
 
Topological Sort
Topological SortTopological Sort
Topological Sort
Jinho Choi
 
Tries - Put
Tries - PutTries - Put
Tries - Put
Jinho Choi
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Jinho Choi
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Jinho Choi
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports Intelligently
Jinho Choi
 
Text-to-SQL with Data-Driven Templates
Text-to-SQL with Data-Driven TemplatesText-to-SQL with Data-Driven Templates
Text-to-SQL with Data-Driven Templates
Jinho Choi
 
Resume Classification with Term Attention Embeddings
Resume Classification with Term Attention EmbeddingsResume Classification with Term Attention Embeddings
Resume Classification with Term Attention Embeddings
Jinho Choi
 
[DSA-Java] Heap Sort
[DSA-Java] Heap Sort[DSA-Java] Heap Sort
[DSA-Java] Heap Sort
Jinho Choi
 

More from Jinho Choi (20)

Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference Resolution
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning Representation
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
 
CKY Parsing
CKY ParsingCKY Parsing
CKY Parsing
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet Similarities
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical Relations
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue Management
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR Parsing
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue Understanding
 
Topological Sort
Topological SortTopological Sort
Topological Sort
 
Tries - Put
Tries - PutTries - Put
Tries - Put
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports Intelligently
 
Text-to-SQL with Data-Driven Templates
Text-to-SQL with Data-Driven TemplatesText-to-SQL with Data-Driven Templates
Text-to-SQL with Data-Driven Templates
 
Resume Classification with Term Attention Embeddings
Resume Classification with Term Attention EmbeddingsResume Classification with Term Attention Embeddings
Resume Classification with Term Attention Embeddings
 
[DSA-Java] Heap Sort
[DSA-Java] Heap Sort[DSA-Java] Heap Sort
[DSA-Java] Heap Sort
 

Recently uploaded

Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 

Recently uploaded (20)

Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 

Competence-Level Prediction and Resume & Job Description Matching Using Context-Aware Transformer Models

  • 1. Competence-Level Prediction and Resume & Job Description Matching Using Context-Aware Transformer Models Changmao Li, Elaine Fisher, Rebecca S. Thomas, Stephen Pittard, Vicki Hertzberg, and Jinho Choi Emory NLP
  • 2. Outline ● Dataset ● Tasks ● Approaches ● Experiments ● Error Analysis ● Contributions
  • 3. Dataset Source: Clinical Research Coordinators(CRC) Applicants Resumes Here we have two kinds of annotations: 1. The levels they applied(an applicant can apply multiple levels). 2. The level they should be qualified. This is annotated by human experts with some annotation agreements. There are four levels, CRC1, CRC2, CRC3, CRC4. For the annotation, if the resume cannot match any level it will be annotated with Not Qualified(NQ) Besides, there is a job description for each level.
  • 4. Dataset Preprocessing: The original resume files are in DOC or PDF, they are parsed using some tools and splitted into 6 sections and finally put into the json file for the convenient use. The existence ratio of each section in the CRC levels
  • 5. Dataset Annotation: Two experts with experience in recruiting applicants for CRC positions of all levels design the annotation guidelines in 5 rounds by labeling each resume. Kappa scores measured for ITA during the five rounds of guideline development
  • 6. Tasks Two novel tasks are proposed for this new dataset: 1. (Multiclass classification(5 class))Given a resume, decide which level of CRC positions that the corresponding applicant is suitable for.(Use the resume as input and the annotation 2 as the gold output) 2. (Binary classification)Given a resume and a CRC level job description, decide whether the applicant is suitable for that particular level.(Use both resume and job description for the levels they applied for as input and combine the annotation 1 and annotation 2 to get the binary gold output)
  • 8. Approaches Strategies when applying baseline models ● Section Trimming for baseline models due to input length limitation of transformer encoders Task 1 Task 2
  • 9. Approaches Proposed Models for the multiclass classification task The context-aware model using section pruning and section encoding
  • 10. Approaches Proposed Models for the multiclass classification task The context-aware model using chunk segmenting and section encoding
  • 11. Approaches Proposed Models for the binary classification task Approaches The context-aware models using chunk segmenting + section encoding + job description embedding and multi-head attention between the resume and the job description
  • 12. Approaches Strategies when applying models ● Section Pruning for Proposed “encoding by sections” models in case each section exceeds the input length of transformer encoders
  • 13. Analysis on Section Pruning (in Appendix) Section lengths before section pruning Section lengths after section pruning
  • 14. Experiments Data split for the multiclass classification task(Keep label distributions): Data statistics for the competence-level classification task
  • 15. Experiments Data split for binary classification task(keep label and CRC distributions without overlap resumes between training and dev or test set ): Data statistics for the resume-to-job description matching task
  • 16. Algorithm to split dataset while avoiding overlaps between training and evaluation dataset(in Appendix) The key idea is 1. Split the data by targeted label distributions but with a smaller initial training set ratio than the original one. 2. If there are overlapping applicants, then the algorithm puts all the overlaps into the training set so that the training set ratio will be large enough to be close to the targeted training set ratio while the label distributions are still kept in a great extent.
  • 17. Experiments Experimented Models W!: Whole context model + section trimming P: Context-aware model + section pruning P⊕I:P+ section encoding C: Context-aware model + chunk segmenting C⊕I:C+ section encoding Models for the competence-level classification task W!" : Whole context + sec./job_desc. trimming P⊕I⊕J:P⊕I+ job_desc. embedding P⊕I⊕J⊕A:P⊕I⊕J+ multi-head attention P⊕I⊕J⊕AE:P⊕I⊕J-E# C⊕I⊕J:C⊕I+ job_desc. embedding C⊕I⊕J⊕A:C⊕I⊕J+ multi-head attention C⊕I⊕J⊕AE:C⊕I⊕J- E# Models for the resume-to-job description matching task
  • 18. Experiments Results for the competence-level classification task.
  • 19. Experiments Results for the resume-to-job description matching task.
  • 20. Experiments Analysis for the competence-level classification task. Confusion matrix for the best model of the competence-level classification task
  • 21. Experiments Analysis for the resume-to-job description matching task. Confusion matrix for the best model of the resume-to-job description matching task
  • 22. Error Analysis • It’s unable to identify clinical research experience. • It can’t identify dates of experience. • It’s hard to distinguish adjacent CRC positions.
  • 23. Contributions ● Introduced a new resume classification dataset. ● Proposed two new tasks for this new dataset. ● Proposed novel context-aware transformer approaches for two tasks. ● Conducted experiments with several proposed models. ● Conducted both quantitative and qualitative analysis for future improvements.