SlideShare a Scribd company logo
1 of 34
Download to read offline
Automatic Personality Prediction
with Attention-based Neural
Networks
Hang Jiang
Jinho D. Choi, PhD
Program in Linguistics
1. Introduction
Objectives
Motivation
Previous Works
Personality
Theories
Big Five Personality Inventory
(Norman, 1963; Goldberg, 1981)
1. Openness to experience
2. Extraversion
3. Conscientiousness
4. Emotional stability
(vs. Neuroticism)
5. Agreeableness
Language
Use
Agreeable
Stable
Conscientious
Open
Extraverted
Previous Works
1. Pennebaker and King (1999)
a. Self-report Essays Dataset with 2468 instances
2. Automatic Personality Prediction (Pennebaker et al., 2001) based
on text
a. Extracted linguistic features using Linguistic Inquiry and Word
Count (LIWC) text analysis tool
3. Mohammad and Kiritchenko (2013) introduced new linguistic
features
4. Tighe et al. (2016) applied feature reduction techniques like
Principal Component Analysis (PCA) and Information Gain (IG)
Motivation
● Applications in daily-life domains
○ Dating websites
○ Anti-terrorism
● Character mining
○ Attribute extraction
Objectives
● Create new Friends Dataset for the task
● Present a novel approach to automatic personality
prediction using attention-based neural networks with
word embeddings
● Evaluate our models on both datasets
2. Background
Neural
Networks
Previous
Datasets
LIWC Features
Big Five
Theories
Evaluation
Metrics
Big Five Theories (John et al., 1991)
Big Five Traits Facets
Extraversion vs. introversion sociable, forceful, energetic, adventurous,
enthusiastic, outgoing
Agreeable vs. antagonism forgiving, not demanding, warm, not
stubborn, not show-off, sympathetic
Conscientiousness vs. lack of direction efficient, organized, not careless, thorough,
not lazy, not impulsive
Neuroticism vs. emotional stability tense, irritable, not contented, shy, moody,
not self-confident
openness vs. closeness to experience curious, imaginative, artistic, wide interest,
excitable, unconventional
Linguistic Inquiry and Word Count (LIWC)
Categories Examples
Past tense walked, were, had
Negations no, never, not
Swear words *****
Friends pal, buddy, coworker
Positive Emotions happy, pretty, good
Anger hate, kill, pissed
Assent agree, OK, yes
Nonfluencies uh, rr*
Essays Electronically Activated Recorder (EAR)
Neural Networks for Text Classification
1. Multiple Layer
Perceptrons (MLP)
2. Convolutional Neural
Networks (CNN)
3. Bidirectional Long
Short-Term Memory
Networks (LSTM)
MLP Model
Word Embeddings (Kim, 2014)
Convolutional Neural Networks (Kim, 2014)
Long Short-Term Memory
Attention Mechanism
Evaluation Metrics
- 10-fold
Cross-Validation
- Accuracy
3. Corpus
Final
Annotation
Agreement
Check
Online
Annotation
Sub-scene
Extraction
Friends Dataset
- Not domain-specific
- Simple language
Dataset Essays Friends EAR
Source written observation Spoken
Structure monologue dialogue dialogue
Report Type self-report observation
self-report &
observation
Number of
words
1.9 million 556,273 97,468
Instances 2,468 3,488 96
Words per
Instance
651 161 1015
Sub-scene Extraction Process
1. Use window technique to find
a main speaker’s frequency
distribution in each scene
2. Choose peaks in each
frequency distribution
3. Use the peaks to find the
index range of each
sub-scene
4. Extract multiple sub-scenes
from each scene, thus
increasing our data size
5. Optimize window_size and
min_conversation_length
Annotation through Crowdsourcing (online annotation)
- Extracted 8738
sub-scenes from
10-season Friends
transcript
- Had 3448 sub-scenes
annotated from the
first 4 seasons
through Amazon
Mechanical Turk
Platform
What do the questions look like?
Inter-rater Agreement
Personality
Trait
Fleiss
Kappa*
Observed
Agreement
Estimated
Agreement
Agreeable 0.053 0.414 0.381
Conscientious 0.017 0.387 0.376
Extraverted 0.016 0.455 0.446
Stable 0.031 0.379 0.359
Open 0.041 0.409 0.383
Final Annotation
Steps:
1. Not change initial annotations
[-1,1]
2. Add three annotators’ scores;
produce 7 classes [-3,3]
3. Classes -3 and 3 are too small
4. Merge -3 and -2, and 3 and 2;
produce 5 classes
Three formats of Friends dataset
Ross: Hi, Rachel.
Rachel: Hi Ross.
Ross: I have a bad day.
Rachel: Oh.
Ross: How is your day?
Original Conversation
Ross: Hi, Rachel.
Ross: I have a bad day.
Ross: How is your day?
Single
Ross: Hi, Rachel.
Ross: I have a bad day.
Ross: How is your day?
Rachel: Hi Ross.
Rachel: Oh.
Single+Context
#Targ# Ross: Hi, Rachel.
#NonTarg# Rachel: Hi
Ross.
#Targ# Ross: I have a
bad day.
#NonTarg# Rachel: Oh.
#Targ# Ross: How is
your day?
Target
Three formats of Friends dataset
Ross: Hi, Rachel.
Rachel: Hi Ross.
Ross: I have a bad day.
Rachel: Oh.
Ross: How is your day?
Original Conversation
Ross: Hi, Rachel.
Ross: I have a bad day.
Ross: How is your day?
Single
Ross: Hi, Rachel.
Ross: I have a bad day.
Ross: How is your day?
Rachel: Hi Ross.
Rachel: Oh.
Single+Context
#Targ# Ross: Hi, Rachel.
#NonTarg# Rachel: Hi
Ross.
#Targ# Ross: I have a bad
day.
#NonTarg# Rachel: Oh.
#Targ# Ross: How is your
day?
Target
5. Approaches
1. Classic classification
algorithms with Weka
2. Multiple Layer Perceptrons
(MLP)
3. Convolutional Neural
Networks (CNN)
4. Bidirectional Long
Short-Term Memory
Networks (LSTM)
5. Attention-based CNN
6. Attention-based LSTM
5. Experiment
Modeling on
Friends Dataset
Modeling on
Essays Dataset
LIWC vs. Word
Embeddings
LIWC Features vs. Word Embeddings
Trait Majority MLP with LIWC
MLP with Word
Embeddings
Agreeableness 53.08 57.90 55.51
Conscientiousness 50.81 56.62 58.59
Extraversion 51.74 55.96 56.69
Neuroticism 50.04 56.72 56.93
Openness 51.54 57.62 59.44
Results on Essays Dataset
10-fold CV /
accuracy
Majority
Tighe et
al., 2016
Jiang & Choi, 2018
Trait Baseline Weka MLP CNN ABCNN BLSTM ABLSTM
Agreeablene
ss
53.08 57.54 55.51 57.38 57.82 56.64 58.85
Conscientiou
sness
50.81 56.04 58.59 57.74 60.13 57.83 59.55
Extraversion 51.74 55.75 56.69 56.28 58.75 59.18 59.32
Neuroticism 50.04 58.31 56.93 57.09 58.51 57.70 59.56
Openness 51.54 61.95 59.44 63.49 63.65 63.02 63.99
Results on Friends Dataset
6. Conclusion
● New Friends dataset is created, and it shows the challenges of
annotating dialogue text data
● A novel approach to automatic personality prediction
● A new benchmark is achieved on essays dataset
● All models fail to work on Friends dataset, implicating the annotations
do not have much consistency
Future Works
● LIWC integrated CNN/LSTM with Attention Mechanism
● A platform to support human annotation process by providing
multimodal information
● The use of the Big Five Inventory Questionnaire
Q&A Session
The End.

More Related Content

What's hot

Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...
Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...
Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...
Simplilearn
 
Detection and recognition of face using neural network
Detection and recognition of face using neural networkDetection and recognition of face using neural network
Detection and recognition of face using neural network
Smriti Tikoo
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detection
kalpesh1908
 

What's hot (20)

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101
 
Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...
Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...
Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...
 
Fraud detection with Machine Learning
Fraud detection with Machine LearningFraud detection with Machine Learning
Fraud detection with Machine Learning
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
 
House price prediction
House price predictionHouse price prediction
House price prediction
 
TYBSC CS SEM 5 AI NOTES
TYBSC CS SEM 5 AI NOTESTYBSC CS SEM 5 AI NOTES
TYBSC CS SEM 5 AI NOTES
 
Machine learning
Machine learning Machine learning
Machine learning
 
Attendance system based on face recognition using python by Raihan Sikdar
Attendance system based on face recognition using python by Raihan SikdarAttendance system based on face recognition using python by Raihan Sikdar
Attendance system based on face recognition using python by Raihan Sikdar
 
Natural Language Generation: New Automation and Personalization Opportunities
Natural Language Generation: New Automation and Personalization OpportunitiesNatural Language Generation: New Automation and Personalization Opportunities
Natural Language Generation: New Automation and Personalization Opportunities
 
Credit card fraud dection
Credit card fraud dectionCredit card fraud dection
Credit card fraud dection
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine learning seminar ppt
Machine learning seminar pptMachine learning seminar ppt
Machine learning seminar ppt
 
Detection and recognition of face using neural network
Detection and recognition of face using neural networkDetection and recognition of face using neural network
Detection and recognition of face using neural network
 
Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detection
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)
 
Machine Learning ppt.pptx
Machine Learning ppt.pptxMachine Learning ppt.pptx
Machine Learning ppt.pptx
 

Similar to Automatic Personality Prediction with Attention-based Neural Networks

Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
NAVER Engineering
 
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptxLiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
VishnuRajuV
 
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingAn-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
Theodore J. LaGrow
 

Similar to Automatic Personality Prediction with Attention-based Neural Networks (20)

Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 ReviewNatural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
 
An Introduction to Recent Advances in the Field of NLP
An Introduction to Recent Advances in the Field of NLPAn Introduction to Recent Advances in the Field of NLP
An Introduction to Recent Advances in the Field of NLP
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 
1808.10245v1 (1).pdf
1808.10245v1 (1).pdf1808.10245v1 (1).pdf
1808.10245v1 (1).pdf
 
Natural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A SurveyNatural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A Survey
 
Prediction of Answer Keywords using Char-RNN
Prediction of Answer Keywords using Char-RNNPrediction of Answer Keywords using Char-RNN
Prediction of Answer Keywords using Char-RNN
 
Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...
 
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
 
NLP Techniques for Text Generation.docx
NLP Techniques for Text Generation.docxNLP Techniques for Text Generation.docx
NLP Techniques for Text Generation.docx
 
The magic of machine translation 20 july 2017
The magic of machine translation 20 july 2017The magic of machine translation 20 july 2017
The magic of machine translation 20 july 2017
 
Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...
 
Building NLP solutions for Davidson ML Group
Building NLP solutions for Davidson ML GroupBuilding NLP solutions for Davidson ML Group
Building NLP solutions for Davidson ML Group
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptxLiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
 
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingAn-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
 
INTELLIGENT QUERY PROCESSING IN MALAYALAM
INTELLIGENT QUERY PROCESSING IN MALAYALAMINTELLIGENT QUERY PROCESSING IN MALAYALAM
INTELLIGENT QUERY PROCESSING IN MALAYALAM
 
Sk t academy lecture note
Sk t academy lecture noteSk t academy lecture note
Sk t academy lecture note
 
Pycon India 2018 Natural Language Processing Workshop
Pycon India 2018   Natural Language Processing WorkshopPycon India 2018   Natural Language Processing Workshop
Pycon India 2018 Natural Language Processing Workshop
 

More from Jinho Choi

More from Jinho Choi (20)

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference Resolution
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning Representation
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
 
CKY Parsing
CKY ParsingCKY Parsing
CKY Parsing
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet Similarities
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical Relations
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue Management
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR Parsing
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to Dialogue
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue Understanding
 
Topological Sort
Topological SortTopological Sort
Topological Sort
 
Tries - Put
Tries - PutTries - Put
Tries - Put
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports Intelligently
 
Text-to-SQL with Data-Driven Templates
Text-to-SQL with Data-Driven TemplatesText-to-SQL with Data-Driven Templates
Text-to-SQL with Data-Driven Templates
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Automatic Personality Prediction with Attention-based Neural Networks

  • 1. Automatic Personality Prediction with Attention-based Neural Networks Hang Jiang Jinho D. Choi, PhD Program in Linguistics
  • 3. Big Five Personality Inventory (Norman, 1963; Goldberg, 1981) 1. Openness to experience 2. Extraversion 3. Conscientiousness 4. Emotional stability (vs. Neuroticism) 5. Agreeableness Language Use Agreeable Stable Conscientious Open Extraverted
  • 4. Previous Works 1. Pennebaker and King (1999) a. Self-report Essays Dataset with 2468 instances 2. Automatic Personality Prediction (Pennebaker et al., 2001) based on text a. Extracted linguistic features using Linguistic Inquiry and Word Count (LIWC) text analysis tool 3. Mohammad and Kiritchenko (2013) introduced new linguistic features 4. Tighe et al. (2016) applied feature reduction techniques like Principal Component Analysis (PCA) and Information Gain (IG)
  • 5. Motivation ● Applications in daily-life domains ○ Dating websites ○ Anti-terrorism ● Character mining ○ Attribute extraction
  • 6. Objectives ● Create new Friends Dataset for the task ● Present a novel approach to automatic personality prediction using attention-based neural networks with word embeddings ● Evaluate our models on both datasets
  • 8. Big Five Theories (John et al., 1991) Big Five Traits Facets Extraversion vs. introversion sociable, forceful, energetic, adventurous, enthusiastic, outgoing Agreeable vs. antagonism forgiving, not demanding, warm, not stubborn, not show-off, sympathetic Conscientiousness vs. lack of direction efficient, organized, not careless, thorough, not lazy, not impulsive Neuroticism vs. emotional stability tense, irritable, not contented, shy, moody, not self-confident openness vs. closeness to experience curious, imaginative, artistic, wide interest, excitable, unconventional
  • 9. Linguistic Inquiry and Word Count (LIWC) Categories Examples Past tense walked, were, had Negations no, never, not Swear words ***** Friends pal, buddy, coworker Positive Emotions happy, pretty, good Anger hate, kill, pissed Assent agree, OK, yes Nonfluencies uh, rr*
  • 11. Neural Networks for Text Classification 1. Multiple Layer Perceptrons (MLP) 2. Convolutional Neural Networks (CNN) 3. Bidirectional Long Short-Term Memory Networks (LSTM) MLP Model
  • 18. Friends Dataset - Not domain-specific - Simple language Dataset Essays Friends EAR Source written observation Spoken Structure monologue dialogue dialogue Report Type self-report observation self-report & observation Number of words 1.9 million 556,273 97,468 Instances 2,468 3,488 96 Words per Instance 651 161 1015
  • 19. Sub-scene Extraction Process 1. Use window technique to find a main speaker’s frequency distribution in each scene 2. Choose peaks in each frequency distribution 3. Use the peaks to find the index range of each sub-scene 4. Extract multiple sub-scenes from each scene, thus increasing our data size 5. Optimize window_size and min_conversation_length
  • 20. Annotation through Crowdsourcing (online annotation) - Extracted 8738 sub-scenes from 10-season Friends transcript - Had 3448 sub-scenes annotated from the first 4 seasons through Amazon Mechanical Turk Platform
  • 21. What do the questions look like?
  • 22. Inter-rater Agreement Personality Trait Fleiss Kappa* Observed Agreement Estimated Agreement Agreeable 0.053 0.414 0.381 Conscientious 0.017 0.387 0.376 Extraverted 0.016 0.455 0.446 Stable 0.031 0.379 0.359 Open 0.041 0.409 0.383
  • 23. Final Annotation Steps: 1. Not change initial annotations [-1,1] 2. Add three annotators’ scores; produce 7 classes [-3,3] 3. Classes -3 and 3 are too small 4. Merge -3 and -2, and 3 and 2; produce 5 classes
  • 24. Three formats of Friends dataset Ross: Hi, Rachel. Rachel: Hi Ross. Ross: I have a bad day. Rachel: Oh. Ross: How is your day? Original Conversation Ross: Hi, Rachel. Ross: I have a bad day. Ross: How is your day? Single Ross: Hi, Rachel. Ross: I have a bad day. Ross: How is your day? Rachel: Hi Ross. Rachel: Oh. Single+Context #Targ# Ross: Hi, Rachel. #NonTarg# Rachel: Hi Ross. #Targ# Ross: I have a bad day. #NonTarg# Rachel: Oh. #Targ# Ross: How is your day? Target
  • 25. Three formats of Friends dataset Ross: Hi, Rachel. Rachel: Hi Ross. Ross: I have a bad day. Rachel: Oh. Ross: How is your day? Original Conversation Ross: Hi, Rachel. Ross: I have a bad day. Ross: How is your day? Single Ross: Hi, Rachel. Ross: I have a bad day. Ross: How is your day? Rachel: Hi Ross. Rachel: Oh. Single+Context #Targ# Ross: Hi, Rachel. #NonTarg# Rachel: Hi Ross. #Targ# Ross: I have a bad day. #NonTarg# Rachel: Oh. #Targ# Ross: How is your day? Target
  • 26. 5. Approaches 1. Classic classification algorithms with Weka 2. Multiple Layer Perceptrons (MLP) 3. Convolutional Neural Networks (CNN) 4. Bidirectional Long Short-Term Memory Networks (LSTM) 5. Attention-based CNN 6. Attention-based LSTM
  • 27. 5. Experiment Modeling on Friends Dataset Modeling on Essays Dataset LIWC vs. Word Embeddings
  • 28. LIWC Features vs. Word Embeddings Trait Majority MLP with LIWC MLP with Word Embeddings Agreeableness 53.08 57.90 55.51 Conscientiousness 50.81 56.62 58.59 Extraversion 51.74 55.96 56.69 Neuroticism 50.04 56.72 56.93 Openness 51.54 57.62 59.44
  • 29. Results on Essays Dataset 10-fold CV / accuracy Majority Tighe et al., 2016 Jiang & Choi, 2018 Trait Baseline Weka MLP CNN ABCNN BLSTM ABLSTM Agreeablene ss 53.08 57.54 55.51 57.38 57.82 56.64 58.85 Conscientiou sness 50.81 56.04 58.59 57.74 60.13 57.83 59.55 Extraversion 51.74 55.75 56.69 56.28 58.75 59.18 59.32 Neuroticism 50.04 58.31 56.93 57.09 58.51 57.70 59.56 Openness 51.54 61.95 59.44 63.49 63.65 63.02 63.99
  • 31. 6. Conclusion ● New Friends dataset is created, and it shows the challenges of annotating dialogue text data ● A novel approach to automatic personality prediction ● A new benchmark is achieved on essays dataset ● All models fail to work on Friends dataset, implicating the annotations do not have much consistency
  • 32. Future Works ● LIWC integrated CNN/LSTM with Attention Mechanism ● A platform to support human annotation process by providing multimodal information ● The use of the Big Five Inventory Questionnaire