SlideShare a Scribd company logo
1 of 15
Download to read offline
Personality Driven Differences in Paraphrase
Preference
Daniel Preot¸iuc-Pietro
Joint work with
Jordan Carpenter (Kenan Institute for Ethics, Duke)
Lyle Ungar (Computer & Information Science, UPenn)
3 August 2017
Motivation
User attribute prediction from text is successful:
Age (Rao et al. 2010 ACL)
Gender (Burger et al. 2011 EMNLP)
Location (Eisenstein et al. 2010 EMNLP)
Personality (Schwartz et al. 2013 PLoS One)
Impact (Lampos et al. 2014 EACL)
Political Orientation (Volkova et al. 2014 ACL)
Mental Illness (Coppersmith et al. 2014 ACL)
Occupation (Preot¸iuc-Pietro et al. 2015 ACL)
Income (Preot¸iuc-Pietro et al. 2015 PLoS One)
However...
Most text prediction methods uncover topical differences
relative frequency
a aa
correlation strength
Openness to Experience
However...
Most text prediction methods uncover topical differences
relative frequency
a aa
correlation strength
Extraversion
Stylistic differences
We need to be aware of style differences, rather than topical
Not useful for many practical applications that adapt to traits:
machine translation (Mirkin et al. 2015 EMNLP, Rabinovich et al 2017 EACL)
agents (e.g. customer service, tutoring)
controlling for gender or racial bias
Stylistic differences
One type of stylistic difference is phrase choice in context.
Splendid
Excellent
Remarkable
Source: https://wikispaces.psu.edu/display/P5PFL/TRAIT+Theory+Page
Openness
Magnificent
Fabulous
Tremendous
Source: http://inwilmingtonde.com/events/thanksgiving-eve-karaoke
Extraversion
Data
We study the Big Five personality traits:
115,312 Facebook users
Personality scores obtained through the MyPersonality
app (Kosinski et al, 2013)
For each trait, take top and bottom 20% of users
Paraphrasing
Paraphrases – alternative ways to convey the same information
Paraphrase Database (PPDB) 2.0 (Pavlick et al. 2015 ACL):
annotated with type and confidence (filter ‘equivalent’
paraphrases with >.2 confidence)
>6M automatically derived paraphrase pairs
we use only 1–3 grams
difference in a pair more than just change of stopwords or
root form of word
Prediction
.603
.551
.519
.551 .549
.573
.589
.578
.553
.590
.623
.639
.597 .593
.631
0.50
0.55
0.60
0.65
Openness Conscientiousness Extraversion Agreeableness Neuroticism
Paraphrases only Phrases w/o paraphrases All Phrases
Accuracy, Naive Bayes, 90-10 training-testing, balanced data
Quantifying Preference
Straightforward measure:
Extraversion(w) = log
Extravert(w)
Introvert(w)
(1)
Within a paraphrase pair (w1, w2), the difference
Extraversion(w1) − Extraversion(w2) is the stylistic distance.
Used previously to study paraphrase preference across age,
gender and occupational class (Preot¸iuc-Pietro, Xu & Ungar, AAAI 2016).
Linguistic Theories
Study which attributes of words in a pair are preferred by one
group:
Word Length in Characters
Word Length in Syllables
Simple proxies for word complexity
Affective Norms: Valence, Arousal, Dominance
14k rated words
Valence: suicide (0.15) → bacon (0.70) → laughter (1)
Concreteness
40k rated words: spirituality (1) → morning (3.44) → tiger (5)
Age of Acquisition
30k rated words: great (5.05) → splendid (7.22) → tremendous (10.63)
More in the paper ...
Linguistic Theories
.163
-.068
-.043
-.012
-.041
.067
.182
-.002
-.014
.036
-.001
.050
.045
.097
-.060
.010
.031
.028
.050
.047
.080
-.032
-.007
.030
.005
.040
.016
.010
-.014
.023
.000
-.024
.004
-.020
-.065
-.200 -.150 -.100 -.050 .000 .050 .100 .150 .200
Age of Acquisition
Concreteness
Dominance
Arousal
Happiness
#Syllables
Word Length
Openess Conscientiousness Extraversion Agreeableness Neuroticism
Correlation coefficients between paraphrase pair preference
and user group usage.
Linguistic Theories
.163
-.068
-.043
-.012
-.041
.067
.182
-.002
-.014
.036
-.001
.050
.045
.097
-.060
.010
.031
.028
.050
.047
.080
-.032
-.007
.030
.005
.040
.016
.010
-.014
.023
.000
-.024
.004
-.020
-.065
-.200 -.150 -.100 -.050 .000 .050 .100 .150 .200
Age of Acquisition
Concreteness
Dominance
Arousal
Happiness
#Syllables
Word Length
Openess Conscientiousness Extraversion Agreeableness Neuroticism
Correlation coefficients between paraphrase pair preference
and user group usage.
Take Aways
Stylistic difference between user groups have important
applicability
Paraphrase choice contains valuable information
Shed light on psycholinguistic theories
Potential way to generate text perceived to be from a
different user trait
See our EMNLP 2017 paper (Preot¸iuc-Pietro, Guntuku, Ungar - Controlling
Human Perception of Basic User Traits)
Thank you!
Questions?

More Related Content

Similar to Daniel Preotiuc-Pietro - 2017 - Personality Driven Differences in Paraphrase Preference

601-CriticalEssay-2-Portfolio Edition
601-CriticalEssay-2-Portfolio Edition601-CriticalEssay-2-Portfolio Edition
601-CriticalEssay-2-Portfolio Edition
Jordan Chapman
 
College of Doctoral StudiesExpanded Comparison.docx
                College of Doctoral StudiesExpanded Comparison.docx                College of Doctoral StudiesExpanded Comparison.docx
College of Doctoral StudiesExpanded Comparison.docx
joyjonna282
 
Nikolay Karpov - Single-sentence readability prediction in russian
Nikolay Karpov - Single-sentence readability prediction in russianNikolay Karpov - Single-sentence readability prediction in russian
Nikolay Karpov - Single-sentence readability prediction in russian
AIST
 

Similar to Daniel Preotiuc-Pietro - 2017 - Personality Driven Differences in Paraphrase Preference (20)

Franz et al evol 2016 representing phylogeny as a logically tractable variable
Franz et al evol 2016 representing phylogeny as a logically tractable variable  Franz et al evol 2016 representing phylogeny as a logically tractable variable
Franz et al evol 2016 representing phylogeny as a logically tractable variable
 
Franz et al evol 2016 aligning multipe incongruent phylogenies with the euler...
Franz et al evol 2016 aligning multipe incongruent phylogenies with the euler...Franz et al evol 2016 aligning multipe incongruent phylogenies with the euler...
Franz et al evol 2016 aligning multipe incongruent phylogenies with the euler...
 
Application Of An Automatic Plagiarism Detection System In A Large-Scale Asse...
Application Of An Automatic Plagiarism Detection System In A Large-Scale Asse...Application Of An Automatic Plagiarism Detection System In A Large-Scale Asse...
Application Of An Automatic Plagiarism Detection System In A Large-Scale Asse...
 
The Social Impact of NLP
The Social Impact of NLPThe Social Impact of NLP
The Social Impact of NLP
 
ugc list of approved journals 02 nov.pdf
ugc list of approved journals 02 nov.pdfugc list of approved journals 02 nov.pdf
ugc list of approved journals 02 nov.pdf
 
klafter PRAISE poster
klafter PRAISE posterklafter PRAISE poster
klafter PRAISE poster
 
601-CriticalEssay-2-Portfolio Edition
601-CriticalEssay-2-Portfolio Edition601-CriticalEssay-2-Portfolio Edition
601-CriticalEssay-2-Portfolio Edition
 
Rule-based Prosody Calculation for Marathi Text-to-Speech Synthesis
Rule-based Prosody Calculation for Marathi Text-to-Speech SynthesisRule-based Prosody Calculation for Marathi Text-to-Speech Synthesis
Rule-based Prosody Calculation for Marathi Text-to-Speech Synthesis
 
College of Doctoral StudiesExpanded Comparison.docx
                College of Doctoral StudiesExpanded Comparison.docx                College of Doctoral StudiesExpanded Comparison.docx
College of Doctoral StudiesExpanded Comparison.docx
 
Validity and reliability
Validity and reliabilityValidity and reliability
Validity and reliability
 
Analyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in PythonAnalyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in Python
 
DETECTING OXYMORON IN A SINGLE STATEMENT
DETECTING OXYMORON IN A SINGLE STATEMENTDETECTING OXYMORON IN A SINGLE STATEMENT
DETECTING OXYMORON IN A SINGLE STATEMENT
 
Data Mining in Rediology reports
Data Mining in Rediology reportsData Mining in Rediology reports
Data Mining in Rediology reports
 
Feb20 mayo-webinar-21feb2012
Feb20 mayo-webinar-21feb2012Feb20 mayo-webinar-21feb2012
Feb20 mayo-webinar-21feb2012
 
Subjective Probabilistic Knowledge Grading and Comprehension
Subjective Probabilistic Knowledge Grading and ComprehensionSubjective Probabilistic Knowledge Grading and Comprehension
Subjective Probabilistic Knowledge Grading and Comprehension
 
20120140506009
2012014050600920120140506009
20120140506009
 
Detecting Paraphrases in Marathi Language
Detecting Paraphrases in Marathi LanguageDetecting Paraphrases in Marathi Language
Detecting Paraphrases in Marathi Language
 
Nikolay Karpov - Single-sentence readability prediction in russian
Nikolay Karpov - Single-sentence readability prediction in russianNikolay Karpov - Single-sentence readability prediction in russian
Nikolay Karpov - Single-sentence readability prediction in russian
 
Possible Word Representation
Possible Word RepresentationPossible Word Representation
Possible Word Representation
 
Ricardo santos poster iopi [rome14]
Ricardo santos poster iopi [rome14]Ricardo santos poster iopi [rome14]
Ricardo santos poster iopi [rome14]
 

More from Association for Computational Linguistics

More from Association for Computational Linguistics (20)

Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal TextMuis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
 
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
 
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour AnalysisCastro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text SimplificationElior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
 

Recently uploaded

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
MateoGardella
 

Recently uploaded (20)

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 

Daniel Preotiuc-Pietro - 2017 - Personality Driven Differences in Paraphrase Preference