SlideShare a Scribd company logo
TEXT
PERSONALIZATION
By
Eng. Joud Khattab
MBTI PERSONALITYTEST
(MYERS-BRIGGSTYPE INDICATOR)
By Joud Khattab 2
““It’s so
incredible to
finally be
understood”
MBTI FOUR FUNCTIONAL DICHOTOMIES
Thinking
(‫)التفكير‬ Feeling
(‫)الشعور‬
Extraversion
(‫)االنبساط‬ Introversion
(‫)االنطواء‬
Sensing
(‫)االستشعار‬ iNtuition
(‫)الحدس‬
By Joud Khattab 3
Judging
(‫)الحكم‬ Perceiving
(‫)االدراك‬
MBTI 16 PERSONALITY
Analysts
1.INTJ
(Architect)
2.INTP
(Logician)
3.ENTJ
(Commander)
4.ENTP
(Debater)
Diplomates
5.INFJ
(Advocate)
6.INFP
(Mediator)
7.ENFJ
(Protagonist)
8.ENFP
(Campaigner)
Sentinels
9.ISTJ
(Logistician)
10.ISFJ
(Defender)
11.ESTJ
(Executive)
12.ESFJ
(Consul)
Explorers
13.ISTP
(Virtuoso)
14.ISFP
(Adventurer)
15.ESTP
(Entrepreneur)
16.ESFP
(Entertainer)
By Joud Khattab 4
WHY PERSONALITY PREDICTION?
 Areas which are directly affected with a user’s personality:
1. Marketing.
2. Recommendation Systems.
3. Customized web pages, advertisements and products.
4. Customized search engines and user experience.
5. Understanding criminal and psychopathic behaviors.
6. Sentiment analysis and clustering of text.
By Joud Khattab 5
LITERATURE SURVEY
1) Understanding Personality through Social Media:
 Y.Wang et al. (2016), Department of Computer Science, Stanford University.
2) Detection of MBTI viaText Based Computer-Mediated Communication:
 D. Brinks et al. (2012), Department of Electrical Engineering, Stanford University.
3) PersonalityTraits onTwitter:
 B. Plank et al. (2015), Center for LanguageTechnology, University of Copenhagen.
4) Identifying PersonalityTypes Using Document Classification Methods:
 M. Komisin et al. (2012), Department of Computer Science, University of North Carolina
Wilmington.
By Joud Khattab 6
UNDERSTANDING PERSONALITY
THROUGH SOCIAL MEDIA
Y.Wang et al. (2016)
Department of Computer Science
Stanford University
By Joud Khattab 7
(1)
DATA SET
(Y. WANG, 2016)
 Twitter dataset:
 GNIPAPIs.
 around 90,000 users.
 Extracting and filtering all personality-related tweets from 2006 to 2015.
 The most recent tweets for all the 90,000 users.
 1.7 million tweets that contain the personality codes.
By Joud Khattab 8
(1)
DATA CLEANING
(Y. WANG, 2016)
1. PositiveTweets:
 @ProfCarol Just wondering, what’s your type? I’m an ENFJ
 @whitneyhess that’s an interesting test.. I got ENTP and it seems pretty accurate IMO
 @megfowler I’m INTP according to this http://similarminds.com/jung.html
2. NegativeTweets:
 I’ll bet that Jeremiah @jowyang is an ESTJ
 @mark ENTJYou should have known... http://typelogic.com/entj.html
 I love my wife. Even though she’s INFP
 Retrieve 120K tweets out of all the 1.7M tweets with personality codes.
By Joud Khattab 9
(1)
SOCIAL MEDIA DATA DISADVANTAGE
(Y. WANG, 2016)
 Language on social media has richer content that makes linguistic analysis tool
perform poorly.
 Each tweet is limited to 140 character contains hashtag, at-mention, URL and
emoticons.
 People tend to use shorten version of phrases “iono” means “I don’t know”.
 Lack of conventional orthography.
 Collecting personality data is costly.
By Joud Khattab 10
(1)
PERSONALITY DISTRIBUTION IN DATASET
(Y. WANG, 2016)
By Joud Khattab 11
(1)
Analysts Diplomates Sentinels Explorers
INTJ (12,247) INFJ (12,885) ISTJ (3,446) ISTP (1,874)
INTP (7,446) INFP (11,706) ISFJ (3,267) ISFP (2,492)
ENTJ (4,921) ENFJ (6,812) ESTJ (2,006) ESTP (1,132)
ENTP (4,386) ENFP (10,400) ESFJ (2,364) ESFP (2,164)
Sum (89,548)
FEATURES SELECTION
(Y. WANG, 2016)
1) Bag of N-Grams.
2) Part-Of-Speech Tags.
3) WordVectors.
By Joud Khattab 12
(1)
N-GRAM
(Y. WANG, 2016)
By Joud Khattab 13
(1)
Top correlated unigram forThinking Top correlated unigram for Feeling
Top correlated bigram for Introversion Top correlated bigram for Extroversion
POSTAGGING
(Y. WANG, 2016)
 Twitter POS tagger has 25 types of distinctive tags has been used.
 Common noun is a good indicator for personality.
 People who use common nouns more often tend to be in Extroversion, Intuition,
Thinking, or Judging type.
 Introverted people use more pronouns but less common nouns.
 Interjection which includes (“lol”, “haha”, “FTW”, “yea”) is more likely to be used
by Sensing and Perceiving type.
 Emoticon is more likely to be used by Sensing and Feeling type.
 Numbers are more likely to be used by Sensing andThinking type.
 Extroverted people are more likely to use hashtags.
By Joud Khattab 14
(1)
WORD COUNT
(Y. WANG, 2016)
1) Average word vectors:
 average all the vectors of all the word that is available in the tweets of a user to
represent the vector representations of that user.
2) Weighted average word vectors:
 A weighted average the vectors of the words that is available in the tweets of a user
according to theTF-IDF values.
 The weighted vector representation is then used to represent the vector
representations of that user.
By Joud Khattab 15
(1)
MODEL SELECTION
(Y. WANG, 2016)
1. Logistic Regression model with 10-fold cross-validation.
2. Random Forest and SVM.
By Joud Khattab 16
(1)
MODEL RESULTS
(Y. WANG, 2016)
Classifier E vs I N vs S T vs F P vs J Average
WordVector 67.9% 64.3% 67.3% 60.8% 65.1%
Bag of n-grams 63.1% 58.8% 62.1% 58.8% 60.7%
Unigram 61.7% 58.1% 60.9% 58.2% 59.7%
Bigram 60.9% 56.9% 60.7% 57.3% 59.0%
Trigram 61.3% 56.7% 59.3% 57.0% 58.6%
POSTag 59.3% 57.5% 60.3% 56.9% 58.5%
POS + n-rams 62.8% 60.7% 63.3% 59.6% 61.6%
POS + n-gram
+WordVector
69.1% 65.3% 68.0% 61.9% 66.1%
By Joud Khattab 17
(1)
DETECTION OF MBTI VIA TEXT BASED
COMPUTER-MEDIATED COMMUNICATION
D. Brinks et al. (2012)
Department of Electrical Engineering
Stanford University
By Joud Khattab 18
(2)
DATA SET
(D. BRINKS, 2012)
 Twitter API to get tweets including MBTI abbreviation.
 6,358 users includes 960,715 tweets.
 Multiple level of data elimination where done to eliminate any improper data.
By Joud Khattab 19
(2)
DATA CLEANING
(D. BRINKS, 2012)
 Many users labeled “INTP” weren’t referencing their MBT. instead, they had
simply misspelled “into”.
 Any user whose tweet contained two or more different MBTs was rejected.
 numbers, links, @<user>, and MBTs were replaced with “NUMBER”, “URL”,
“AT_USER”, and “MBT”.
 Contractions were replaced by their expanded form.
 Words were converted to lowercase.
 Finally, all of a user’s tweets were aggregated into a single text block.
By Joud Khattab 20
(2)
PERSONALITY DISTRIBUTION IN DATASET
(D. BRINKS, 2012)
By Joud Khattab 21
(2)
Analysts Diplomates Sentinels Explorers
INTJ (650) INFJ (714) ISTJ (183) ISTP (105)
INTP (423) INFP (449) ISFJ (181) ISFP (128)
ENTJ (279) ENFJ (336) ESTJ (101) ESTP (95)
ENTP (237) ENFP (448) ESFJ (151) ESFP (122)
Sum = 4,602
PROCESSING PARAMETERIZATION
(D. BRINKS, 2012)
1) Porter Stemming.
2) Emoticon Substitution.
3) MinimumToken Frequency.
4) Minimum User Frequency.
5) Term FrequencyTransform.
6) Inverse Document FrequencyTransform.
By Joud Khattab 22
(2)
TRAINING ACCURACY BY CLASSIFIER
(D. BRINKS, 2012)
Classifier E vs I N vs S T vs F P vs J Average
Multinomial Event Model Naive Bayes 96.0% 83.4% 84.6% 75.9% 85.0%
L2-regularized logistic regression (primal) 99.8% 99.8% 100.0% 99.8% 99.9%
L2-regularized L2-loss SV classification
(dual)
99.8% 99.9% 99.9% 99.9% 99.9%
L2-regularized L2-loss SV classification
(primal)
99.8% 99.9% 99.9% 99.9% 99.9%
L2-regularized L1-loss SV classification
(dual)
99.9% 99.9% 99.9% 99.9% 99.9%
SV classification by Crammer and Singer 100.0% 100.0% 100.0% 100.0% 100.0%
L1-regularized L2-loss SV classification 100.0% 100.0% 100.0% 100.0% 100.0%
L1-regularized logistic regression 99.9% 99.9% 99.8% 99.9% 99.9%
L2-regularized logistic regression (dual) 100.0% 100.0% 100.0% 100.0% 100.0%
By Joud Khattab 23
(2)
HIGHVARIANCE SOLUTIONS
(D. BRINKS, 2012)
1. Get more data:
 Unfortunately,Twitter places a cap on data retrieval requests.
 Even after tripling the number of collected tweets, performance remained constant.
2. Decreasing the feature set size:
 Modifying the preprocessing steps.
 Parameterized number of features fed to classifier to determine the optimal features.
 Several transforms detailed were added to the classifier.
 Algorithm was modified to use confidence metrics in its classification and instructed to
only decide for users about which it had a strong degree of certainty.
 However, none of these options improved testing behavior to any significant
degree.
By Joud Khattab 24
(2)
PERFORMANCE BY CLASSIFIER
(D. BRINKS, 2012)
Classifier E vs I N vs S T vs F P vs J Average
Multinomial Event Model Naive Bayes 63.9% 74.6% 60.8% 58.5% 64.5%
L2-regularized logistic regression (primal) 60.3% 70.7% 59.4% 56.1% 61.6%
L2-regularized L2-loss SV classification
(dual)
56.9% 67.5% 59.3% 54.1% 59.5%
L2-regularized L2-loss SV classification
(primal)
58.8% 69.5% 59.0% 55.9% 61.0%
L2-regularized L1-loss SV classification
(dual)
56.8% 67.6% 59.6% 54.5% 59.7%
SV classification by Crammer and Singer 56.8% 67.7% 59.4% 54.5% 59.6%
L1-regularized L2-loss SV classification 59.4% 68.3% 56.8% 56.1% 60.2%
L1-regularized logistic regression 60.9% 70.5% 58.5% 56.3% 61.6%
L2-regularized logistic regression (dual) 59.2% 69.6% 59.0% 55.0% 60.7%
By Joud Khattab 25
(2)
DATA PROBLEM
(D. BRINKS, 2012)
 Reasons why the machine classifier did not achieve better performance because a
large portion of tweets are noise with respect to MBTI.
 Twitter imposes a 140-character limit on each tweet, users are forced to express
themselves succinctly.
 Large percentage of tokens in tweets are not English words, but twitter handles being
retweeted or URLs.Thus, while a user’s tweet set may contain a thousand tokens, a
significant subset is unique to that individual user, and cannot be used for correlation.
 Due to retweeting, a user’s tweet may not be expressing his or her own thoughts.
By Joud Khattab 26
(2)
COMPARISON WITH HUMAN EXPERTS
(D. BRINKS, 2012)
Spectrum Human 1 Human 2 MNEMNB
E vs I 50.0% 40.0% 55.0%
N vs S 50.0% 90.0% 90.0%
T vs F 80.0% 65.0% 55.0%
P vs J 60.0% 50.0% 65.0%
Average 60.0% 61.3% 66.3%
By Joud Khattab 27
(2)
PERSONALITY TRAITS ON TWITTER
B. Plank et al. (2015)
Center for LanguageTechnology
University of Copenhagen
By Joud Khattab 28
(3)
DATA SET
(B. PLANK, 2015)
 Corpus of 1.2M tweets.
 1,500 users that self-identity with an MBTI.
 Open source code and data set.
By Joud Khattab 29
(3)
PERSONALITY DISTRIBUTION IN DATASET
(B. PLANK, 2015)
By Joud Khattab 30
(3)
Analysts Diplomates Sentinels Explorers
INTJ (193) INFJ (257) ISTJ (75) ISTP (22)
INTP (111) INFP (175) ISFJ (77) ISFP (51)
ENTJ (102) ENFJ (106) ESTJ (36) ESTP (15)
ENTP (70) ENFP (148) ESFJ (36) ESFP (26)
Sum = 1,500
MBTI DISTRIBUTION INTWITTER CORPUSVS
GENERAL US POPULATION
(B. PLANK, 2015)
By Joud Khattab 31
(3)
By Joud Khattab 32
0 2 4 6 8 10 12 14 16 18
ISTP
ESFP
ESFJ
ESTJ
ESTP
ENFJ
ENTJ
ISTJ
ISFP
ENTP
ISFJ
INTP
ENFP
INFJ
INFP
INTJ
MBTI distribution inTwitter corpusVS general US population
US Population
Paper 3
Paper 2
Paper 1
CLASSIFIER
(B. PLANK, 2015)
By Joud Khattab 33
(3)
Classifier E vs I N vs S T vs F P vs J Average
Accuracy for four
discrimination tasks
Majority 64.1% 77.5% 58.4% 58.8% 64.7%
System 72.5% 77.4% 61.2% 55.4% 66.6%
Prediction performance
for four discrimination
Tasks controlled for
gender
Majority 64.9% 79.6% 51.8% 59.4% 63.9%
System 72.1% 79.5% 54.0% 58.2% 66.0%
PREDICTIVE FEATURES
(B. PLANK, 2015)
By Joud Khattab 34
(3)
INTROVERT
• someone (91%)
• probably (89%)
• favorite (83%)
• stars (81%)
• b (81%)
• writing (78%)
• , the (77%)
• status count< 5000
(77%)
• lol (74%)
• but i (74%)
EXTROVERT
• pull (96%)
• mom (81%)
• travel (78%)
• don’t get (78%)
• when you’re (77%)
• posted (77%)
• #HASHTAG is
(76%)
• comes to (72%)
• tonight ! (71%)
• join (69%)
THINKING
• must be (95%)
• drink (95%)
• red (91%)
• from the (89%)
• all the (88%)
• business (85%)
• to get a (81%)
• hope (81%)
• june (78%)
• their (77%)
FEELING
• out to (88%)
• difficult (87%)
• the most (85%)
• couldn’t (85%)
• me and (80%)
• in @USER (80%)
• wonderful (79%)
• what it (79%)
• trying to (79%)
• ! so (78%)
IDENTIFYING PERSONALITY TYPES USING
DOCUMENT CLASSIFICATION METHODS
M. Komisin et al. (2012)
Department of Computer Science
University of North CarolinaWilmington
By Joud Khattab 35
(4)
DATA SET
(M. KOMISIN, 2012)
 Data collected as part of a graduate course:
 Students took the MBTI Step II.
 Completed a Best Possible Future Self (BPFS) exercise.
 Over 3 semesters, data was collected from 40 subjects.
 Best Possible Future SelfWriting (BPFS) Exercise:
 This essay contains elements of self-description, present and future, as well as various contexts.
 “Think about your life in the future. Imagine everything gone as well as it possibly.You have succeeded
accomplishing all your life goals.Think of this as the realization of all your dreams. Now, write about it.”
 Many existing data sets are comprised of written essays, which usually contain highly canonical
language, often of a specific topic.
 Such controlled settings inhibit the expression of individual traits much more than spontaneous
language.
By Joud Khattab 36
(4)
PREPROCESSING
(M. KOMISIN, 2012)
1. Word stemming.
2. Stop-words removal.
3. Multiple Data smoothing techniques.
 Lidstone smoothing.
 Good-Turing smoothing.
 Witten and Bell Smoothing.
By Joud Khattab 37
(4)
MODEL SELECTION
(M. KOMISIN, 2012)
1. Naïve Bayes.
2. SVM.
3. Linguistic Inquiry andWord Count (LIWC).
By Joud Khattab 38
(4)
LIWC FEATURES
(PENNEBAKER, 2001)
 STANDARD COUNTS:
 Word count, words per sentence, type/token ratio, words captured, words longer than 6
letters, negations, assents, articles, prepositions, numbers.
 Pronouns: 1st person singular, 1st person plural, total 1st person, total 2nd person, total
3rd person
 PSYCHOLOGICAL PROCESSES:
 Affective or emotional processes: positive emotions, positive feelings, optimism and
energy, negative emotions, anxiety or fear, anger, sadness.
 Cognitive Processes: causation, insight, discrepancy, inhibition, tentative, certainty.
 Sensory and perceptual processes: seeing, hearing, feeling.
 Social processes: communication, other references to people, friends, family, humans.
By Joud Khattab 39
(4)
LIWC FEATURES
(PENNEBAKER, 2001)
 RELATIVITY:
 Time, past tense verb, present tense verb, future tense verb.
 Space: up, down, inclusive, exclusive.
 Motion.
 PERSONAL CONCERNS:
 Occupation: school, work and job, achievement.
 Leisure activity: home, sports, television and movies, music.
 Money and financial issues.
 Metaphysical issues: religion, death, physical states and functions, body states and
symptoms, sexuality, eating and drinking, sleeping, grooming.
By Joud Khattab 40
(4)
LIWC FEATURES
(PENNEBAKER, 2001)
 OTHER DIMENSIONS:
 Punctuation: period, comma, colon, semi-colon, question, exclamation, dash, quote,
apostrophe, parenthesis, other.
 Swear words, nonfluencies, fillers.
By Joud Khattab 41
(4)
TEXT FEATURES OF BPFS ESSAYS
(M. KOMISIN, 2012)
Myers-Briggs
Preferences
Word
Tokens
Unique
Words
WordsTokens
Per Document
UniqueWord
Types Per
Document
Extraversion 10,428 1,859 401 72
Introversion 5,275 1,140 377 81
Sensing 7,913 1,455 377 69
Intuition 7,790 1,594 410 84
Thinking 6,879 1,348 362 71
Feeling 8,824 1,685 420 80
Judging 6,210 1,389 388 87
Perceiving 9,493 1,649 396 69
By Joud Khattab 42
(4)
TEXT FEATURES OF BPFS ESSAYS AFTER
PORTER AND STOP-WORD FILTERING
(M. KOMISIN, 2012)
Myers-Briggs
Preferences
Word
Tokens
Unique
Words
WordsTokens
Per Document
UniqueWord
Types Per
Document
Extraversion 5,631 1,376 217 53
Introversion 2,834 846 202 60
Sensing 4,335 1,067 206 51
Intuition 4,130 1,178 217 62
Thinking 3,718 1,015 196 53
Feeling 4,747 1,224 226 58
Judging 3,312 1,030 207 64
Perceiving 5,153 1,207 215 50
By Joud Khattab 43
(4)
CLASSIFICATION RESULTS
(M. KOMISIN, 2012)
Summary of results with leave-one-out
cross validation and sample size (n = 40)
Summary of results with leave-one-out cross
validation and reduced sample size (n = 30)
lowest clarity scores removed
By Joud Khattab 44
(4)
By Joud Khattab 45
Research
Papers
Date Set
Kind
Date Set Size Features and Pre-processing
Prediction
Models
Evaluation
Metrics
Y.Wang, 2016 Twitter Dataset
1.7 M tweets for
90,000 users, 120 K
tweets after
preprocessing
n-grams, POS tags, word vectors
(Average word vectors, Weighted
average word vectors)
Logistic Regression
(10-fold cross-
validation), Random
Forest, SVM
Highest average is
66.1% for combined
features
D. Brinks, 2012 Twitter Dataset
960 K tweets for
6,000 users
Porter Stemming, Emoticon
Substitution, MinimumToken
Frequency, Minimum User Frequency,
Term FrequencyTransform, Inverse
Document FrequencyTransform
Naïve Bayes, multi-
variate event model,
confidence metrics,
SVM, logistic
regression
Highest average is
64.5%
B. Plank, 2015 Twitter Dataset
1.2 M tweets for 1,500
users
gender, n-grams, count statistics,
tweets count, followers, statuses,
favorites
logistic regression
Highest average is
66.6% (T–F predicted
with high reliability,
while
others are very hard to
model)
M. Komisin,
2012
MBTITest and
BPFS Exercise
4800 text
specific word choices, semantic
categories words
Porter stemming, stop-words
removal, smoothing techniques
Naïve Bayes, SVM,
LIWC
Highest average 65%
RESEARCH GAP
 TwitterVS. Document.
 Language on social media has richer content that makes linguistic analysis tool
perform poorly.
 Each tweet is limited to 140 character contains hashtag, at-mention, URL and
emoticons.
 Due to retweeting, a user’s tweet may not be expressing his or her own thoughts.
 Removing StopWords problem.
 Collecting personality data is costly.
 MBTI distribution in twitter that discussed in the fourth paper.
By Joud Khattab 46
PROPOSED WORK
Validation
Model Selection
N-Gram POS tagger Naïve Bayes
Data Preprocessing
Snow Ball Stemmer Porter Stemmer Lemmatize StopWords Emoji
Data Cleaning
Data Collection
Twitter Corpus Letter Corpus Text Corpus
Research
By Joud Khattab 47
MODEL SELECTION (TEXT CORPUS)
NAÏVE BAYES
Data Set E / I T / F S / N
cleaned version  naive bayes  gain function for every two letter
50 / 20 0.6 0.95 0.525
70 / 30 ↓ 0.5 ↓ ↑ 0.96 ↑ ↑ 0.616 ↑
cleaned version  stop word  naive bayes  gain
50 / 20 0.6 0.975 0.525
70 / 30 ↓ 0.5 ↓ ↑ 0.983 ↑ ↑ 0.57 ↑
cleaned version  snow stemmer  naive bayes  gain
50 / 20 0.6 0.975 0.525
70 / 30 ↓ 0.5 ↓ ↑ 0.967 ↑ ↑ 0.583 ↑
By Joud Khattab 48
1)
MODEL SELECTION (LETTER CORPUS)
N-GRAM
1. cleaned version  1-gram  first 20%
2. cleaned version  2-gram  first 20%
3. cleaned version  3-gram  first 20%
4. cleaned version  snow stemmer  1-gram  first 20%
5. cleaned version  snow stemmer  2-gram  first 20%
6. cleaned version  snow stemmer  3-gram  first 20%
7. cleaned version  stop words  1-gram  first 20%
8. cleaned version  stop words  2-gram  first 20%
9. cleaned version  stop words  3-gram  first 20%
By Joud Khattab 49
2)
MODEL SELECTION (TWITTER CORPUS)
POSTAGGING
By Joud Khattab 50
3)
THANKYOU
By Joud Khattab 51

More Related Content

What's hot

Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Data Science Society
 
Sentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big DataSentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big Data
Iswarya M
 
Blank paper challenge
Blank paper challengeBlank paper challenge
Blank paper challengeRadu Matei
 
Sentiment Analysis using Twitter Data
Sentiment Analysis using Twitter DataSentiment Analysis using Twitter Data
Sentiment Analysis using Twitter Data
Hari Prasad
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
Amenda Joy
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Ankur Tyagi
 
Inner&Outer Journey
Inner&Outer JourneyInner&Outer Journey
Inner&Outer Journey
aiesec_poland
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on Twitter
Subarno Pal
 
Presentation Techniques
Presentation TechniquesPresentation Techniques
Presentation TechniquesVikram Kalyani
 
Evaluate to Motivation (Powerpoint)
Evaluate to Motivation (Powerpoint)Evaluate to Motivation (Powerpoint)
Evaluate to Motivation (Powerpoint)
Gavel and Glass Toastmasters Club
 
AIESEC: The Team Standards & Explanation
AIESEC: The Team Standards & ExplanationAIESEC: The Team Standards & Explanation
AIESEC: The Team Standards & Explanation
AIESEC
 
Toastmasters' Effective evaluation
Toastmasters' Effective evaluationToastmasters' Effective evaluation
Toastmasters' Effective evaluation
Renaat Toppets
 
Sentiment Analaysis on Twitter
Sentiment Analaysis on TwitterSentiment Analaysis on Twitter
Sentiment Analaysis on Twitter
Nitish J Prabhu
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
Chaand Chopra
 
social network analysis project twitter sentimental analysis
social network analysis project twitter sentimental analysissocial network analysis project twitter sentimental analysis
social network analysis project twitter sentimental analysis
Ashish Mundra
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using python
Hetu Bhavsar
 
Design the Conversation: A case study on making digital banking clear and human
Design the Conversation: A case study on making digital banking clear and humanDesign the Conversation: A case study on making digital banking clear and human
Design the Conversation: A case study on making digital banking clear and human
Sara Walsh
 

What's hot (20)

Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Sentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big DataSentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big Data
 
Blank paper challenge
Blank paper challengeBlank paper challenge
Blank paper challenge
 
Sentiment Analysis using Twitter Data
Sentiment Analysis using Twitter DataSentiment Analysis using Twitter Data
Sentiment Analysis using Twitter Data
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Inner&Outer Journey
Inner&Outer JourneyInner&Outer Journey
Inner&Outer Journey
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on Twitter
 
Presentation Techniques
Presentation TechniquesPresentation Techniques
Presentation Techniques
 
AIESEC PR
AIESEC PRAIESEC PR
AIESEC PR
 
Evaluate to Motivation (Powerpoint)
Evaluate to Motivation (Powerpoint)Evaluate to Motivation (Powerpoint)
Evaluate to Motivation (Powerpoint)
 
AIESEC: The Team Standards & Explanation
AIESEC: The Team Standards & ExplanationAIESEC: The Team Standards & Explanation
AIESEC: The Team Standards & Explanation
 
Mbti complete report
Mbti complete reportMbti complete report
Mbti complete report
 
Toastmasters' Effective evaluation
Toastmasters' Effective evaluationToastmasters' Effective evaluation
Toastmasters' Effective evaluation
 
Sentiment Analaysis on Twitter
Sentiment Analaysis on TwitterSentiment Analaysis on Twitter
Sentiment Analaysis on Twitter
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
social network analysis project twitter sentimental analysis
social network analysis project twitter sentimental analysissocial network analysis project twitter sentimental analysis
social network analysis project twitter sentimental analysis
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using python
 
Design the Conversation: A case study on making digital banking clear and human
Design the Conversation: A case study on making digital banking clear and humanDesign the Conversation: A case study on making digital banking clear and human
Design the Conversation: A case study on making digital banking clear and human
 
Introduction to NLTK
Introduction to NLTKIntroduction to NLTK
Introduction to NLTK
 

Similar to Personality Detection via MBTI Test

DevFest19 - Early Diagnosis of Chronic Diseases by Smartphone AI
DevFest19 -  Early Diagnosis of Chronic Diseases by Smartphone AIDevFest19 -  Early Diagnosis of Chronic Diseases by Smartphone AI
DevFest19 - Early Diagnosis of Chronic Diseases by Smartphone AI
Gaurav Kheterpal
 
StoryFlow - Visually Tracking Evolution of Stories
StoryFlow - Visually Tracking Evolution of StoriesStoryFlow - Visually Tracking Evolution of Stories
StoryFlow - Visually Tracking Evolution of Stories
Yingcai Wu
 
Ai in healthcare
Ai in healthcareAi in healthcare
Ai in healthcare
Sabyasachi Mukhopadhyay
 
Yolos you only look one sequence
Yolos you only look one sequenceYolos you only look one sequence
Yolos you only look one sequence
taeseon ryu
 
Genetic mapping of behaviour and gene expression in the chicken
Genetic mapping of behaviour and gene expression in the chickenGenetic mapping of behaviour and gene expression in the chicken
Genetic mapping of behaviour and gene expression in the chicken
Martin Johnsson
 
NLP DLforDS
NLP DLforDSNLP DLforDS
NLP DLforDS
Liangqun Lu
 
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Denis C. Bauer
 
Exponential lindley additive failure rate model
Exponential lindley additive failure rate modelExponential lindley additive failure rate model
Exponential lindley additive failure rate model
eSAT Journals
 
Estimation of Age Through Fingerprints Using Wavelet Transform and Singular V...
Estimation of Age Through Fingerprints Using Wavelet Transform and Singular V...Estimation of Age Through Fingerprints Using Wavelet Transform and Singular V...
Estimation of Age Through Fingerprints Using Wavelet Transform and Singular V...
CSCJournals
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Bhaskar Mitra
 
ODSC London 2018
ODSC London 2018ODSC London 2018
ODSC London 2018
Kfir Bar
 

Similar to Personality Detection via MBTI Test (11)

DevFest19 - Early Diagnosis of Chronic Diseases by Smartphone AI
DevFest19 -  Early Diagnosis of Chronic Diseases by Smartphone AIDevFest19 -  Early Diagnosis of Chronic Diseases by Smartphone AI
DevFest19 - Early Diagnosis of Chronic Diseases by Smartphone AI
 
StoryFlow - Visually Tracking Evolution of Stories
StoryFlow - Visually Tracking Evolution of StoriesStoryFlow - Visually Tracking Evolution of Stories
StoryFlow - Visually Tracking Evolution of Stories
 
Ai in healthcare
Ai in healthcareAi in healthcare
Ai in healthcare
 
Yolos you only look one sequence
Yolos you only look one sequenceYolos you only look one sequence
Yolos you only look one sequence
 
Genetic mapping of behaviour and gene expression in the chicken
Genetic mapping of behaviour and gene expression in the chickenGenetic mapping of behaviour and gene expression in the chicken
Genetic mapping of behaviour and gene expression in the chicken
 
NLP DLforDS
NLP DLforDSNLP DLforDS
NLP DLforDS
 
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1
 
Exponential lindley additive failure rate model
Exponential lindley additive failure rate modelExponential lindley additive failure rate model
Exponential lindley additive failure rate model
 
Estimation of Age Through Fingerprints Using Wavelet Transform and Singular V...
Estimation of Age Through Fingerprints Using Wavelet Transform and Singular V...Estimation of Age Through Fingerprints Using Wavelet Transform and Singular V...
Estimation of Age Through Fingerprints Using Wavelet Transform and Singular V...
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
ODSC London 2018
ODSC London 2018ODSC London 2018
ODSC London 2018
 

More from Joud Khattab

Customer Engagement Management
Customer Engagement ManagementCustomer Engagement Management
Customer Engagement Management
Joud Khattab
 
Design thinking and Role Playing
Design thinking and Role PlayingDesign thinking and Role Playing
Design thinking and Role Playing
Joud Khattab
 
Algorithms and Data Structure 2020
Algorithms and Data Structure 2020Algorithms and Data Structure 2020
Algorithms and Data Structure 2020
Joud Khattab
 
Artificial Intelligence 2020
Artificial Intelligence 2020Artificial Intelligence 2020
Artificial Intelligence 2020
Joud Khattab
 
Automata and Compiler 2020
Automata and Compiler 2020Automata and Compiler 2020
Automata and Compiler 2020
Joud Khattab
 
Database 2020
Database 2020Database 2020
Database 2020
Joud Khattab
 
Software Engineering 2020
Software Engineering 2020Software Engineering 2020
Software Engineering 2020
Joud Khattab
 
Software Engineering 2018
Software Engineering 2018Software Engineering 2018
Software Engineering 2018
Joud Khattab
 
Database 2018
Database 2018Database 2018
Database 2018
Joud Khattab
 
Automate and Compiler 2018
Automate and Compiler 2018Automate and Compiler 2018
Automate and Compiler 2018
Joud Khattab
 
Artificial Intelligence 2018
Artificial Intelligence 2018Artificial Intelligence 2018
Artificial Intelligence 2018
Joud Khattab
 
Algorithms and Data Structure 2018
Algorithms and Data Structure 2018Algorithms and Data Structure 2018
Algorithms and Data Structure 2018
Joud Khattab
 
Data Storytelling
Data StorytellingData Storytelling
Data Storytelling
Joud Khattab
 
Geospatial Information Management
Geospatial Information ManagementGeospatial Information Management
Geospatial Information Management
Joud Khattab
 
Big Data for Development
Big Data for DevelopmentBig Data for Development
Big Data for Development
Joud Khattab
 
Fog Computing
Fog ComputingFog Computing
Fog Computing
Joud Khattab
 
Seasonal ARIMA
Seasonal ARIMASeasonal ARIMA
Seasonal ARIMA
Joud Khattab
 
Optimization Techniques
Optimization TechniquesOptimization Techniques
Optimization Techniques
Joud Khattab
 
Network Address Translation (NAT)
Network Address Translation (NAT)Network Address Translation (NAT)
Network Address Translation (NAT)
Joud Khattab
 
From Image Processing To Computer Vision
From Image Processing To Computer VisionFrom Image Processing To Computer Vision
From Image Processing To Computer Vision
Joud Khattab
 

More from Joud Khattab (20)

Customer Engagement Management
Customer Engagement ManagementCustomer Engagement Management
Customer Engagement Management
 
Design thinking and Role Playing
Design thinking and Role PlayingDesign thinking and Role Playing
Design thinking and Role Playing
 
Algorithms and Data Structure 2020
Algorithms and Data Structure 2020Algorithms and Data Structure 2020
Algorithms and Data Structure 2020
 
Artificial Intelligence 2020
Artificial Intelligence 2020Artificial Intelligence 2020
Artificial Intelligence 2020
 
Automata and Compiler 2020
Automata and Compiler 2020Automata and Compiler 2020
Automata and Compiler 2020
 
Database 2020
Database 2020Database 2020
Database 2020
 
Software Engineering 2020
Software Engineering 2020Software Engineering 2020
Software Engineering 2020
 
Software Engineering 2018
Software Engineering 2018Software Engineering 2018
Software Engineering 2018
 
Database 2018
Database 2018Database 2018
Database 2018
 
Automate and Compiler 2018
Automate and Compiler 2018Automate and Compiler 2018
Automate and Compiler 2018
 
Artificial Intelligence 2018
Artificial Intelligence 2018Artificial Intelligence 2018
Artificial Intelligence 2018
 
Algorithms and Data Structure 2018
Algorithms and Data Structure 2018Algorithms and Data Structure 2018
Algorithms and Data Structure 2018
 
Data Storytelling
Data StorytellingData Storytelling
Data Storytelling
 
Geospatial Information Management
Geospatial Information ManagementGeospatial Information Management
Geospatial Information Management
 
Big Data for Development
Big Data for DevelopmentBig Data for Development
Big Data for Development
 
Fog Computing
Fog ComputingFog Computing
Fog Computing
 
Seasonal ARIMA
Seasonal ARIMASeasonal ARIMA
Seasonal ARIMA
 
Optimization Techniques
Optimization TechniquesOptimization Techniques
Optimization Techniques
 
Network Address Translation (NAT)
Network Address Translation (NAT)Network Address Translation (NAT)
Network Address Translation (NAT)
 
From Image Processing To Computer Vision
From Image Processing To Computer VisionFrom Image Processing To Computer Vision
From Image Processing To Computer Vision
 

Recently uploaded

Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 

Recently uploaded (20)

Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 

Personality Detection via MBTI Test

  • 2. MBTI PERSONALITYTEST (MYERS-BRIGGSTYPE INDICATOR) By Joud Khattab 2 ““It’s so incredible to finally be understood”
  • 3. MBTI FOUR FUNCTIONAL DICHOTOMIES Thinking (‫)التفكير‬ Feeling (‫)الشعور‬ Extraversion (‫)االنبساط‬ Introversion (‫)االنطواء‬ Sensing (‫)االستشعار‬ iNtuition (‫)الحدس‬ By Joud Khattab 3 Judging (‫)الحكم‬ Perceiving (‫)االدراك‬
  • 5. WHY PERSONALITY PREDICTION?  Areas which are directly affected with a user’s personality: 1. Marketing. 2. Recommendation Systems. 3. Customized web pages, advertisements and products. 4. Customized search engines and user experience. 5. Understanding criminal and psychopathic behaviors. 6. Sentiment analysis and clustering of text. By Joud Khattab 5
  • 6. LITERATURE SURVEY 1) Understanding Personality through Social Media:  Y.Wang et al. (2016), Department of Computer Science, Stanford University. 2) Detection of MBTI viaText Based Computer-Mediated Communication:  D. Brinks et al. (2012), Department of Electrical Engineering, Stanford University. 3) PersonalityTraits onTwitter:  B. Plank et al. (2015), Center for LanguageTechnology, University of Copenhagen. 4) Identifying PersonalityTypes Using Document Classification Methods:  M. Komisin et al. (2012), Department of Computer Science, University of North Carolina Wilmington. By Joud Khattab 6
  • 7. UNDERSTANDING PERSONALITY THROUGH SOCIAL MEDIA Y.Wang et al. (2016) Department of Computer Science Stanford University By Joud Khattab 7 (1)
  • 8. DATA SET (Y. WANG, 2016)  Twitter dataset:  GNIPAPIs.  around 90,000 users.  Extracting and filtering all personality-related tweets from 2006 to 2015.  The most recent tweets for all the 90,000 users.  1.7 million tweets that contain the personality codes. By Joud Khattab 8 (1)
  • 9. DATA CLEANING (Y. WANG, 2016) 1. PositiveTweets:  @ProfCarol Just wondering, what’s your type? I’m an ENFJ  @whitneyhess that’s an interesting test.. I got ENTP and it seems pretty accurate IMO  @megfowler I’m INTP according to this http://similarminds.com/jung.html 2. NegativeTweets:  I’ll bet that Jeremiah @jowyang is an ESTJ  @mark ENTJYou should have known... http://typelogic.com/entj.html  I love my wife. Even though she’s INFP  Retrieve 120K tweets out of all the 1.7M tweets with personality codes. By Joud Khattab 9 (1)
  • 10. SOCIAL MEDIA DATA DISADVANTAGE (Y. WANG, 2016)  Language on social media has richer content that makes linguistic analysis tool perform poorly.  Each tweet is limited to 140 character contains hashtag, at-mention, URL and emoticons.  People tend to use shorten version of phrases “iono” means “I don’t know”.  Lack of conventional orthography.  Collecting personality data is costly. By Joud Khattab 10 (1)
  • 11. PERSONALITY DISTRIBUTION IN DATASET (Y. WANG, 2016) By Joud Khattab 11 (1) Analysts Diplomates Sentinels Explorers INTJ (12,247) INFJ (12,885) ISTJ (3,446) ISTP (1,874) INTP (7,446) INFP (11,706) ISFJ (3,267) ISFP (2,492) ENTJ (4,921) ENFJ (6,812) ESTJ (2,006) ESTP (1,132) ENTP (4,386) ENFP (10,400) ESFJ (2,364) ESFP (2,164) Sum (89,548)
  • 12. FEATURES SELECTION (Y. WANG, 2016) 1) Bag of N-Grams. 2) Part-Of-Speech Tags. 3) WordVectors. By Joud Khattab 12 (1)
  • 13. N-GRAM (Y. WANG, 2016) By Joud Khattab 13 (1) Top correlated unigram forThinking Top correlated unigram for Feeling Top correlated bigram for Introversion Top correlated bigram for Extroversion
  • 14. POSTAGGING (Y. WANG, 2016)  Twitter POS tagger has 25 types of distinctive tags has been used.  Common noun is a good indicator for personality.  People who use common nouns more often tend to be in Extroversion, Intuition, Thinking, or Judging type.  Introverted people use more pronouns but less common nouns.  Interjection which includes (“lol”, “haha”, “FTW”, “yea”) is more likely to be used by Sensing and Perceiving type.  Emoticon is more likely to be used by Sensing and Feeling type.  Numbers are more likely to be used by Sensing andThinking type.  Extroverted people are more likely to use hashtags. By Joud Khattab 14 (1)
  • 15. WORD COUNT (Y. WANG, 2016) 1) Average word vectors:  average all the vectors of all the word that is available in the tweets of a user to represent the vector representations of that user. 2) Weighted average word vectors:  A weighted average the vectors of the words that is available in the tweets of a user according to theTF-IDF values.  The weighted vector representation is then used to represent the vector representations of that user. By Joud Khattab 15 (1)
  • 16. MODEL SELECTION (Y. WANG, 2016) 1. Logistic Regression model with 10-fold cross-validation. 2. Random Forest and SVM. By Joud Khattab 16 (1)
  • 17. MODEL RESULTS (Y. WANG, 2016) Classifier E vs I N vs S T vs F P vs J Average WordVector 67.9% 64.3% 67.3% 60.8% 65.1% Bag of n-grams 63.1% 58.8% 62.1% 58.8% 60.7% Unigram 61.7% 58.1% 60.9% 58.2% 59.7% Bigram 60.9% 56.9% 60.7% 57.3% 59.0% Trigram 61.3% 56.7% 59.3% 57.0% 58.6% POSTag 59.3% 57.5% 60.3% 56.9% 58.5% POS + n-rams 62.8% 60.7% 63.3% 59.6% 61.6% POS + n-gram +WordVector 69.1% 65.3% 68.0% 61.9% 66.1% By Joud Khattab 17 (1)
  • 18. DETECTION OF MBTI VIA TEXT BASED COMPUTER-MEDIATED COMMUNICATION D. Brinks et al. (2012) Department of Electrical Engineering Stanford University By Joud Khattab 18 (2)
  • 19. DATA SET (D. BRINKS, 2012)  Twitter API to get tweets including MBTI abbreviation.  6,358 users includes 960,715 tweets.  Multiple level of data elimination where done to eliminate any improper data. By Joud Khattab 19 (2)
  • 20. DATA CLEANING (D. BRINKS, 2012)  Many users labeled “INTP” weren’t referencing their MBT. instead, they had simply misspelled “into”.  Any user whose tweet contained two or more different MBTs was rejected.  numbers, links, @<user>, and MBTs were replaced with “NUMBER”, “URL”, “AT_USER”, and “MBT”.  Contractions were replaced by their expanded form.  Words were converted to lowercase.  Finally, all of a user’s tweets were aggregated into a single text block. By Joud Khattab 20 (2)
  • 21. PERSONALITY DISTRIBUTION IN DATASET (D. BRINKS, 2012) By Joud Khattab 21 (2) Analysts Diplomates Sentinels Explorers INTJ (650) INFJ (714) ISTJ (183) ISTP (105) INTP (423) INFP (449) ISFJ (181) ISFP (128) ENTJ (279) ENFJ (336) ESTJ (101) ESTP (95) ENTP (237) ENFP (448) ESFJ (151) ESFP (122) Sum = 4,602
  • 22. PROCESSING PARAMETERIZATION (D. BRINKS, 2012) 1) Porter Stemming. 2) Emoticon Substitution. 3) MinimumToken Frequency. 4) Minimum User Frequency. 5) Term FrequencyTransform. 6) Inverse Document FrequencyTransform. By Joud Khattab 22 (2)
  • 23. TRAINING ACCURACY BY CLASSIFIER (D. BRINKS, 2012) Classifier E vs I N vs S T vs F P vs J Average Multinomial Event Model Naive Bayes 96.0% 83.4% 84.6% 75.9% 85.0% L2-regularized logistic regression (primal) 99.8% 99.8% 100.0% 99.8% 99.9% L2-regularized L2-loss SV classification (dual) 99.8% 99.9% 99.9% 99.9% 99.9% L2-regularized L2-loss SV classification (primal) 99.8% 99.9% 99.9% 99.9% 99.9% L2-regularized L1-loss SV classification (dual) 99.9% 99.9% 99.9% 99.9% 99.9% SV classification by Crammer and Singer 100.0% 100.0% 100.0% 100.0% 100.0% L1-regularized L2-loss SV classification 100.0% 100.0% 100.0% 100.0% 100.0% L1-regularized logistic regression 99.9% 99.9% 99.8% 99.9% 99.9% L2-regularized logistic regression (dual) 100.0% 100.0% 100.0% 100.0% 100.0% By Joud Khattab 23 (2)
  • 24. HIGHVARIANCE SOLUTIONS (D. BRINKS, 2012) 1. Get more data:  Unfortunately,Twitter places a cap on data retrieval requests.  Even after tripling the number of collected tweets, performance remained constant. 2. Decreasing the feature set size:  Modifying the preprocessing steps.  Parameterized number of features fed to classifier to determine the optimal features.  Several transforms detailed were added to the classifier.  Algorithm was modified to use confidence metrics in its classification and instructed to only decide for users about which it had a strong degree of certainty.  However, none of these options improved testing behavior to any significant degree. By Joud Khattab 24 (2)
  • 25. PERFORMANCE BY CLASSIFIER (D. BRINKS, 2012) Classifier E vs I N vs S T vs F P vs J Average Multinomial Event Model Naive Bayes 63.9% 74.6% 60.8% 58.5% 64.5% L2-regularized logistic regression (primal) 60.3% 70.7% 59.4% 56.1% 61.6% L2-regularized L2-loss SV classification (dual) 56.9% 67.5% 59.3% 54.1% 59.5% L2-regularized L2-loss SV classification (primal) 58.8% 69.5% 59.0% 55.9% 61.0% L2-regularized L1-loss SV classification (dual) 56.8% 67.6% 59.6% 54.5% 59.7% SV classification by Crammer and Singer 56.8% 67.7% 59.4% 54.5% 59.6% L1-regularized L2-loss SV classification 59.4% 68.3% 56.8% 56.1% 60.2% L1-regularized logistic regression 60.9% 70.5% 58.5% 56.3% 61.6% L2-regularized logistic regression (dual) 59.2% 69.6% 59.0% 55.0% 60.7% By Joud Khattab 25 (2)
  • 26. DATA PROBLEM (D. BRINKS, 2012)  Reasons why the machine classifier did not achieve better performance because a large portion of tweets are noise with respect to MBTI.  Twitter imposes a 140-character limit on each tweet, users are forced to express themselves succinctly.  Large percentage of tokens in tweets are not English words, but twitter handles being retweeted or URLs.Thus, while a user’s tweet set may contain a thousand tokens, a significant subset is unique to that individual user, and cannot be used for correlation.  Due to retweeting, a user’s tweet may not be expressing his or her own thoughts. By Joud Khattab 26 (2)
  • 27. COMPARISON WITH HUMAN EXPERTS (D. BRINKS, 2012) Spectrum Human 1 Human 2 MNEMNB E vs I 50.0% 40.0% 55.0% N vs S 50.0% 90.0% 90.0% T vs F 80.0% 65.0% 55.0% P vs J 60.0% 50.0% 65.0% Average 60.0% 61.3% 66.3% By Joud Khattab 27 (2)
  • 28. PERSONALITY TRAITS ON TWITTER B. Plank et al. (2015) Center for LanguageTechnology University of Copenhagen By Joud Khattab 28 (3)
  • 29. DATA SET (B. PLANK, 2015)  Corpus of 1.2M tweets.  1,500 users that self-identity with an MBTI.  Open source code and data set. By Joud Khattab 29 (3)
  • 30. PERSONALITY DISTRIBUTION IN DATASET (B. PLANK, 2015) By Joud Khattab 30 (3) Analysts Diplomates Sentinels Explorers INTJ (193) INFJ (257) ISTJ (75) ISTP (22) INTP (111) INFP (175) ISFJ (77) ISFP (51) ENTJ (102) ENFJ (106) ESTJ (36) ESTP (15) ENTP (70) ENFP (148) ESFJ (36) ESFP (26) Sum = 1,500
  • 31. MBTI DISTRIBUTION INTWITTER CORPUSVS GENERAL US POPULATION (B. PLANK, 2015) By Joud Khattab 31 (3)
  • 32. By Joud Khattab 32 0 2 4 6 8 10 12 14 16 18 ISTP ESFP ESFJ ESTJ ESTP ENFJ ENTJ ISTJ ISFP ENTP ISFJ INTP ENFP INFJ INFP INTJ MBTI distribution inTwitter corpusVS general US population US Population Paper 3 Paper 2 Paper 1
  • 33. CLASSIFIER (B. PLANK, 2015) By Joud Khattab 33 (3) Classifier E vs I N vs S T vs F P vs J Average Accuracy for four discrimination tasks Majority 64.1% 77.5% 58.4% 58.8% 64.7% System 72.5% 77.4% 61.2% 55.4% 66.6% Prediction performance for four discrimination Tasks controlled for gender Majority 64.9% 79.6% 51.8% 59.4% 63.9% System 72.1% 79.5% 54.0% 58.2% 66.0%
  • 34. PREDICTIVE FEATURES (B. PLANK, 2015) By Joud Khattab 34 (3) INTROVERT • someone (91%) • probably (89%) • favorite (83%) • stars (81%) • b (81%) • writing (78%) • , the (77%) • status count< 5000 (77%) • lol (74%) • but i (74%) EXTROVERT • pull (96%) • mom (81%) • travel (78%) • don’t get (78%) • when you’re (77%) • posted (77%) • #HASHTAG is (76%) • comes to (72%) • tonight ! (71%) • join (69%) THINKING • must be (95%) • drink (95%) • red (91%) • from the (89%) • all the (88%) • business (85%) • to get a (81%) • hope (81%) • june (78%) • their (77%) FEELING • out to (88%) • difficult (87%) • the most (85%) • couldn’t (85%) • me and (80%) • in @USER (80%) • wonderful (79%) • what it (79%) • trying to (79%) • ! so (78%)
  • 35. IDENTIFYING PERSONALITY TYPES USING DOCUMENT CLASSIFICATION METHODS M. Komisin et al. (2012) Department of Computer Science University of North CarolinaWilmington By Joud Khattab 35 (4)
  • 36. DATA SET (M. KOMISIN, 2012)  Data collected as part of a graduate course:  Students took the MBTI Step II.  Completed a Best Possible Future Self (BPFS) exercise.  Over 3 semesters, data was collected from 40 subjects.  Best Possible Future SelfWriting (BPFS) Exercise:  This essay contains elements of self-description, present and future, as well as various contexts.  “Think about your life in the future. Imagine everything gone as well as it possibly.You have succeeded accomplishing all your life goals.Think of this as the realization of all your dreams. Now, write about it.”  Many existing data sets are comprised of written essays, which usually contain highly canonical language, often of a specific topic.  Such controlled settings inhibit the expression of individual traits much more than spontaneous language. By Joud Khattab 36 (4)
  • 37. PREPROCESSING (M. KOMISIN, 2012) 1. Word stemming. 2. Stop-words removal. 3. Multiple Data smoothing techniques.  Lidstone smoothing.  Good-Turing smoothing.  Witten and Bell Smoothing. By Joud Khattab 37 (4)
  • 38. MODEL SELECTION (M. KOMISIN, 2012) 1. Naïve Bayes. 2. SVM. 3. Linguistic Inquiry andWord Count (LIWC). By Joud Khattab 38 (4)
  • 39. LIWC FEATURES (PENNEBAKER, 2001)  STANDARD COUNTS:  Word count, words per sentence, type/token ratio, words captured, words longer than 6 letters, negations, assents, articles, prepositions, numbers.  Pronouns: 1st person singular, 1st person plural, total 1st person, total 2nd person, total 3rd person  PSYCHOLOGICAL PROCESSES:  Affective or emotional processes: positive emotions, positive feelings, optimism and energy, negative emotions, anxiety or fear, anger, sadness.  Cognitive Processes: causation, insight, discrepancy, inhibition, tentative, certainty.  Sensory and perceptual processes: seeing, hearing, feeling.  Social processes: communication, other references to people, friends, family, humans. By Joud Khattab 39 (4)
  • 40. LIWC FEATURES (PENNEBAKER, 2001)  RELATIVITY:  Time, past tense verb, present tense verb, future tense verb.  Space: up, down, inclusive, exclusive.  Motion.  PERSONAL CONCERNS:  Occupation: school, work and job, achievement.  Leisure activity: home, sports, television and movies, music.  Money and financial issues.  Metaphysical issues: religion, death, physical states and functions, body states and symptoms, sexuality, eating and drinking, sleeping, grooming. By Joud Khattab 40 (4)
  • 41. LIWC FEATURES (PENNEBAKER, 2001)  OTHER DIMENSIONS:  Punctuation: period, comma, colon, semi-colon, question, exclamation, dash, quote, apostrophe, parenthesis, other.  Swear words, nonfluencies, fillers. By Joud Khattab 41 (4)
  • 42. TEXT FEATURES OF BPFS ESSAYS (M. KOMISIN, 2012) Myers-Briggs Preferences Word Tokens Unique Words WordsTokens Per Document UniqueWord Types Per Document Extraversion 10,428 1,859 401 72 Introversion 5,275 1,140 377 81 Sensing 7,913 1,455 377 69 Intuition 7,790 1,594 410 84 Thinking 6,879 1,348 362 71 Feeling 8,824 1,685 420 80 Judging 6,210 1,389 388 87 Perceiving 9,493 1,649 396 69 By Joud Khattab 42 (4)
  • 43. TEXT FEATURES OF BPFS ESSAYS AFTER PORTER AND STOP-WORD FILTERING (M. KOMISIN, 2012) Myers-Briggs Preferences Word Tokens Unique Words WordsTokens Per Document UniqueWord Types Per Document Extraversion 5,631 1,376 217 53 Introversion 2,834 846 202 60 Sensing 4,335 1,067 206 51 Intuition 4,130 1,178 217 62 Thinking 3,718 1,015 196 53 Feeling 4,747 1,224 226 58 Judging 3,312 1,030 207 64 Perceiving 5,153 1,207 215 50 By Joud Khattab 43 (4)
  • 44. CLASSIFICATION RESULTS (M. KOMISIN, 2012) Summary of results with leave-one-out cross validation and sample size (n = 40) Summary of results with leave-one-out cross validation and reduced sample size (n = 30) lowest clarity scores removed By Joud Khattab 44 (4)
  • 45. By Joud Khattab 45 Research Papers Date Set Kind Date Set Size Features and Pre-processing Prediction Models Evaluation Metrics Y.Wang, 2016 Twitter Dataset 1.7 M tweets for 90,000 users, 120 K tweets after preprocessing n-grams, POS tags, word vectors (Average word vectors, Weighted average word vectors) Logistic Regression (10-fold cross- validation), Random Forest, SVM Highest average is 66.1% for combined features D. Brinks, 2012 Twitter Dataset 960 K tweets for 6,000 users Porter Stemming, Emoticon Substitution, MinimumToken Frequency, Minimum User Frequency, Term FrequencyTransform, Inverse Document FrequencyTransform Naïve Bayes, multi- variate event model, confidence metrics, SVM, logistic regression Highest average is 64.5% B. Plank, 2015 Twitter Dataset 1.2 M tweets for 1,500 users gender, n-grams, count statistics, tweets count, followers, statuses, favorites logistic regression Highest average is 66.6% (T–F predicted with high reliability, while others are very hard to model) M. Komisin, 2012 MBTITest and BPFS Exercise 4800 text specific word choices, semantic categories words Porter stemming, stop-words removal, smoothing techniques Naïve Bayes, SVM, LIWC Highest average 65%
  • 46. RESEARCH GAP  TwitterVS. Document.  Language on social media has richer content that makes linguistic analysis tool perform poorly.  Each tweet is limited to 140 character contains hashtag, at-mention, URL and emoticons.  Due to retweeting, a user’s tweet may not be expressing his or her own thoughts.  Removing StopWords problem.  Collecting personality data is costly.  MBTI distribution in twitter that discussed in the fourth paper. By Joud Khattab 46
  • 47. PROPOSED WORK Validation Model Selection N-Gram POS tagger Naïve Bayes Data Preprocessing Snow Ball Stemmer Porter Stemmer Lemmatize StopWords Emoji Data Cleaning Data Collection Twitter Corpus Letter Corpus Text Corpus Research By Joud Khattab 47
  • 48. MODEL SELECTION (TEXT CORPUS) NAÏVE BAYES Data Set E / I T / F S / N cleaned version  naive bayes  gain function for every two letter 50 / 20 0.6 0.95 0.525 70 / 30 ↓ 0.5 ↓ ↑ 0.96 ↑ ↑ 0.616 ↑ cleaned version  stop word  naive bayes  gain 50 / 20 0.6 0.975 0.525 70 / 30 ↓ 0.5 ↓ ↑ 0.983 ↑ ↑ 0.57 ↑ cleaned version  snow stemmer  naive bayes  gain 50 / 20 0.6 0.975 0.525 70 / 30 ↓ 0.5 ↓ ↑ 0.967 ↑ ↑ 0.583 ↑ By Joud Khattab 48 1)
  • 49. MODEL SELECTION (LETTER CORPUS) N-GRAM 1. cleaned version  1-gram  first 20% 2. cleaned version  2-gram  first 20% 3. cleaned version  3-gram  first 20% 4. cleaned version  snow stemmer  1-gram  first 20% 5. cleaned version  snow stemmer  2-gram  first 20% 6. cleaned version  snow stemmer  3-gram  first 20% 7. cleaned version  stop words  1-gram  first 20% 8. cleaned version  stop words  2-gram  first 20% 9. cleaned version  stop words  3-gram  first 20% By Joud Khattab 49 2)
  • 50. MODEL SELECTION (TWITTER CORPUS) POSTAGGING By Joud Khattab 50 3)