SlideShare a Scribd company logo
Subconscious Crowdsourcing:
A Feasible Data Collection Mechanism for Mental
Disorder Detection on Social Media
Chun-Hao Chang, Elvis Saravia, and Yi-Shin Chen
Institute of Information Systems and Applications
National Tsing Hua University
Hsinchu, Taiwan 30013, R.O.C.
Email: { ccha97u, ellfae, yishin}@gmail.com
1
Introduction
➔ One in three persons report sufficient criteria for at least one form of
mental disorder at some point in their life.
➔ 16% in US suffer from some form of mental disorder. The leading cause
of disability worldwide.
➔ Problem: Majority of cases remain largely undetected. Diagnosis is
difficult.
➔ Solution: Social networks provide a venue for mental disorder research.
Source: Wikipedia 2
Background
Bipolar Disorder:
- Unstable and impulsive emotions
- Cycling between mania and depression
Borderline Personality Disorder:
- Unstable and impulsive emotions
- Impaired social interactions
3
Motivation
➔ Open access to patients data
from social websites.
➔ Build a real-time mental health
assessment tool to assist in
diagnosis.
4
Related Work
➔ Predicting Depression via Social Media - Microsoft (M De Choudhury, M
Gamon, S Counts, E Horvitz - ICWSM, 2013)
1. Collected data using crowdsourcing platform, Amazon Mechanical Turk.
2. Purchased Twitter data.
3. Prediction of depression before diagnosis.
➔ Quantifying Mental Health Signals in Twitter - John Hopkins University
(Coppersmith, G., Dredze, M., & Harman, C. (2014))
1. Automatically collected patients by keyword matching (e.g., “I was diagnosed with X”).
2. Predicts 4 different kinds of mental disorders.
Limitation: Data not easily accessible or reproduced.
5
Challenges
➔ How to identify online patients?
➔ How to efficiently collect patients data?
➔ Avoid selection bias - Is the predictive model detecting patients with
mental illnesses or just people talk about it?
6
Objectives
➔ To build predictive models for the purpose of mental disorder
detection.
➔ To extract features which alleviate the selection bias problem.
➔ Standardize features for mental disorder detection.
7
Methodology
8
Data Collection
➔ Subconscious crowdsourcing - a reliable and efficient mechanism to
gather patients data. Community is the key element.
Therapist
Patients
9
Preprocessing
➔ Twitter accounts with more than 100 posts
➔ Accounts with more than 50% hyperlinks were also removed
Purpose: Getting rid of spam accounts.
10
Feature Extraction
➔ Overall, we are interested in linguistic and behavioural
features.
➔ Information that reveals a user’s personality and behavior: emotion
transition, social interactions, age, gender, etc.
➔ TF-IDF, LIWC, and Pattern of Life Features
11
Features
➔ TF-IDF Model:
◆ Unigrams and bigrams
➔ LIWC (Linguistic Inquiry and Word Count):
◆ Thoughts, feeling, personality and motivation
➔ Pattern of Life:
◆ Emotional scores, age, and gender
◆ Polarity features (negative ratio, positive ratio, positive combo,
negative combo, and flips ratio)
◆ Social features (tweeting frequency, mention ratio, frequent
mentions, and unique mentions) 12
Experiments: Data
Group Users Tweets Averaged Tweets
Random Samples 548 796957 1454.3
Bipolar Patients 278 347774 1250.99
BPD Patients 203 225774 1112.19
Bipolar Experts 11 14056 1611.67
BPD Experts 9 19696 1790.55
13
Experiments: Evaluation
➔ Three predictive models (Random Forest) for each mental disorder
◆ Pattern of Life Model
◆ TF-IDF Model
◆ LIWC Model
➔ Three experiments
◆ 10-Fold Cross Validation Test
◆ Selection Bias Test
◆ Limited Data Test
14
10-Fold Cross Validation
Pattern of Life 0.90
LIWC 0.91
TF-IDF 0.96
Pattern of Life 0.91
LIWC 0.90
TF-IDF 0.96
15
Selection Bias Test
Is model detecting user suffering from
mental disorder or just talking about it?
Bipolar BPD
mentalhealth dbt
meds feeling
blog borderline
therapy helps
anxiety self harm
thoughts psychiatrist
feel better cpn
electroboyusa disorder
health bpdchat
bipolarblogger depression
Top TF-IDF terms
16
Data Limitation
What if user only has a few tweets?
17
Conclusion
➔ We proposed an efficient and accessible mechanism for collection
patients data.
➔ We improved the Pattern of Life Model to produce better predictions.
➔ Address selection bias problem, previously not addressed.
Future work: Support more mental illnesses
18
Demonstration
19

More Related Content

Similar to Subconscious Crowdsourcing: A Feasible Data Collection Mechanism for Mental Disorder Detection on Social Media

Context Aware Harassment Detection in Social Media [Overview]
Context Aware Harassment Detection in Social Media [Overview]Context Aware Harassment Detection in Social Media [Overview]
Context Aware Harassment Detection in Social Media [Overview]
Artificial Intelligence Institute at UofSC
 
Detecting Mental Disorders in social Media through Emotional patterns-The cas...
Detecting Mental Disorders in social Media through Emotional patterns-The cas...Detecting Mental Disorders in social Media through Emotional patterns-The cas...
Detecting Mental Disorders in social Media through Emotional patterns-The cas...
Shakas Technologies
 
Mental Disorder Detection on Twitter
Mental Disorder Detection on TwitterMental Disorder Detection on Twitter
Mental Disorder Detection on Twitter
Chun-Hao Chang
 
Finished product
Finished productFinished product
Finished product
mbalzano25
 
Applications of SNA Week 4: Health networks
Applications of SNA Week 4: Health networksApplications of SNA Week 4: Health networks
Applications of SNA Week 4: Health networks
DharmiKapadia
 
Online patients: characteristics and behaviour on health social networks - fe...
Online patients: characteristics and behaviour on health social networks - fe...Online patients: characteristics and behaviour on health social networks - fe...
Online patients: characteristics and behaviour on health social networks - fe...
Ricardo Sousa
 
Ian's UnityHealth 2019 grand rounds suicide prevention
Ian's UnityHealth 2019 grand rounds suicide preventionIan's UnityHealth 2019 grand rounds suicide prevention
Ian's UnityHealth 2019 grand rounds suicide prevention
Ian Dawe
 
There Is A 90% Probability That Your Son Is Pregnant: Predicting The Future ...
There Is A 90% Probability That Your Son Is Pregnant:  Predicting The Future ...There Is A 90% Probability That Your Son Is Pregnant:  Predicting The Future ...
There Is A 90% Probability That Your Son Is Pregnant: Predicting The Future ...
Health Catalyst
 
Associations between violent media and violence behavior over a 24-month peri...
Associations between violent media and violence behavior over a 24-month peri...Associations between violent media and violence behavior over a 24-month peri...
Associations between violent media and violence behavior over a 24-month peri...
Center for Innovative Public Health Research
 
Social Media Research and Practice in the Health Domain - Tutorial, Part II
Social Media Research and Practice in the Health Domain - Tutorial, Part IISocial Media Research and Practice in the Health Domain - Tutorial, Part II
Social Media Research and Practice in the Health Domain - Tutorial, Part II
Ingmar Weber
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
Woodridgeturtle
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
415167hg
 
2nd Period FInished Product
2nd Period FInished Product2nd Period FInished Product
2nd Period FInished Product
jamiekraska
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
Amber_ONeal
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
415533
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
415514
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
415248
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
415203
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
415547kj
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
astrid-vargas
 

Similar to Subconscious Crowdsourcing: A Feasible Data Collection Mechanism for Mental Disorder Detection on Social Media (20)

Context Aware Harassment Detection in Social Media [Overview]
Context Aware Harassment Detection in Social Media [Overview]Context Aware Harassment Detection in Social Media [Overview]
Context Aware Harassment Detection in Social Media [Overview]
 
Detecting Mental Disorders in social Media through Emotional patterns-The cas...
Detecting Mental Disorders in social Media through Emotional patterns-The cas...Detecting Mental Disorders in social Media through Emotional patterns-The cas...
Detecting Mental Disorders in social Media through Emotional patterns-The cas...
 
Mental Disorder Detection on Twitter
Mental Disorder Detection on TwitterMental Disorder Detection on Twitter
Mental Disorder Detection on Twitter
 
Finished product
Finished productFinished product
Finished product
 
Applications of SNA Week 4: Health networks
Applications of SNA Week 4: Health networksApplications of SNA Week 4: Health networks
Applications of SNA Week 4: Health networks
 
Online patients: characteristics and behaviour on health social networks - fe...
Online patients: characteristics and behaviour on health social networks - fe...Online patients: characteristics and behaviour on health social networks - fe...
Online patients: characteristics and behaviour on health social networks - fe...
 
Ian's UnityHealth 2019 grand rounds suicide prevention
Ian's UnityHealth 2019 grand rounds suicide preventionIan's UnityHealth 2019 grand rounds suicide prevention
Ian's UnityHealth 2019 grand rounds suicide prevention
 
There Is A 90% Probability That Your Son Is Pregnant: Predicting The Future ...
There Is A 90% Probability That Your Son Is Pregnant:  Predicting The Future ...There Is A 90% Probability That Your Son Is Pregnant:  Predicting The Future ...
There Is A 90% Probability That Your Son Is Pregnant: Predicting The Future ...
 
Associations between violent media and violence behavior over a 24-month peri...
Associations between violent media and violence behavior over a 24-month peri...Associations between violent media and violence behavior over a 24-month peri...
Associations between violent media and violence behavior over a 24-month peri...
 
Social Media Research and Practice in the Health Domain - Tutorial, Part II
Social Media Research and Practice in the Health Domain - Tutorial, Part IISocial Media Research and Practice in the Health Domain - Tutorial, Part II
Social Media Research and Practice in the Health Domain - Tutorial, Part II
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
 
2nd Period FInished Product
2nd Period FInished Product2nd Period FInished Product
2nd Period FInished Product
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
 

More from Elvis Saravia

The Future of Brain-Powered Learning
The Future of Brain-Powered Learning The Future of Brain-Powered Learning
The Future of Brain-Powered Learning
Elvis Saravia
 
Introduction to Fundamentals of RNNs
Introduction to Fundamentals of RNNsIntroduction to Fundamentals of RNNs
Introduction to Fundamentals of RNNs
Elvis Saravia
 
Text mining lab (summer 2017) - Word Vector Representation
Text mining lab (summer 2017) - Word Vector RepresentationText mining lab (summer 2017) - Word Vector Representation
Text mining lab (summer 2017) - Word Vector Representation
Elvis Saravia
 
Thesis oral defense 2015 elvis saravia
Thesis oral defense 2015  elvis saraviaThesis oral defense 2015  elvis saravia
Thesis oral defense 2015 elvis saravia
Elvis Saravia
 
An Introduction to Apache Spark
An Introduction to Apache SparkAn Introduction to Apache Spark
An Introduction to Apache Spark
Elvis Saravia
 
The Neurochemistry of Music
The Neurochemistry of MusicThe Neurochemistry of Music
The Neurochemistry of Music
Elvis Saravia
 
NewSQL - The Future of Databases?
NewSQL - The Future of Databases?NewSQL - The Future of Databases?
NewSQL - The Future of Databases?
Elvis Saravia
 
Crowdsource Delivery System - Improving traditional delivery systems
Crowdsource Delivery System - Improving traditional delivery systemsCrowdsource Delivery System - Improving traditional delivery systems
Crowdsource Delivery System - Improving traditional delivery systems
Elvis Saravia
 
Relational Databases - Benefits and Challenges
Relational Databases - Benefits and ChallengesRelational Databases - Benefits and Challenges
Relational Databases - Benefits and Challenges
Elvis Saravia
 

More from Elvis Saravia (9)

The Future of Brain-Powered Learning
The Future of Brain-Powered Learning The Future of Brain-Powered Learning
The Future of Brain-Powered Learning
 
Introduction to Fundamentals of RNNs
Introduction to Fundamentals of RNNsIntroduction to Fundamentals of RNNs
Introduction to Fundamentals of RNNs
 
Text mining lab (summer 2017) - Word Vector Representation
Text mining lab (summer 2017) - Word Vector RepresentationText mining lab (summer 2017) - Word Vector Representation
Text mining lab (summer 2017) - Word Vector Representation
 
Thesis oral defense 2015 elvis saravia
Thesis oral defense 2015  elvis saraviaThesis oral defense 2015  elvis saravia
Thesis oral defense 2015 elvis saravia
 
An Introduction to Apache Spark
An Introduction to Apache SparkAn Introduction to Apache Spark
An Introduction to Apache Spark
 
The Neurochemistry of Music
The Neurochemistry of MusicThe Neurochemistry of Music
The Neurochemistry of Music
 
NewSQL - The Future of Databases?
NewSQL - The Future of Databases?NewSQL - The Future of Databases?
NewSQL - The Future of Databases?
 
Crowdsource Delivery System - Improving traditional delivery systems
Crowdsource Delivery System - Improving traditional delivery systemsCrowdsource Delivery System - Improving traditional delivery systems
Crowdsource Delivery System - Improving traditional delivery systems
 
Relational Databases - Benefits and Challenges
Relational Databases - Benefits and ChallengesRelational Databases - Benefits and Challenges
Relational Databases - Benefits and Challenges
 

Recently uploaded

LORRAINE ANDREI_LEQUIGAN_HOW TO USE TELEGRAM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE TELEGRAMLORRAINE ANDREI_LEQUIGAN_HOW TO USE TELEGRAM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE TELEGRAM
lorraineandreiamcidl
 
Dominate Reddit Discussions.............
Dominate Reddit Discussions.............Dominate Reddit Discussions.............
Dominate Reddit Discussions.............
SocioCosmos
 
HOW TO USE FACEBOOK _ by Clarissa Credito
HOW TO USE FACEBOOK _ by Clarissa CreditoHOW TO USE FACEBOOK _ by Clarissa Credito
HOW TO USE FACEBOOK _ by Clarissa Credito
ClarissaAlanoCredito
 
EASY TUTORIAL OF HOW TO USE REMINI BY: FEBLESS HERNANE
EASY TUTORIAL OF HOW TO USE REMINI BY: FEBLESS HERNANEEASY TUTORIAL OF HOW TO USE REMINI BY: FEBLESS HERNANE
EASY TUTORIAL OF HOW TO USE REMINI BY: FEBLESS HERNANE
Febless Hernane
 
HOW TO USE THREADS an Instagram App_ by Clarissa Credito
HOW TO USE THREADS an Instagram App_ by Clarissa CreditoHOW TO USE THREADS an Instagram App_ by Clarissa Credito
HOW TO USE THREADS an Instagram App_ by Clarissa Credito
ClarissaAlanoCredito
 
EASY TUTORIAL OF HOW TO USE G-TEAMS BY: FEBLESS HERNANE
EASY TUTORIAL OF HOW TO USE G-TEAMS BY: FEBLESS HERNANEEASY TUTORIAL OF HOW TO USE G-TEAMS BY: FEBLESS HERNANE
EASY TUTORIAL OF HOW TO USE G-TEAMS BY: FEBLESS HERNANE
Febless Hernane
 
快速办理(BCR毕业证书)加州大学河滨分校毕业证文凭证书一模一样
快速办理(BCR毕业证书)加州大学河滨分校毕业证文凭证书一模一样快速办理(BCR毕业证书)加州大学河滨分校毕业证文凭证书一模一样
快速办理(BCR毕业证书)加州大学河滨分校毕业证文凭证书一模一样
ryxqoswi
 
STUDY ON THE DEVELOPMENT STRATEGY OF HUZHOU TOURISM
STUDY ON THE DEVELOPMENT STRATEGY OF HUZHOU TOURISMSTUDY ON THE DEVELOPMENT STRATEGY OF HUZHOU TOURISM
STUDY ON THE DEVELOPMENT STRATEGY OF HUZHOU TOURISM
AJHSSR Journal
 
HMS Facebook Stories All V1 06092024.docx
HMS Facebook Stories All V1 06092024.docxHMS Facebook Stories All V1 06092024.docx
HMS Facebook Stories All V1 06092024.docx
Charles Bayless
 
UR BHATTI ACADEMY AND ONLINE COURSES.pdf
UR BHATTI ACADEMY AND ONLINE COURSES.pdfUR BHATTI ACADEMY AND ONLINE COURSES.pdf
UR BHATTI ACADEMY AND ONLINE COURSES.pdf
urbhattiacademy
 
Project Serenity — 33% Life-time Commissions.docx
Project Serenity — 33% Life-time Commissions.docxProject Serenity — 33% Life-time Commissions.docx
Project Serenity — 33% Life-time Commissions.docx
zeqirielmedina8
 
原版制作(Hull毕业证书)赫尔大学毕业证Offer一模一样
原版制作(Hull毕业证书)赫尔大学毕业证Offer一模一样原版制作(Hull毕业证书)赫尔大学毕业证Offer一模一样
原版制作(Hull毕业证书)赫尔大学毕业证Offer一模一样
7lkkjxt
 
The Evolution of SEO: Insights from a Leading Digital Marketing Agency
The Evolution of SEO: Insights from a Leading Digital Marketing AgencyThe Evolution of SEO: Insights from a Leading Digital Marketing Agency
The Evolution of SEO: Insights from a Leading Digital Marketing Agency
Digital Marketing Lab
 
Lifecycle of a GME Trader: From Newbie to Diamond Hands
Lifecycle of a GME Trader: From Newbie to Diamond HandsLifecycle of a GME Trader: From Newbie to Diamond Hands
Lifecycle of a GME Trader: From Newbie to Diamond Hands
mediavestfzllc
 
Your LinkedIn Success Starts Here.......
Your LinkedIn Success Starts Here.......Your LinkedIn Success Starts Here.......
Your LinkedIn Success Starts Here.......
SocioCosmos
 

Recently uploaded (15)

LORRAINE ANDREI_LEQUIGAN_HOW TO USE TELEGRAM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE TELEGRAMLORRAINE ANDREI_LEQUIGAN_HOW TO USE TELEGRAM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE TELEGRAM
 
Dominate Reddit Discussions.............
Dominate Reddit Discussions.............Dominate Reddit Discussions.............
Dominate Reddit Discussions.............
 
HOW TO USE FACEBOOK _ by Clarissa Credito
HOW TO USE FACEBOOK _ by Clarissa CreditoHOW TO USE FACEBOOK _ by Clarissa Credito
HOW TO USE FACEBOOK _ by Clarissa Credito
 
EASY TUTORIAL OF HOW TO USE REMINI BY: FEBLESS HERNANE
EASY TUTORIAL OF HOW TO USE REMINI BY: FEBLESS HERNANEEASY TUTORIAL OF HOW TO USE REMINI BY: FEBLESS HERNANE
EASY TUTORIAL OF HOW TO USE REMINI BY: FEBLESS HERNANE
 
HOW TO USE THREADS an Instagram App_ by Clarissa Credito
HOW TO USE THREADS an Instagram App_ by Clarissa CreditoHOW TO USE THREADS an Instagram App_ by Clarissa Credito
HOW TO USE THREADS an Instagram App_ by Clarissa Credito
 
EASY TUTORIAL OF HOW TO USE G-TEAMS BY: FEBLESS HERNANE
EASY TUTORIAL OF HOW TO USE G-TEAMS BY: FEBLESS HERNANEEASY TUTORIAL OF HOW TO USE G-TEAMS BY: FEBLESS HERNANE
EASY TUTORIAL OF HOW TO USE G-TEAMS BY: FEBLESS HERNANE
 
快速办理(BCR毕业证书)加州大学河滨分校毕业证文凭证书一模一样
快速办理(BCR毕业证书)加州大学河滨分校毕业证文凭证书一模一样快速办理(BCR毕业证书)加州大学河滨分校毕业证文凭证书一模一样
快速办理(BCR毕业证书)加州大学河滨分校毕业证文凭证书一模一样
 
STUDY ON THE DEVELOPMENT STRATEGY OF HUZHOU TOURISM
STUDY ON THE DEVELOPMENT STRATEGY OF HUZHOU TOURISMSTUDY ON THE DEVELOPMENT STRATEGY OF HUZHOU TOURISM
STUDY ON THE DEVELOPMENT STRATEGY OF HUZHOU TOURISM
 
HMS Facebook Stories All V1 06092024.docx
HMS Facebook Stories All V1 06092024.docxHMS Facebook Stories All V1 06092024.docx
HMS Facebook Stories All V1 06092024.docx
 
UR BHATTI ACADEMY AND ONLINE COURSES.pdf
UR BHATTI ACADEMY AND ONLINE COURSES.pdfUR BHATTI ACADEMY AND ONLINE COURSES.pdf
UR BHATTI ACADEMY AND ONLINE COURSES.pdf
 
Project Serenity — 33% Life-time Commissions.docx
Project Serenity — 33% Life-time Commissions.docxProject Serenity — 33% Life-time Commissions.docx
Project Serenity — 33% Life-time Commissions.docx
 
原版制作(Hull毕业证书)赫尔大学毕业证Offer一模一样
原版制作(Hull毕业证书)赫尔大学毕业证Offer一模一样原版制作(Hull毕业证书)赫尔大学毕业证Offer一模一样
原版制作(Hull毕业证书)赫尔大学毕业证Offer一模一样
 
The Evolution of SEO: Insights from a Leading Digital Marketing Agency
The Evolution of SEO: Insights from a Leading Digital Marketing AgencyThe Evolution of SEO: Insights from a Leading Digital Marketing Agency
The Evolution of SEO: Insights from a Leading Digital Marketing Agency
 
Lifecycle of a GME Trader: From Newbie to Diamond Hands
Lifecycle of a GME Trader: From Newbie to Diamond HandsLifecycle of a GME Trader: From Newbie to Diamond Hands
Lifecycle of a GME Trader: From Newbie to Diamond Hands
 
Your LinkedIn Success Starts Here.......
Your LinkedIn Success Starts Here.......Your LinkedIn Success Starts Here.......
Your LinkedIn Success Starts Here.......
 

Subconscious Crowdsourcing: A Feasible Data Collection Mechanism for Mental Disorder Detection on Social Media

  • 1. Subconscious Crowdsourcing: A Feasible Data Collection Mechanism for Mental Disorder Detection on Social Media Chun-Hao Chang, Elvis Saravia, and Yi-Shin Chen Institute of Information Systems and Applications National Tsing Hua University Hsinchu, Taiwan 30013, R.O.C. Email: { ccha97u, ellfae, yishin}@gmail.com 1
  • 2. Introduction ➔ One in three persons report sufficient criteria for at least one form of mental disorder at some point in their life. ➔ 16% in US suffer from some form of mental disorder. The leading cause of disability worldwide. ➔ Problem: Majority of cases remain largely undetected. Diagnosis is difficult. ➔ Solution: Social networks provide a venue for mental disorder research. Source: Wikipedia 2
  • 3. Background Bipolar Disorder: - Unstable and impulsive emotions - Cycling between mania and depression Borderline Personality Disorder: - Unstable and impulsive emotions - Impaired social interactions 3
  • 4. Motivation ➔ Open access to patients data from social websites. ➔ Build a real-time mental health assessment tool to assist in diagnosis. 4
  • 5. Related Work ➔ Predicting Depression via Social Media - Microsoft (M De Choudhury, M Gamon, S Counts, E Horvitz - ICWSM, 2013) 1. Collected data using crowdsourcing platform, Amazon Mechanical Turk. 2. Purchased Twitter data. 3. Prediction of depression before diagnosis. ➔ Quantifying Mental Health Signals in Twitter - John Hopkins University (Coppersmith, G., Dredze, M., & Harman, C. (2014)) 1. Automatically collected patients by keyword matching (e.g., “I was diagnosed with X”). 2. Predicts 4 different kinds of mental disorders. Limitation: Data not easily accessible or reproduced. 5
  • 6. Challenges ➔ How to identify online patients? ➔ How to efficiently collect patients data? ➔ Avoid selection bias - Is the predictive model detecting patients with mental illnesses or just people talk about it? 6
  • 7. Objectives ➔ To build predictive models for the purpose of mental disorder detection. ➔ To extract features which alleviate the selection bias problem. ➔ Standardize features for mental disorder detection. 7
  • 9. Data Collection ➔ Subconscious crowdsourcing - a reliable and efficient mechanism to gather patients data. Community is the key element. Therapist Patients 9
  • 10. Preprocessing ➔ Twitter accounts with more than 100 posts ➔ Accounts with more than 50% hyperlinks were also removed Purpose: Getting rid of spam accounts. 10
  • 11. Feature Extraction ➔ Overall, we are interested in linguistic and behavioural features. ➔ Information that reveals a user’s personality and behavior: emotion transition, social interactions, age, gender, etc. ➔ TF-IDF, LIWC, and Pattern of Life Features 11
  • 12. Features ➔ TF-IDF Model: ◆ Unigrams and bigrams ➔ LIWC (Linguistic Inquiry and Word Count): ◆ Thoughts, feeling, personality and motivation ➔ Pattern of Life: ◆ Emotional scores, age, and gender ◆ Polarity features (negative ratio, positive ratio, positive combo, negative combo, and flips ratio) ◆ Social features (tweeting frequency, mention ratio, frequent mentions, and unique mentions) 12
  • 13. Experiments: Data Group Users Tweets Averaged Tweets Random Samples 548 796957 1454.3 Bipolar Patients 278 347774 1250.99 BPD Patients 203 225774 1112.19 Bipolar Experts 11 14056 1611.67 BPD Experts 9 19696 1790.55 13
  • 14. Experiments: Evaluation ➔ Three predictive models (Random Forest) for each mental disorder ◆ Pattern of Life Model ◆ TF-IDF Model ◆ LIWC Model ➔ Three experiments ◆ 10-Fold Cross Validation Test ◆ Selection Bias Test ◆ Limited Data Test 14
  • 15. 10-Fold Cross Validation Pattern of Life 0.90 LIWC 0.91 TF-IDF 0.96 Pattern of Life 0.91 LIWC 0.90 TF-IDF 0.96 15
  • 16. Selection Bias Test Is model detecting user suffering from mental disorder or just talking about it? Bipolar BPD mentalhealth dbt meds feeling blog borderline therapy helps anxiety self harm thoughts psychiatrist feel better cpn electroboyusa disorder health bpdchat bipolarblogger depression Top TF-IDF terms 16
  • 17. Data Limitation What if user only has a few tweets? 17
  • 18. Conclusion ➔ We proposed an efficient and accessible mechanism for collection patients data. ➔ We improved the Pattern of Life Model to produce better predictions. ➔ Address selection bias problem, previously not addressed. Future work: Support more mental illnesses 18