Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media

194 views

Published on

Slides from our presentation in ASONAM 2017 presenting our work on Depression Detection on Social Media. The full paper can be found in our library: http://knoesis.org/node/2867

Published in: Technology

Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media

  1. 1. Modeling Social Behavior for Health care Utilization in Depression Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media Amir Hossein Yazdavar∗, Hussein S. Al-Olimat∗, Monireh Ebrahimi∗, Goonmeet Bajaj∗, Tanvi Banerjee∗, Krishnaprasad Thirunarayan∗, Jyotishman Pathak∗∗ and Amit Sheth∗ ∗ Kno.e.sis Research Center, Wright State University, OH, USA ∗∗ Division of Health Informatics, Cornell University, New York, NY, USA Project Funded by: NIH R01 Grant #: MH105384-01A1 The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. rebrand.ly/depressionProject @knoesis_mdd
  2. 2. The importance of Studying Clinical Depression November , 2011, “Teen Tweets Before Committing Suicide: The Importance of “Cyber-helping.” http://www.adweek.com/digital/teen-tweets-before-comitting-suicide-the-importance-of-cyber-helping/ September , 2015, “Jim Carrey’s Girlfriend: Her Last Tweet Before Committing Suicide ’Signing Off.” http://hollywoodlife.com/2015/09/29/jim-carrey-girlfriend-suicide-note-twitter-break-up/ Clinical depression is one of the most common mental illness Adults in USA age 18 and older suffered from depression Of those suffering receive treatment. Spent on depression treatment in a year. People affected Over 90% of people who commit suicide have been diagnosed with clinical depression. 1/30@knoesis_mdd Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media 350 million $42 billion 40 million Only 1/3
  3. 3. The importance of Studying Clinical Depression In 2015, there were more than 44K reported suicide deaths. “Suicide claims more lives than war, murder, and natural disasters combined.” American Foundation for Suicide Prevention: Every 12 minutes a person dies by suicide in the US. Every day, 121 Americans take their own life. 500K people visited a hospital for injuries due to self-harm. Suicide was the second leading cause of death for adults between the ages of 10 & 34 years 2/30@knoesis_mdd http://myefiko.com/wp-content/uploads/2016/05/girls-suicide.jpg http://www.udayavani.com/sites/default/files/imag es/english_articles/2017/04/15/Suicide.jpg Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  4. 4. Previous efforts to address Clinical Depression Global effort to manage depression involves detecting it through survey-based methods via phone or online questionnaires Under-representation Sampling bias Incomplete information Large temporal gaps between data collection & dissemination of findings Cognitive bias 3/30@knoesis_mdd http://www.thechicagobridge.org/wp- content/uploads/2015/12/online-survey.jpg Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  5. 5. Unique opportunity to study Clinical Depression Social Media Platforms: A valuable resource for learning about users’ feelings, emotions, behaviors, and decisions that reflect their mental health as they are experiencing the ups and downs. How well can textual content in social media be harnessed to reliably capture clinical depression symptoms of a user over time? Are there any underlying common themes among depressed users? https://makeawebsitehub.com/wp-content/uploads/2016/04/social_media.jpg http://www.wamitab.org.uk/useruploads/images/istock_question_man_small.jpg http://news.mit.edu/sites/mit.edu.newsoffice/files/styles/news_article_image_top_s lideshow/public/images/2015/big-data-medicine-model_0.jpg?itok=9gUD6D48 4/30@knoesis_mdd Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  6. 6. Supervised approach Language Style, Emotion, Ego-network, User Engagement Low Recall, High dependency on the quality of lexicon Labor-intensive annotations 5/30@knoesis_mdd Lexicon-based approach Previous Efforts to Study Depression Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  7. 7. Previous Efforts to Study Depression Simply use the keyword “depression” in tweets to find depression indicative tweets 2.bp.blogspot.com/_VMAt17gvKp8/TEUAD3ebxQI/ AAAAAAAAA1A/riVGzdSTavM/s1600/confuse.gif “economic depression”, “great depression”, “depression era”, “tropical depression” Ambiguity I am depressed, my paper got rejected again Transient sadness ...sleep forever... Context sensitivity. 6/30@knoesis_mdd Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  8. 8. Can we develop more specific evaluations rooted in current clinical protocols? According to DSM*, clinical depression can be diagnosed through the presence of a set of symptoms over a period of time. Patient Health Questionnaire 9 items (PHQ-9) A nine item depression scale used by clinicians to screen, diagnose, and measure the severity of depression. PHQ-9 rates the frequency of symptoms which leads into scoring severity index. PHQ-9 is completed by the patient and is scored by the clinician. http://cdn.xl.thumbs.canstockph oto.com/canstock10070068.jpg 7/30@knoesis_mdd * Diagnostic and Statistical Manual of Mental Disorders (DSM) Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  9. 9. Top-down definition of depressive disorder pinkgirlq8.com/wp- content/uploads/twitte r.jpg 8/30@knoesis_mdd Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  10. 10. Obsessed with weight S5 Lassitude S4 Sleep disorder S3 @knoesis_mdd so depressed with my weight can I just want be skinny So tired..always so tired. havent slept in 8 days.. 94lbs, urgh I disgust myself. 0 energy to do anything just burst out in tears because I'm that tired :’( :’( why can't I sleep PHQ-9 symptoms in tweets Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media 9/30
  11. 11. Self harm S9 Suicidal Thought S9 Feeling bad about yourself S6 10/30@knoesis_mdd I've never been so sure about suicide Cut again tonight, couldnt do it as bad as I have therepy tomorrow I feel like a failure I swear every night I tell myself "one more cut and that's it" It's never just one more. I'm so done. **** recovery. **** life. Just want to die. Couldn't care less what happens to me anymore. What a sick life I lead.. PHQ-9 symptoms in tweets Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  12. 12. How to Study Clinical Depression in Twitter? We built a model which emulates PHQ-9 questionnaire for detecting clinical depression symptoms in Twitter profiles. Analyzing user’s topic preferences and word usage we can monitor the depression symptoms. 11/30@knoesis_mdd What they are talking about? How they express themselves? Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  13. 13. A document is mixture of latent topics, where a topic is a distribution of co- occurring words. Simply counts the number of depression- indicative terms by creating a dictionary of terms for each depressive symptoms. Learned topics were not granular and specific enough to correspond to depressive symptoms. bocawatch.org/wp- content/uploads/2016/10/careerconfusion 1-e1417093044460.png 12/30@knoesis_mdd Ambiguity and context sensitivity problems How to Study Clinical Depression in Twitter? Bottom-up processing: LDA Top-down processing: Lexicon-based Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  14. 14. Hybrid Processing We add supervision to LDA, by using terms that are strongly related to the 9 depression symptoms as seeds of the topical clusters, and guide the model to aggregate semantically related terms into the same cluster. 13/30@knoesis_mdd How to Study Clinical Depression in Twitter? Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  15. 15. 14/30@knoesis_mdd Generating Seed Terms lexicon of 1620 depression-related symptoms categorized into nine different clinical depression symptom categories Lexicon PHQ-9 Symptoms Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  16. 16. 15/30@knoesis_mdd Generating Seed Terms Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media Lack of Interest no_interest no_motivation no_desire acedia incuriosity Feeling Down agony all_torn_up anguish atrabilious bad_day be_alone Sleep Disorder awake bad_sleep be_up_late can't_fall_back_asleep Lack of Energy did_nothing dog-tired don't_have_the_strength drained Eating Disorder as_big_as_a_mountain as_fat_as_an_elephant avoirdupois binge Low Self-esteem i_am_a_burden abhorrence am_useless atelophobia Concentration Problems absent_minded absent-minded absentminded absorbed Hyper/Lower Activity nervous obtuse overwrought panic panic_attack panic-stricken Suicidal Thoughts benumb better_be_dead better_off_death bleed_to_death
  17. 17. Social media users have a creative descriptive metaphorical phrases and explanations for symptoms I’m so exhausted all timeSo tired, so drained, so done Lack of Energy 16/30@knoesis_mdd Generating Seed Terms: Challenges Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  18. 18. Language of social media contains polysemous words in its vocabulary Cut my finger opening a can of fruit scars don’t heal when you keep cutting Word Sense Disambiguation (WSD) We disambiguate a polysemous word based on the sentiment polarity of its enclosing sentence. 17/30@knoesis_mdd Generating Seed Terms: Challenges Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  19. 19. We divide each user’s collection of preprocessed tweets into a set of tweet buckets using a specific time interval of d days. Φ the distribution of words per symptom. θ the distribution of symptoms over buckets We restrict a topic si to a single corresponding value for each user-specific seed terms. Each term wi is then assigned to the largest probability symptom associated with it in Φ. 18/30@knoesis_mdd Framework Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media Andrzejewski, D. and Zhu, X. (2009). Latent Dirichlet Allocation with Topic- in-Set Knowledge. NAACL 2009 Workshop on Semi-supervised Learning for NLP (NAACL-SSLNLP 2009)
  20. 20. 19/30@knoesis_mdd Framework User texts Buckets of 14 days texts Run Topic Modeling Seed Topics from Lexicon Change Z-values in the multinomial distribution Θ distribution of Symptoms (topics) over Documents (buckets) Φ distribution of words over symptoms (topics) Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  21. 21. Quantitative Analysis Umass: where D(wi ,wj ) counts the number of documents containing both wi and wj words and D(wi) counts the ones containing wi ● Topic coherence measures score a single topic by identifying the degree of semantic similarity between high-scoring words in that topic. UCI: where the word probabilities are calculated by counting word co-occurrence in a sliding window over an external dataset such as Wikipedia. 20/30@knoesis_mdd Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  22. 22. ● We created a dataset containing 45,000 Twitter users who self-declared their depression and the other 2000 “undeclared” users collected randomly. ● What is self-declared depression? 21/30@knoesis_mdd Dataset Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  23. 23. ● Data Preparation ○ After removing the profiles with less than 100 tweets, ○ we obtained 7,046 users with 21 million timestamped tweets (at most 3,200 per user) ○ Next, we randomly sampled self-reported 2,000 profiles and 2,000 random users. ● Performing text preprocessing Our model discovers depressive symptoms as latent topics from sliding window on buckets of timestamped tweets posted by users. 22/30@knoesis_mdd Dataset Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  24. 24. The following table illustrates the sample of topics learned by ssToT and LDA model. p(w|s) 23/30@knoesis_mdd Qualitative Analysis Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  25. 25. ● Interesting Observations: ○ Captures acronyms related to each symptoms ○ Has excessive usage of expressive interjections ○ Has family and friends-related topics Symptom 5 “Mfp” ↣ for “More Food Please” “Ugw” ↣ stands for “Ultimate Goal Weight” Symptom 2 “idec” ↣ “I Don’t Even Care” “aw” ↣ indicative of disappointment “feh” ↣ indicative of feeling underwhelmed “ew” ↣ denoting disgust “argh” ↣ showing frustration family, hugs, attention, parents, competition, daddy, mums, sigh, grandma,losing 24/30@knoesis_mdd Qualitative Analysis Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  26. 26. Visualizing Depressive Symptoms ● We keep topics containing certain number of seed terms as dominant words (threshold) 25/30@knoesis_mdd Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  27. 27. ● Average coherency of different models vs our model. ● The higher the coherency score, the more interpretable the topics are 26/30@knoesis_mdd Quantitative Analysis Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  28. 28. Symptom Prediction (Multi-label Classification) ● We try to predict the correct set of labels (depressive symptoms) for each bucket of tweets. ● We build a ground truth dataset of 10400 tweets in 192 buckets. Each bucket contains tweets that are posted by the user within span of 14 days (in compliance with PHQ-9). ● Tweets are selected from a randomly sampled subset of both self-reported depressed users and random users. ● Three human judges manually annotated each tweet using the nine PHQ-9 categories as labels. 27/30@knoesis_mdd Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  29. 29. Testing our method on Symptom Prediction (Multi-label Classification) vs. supervised binary relevance (BR) and classifier chains (CC) methods. 28/30@knoesis_mdd Our Semi-supervised approach vs. Supervised approaches 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 S1 S2 S3 S4 S5 S6 S7 S8 S9 F-SCORE PHQ-9 SYMPTOM BR-MNB CC-MNB BR-SVM CC-SVM ssToT Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  30. 30. Limitations Our model works well with less descriptive symptoms Over thinking always destroy my mood this essay is dragging so much, can’t deal with essays and revisions any more :( my head is such a mess right now I need a break from my thoughts Concentration Problems descriptive and metaphoric and do not contain any depressive-indicative term photos.gograph.com/thumbs/ CSP/CSP993/k14300461.jpg 29/30@knoesis_mdd Limitations Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  31. 31. Conclusion and Future work ● We showed the potential of social media data for extracting depression symptoms. ● We developed a semi-supervised statistical model using a hybrid approach that combines lexicon and topic modeling. ● Our approach complements the current questionnaire-driven diagnostic tools by gleaning depression symptoms in a continuous and unobtrusive manner. ● Our model yields promising results that is competitive with a fully supervised approach: 70.6% on average F-Score for capturing the 9 depression symptoms. ● In future, we plan to use the system to detect depressive behavior at the community level which also includes the study of demographics and so on. 30/30@knoesis_mdd Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
  32. 32. Thank You! For more information: rebrand.ly/depressionProject @halolimat hussein@knoesis.org @knoesis_mdd

×