Utilize Digital & Social 
Media Data to Inform 
Your Research 
Workshop 4 of the Digital Scholar Training Series 
Katja Reuter1, PhD; Audun Utengen2, MBA; Thomas Lee2, BS, NHA 
1Southern California Clinical and Translational Science Institute (SC CTSI) at the University of Southern 
California and Children's Hospital Los Angeles 
2Symplur LLC, a healthcare social media consultancy
Today’s Goals 
1. Understand how to use social media data 
in support of research and study participant 
recruitment 
2. Understand how to use Symplur Signals
Workshop Outline 
Introduction: Web-Based Disease 
Communities, PHI, Online Recruitment, 
Online Consent 
Using Social Media Data from Twitter 
The Healthcare Hashtag Project 
Using Symplur Signals
The Opportunity 
R e s e a r c h p r o f e s s i o n a l s a r e a b l e t o … 
…learn about 
emerging health/ 
disease topics 
…identify active 
online disease 
communities & 
start a dialogue 
…identify, and 
engage/recruit 
potential research 
participants 
ON L IN E 
U s e r s S e a r c h , V i e w , a n d C o n t r i b u t e O n l i n e
Internet and Social Media Usage 
 80%+ of Americans seek health information online 
 Nearly 70% of all Internet users in the U.S. use 
(Pew Research Center, 2013, 2014) 
digital and social media 
 40% of 18-29 year old African Americans who use 
the Internet say that they use Twitter 
 Latinos go online from mobile devices and use 
social networking sites at similar – and sometimes 
higher – rates than do other Americans
L.A. County 
Medical 
Enterprise 
Research 
Enterprise 
Internet/Web 
Social Media 
Mobile Technologies
L.A. County – Digital Divide 
Medical 
Enterprise 
Research 
Enterprise 
Internet/Web 
Social Media 
Mobile Technologies 
CAVEAT: DIGITAL DIVIDE 
>35% of Angelenos either do not have 
access to broadband or cannot afford it. 
BUT: High Penetration of mobile 
technologies among low-income 
and underserved populations (e.g., 
more than 75% of Latinos). 
89% of U.S. Hispanics log on to 
Facebook every day, 78% via mobile 
device (US Hispanics and Facebook, The 
Generation of Growth, 2014)
L.A. Users on Twitter 
June 2012 - Oct 2014 
13,372 users that self-identified 
as located in Los 
Angeles 
Sent 35,295 messages on 
Twitter using the word 
“diabetes” 
In more than a dozen 
languages, including English, 
Spanish, Tagalog, Haitian, Korean, 
and Vietnamese 
(Symplur Signals database)
Data Source: Twitter 
https://blog.twitter.com/2014/introducing-twitter-data-grants
http://www.socialwebcafe.com/wp-content/ 
uploads/2014/07/teksocialAnatomyofTweet.png
Research Studies that Successfully 
Leverage Digital Data 
Group 1 
Q u e s t i o n s 
How did they engage research participants? 
What did they ask participants to do? 
How long did they recruit? 
Describe the online consent method. 
Group 2 
Describe the digital data the research team 
used. 
What did they learn?
Online Data Usage in Research 
Things to keep in mind… 
 Protected health information (PHI) 
 De-identification Vs. Anonymization
Online Recruitment 
Things to keep in mind… 
1. The creator of a social media page is responsible for the content. 
Link to site with more information (Institutional Clinical Trials Dircetory or 
ClinicalTrials.gov) 
2. Abide by institution’s media guidelines 
3. Consult IRB (meet institutional and state definitions of advertising) 
4. Refrain from providing significant details of any trial – focus on 
basic study information 
5. Beware of proprietary information 
6. Avoid making claims of treatment efficacy or side effects. Use 
disclaimers to reduce risk. 
7. Establish monitoring mechanism to protect against HIPAA violations 
and inappropriate posting 
8. Avoid disclosure of preliminary results or non-public information 
9. Bloggers involved in study conduct should not write about trial or 
drug (could be viewed as advertising) 
http://www.ncbi.nlm.nih.gov/pubmed/24857086
Online Consent & Signatures 
Things to keep in mind… 
 In minimal risk research, requires IRB approval 
 Office of Human Research Protections (OHRP): “If properly obtained, an 
electronic signature can be considered ‘original’ for the purposes of 
recordkeeping.” 
 Federal Electronic Signatures in Global and National Commerce Act 
(eSIGN) and California’s Uniform Electronic Transactions Act (UETA): 
Require that subjects agree to use the electronic format (by clicking a 
“You agree” icon) and that subjects be informed about their rights to 
obtain the electronic consent in non-electronic form and a description 
of any procedures that must be followed to withdraw their agreement 
to use an electronic record. 
 FDA-regulated: Electronic documents would be subject to a specialized 
set of requirements found at 21 CFR Part 11.
Symplur 
A healthcare social media consultancy, creator of “The Healthcare 
Hashtag Project” 
Audun Utengen, MBA, 
Co-founder, Symplur 
LLC 
Thomas Lee, BS, NHA, 
Co-founder, Symplur 
LLC
The Start of a Healthcare Communications Revolution 
 2006 - Twitter Created 
 2007 - First use of hashtags on Twitter 
 2009 - First healthcare related tweet chat - #hcsm 
 2010 - The Healthcare Hashtag Project started
The Healthcare Hashtag Project 
 A structured organization of healthcare social media 
 An open social project 
 Millions of tweets – thousands of untold stories
The Rise of Patient Communities on Twitter 
 22 month study 
 100 million tweets 
 A surprise discovery 
http://www.symplur.com/shorts/the-rise-of-patient- 
communities-on-twitter-visualized/
The Dynamics of a Twitter Patient Community 
 A look at a Rheumatoid Arthritis community - #rheum 
 Dynamic network centrality 
analysis 
 1 month timeline 
http://www.symplur.com/shorts/dyna 
mics-twitter-patient-community-network- 
centrality-analysis/
The True Audience of Healthcare Conferences 
 3,190 healthcare conferences in the database 
 The physical attendees VS. the whole audience 
 Dissemination of healthcare information
A Dive into Healthcare Social Media 
 Very high growth: 1M tweets a day 
 Types of hashtags: 
 Tweet Chats: 377 
 Diseases: 390 
 Conferences: 3,190 
 Others: 1,402 
Demo: http://www.symplur.com/healthcare-hashtags/
Where and How to Find Digital and Social Media Data? 
 Public VS. Private Social Networks 
 Platform APIs 
 Aggregate sources: 
 Gnip 
 Datasift 
 Analytics Providers (1,000+) 
 Most well known: Radian6 (Salesforce) 
 Symplur Signals only one healthcare focused
Open Social Network Data 
 Twitter 
 Tumblr 
 Foursquare 
 WordPress 
 Disqus, Intensedebate
Partly-Open Social Network Data 
 Facebook 
 Instagram 
 Flickr 
 Google+ 
 YouTube, Vimeo
Limitations of Digital Data 
 Biases 
 Population adoption of networks 
 Personal Identity (Facebook VS. Twitter) 
 Functional Identity (Doctors VS. Patients) 
 Patient role (open VS private, job-to-be-done of 
network) 
 Historic access
Symplur Signals 
 Requests for research and 
data 
 A treasure trove of insights 
and stories 
 3rd parties need access
Symplur Signals Data 
 Not only 5,000+ healthcare hashtags 
 10,000+ healthcare topics 
 Thousands of high impact stakeholders 
 doctors 
 pharma 
 hospitals 
 Drug names, infectious disease terms, etc.
Example Insights to Extract: 
 What are the top healthcare articles being shared by 
Radiation Oncologists this week? 
 Who are the most influential bloggers that are talking 
about statin drugs? 
 When has the topic of “flu” begun to trend and peak 
over the last two years? 
 What are the trending topics today in the Diabetes 
communities?
Symplur Signals Demo: Diabetes 
http://signals.symplur.com
Symplur Signals Hands-On Training 
http://signals.symplur.com 
Try It!
30-Minute Individual Consulting Session 
Start the process by contacting us at: 
www.symplur.com/contact/
Contact Us 
SC CTSI | www.sc-ctsi.org 
Phone: (323) 442-4032 
Email: info@sc-ctsi.org 
Twitter: @SoCalCTSI 
Symplur LLC | http://www.symplur.com 
Twitter: @symplur 
Audun Utengen, MBA: @audvin 
Thomas M. Lee, BS, NHA: @tmlfox

Utilize Digital and Social Media Data to Inform Your Research in Novel Ways

  • 1.
    Utilize Digital &Social Media Data to Inform Your Research Workshop 4 of the Digital Scholar Training Series Katja Reuter1, PhD; Audun Utengen2, MBA; Thomas Lee2, BS, NHA 1Southern California Clinical and Translational Science Institute (SC CTSI) at the University of Southern California and Children's Hospital Los Angeles 2Symplur LLC, a healthcare social media consultancy
  • 2.
    Today’s Goals 1.Understand how to use social media data in support of research and study participant recruitment 2. Understand how to use Symplur Signals
  • 3.
    Workshop Outline Introduction:Web-Based Disease Communities, PHI, Online Recruitment, Online Consent Using Social Media Data from Twitter The Healthcare Hashtag Project Using Symplur Signals
  • 4.
    The Opportunity Re s e a r c h p r o f e s s i o n a l s a r e a b l e t o … …learn about emerging health/ disease topics …identify active online disease communities & start a dialogue …identify, and engage/recruit potential research participants ON L IN E U s e r s S e a r c h , V i e w , a n d C o n t r i b u t e O n l i n e
  • 5.
    Internet and SocialMedia Usage  80%+ of Americans seek health information online  Nearly 70% of all Internet users in the U.S. use (Pew Research Center, 2013, 2014) digital and social media  40% of 18-29 year old African Americans who use the Internet say that they use Twitter  Latinos go online from mobile devices and use social networking sites at similar – and sometimes higher – rates than do other Americans
  • 6.
    L.A. County Medical Enterprise Research Enterprise Internet/Web Social Media Mobile Technologies
  • 7.
    L.A. County –Digital Divide Medical Enterprise Research Enterprise Internet/Web Social Media Mobile Technologies CAVEAT: DIGITAL DIVIDE >35% of Angelenos either do not have access to broadband or cannot afford it. BUT: High Penetration of mobile technologies among low-income and underserved populations (e.g., more than 75% of Latinos). 89% of U.S. Hispanics log on to Facebook every day, 78% via mobile device (US Hispanics and Facebook, The Generation of Growth, 2014)
  • 8.
    L.A. Users onTwitter June 2012 - Oct 2014 13,372 users that self-identified as located in Los Angeles Sent 35,295 messages on Twitter using the word “diabetes” In more than a dozen languages, including English, Spanish, Tagalog, Haitian, Korean, and Vietnamese (Symplur Signals database)
  • 9.
    Data Source: Twitter https://blog.twitter.com/2014/introducing-twitter-data-grants
  • 10.
  • 11.
    Research Studies thatSuccessfully Leverage Digital Data Group 1 Q u e s t i o n s How did they engage research participants? What did they ask participants to do? How long did they recruit? Describe the online consent method. Group 2 Describe the digital data the research team used. What did they learn?
  • 12.
    Online Data Usagein Research Things to keep in mind…  Protected health information (PHI)  De-identification Vs. Anonymization
  • 13.
    Online Recruitment Thingsto keep in mind… 1. The creator of a social media page is responsible for the content. Link to site with more information (Institutional Clinical Trials Dircetory or ClinicalTrials.gov) 2. Abide by institution’s media guidelines 3. Consult IRB (meet institutional and state definitions of advertising) 4. Refrain from providing significant details of any trial – focus on basic study information 5. Beware of proprietary information 6. Avoid making claims of treatment efficacy or side effects. Use disclaimers to reduce risk. 7. Establish monitoring mechanism to protect against HIPAA violations and inappropriate posting 8. Avoid disclosure of preliminary results or non-public information 9. Bloggers involved in study conduct should not write about trial or drug (could be viewed as advertising) http://www.ncbi.nlm.nih.gov/pubmed/24857086
  • 14.
    Online Consent &Signatures Things to keep in mind…  In minimal risk research, requires IRB approval  Office of Human Research Protections (OHRP): “If properly obtained, an electronic signature can be considered ‘original’ for the purposes of recordkeeping.”  Federal Electronic Signatures in Global and National Commerce Act (eSIGN) and California’s Uniform Electronic Transactions Act (UETA): Require that subjects agree to use the electronic format (by clicking a “You agree” icon) and that subjects be informed about their rights to obtain the electronic consent in non-electronic form and a description of any procedures that must be followed to withdraw their agreement to use an electronic record.  FDA-regulated: Electronic documents would be subject to a specialized set of requirements found at 21 CFR Part 11.
  • 15.
    Symplur A healthcaresocial media consultancy, creator of “The Healthcare Hashtag Project” Audun Utengen, MBA, Co-founder, Symplur LLC Thomas Lee, BS, NHA, Co-founder, Symplur LLC
  • 16.
    The Start ofa Healthcare Communications Revolution  2006 - Twitter Created  2007 - First use of hashtags on Twitter  2009 - First healthcare related tweet chat - #hcsm  2010 - The Healthcare Hashtag Project started
  • 17.
    The Healthcare HashtagProject  A structured organization of healthcare social media  An open social project  Millions of tweets – thousands of untold stories
  • 18.
    The Rise ofPatient Communities on Twitter  22 month study  100 million tweets  A surprise discovery http://www.symplur.com/shorts/the-rise-of-patient- communities-on-twitter-visualized/
  • 19.
    The Dynamics ofa Twitter Patient Community  A look at a Rheumatoid Arthritis community - #rheum  Dynamic network centrality analysis  1 month timeline http://www.symplur.com/shorts/dyna mics-twitter-patient-community-network- centrality-analysis/
  • 24.
    The True Audienceof Healthcare Conferences  3,190 healthcare conferences in the database  The physical attendees VS. the whole audience  Dissemination of healthcare information
  • 27.
    A Dive intoHealthcare Social Media  Very high growth: 1M tweets a day  Types of hashtags:  Tweet Chats: 377  Diseases: 390  Conferences: 3,190  Others: 1,402 Demo: http://www.symplur.com/healthcare-hashtags/
  • 28.
    Where and Howto Find Digital and Social Media Data?  Public VS. Private Social Networks  Platform APIs  Aggregate sources:  Gnip  Datasift  Analytics Providers (1,000+)  Most well known: Radian6 (Salesforce)  Symplur Signals only one healthcare focused
  • 29.
    Open Social NetworkData  Twitter  Tumblr  Foursquare  WordPress  Disqus, Intensedebate
  • 30.
    Partly-Open Social NetworkData  Facebook  Instagram  Flickr  Google+  YouTube, Vimeo
  • 31.
    Limitations of DigitalData  Biases  Population adoption of networks  Personal Identity (Facebook VS. Twitter)  Functional Identity (Doctors VS. Patients)  Patient role (open VS private, job-to-be-done of network)  Historic access
  • 32.
    Symplur Signals Requests for research and data  A treasure trove of insights and stories  3rd parties need access
  • 33.
    Symplur Signals Data  Not only 5,000+ healthcare hashtags  10,000+ healthcare topics  Thousands of high impact stakeholders  doctors  pharma  hospitals  Drug names, infectious disease terms, etc.
  • 34.
    Example Insights toExtract:  What are the top healthcare articles being shared by Radiation Oncologists this week?  Who are the most influential bloggers that are talking about statin drugs?  When has the topic of “flu” begun to trend and peak over the last two years?  What are the trending topics today in the Diabetes communities?
  • 35.
    Symplur Signals Demo:Diabetes http://signals.symplur.com
  • 36.
    Symplur Signals Hands-OnTraining http://signals.symplur.com Try It!
  • 37.
    30-Minute Individual ConsultingSession Start the process by contacting us at: www.symplur.com/contact/
  • 38.
    Contact Us SCCTSI | www.sc-ctsi.org Phone: (323) 442-4032 Email: info@sc-ctsi.org Twitter: @SoCalCTSI Symplur LLC | http://www.symplur.com Twitter: @symplur Audun Utengen, MBA: @audvin Thomas M. Lee, BS, NHA: @tmlfox