SlideShare a Scribd company logo
Computational Social Science
as the Ultimate Web Intelligence
Kno.e.sis Projects at the Intersection of Big Data, AI, Social Good and Health
Panel at Web Intelligence 2018
Prof. Amit Sheth
LexisNexis Ohio Eminent Scholar
Executive Director, Kno.e.sis - Ohio Center of Excellence in
Knowledge-enabled Computing & BioHealth Innovation
Presentation template by SlidesCarnival
Photographs by Unsplash
Icons by thenounproject
Big Data | Social Media | AI
2
Harnessing Twitter ‘Big Data’ for
Automatic Emotion Identification
2.5 M Tweets with Machine
Learning algorithms
Trends
Emotions
eDrugTrends - Identify emerging trends in
cannabis and synthetic cannabinoid use in the
U.S.
Web Forum Data & Tweets with
NLP, ML & Semantic Web
Technologies
Intents
Sentiments
Hazards SEES - Cross-modal aggregation
of Multi-modal & Multi-disciplinary
Data to support human efforts in disaster
management
Extracting Diverse Sentiment Expressions
with Target-Dependent Polarity from
Twitter
Opinions
400 000 Tweets with an
Optimization Model
People
Places
Times
Gender-Based Violence in
140 Characters or Fewer: A
#BigData Case Study of
Twitter
14 million tweets
collected from Twitter
over a period of 10
months
3
1. Gender-based violence in 140 characters or fewer: A #BigData case study of Twitter, Hemant Purohit, Tanvi Banerjee, Andrew Hampton, Valerie L. Shalin, Nayanesh Bhandutia, and Amit
Sheth, First Monday, Volume 21, Number 1 - 4 January 2016
Outcomes of Analysis
◎ Trends of GBV tweets across 5 countries; USA,
India, Philippines, Nigeria, South Africa.
4
◎ Three thematic groups of GBV tweets: physical
violence, sexual violence, and harmful practices.
◎ Nigeria has the highest percentage of tweets with URLs in
comparison to other countries.
◎ Numerous explanations;
○ Literacy,
○ Credibility of the public press
○ Possibility that reliance on external resources somehow reduces
the threat of being identified as the responsible party.
Context-Aware
Harassment Detection
on Social Media
24 000 tweets collected
Supervised ML methods
used
5
1. Mohammadreza Rezvan, Saeedeh Shekarpour, Lakshika Balasuriya, Krishnaprasad Thirunarayan, Valerie L. Shalin, Amit Sheth. A Quality Type-aware Annotated Corpus and
Lexicon for Harassment Research. Web Science, WebSci 2018, Amsterdam, The Netherlands, May 27-30, 2018
2. Mohammadreza Rezvan, Saeedeh Shekarpour, Thirunarayan, K., Valerie L. Shalin, Sheth, A. (2018). Analyzing and learning the languagefor different types of harassment
Knoesis wiki for Context-Aware Harassment Detection on Social
Media
Outcomes and Insights
Lexicon
Covering different types of harassment content
● Sexual
● Political
● Racial
Tweets
24 000 non-redundant annotated
tweets with 3000 are labeled as
harassing
Features
Combination of features resulted in best
accuracy
○ TFIDF
○ word2vec
○ paragraph2vec
○ LIWC vector
ML Methods
Gradient Boosting Machine (GBM)
outperformed SVM, KNN and NB
6
● Intellectuel
● Appearance - related
● General
7
1. Gaur, Manas, Ugur Kursuncu, Amanuel Alambo, Amit Sheth, Raminta Daniulaityte, Krishnaprasad Thirunarayan, and Jyotishman Pathak. "Let Me Tell You About Your
Mental Health!: Contextualized Classification of Reddit Posts to DSM-5 for Web-based Intervention." In Proceedings of the 27th ACM CIKM 2018.
Patient
ClinicianEMR
Insight
DSM-5 & Drug Abuse
Ontology
Improved
Healthcare
Classification of Reddit
Content to DSM-5 for
Web-based
Intervention
3 Million Posts from 270K
Reddit Users collected From
2005-2015 with zero shot
learning
Provide clinicians, insights of their patients
Knoesis wiki for Modeling Social Behavior for Healthcare
Utilization in Depression
Outcomes & Insights
9
Our sophisticated methods have
reduced the false alarm rate to 3%
- 5% by incorporating domain
knowledge and slang terms in
social media data
Views: People - Content - Network
Information in tweets by a user displays
an intent based on the user type:
Personal accounts share opinions, Retail
accounts promote related products for
sale, Media accounts disseminate
information.
Proper incorporation
of each view is
essential to
better represent
characteristics
of users.
User Modeling in Marijuana-related Communications
11
Multimodality
- The information shared in different
formats contributes to the meaning:
Text, Image, Emoji, Interactions
- Translation of image and emoji to textual
representation using state-of-the-art tools
such as EmojiNet.
People: user description, emoji,
profile pictures.
Content: text, emoji
Network: interactions with other
users: retweets and mentions.
🏈
😉
🍔
1. Ugur Kursuncu, Manas Gaur, Usha Lokala, Anurag Illendula, Krishnaprasad Thirunarayan, Raminta Daniulaityte, Amit Sheth, and I. Budak Arpinar. "" What's ur type?"
Contextualized Classification of User Types in Marijuana-related Communications using Compositional Multiview Embedding." In Proceedings of IEEE International
Conference on Web Intelligence, 2018
Knoesis wiki for eDrugTrends
Outcomes & Insights
◎ Incorporation of multimodal data,
specifically profile pictures and network
interactions, significantly contributes into
the classification of users.
◎ Multimodality significantly improves the
classification performance in the case of
imbalanced dataset, e.g., profile pictures
of users.
◎ Compositional of embeddings of views
(e.g., person, content, network) provide
more coherent representation of users.
12
Features Personal Media Retail
1 Tweet + Desc 0.95 0.42 0.73
2 w/ Composition 0.94 0.18 0.71
3 w/ Metadata 0.94 0.17 0.72
4 w/ Image 0.97 0.72 0.87
5 w/ Network 0.98 0.73 0.91
F-Scores for each user type
Fusing Visual, Textual and
Connectivity Clues for Studying
Mental Health
Knoesis wiki for Modeling Social Behavior for Healthcare Utilization in Depression
Develop a multimodal framework and
employing statistical techniques for
fusing heterogeneous sets of features
obtained by processing visual, textual
and user interaction data to identify
depressive behavior and demographic
inference.
13
1. Amir Hossein Yazdavar, Mohammad Saied Mahdavinejad, Goonmeet Bajaj, Krishnaprasad Thirunarayan, Jyotishman Pathak and Amit Sheth. Fusing Visual, Textual and
Connectivity Clues for Studying Mental Health in Population. In: 30th International Conference on World Wide Web (Submitted WWW-2019)
◎ How well do the content of posted images (colors,
aesthetic and facial presentation) reflect depressive
behavior?
◎ Does the choice of profile picture show any psychological
traits of depressed online persona? Are they reliable
enough to represent the demographic information such as
age and gender?
◎ Are there any underlying common themes among
depressed individuals generated using multimodal
content that can be used to detect depression reliably?
Outcomes & Insights
14
Characterizing Linguistic Patterns in two aspects:
Depressive-behavior and Age Distribution
Gender Biases
and Depressive
Behavior
Association (Chi-
square test: color-
code:
(blue:association),
(red: repulsion),
size: amount of
each cell’s
contribution)
The age
distribution for
depressed and
control users
in ground-truth
dataset
Outcomes & Insights
15
The explanation of the log-odds prediction of outcome (0.31) for
a sample user (y-axis shows the outcome probability (depressed
or control), the bar labels indicate the log-odds impact of each
feature)
Ranking Features obtained from Different Modalities with
Boruta Algorithm
Create value from data that supports action
Big Data & AI
16
What can we do that
is unique?
Emotions
Sentiments
Intentions Derive Insights
Scale to identify important & relevant
issues to human kind
Floods Earthquake
Wildfires Tsunami
Derive insights from data
Do more exercises
Reduce sugar intake
Increase water intake
More at: http://knoesis.org/projects, http://bit.ly/Kapproach
Thank You!
17

More Related Content

What's hot

Sensor Ubiquity: Automotive-Quantified Self Integrated Sensor Applications
Sensor Ubiquity:  Automotive-Quantified Self  Integrated Sensor ApplicationsSensor Ubiquity:  Automotive-Quantified Self  Integrated Sensor Applications
Sensor Ubiquity: Automotive-Quantified Self Integrated Sensor Applications
Melanie Swan
 
disinformation risk management: leveraging cyber security best practices to s...
disinformation risk management: leveraging cyber security best practices to s...disinformation risk management: leveraging cyber security best practices to s...
disinformation risk management: leveraging cyber security best practices to s...
Sara-Jayne Terp
 
Just Google it! [slides]
Just Google it! [slides]Just Google it! [slides]
Just Google it! [slides]
Queen's University Belfast
 
Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...
Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...
Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...
ALexandruDaia1
 
Big Data for Development: Opportunities and Challenges, Summary Slidedeck
Big Data for Development: Opportunities and Challenges, Summary SlidedeckBig Data for Development: Opportunities and Challenges, Summary Slidedeck
Big Data for Development: Opportunities and Challenges, Summary Slidedeck
UN Global Pulse
 
Digital Signals & Access to Finance in Kenya
Digital Signals & Access to Finance in KenyaDigital Signals & Access to Finance in Kenya
Digital Signals & Access to Finance in Kenya
UN Global Pulse
 
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copy
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copyGlobal Pulse: Mining Indonesian Tweets to Understand Food Price Crises copy
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copy
UN Global Pulse
 
Helping Crisis Responders Find the Informative Needle in the Tweet Haystack
Helping Crisis Responders Find the Informative Needle in the Tweet HaystackHelping Crisis Responders Find the Informative Needle in the Tweet Haystack
Helping Crisis Responders Find the Informative Needle in the Tweet Haystack
COMRADES project
 
Disaster data informatics for situation awareness
Disaster data informatics for situation awareness Disaster data informatics for situation awareness
Disaster data informatics for situation awareness Ashutosh Jadhav
 
Augmented Personalized Health: using AI techniques on semantically integrated...
Augmented Personalized Health: using AI techniques on semantically integrated...Augmented Personalized Health: using AI techniques on semantically integrated...
Augmented Personalized Health: using AI techniques on semantically integrated...
Amit Sheth
 
A Communicator's Guide to COVID-19 Vaccination
A Communicator's Guide to COVID-19 VaccinationA Communicator's Guide to COVID-19 Vaccination
A Communicator's Guide to COVID-19 Vaccination
Sarah Jackson
 
"Big Data for Development: Opportunities and Challenges"
"Big Data for Development: Opportunities and Challenges" "Big Data for Development: Opportunities and Challenges"
"Big Data for Development: Opportunities and Challenges"
UN Global Pulse
 
Future%20of%20internet%202010%20 %20 Aaas%20paper
Future%20of%20internet%202010%20 %20 Aaas%20paperFuture%20of%20internet%202010%20 %20 Aaas%20paper
Future%20of%20internet%202010%20 %20 Aaas%20paper
Marketingfacts
 
Pew Study: The Future Of The Internet
Pew Study: The Future Of The InternetPew Study: The Future Of The Internet
Pew Study: The Future Of The InternetDavid O'Reilly
 
Biases in Social Media Research (NoBias EU project)
Biases in Social Media Research (NoBias EU project)Biases in Social Media Research (NoBias EU project)
Biases in Social Media Research (NoBias EU project)
Miriam Fernandez
 
Public Health Crisis Analytics for Gender Violence
Public Health Crisis Analytics for Gender ViolencePublic Health Crisis Analytics for Gender Violence
Public Health Crisis Analytics for Gender Violence
Hemant Purohit
 

What's hot (16)

Sensor Ubiquity: Automotive-Quantified Self Integrated Sensor Applications
Sensor Ubiquity:  Automotive-Quantified Self  Integrated Sensor ApplicationsSensor Ubiquity:  Automotive-Quantified Self  Integrated Sensor Applications
Sensor Ubiquity: Automotive-Quantified Self Integrated Sensor Applications
 
disinformation risk management: leveraging cyber security best practices to s...
disinformation risk management: leveraging cyber security best practices to s...disinformation risk management: leveraging cyber security best practices to s...
disinformation risk management: leveraging cyber security best practices to s...
 
Just Google it! [slides]
Just Google it! [slides]Just Google it! [slides]
Just Google it! [slides]
 
Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...
Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...
Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...
 
Big Data for Development: Opportunities and Challenges, Summary Slidedeck
Big Data for Development: Opportunities and Challenges, Summary SlidedeckBig Data for Development: Opportunities and Challenges, Summary Slidedeck
Big Data for Development: Opportunities and Challenges, Summary Slidedeck
 
Digital Signals & Access to Finance in Kenya
Digital Signals & Access to Finance in KenyaDigital Signals & Access to Finance in Kenya
Digital Signals & Access to Finance in Kenya
 
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copy
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copyGlobal Pulse: Mining Indonesian Tweets to Understand Food Price Crises copy
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copy
 
Helping Crisis Responders Find the Informative Needle in the Tweet Haystack
Helping Crisis Responders Find the Informative Needle in the Tweet HaystackHelping Crisis Responders Find the Informative Needle in the Tweet Haystack
Helping Crisis Responders Find the Informative Needle in the Tweet Haystack
 
Disaster data informatics for situation awareness
Disaster data informatics for situation awareness Disaster data informatics for situation awareness
Disaster data informatics for situation awareness
 
Augmented Personalized Health: using AI techniques on semantically integrated...
Augmented Personalized Health: using AI techniques on semantically integrated...Augmented Personalized Health: using AI techniques on semantically integrated...
Augmented Personalized Health: using AI techniques on semantically integrated...
 
A Communicator's Guide to COVID-19 Vaccination
A Communicator's Guide to COVID-19 VaccinationA Communicator's Guide to COVID-19 Vaccination
A Communicator's Guide to COVID-19 Vaccination
 
"Big Data for Development: Opportunities and Challenges"
"Big Data for Development: Opportunities and Challenges" "Big Data for Development: Opportunities and Challenges"
"Big Data for Development: Opportunities and Challenges"
 
Future%20of%20internet%202010%20 %20 Aaas%20paper
Future%20of%20internet%202010%20 %20 Aaas%20paperFuture%20of%20internet%202010%20 %20 Aaas%20paper
Future%20of%20internet%202010%20 %20 Aaas%20paper
 
Pew Study: The Future Of The Internet
Pew Study: The Future Of The InternetPew Study: The Future Of The Internet
Pew Study: The Future Of The Internet
 
Biases in Social Media Research (NoBias EU project)
Biases in Social Media Research (NoBias EU project)Biases in Social Media Research (NoBias EU project)
Biases in Social Media Research (NoBias EU project)
 
Public Health Crisis Analytics for Gender Violence
Public Health Crisis Analytics for Gender ViolencePublic Health Crisis Analytics for Gender Violence
Public Health Crisis Analytics for Gender Violence
 

Similar to Computational Social Science as the Ultimate Web Intelligence

A Systematic Survey on Detection of Extremism in Social Media
A Systematic Survey on Detection of Extremism in Social MediaA Systematic Survey on Detection of Extremism in Social Media
A Systematic Survey on Detection of Extremism in Social Media
RSIS International
 
Social computing meet & greet
Social computing meet & greetSocial computing meet & greet
Social computing meet & greetAngela Brandt
 
Social Multimedia as Sensors
Social Multimedia as SensorsSocial Multimedia as Sensors
Social Multimedia as Sensors
Goergen Institute for Data Science
 
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIATHE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
IJCSES Journal
 
UMN - Social Computing Collaborative
UMN - Social Computing CollaborativeUMN - Social Computing Collaborative
UMN - Social Computing Collaborative
norapaul
 
A_Hybrid_Deep_Learning_Model_to_Predict_the_Impact (1).pdf
A_Hybrid_Deep_Learning_Model_to_Predict_the_Impact (1).pdfA_Hybrid_Deep_Learning_Model_to_Predict_the_Impact (1).pdf
A_Hybrid_Deep_Learning_Model_to_Predict_the_Impact (1).pdf
clientmentailai
 
Applications of data science in social media.pptx
Applications of data science in social media.pptxApplications of data science in social media.pptx
Applications of data science in social media.pptx
lyudmilabaruah
 
Empowering Women in the Digital Sphere.pdf
Empowering Women in the Digital Sphere.pdfEmpowering Women in the Digital Sphere.pdf
Empowering Women in the Digital Sphere.pdf
Samirsinh Parmar
 
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCEEPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
ijcsit
 
Literature review on customer emotions in social media
Literature review on customer emotions in social mediaLiterature review on customer emotions in social media
Literature review on customer emotions in social media
Jari Jussila
 
Digital intermediation: Towards Transparent Public Automated Media
Digital intermediation: Towards Transparent Public Automated MediaDigital intermediation: Towards Transparent Public Automated Media
Digital intermediation: Towards Transparent Public Automated Media
University of Sydney
 
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCEEPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
AIRCC Publishing Corporation
 
Strategic perspectives 3
Strategic perspectives 3Strategic perspectives 3
Strategic perspectives 3
archiejones4
 
Suicide Analysis and Prevention Application using Machine Learning Classifiers
Suicide Analysis and Prevention Application using Machine Learning ClassifiersSuicide Analysis and Prevention Application using Machine Learning Classifiers
Suicide Analysis and Prevention Application using Machine Learning Classifiers
IRJET Journal
 
Convergence, Computation and Continuity: Challenges for PR in the 21st Century
Convergence, Computation and Continuity: Challenges for PR in the 21st CenturyConvergence, Computation and Continuity: Challenges for PR in the 21st Century
Convergence, Computation and Continuity: Challenges for PR in the 21st Century
Simon Collister & Associates
 
76201960
7620196076201960
76201960
IJRAT
 
Artificial intelligence in social media.
Artificial intelligence in social media.Artificial intelligence in social media.
Artificial intelligence in social media.
ChetnaGoyal16
 
mmmmmmmmmmmmmmmm
mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
mmmmmmmmmmmmmmmm
Rohit440277
 
Fake_News_Detection_1st_review[1] capstone project.pptx
Fake_News_Detection_1st_review[1] capstone project.pptxFake_News_Detection_1st_review[1] capstone project.pptx
Fake_News_Detection_1st_review[1] capstone project.pptx
HarshMangal20
 
Identifying social media influencers through social media network analysis: A...
Identifying social media influencers through social media network analysis: A...Identifying social media influencers through social media network analysis: A...
Identifying social media influencers through social media network analysis: A...
Miguel del Fresno
 

Similar to Computational Social Science as the Ultimate Web Intelligence (20)

A Systematic Survey on Detection of Extremism in Social Media
A Systematic Survey on Detection of Extremism in Social MediaA Systematic Survey on Detection of Extremism in Social Media
A Systematic Survey on Detection of Extremism in Social Media
 
Social computing meet & greet
Social computing meet & greetSocial computing meet & greet
Social computing meet & greet
 
Social Multimedia as Sensors
Social Multimedia as SensorsSocial Multimedia as Sensors
Social Multimedia as Sensors
 
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIATHE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
 
UMN - Social Computing Collaborative
UMN - Social Computing CollaborativeUMN - Social Computing Collaborative
UMN - Social Computing Collaborative
 
A_Hybrid_Deep_Learning_Model_to_Predict_the_Impact (1).pdf
A_Hybrid_Deep_Learning_Model_to_Predict_the_Impact (1).pdfA_Hybrid_Deep_Learning_Model_to_Predict_the_Impact (1).pdf
A_Hybrid_Deep_Learning_Model_to_Predict_the_Impact (1).pdf
 
Applications of data science in social media.pptx
Applications of data science in social media.pptxApplications of data science in social media.pptx
Applications of data science in social media.pptx
 
Empowering Women in the Digital Sphere.pdf
Empowering Women in the Digital Sphere.pdfEmpowering Women in the Digital Sphere.pdf
Empowering Women in the Digital Sphere.pdf
 
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCEEPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
 
Literature review on customer emotions in social media
Literature review on customer emotions in social mediaLiterature review on customer emotions in social media
Literature review on customer emotions in social media
 
Digital intermediation: Towards Transparent Public Automated Media
Digital intermediation: Towards Transparent Public Automated MediaDigital intermediation: Towards Transparent Public Automated Media
Digital intermediation: Towards Transparent Public Automated Media
 
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCEEPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
 
Strategic perspectives 3
Strategic perspectives 3Strategic perspectives 3
Strategic perspectives 3
 
Suicide Analysis and Prevention Application using Machine Learning Classifiers
Suicide Analysis and Prevention Application using Machine Learning ClassifiersSuicide Analysis and Prevention Application using Machine Learning Classifiers
Suicide Analysis and Prevention Application using Machine Learning Classifiers
 
Convergence, Computation and Continuity: Challenges for PR in the 21st Century
Convergence, Computation and Continuity: Challenges for PR in the 21st CenturyConvergence, Computation and Continuity: Challenges for PR in the 21st Century
Convergence, Computation and Continuity: Challenges for PR in the 21st Century
 
76201960
7620196076201960
76201960
 
Artificial intelligence in social media.
Artificial intelligence in social media.Artificial intelligence in social media.
Artificial intelligence in social media.
 
mmmmmmmmmmmmmmmm
mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
mmmmmmmmmmmmmmmm
 
Fake_News_Detection_1st_review[1] capstone project.pptx
Fake_News_Detection_1st_review[1] capstone project.pptxFake_News_Detection_1st_review[1] capstone project.pptx
Fake_News_Detection_1st_review[1] capstone project.pptx
 
Identifying social media influencers through social media network analysis: A...
Identifying social media influencers through social media network analysis: A...Identifying social media influencers through social media network analysis: A...
Identifying social media influencers through social media network analysis: A...
 

Recently uploaded

Burning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdfBurning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdf
kkirkland2
 
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AwangAniqkmals
 
María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024
eCommerce Institute
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
Sebastiano Panichella
 
Tom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issueTom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issue
amekonnen
 
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
OECD Directorate for Financial and Enterprise Affairs
 
Acorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutesAcorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutes
IP ServerOne
 
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXOBitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Matjaž Lipuš
 
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptxsomanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
Howard Spence
 
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Dutch Power
 
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Sebastiano Panichella
 
Getting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control TowerGetting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control Tower
Vladimir Samoylov
 
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Dutch Power
 
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Access Innovations, Inc.
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Sebastiano Panichella
 
Gregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptxGregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptx
gharris9
 
Obesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditionsObesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditions
Faculty of Medicine And Health Sciences
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
khadija278284
 
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
0x01 - Newton's Third Law:  Static vs. Dynamic Abusers0x01 - Newton's Third Law:  Static vs. Dynamic Abusers
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
OWASP Beja
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
faizulhassanfaiz1670
 

Recently uploaded (20)

Burning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdfBurning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdf
 
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
 
María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
 
Tom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issueTom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issue
 
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
 
Acorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutesAcorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutes
 
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXOBitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXO
 
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptxsomanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
 
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
 
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
 
Getting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control TowerGetting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control Tower
 
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
 
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
 
Gregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptxGregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptx
 
Obesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditionsObesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditions
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
 
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
0x01 - Newton's Third Law:  Static vs. Dynamic Abusers0x01 - Newton's Third Law:  Static vs. Dynamic Abusers
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
 

Computational Social Science as the Ultimate Web Intelligence

  • 1. Computational Social Science as the Ultimate Web Intelligence Kno.e.sis Projects at the Intersection of Big Data, AI, Social Good and Health Panel at Web Intelligence 2018 Prof. Amit Sheth LexisNexis Ohio Eminent Scholar Executive Director, Kno.e.sis - Ohio Center of Excellence in Knowledge-enabled Computing & BioHealth Innovation Presentation template by SlidesCarnival Photographs by Unsplash Icons by thenounproject
  • 2. Big Data | Social Media | AI 2 Harnessing Twitter ‘Big Data’ for Automatic Emotion Identification 2.5 M Tweets with Machine Learning algorithms Trends Emotions eDrugTrends - Identify emerging trends in cannabis and synthetic cannabinoid use in the U.S. Web Forum Data & Tweets with NLP, ML & Semantic Web Technologies Intents Sentiments Hazards SEES - Cross-modal aggregation of Multi-modal & Multi-disciplinary Data to support human efforts in disaster management Extracting Diverse Sentiment Expressions with Target-Dependent Polarity from Twitter Opinions 400 000 Tweets with an Optimization Model People Places Times
  • 3. Gender-Based Violence in 140 Characters or Fewer: A #BigData Case Study of Twitter 14 million tweets collected from Twitter over a period of 10 months 3 1. Gender-based violence in 140 characters or fewer: A #BigData case study of Twitter, Hemant Purohit, Tanvi Banerjee, Andrew Hampton, Valerie L. Shalin, Nayanesh Bhandutia, and Amit Sheth, First Monday, Volume 21, Number 1 - 4 January 2016
  • 4. Outcomes of Analysis ◎ Trends of GBV tweets across 5 countries; USA, India, Philippines, Nigeria, South Africa. 4 ◎ Three thematic groups of GBV tweets: physical violence, sexual violence, and harmful practices. ◎ Nigeria has the highest percentage of tweets with URLs in comparison to other countries. ◎ Numerous explanations; ○ Literacy, ○ Credibility of the public press ○ Possibility that reliance on external resources somehow reduces the threat of being identified as the responsible party.
  • 5. Context-Aware Harassment Detection on Social Media 24 000 tweets collected Supervised ML methods used 5 1. Mohammadreza Rezvan, Saeedeh Shekarpour, Lakshika Balasuriya, Krishnaprasad Thirunarayan, Valerie L. Shalin, Amit Sheth. A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research. Web Science, WebSci 2018, Amsterdam, The Netherlands, May 27-30, 2018 2. Mohammadreza Rezvan, Saeedeh Shekarpour, Thirunarayan, K., Valerie L. Shalin, Sheth, A. (2018). Analyzing and learning the languagefor different types of harassment Knoesis wiki for Context-Aware Harassment Detection on Social Media
  • 6. Outcomes and Insights Lexicon Covering different types of harassment content ● Sexual ● Political ● Racial Tweets 24 000 non-redundant annotated tweets with 3000 are labeled as harassing Features Combination of features resulted in best accuracy ○ TFIDF ○ word2vec ○ paragraph2vec ○ LIWC vector ML Methods Gradient Boosting Machine (GBM) outperformed SVM, KNN and NB 6 ● Intellectuel ● Appearance - related ● General
  • 7. 7 1. Gaur, Manas, Ugur Kursuncu, Amanuel Alambo, Amit Sheth, Raminta Daniulaityte, Krishnaprasad Thirunarayan, and Jyotishman Pathak. "Let Me Tell You About Your Mental Health!: Contextualized Classification of Reddit Posts to DSM-5 for Web-based Intervention." In Proceedings of the 27th ACM CIKM 2018. Patient ClinicianEMR Insight DSM-5 & Drug Abuse Ontology Improved Healthcare Classification of Reddit Content to DSM-5 for Web-based Intervention 3 Million Posts from 270K Reddit Users collected From 2005-2015 with zero shot learning Provide clinicians, insights of their patients Knoesis wiki for Modeling Social Behavior for Healthcare Utilization in Depression
  • 8. Outcomes & Insights 9 Our sophisticated methods have reduced the false alarm rate to 3% - 5% by incorporating domain knowledge and slang terms in social media data
  • 9. Views: People - Content - Network Information in tweets by a user displays an intent based on the user type: Personal accounts share opinions, Retail accounts promote related products for sale, Media accounts disseminate information. Proper incorporation of each view is essential to better represent characteristics of users. User Modeling in Marijuana-related Communications 11 Multimodality - The information shared in different formats contributes to the meaning: Text, Image, Emoji, Interactions - Translation of image and emoji to textual representation using state-of-the-art tools such as EmojiNet. People: user description, emoji, profile pictures. Content: text, emoji Network: interactions with other users: retweets and mentions. 🏈 😉 🍔 1. Ugur Kursuncu, Manas Gaur, Usha Lokala, Anurag Illendula, Krishnaprasad Thirunarayan, Raminta Daniulaityte, Amit Sheth, and I. Budak Arpinar. "" What's ur type?" Contextualized Classification of User Types in Marijuana-related Communications using Compositional Multiview Embedding." In Proceedings of IEEE International Conference on Web Intelligence, 2018 Knoesis wiki for eDrugTrends
  • 10. Outcomes & Insights ◎ Incorporation of multimodal data, specifically profile pictures and network interactions, significantly contributes into the classification of users. ◎ Multimodality significantly improves the classification performance in the case of imbalanced dataset, e.g., profile pictures of users. ◎ Compositional of embeddings of views (e.g., person, content, network) provide more coherent representation of users. 12 Features Personal Media Retail 1 Tweet + Desc 0.95 0.42 0.73 2 w/ Composition 0.94 0.18 0.71 3 w/ Metadata 0.94 0.17 0.72 4 w/ Image 0.97 0.72 0.87 5 w/ Network 0.98 0.73 0.91 F-Scores for each user type
  • 11. Fusing Visual, Textual and Connectivity Clues for Studying Mental Health Knoesis wiki for Modeling Social Behavior for Healthcare Utilization in Depression Develop a multimodal framework and employing statistical techniques for fusing heterogeneous sets of features obtained by processing visual, textual and user interaction data to identify depressive behavior and demographic inference. 13 1. Amir Hossein Yazdavar, Mohammad Saied Mahdavinejad, Goonmeet Bajaj, Krishnaprasad Thirunarayan, Jyotishman Pathak and Amit Sheth. Fusing Visual, Textual and Connectivity Clues for Studying Mental Health in Population. In: 30th International Conference on World Wide Web (Submitted WWW-2019) ◎ How well do the content of posted images (colors, aesthetic and facial presentation) reflect depressive behavior? ◎ Does the choice of profile picture show any psychological traits of depressed online persona? Are they reliable enough to represent the demographic information such as age and gender? ◎ Are there any underlying common themes among depressed individuals generated using multimodal content that can be used to detect depression reliably?
  • 12. Outcomes & Insights 14 Characterizing Linguistic Patterns in two aspects: Depressive-behavior and Age Distribution Gender Biases and Depressive Behavior Association (Chi- square test: color- code: (blue:association), (red: repulsion), size: amount of each cell’s contribution) The age distribution for depressed and control users in ground-truth dataset
  • 13. Outcomes & Insights 15 The explanation of the log-odds prediction of outcome (0.31) for a sample user (y-axis shows the outcome probability (depressed or control), the bar labels indicate the log-odds impact of each feature) Ranking Features obtained from Different Modalities with Boruta Algorithm
  • 14. Create value from data that supports action Big Data & AI 16 What can we do that is unique? Emotions Sentiments Intentions Derive Insights Scale to identify important & relevant issues to human kind Floods Earthquake Wildfires Tsunami Derive insights from data Do more exercises Reduce sugar intake Increase water intake More at: http://knoesis.org/projects, http://bit.ly/Kapproach

Editor's Notes

  1. Opinions - "Time for dabs": Analyzing Twitter data on butane hash oil use.
  2. Sharing behavior analysis. Social media provide the opportunity to distribute information, potentially reflecting both the senders’ judgment of information importance, and reliance on the voice of others. Sharing functions as an amplification of these voices, often through the voices of influential celebrities. We analyze two types of sharing behavior in the social media community surrounding GBV events: direct content resharing as a retweet (RT), and indirect sharing via references to external resources, such as news, blogs, articles, and multimedia, using URLs, etc. the low retweeting frequency in Nigeria is particularly remarkable (see Table 5). One might hypothesize that a low literacy country such as Nigeria, in which senders are less able to compose messages, would have the highest retweet ratio. The adjacent analysis of the proportion of URL references with respect to the total corpus suggests a different sociocultural phenomenon at work concerning the identifiability of the responsible party. For GBV tweets containing URLs, Nigeria has the highest percentage of tweets with URLs in comparison to other countries. Numerous explanations can be tested, including literacy, credibility of the public press, and the possibility that reliance on external resources somehow reduces the threat of being identified as the responsible party.
  3. Goal - understanding individuals mental health situation Provide clinicias insights of his/ pataients
  4. Not all the Reddit content types (Main Posts, Comments, and Replies) are informative. Identification of Features that represent users on Reddit: Vertical Linguistic Features (e.g. Inter-Subreddit Similarity) Horizontal Linguistic Features (e.g. Subordinate Conjunction) Fine-Grained Features (e.g. Readability scores) Word Embedding with/without modulation Coherence-based topic selection that associate subreddit to DSM-5 Enrichment of DAO ontology with DSM-5 lexicon and Slang Terms : DSM-5 Knowledge Hierarchy DAO - we created
  5. A sophisticated method allowed us to hugly reduce the false alarm rate - Explain the optimization effort in one sentence 25% reduction in the false alarm rate (2- 5%) while the other methods have higher false alarm rates () Takeaway; Incorporation of domain knowledge and slang terms in social media data
  6. 1)Analysis of content of posted images in terms of colors, aesthetic and facial presentation and their associations with depressive behavior; 2)Uncovering the underlying relationships between the visual and contextual content of likely depressed profiles obtained using demographic inference process which can facilitate community-level management of depression
  7. Top left: Our findings from social media are consistent with the findings in the medical literature as according to the third National Health and Nutrition Examination Survey [29] more women than men were given a diagnosis of depression. Bottom Left: shows that young people aged below 24 tend to be more depressed suggesting that either likely depressed-user population is younger, or youngsters are more likely to disclose their age say with the intention of connecting to their peers (social homophily
  8. Right: The waterfall charts represent how the probability of being de- pressed changes with the addition of each feature variable. Left: illustrates feature importance obtained by Boruta algorithm.