SlideShare a Scribd company logo
Thesis Proposal:
A Quantitative Analysis of the Spread
of Information in Social Networks




Joshua S. White
Advisor: Jeanna N. Matthews, PhD


                                    03/05/13
Outline
 •   Problem
 •   To Date
 •   Recently Completed Work
 •   Current Work
 •   Inspiration
 •   Unanswered Questions
 •   Current Tool Kits
 •   Our Approach
 •   Schedule of Completion
Problem
 • Social Media Networks are the fastest
   growing, and make up the largest portions, of
   Internet content today1

 • These networks have only recently (2010-
   Present) been studied in any level of detail

 • Most work has been in sampling small
   portions of the network and trying to predict
   outcomes (predicting politics)

                              1. Tom Pick. "102 Compelling Social Media and
                              Online Marketing Stats and Facts for 2012 (and
                              2013)." Business 2 Community. January 2, 2013
Problem (continued)
                                           ACM Digital Library Search Results

                                                (Sampled Dec, 2012 - Total = 20796)




                             132

                                   156
                             278
              25

                   30

                        43
          2




                                         2495
                                                                            {
                                                                                  Social Networks and Political Analysis
                                                                                  Using Social Networks as Datasets for Machine Learning
                                                                   20,132         Twitter




                                                                         {
                                                                                  Actor Types in Any Network
                                                                                  Social Network Graphing
                                                                                  Malware and Social Networks



                                                        3040
                                                                                  Social Network Meme's
                                                                   666            Botnets and Social Networks
                                                                                  Individuals Influence on Social Networks
  11359




                                                                                  Social Network Analysis Tools
                                                                                  Actor Types in Twitter
                                                 3236
To Date: Coalmine
• The basis for a Social Network Analysis Tool




                                      Coalmine: an experience in building a
                                      system for social media analytics
                                      JS White, JN Matthews, JL Stacy
                                      SPIE Defense, Security, and
                                      Sensing, 84080A-84080A-11
To Date: Coalmine
To Date: Coalmine
• Coalmine
  – Method scales well based on initial tests
  – Manual and automated detection
  – Configurable data collection capabilities
  – Trial and error filter design tool
• At the Time (Major Future Work)
  – Rebuild of the tool:
     • Fix scaling limitations
     • More extensible Map/Reduce method
         – Solve map-piping issue
     • Inclusion of multi-job support
     • New storage and distribution method
         – Solve replication and state issues
Coalmine: Data Set Overview
• Over the course of 2012 we collected 165 TB of
  Twitter Data (Uncompressed)
  – 147 “Full Days”, 100 “Partial Days”
     • Estimated 65 Billion Tweets
                                                                                     1
  – Twitter traffic at est. 175 million tweets per day in 2012
  – Collection rates between 50% and 80% for “Full Days”.
  – Data in JSON format using Twitters REST API.




                                     1. Shea Bennett. "Just How Big Is twitter In 2012
                                     [INFOGRAPHIC]," All Twitter - The Unofficial Twitter
                                     Resource, February 2013
Coalmine: Data Set Overview
• Basic observable patterns
   – Twitter has a lot of outages
   – Posting rates follow predictable patterns
To Date: Phishing Analysis




A method for the automated detection phishing websites through both site characteristics and image analysis
JS White, JN Matthews, JL Stacy
SPIE Defense, Security, and Sensing, 84080B-84080B-11
To Date: Phishing Analysis
• Phash Process:                           Results:
   – Reduce image size to 32p x 32p
   – Reduce the color to greyscale
   – Calculate the DCT (creates
     frequency scalars)
   – Reduce the DCT to 8p x 8p
   – Second DCT reduction, set bits to 1
     or 0 depending on placement
     above or below average DCT
   – Take Hash




   5
To Date: Phishing Analysis

  • Two Methods:
     – Page characteristic analysis
     – Image similarity analysis


  • Proof of concept system


  • Need for a generically customization filter
Recently Completed Work:
• BEK Infection Vector Analysis
  – Finished dev. of a filter for detection of suspect accounts
     • Submitted to the IEEE CNS (Communications Network Security)
         – “It's you on photo?: Automatic Detection of Twitter Accounts Infected
           With the Blackhole Exploit Kit”
Recently Completed Work:




   Normal
                            =
Infectious
Current Work:
• KONY2012 Meme Analysis
  – Finished extraction of relevant data, identification of tag
    variants, directed graphs of information flow
     • Preparing for submission to ASONAM (Advances in Social Network
       Analysis and Mining)
Current Work:
• Actor Types Analysis
  – Literature review completed, started identifying statistical and
    temporal characteristics of each type
     • Planned for submission to LEET'13 (Large Scale Emerging Exploits
       and Threats)




             +                          =
Inspiration
 • Our work was inspired in part by Malcolm
   Gladwell’s book, The Tipping Point 1
   – Life as an epidemic


 • Thinking this way lead us to consider the
   spread of information and trends in terms of
   an outbreak where key people, Mavens,
   Connectors, and Salesmen, are primarily
   responsible.


                           1. Gladwell, M. (2000). The tipping point.
                           Boston: Little, Brown and Company.
Some Unanswered Questions
• Automatic classification of actor types in social networks.
   – Do Gladwell's classifications apply?
       • Connectors, mavens and salesmen
   – Who are the opinion leaders?
• Privacy related implications of social network analysis
• Do social networks have the level of impact on public
  opinion/mass media that some believe?
   – Can we predict changes in the public or individuals opinions using
     social network datasets as a base?
   – Can we predict how meme's/news will spread?
   – Are individuals covertly manipulating mass media through social
     networks?
• Is there an generally applicable way to identify major events like
  natural disasters as they happen?
Current Tool Kits
 • Tool Kits and Methods:
    – Only one well developed tool kit:
       • NodeXL1
                 – Small Datasets (Under 5000 Nodes)
                 – Built In statistics and data collection
                   capabilities
                 – Built on MS Excel
                 – Allows exploration of group relationships
                 – Highest usage seems to be for political
                   related research

  1. Smith, M., Milic-Frayling, N., Shneiderman, B., Mendes Rodrigues, E., Leskovec, J., Dunne, C., (2010).
  NodeXL: a free and open network overview, discovery and exploration add-in for Excel 2007/2010,
  http://nodexl.codeplex.com/ from the Social Media Research Foundation, http://www.smrfoundation.org
Approach
• Borrow from traditional “Social Network Analysis” as it
  relates to the study of Sociology


• Most tools can't handle extremely large datasets
   – We employ the MapReduce methodology as our core for data
     analysis


• Treat the analysis system like a filtering system and build
  “rules” for how the data should be processed
      • Each rule is essentially constrained to a single Mapper


• Use case studies base on available data to develop
  individual statistics and rules
Schedule of Completion:
Questions:

More Related Content

What's hot

Social Media Analytics: Concepts, Models, Methods, & Tools - Ravi Vatrapu
Social Media Analytics: Concepts, Models, Methods, & Tools - Ravi VatrapuSocial Media Analytics: Concepts, Models, Methods, & Tools - Ravi Vatrapu
Social Media Analytics: Concepts, Models, Methods, & Tools - Ravi Vatrapu
CBS Competitiveness Platform
 
Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011
guillaume ereteo
 
Social Network Analysis for Competitive Intelligence
Social Network Analysis for Competitive IntelligenceSocial Network Analysis for Competitive Intelligence
Social Network Analysis for Competitive Intelligence
August Jackson
 
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Lauri Eloranta
 
Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview. Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview.
Doug Needham
 
Mining the Social Web - Lecture 2 - T61.6020
Mining the Social Web - Lecture 2 - T61.6020Mining the Social Web - Lecture 2 - T61.6020
Mining the Social Web - Lecture 2 - T61.6020
Michael Mathioudakis
 
Mobile Social Computing
Mobile Social ComputingMobile Social Computing
Social Networks and Social Capital
Social Networks and Social CapitalSocial Networks and Social Capital
Social Networks and Social Capital
Giorgos Cheliotis
 
20111103 con tech2011-marc smith
20111103 con tech2011-marc smith20111103 con tech2011-marc smith
20111103 con tech2011-marc smith
Marc Smith
 
Seams2016 presentation calikli_et_al
Seams2016 presentation calikli_et_alSeams2016 presentation calikli_et_al
Seams2016 presentation calikli_et_al
Gul Calikli
 
LSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social MediaLSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social Media
Local Social Summit
 
Final social network_analysis
Final social network_analysisFinal social network_analysis
Final social network_analysis
Tarvinder Singh
 
Privacy Dynamics: Learning Privacy Norms for Social Software
Privacy Dynamics: Learning Privacy Norms for Social SoftwarePrivacy Dynamics: Learning Privacy Norms for Social Software
Privacy Dynamics: Learning Privacy Norms for Social Software
Arosha Bandara
 
Social network analysis
Social network analysisSocial network analysis
Social network analysis
prasadkulkarnigit
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media Analysis
Farida Vis
 
Social Media Mining - Chapter 8 (Influence and Homophily)
Social Media Mining - Chapter 8 (Influence and Homophily)Social Media Mining - Chapter 8 (Influence and Homophily)
Social Media Mining - Chapter 8 (Influence and Homophily)
SocialMediaMining
 
Social Network Analysis and Partnerships SNA presentation Guevara 2015
Social Network Analysis and Partnerships SNA presentation Guevara 2015Social Network Analysis and Partnerships SNA presentation Guevara 2015
Social Network Analysis and Partnerships SNA presentation Guevara 2015
Sophia Guevara
 
Accessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science KnowledgeAccessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science Knowledge
Josh Cowls
 
Introduction to Computational Social Science
Introduction to Computational Social ScienceIntroduction to Computational Social Science
Introduction to Computational Social Science
Premsankar Chakkingal
 
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT ToolsIntroduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Mike Kujawski
 

What's hot (20)

Social Media Analytics: Concepts, Models, Methods, & Tools - Ravi Vatrapu
Social Media Analytics: Concepts, Models, Methods, & Tools - Ravi VatrapuSocial Media Analytics: Concepts, Models, Methods, & Tools - Ravi Vatrapu
Social Media Analytics: Concepts, Models, Methods, & Tools - Ravi Vatrapu
 
Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011
 
Social Network Analysis for Competitive Intelligence
Social Network Analysis for Competitive IntelligenceSocial Network Analysis for Competitive Intelligence
Social Network Analysis for Competitive Intelligence
 
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
 
Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview. Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview.
 
Mining the Social Web - Lecture 2 - T61.6020
Mining the Social Web - Lecture 2 - T61.6020Mining the Social Web - Lecture 2 - T61.6020
Mining the Social Web - Lecture 2 - T61.6020
 
Mobile Social Computing
Mobile Social ComputingMobile Social Computing
Mobile Social Computing
 
Social Networks and Social Capital
Social Networks and Social CapitalSocial Networks and Social Capital
Social Networks and Social Capital
 
20111103 con tech2011-marc smith
20111103 con tech2011-marc smith20111103 con tech2011-marc smith
20111103 con tech2011-marc smith
 
Seams2016 presentation calikli_et_al
Seams2016 presentation calikli_et_alSeams2016 presentation calikli_et_al
Seams2016 presentation calikli_et_al
 
LSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social MediaLSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social Media
 
Final social network_analysis
Final social network_analysisFinal social network_analysis
Final social network_analysis
 
Privacy Dynamics: Learning Privacy Norms for Social Software
Privacy Dynamics: Learning Privacy Norms for Social SoftwarePrivacy Dynamics: Learning Privacy Norms for Social Software
Privacy Dynamics: Learning Privacy Norms for Social Software
 
Social network analysis
Social network analysisSocial network analysis
Social network analysis
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media Analysis
 
Social Media Mining - Chapter 8 (Influence and Homophily)
Social Media Mining - Chapter 8 (Influence and Homophily)Social Media Mining - Chapter 8 (Influence and Homophily)
Social Media Mining - Chapter 8 (Influence and Homophily)
 
Social Network Analysis and Partnerships SNA presentation Guevara 2015
Social Network Analysis and Partnerships SNA presentation Guevara 2015Social Network Analysis and Partnerships SNA presentation Guevara 2015
Social Network Analysis and Partnerships SNA presentation Guevara 2015
 
Accessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science KnowledgeAccessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science Knowledge
 
Introduction to Computational Social Science
Introduction to Computational Social ScienceIntroduction to Computational Social Science
Introduction to Computational Social Science
 
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT ToolsIntroduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
 

Similar to Clarkson - Joshua White - Research Proposal Presentation

Practical Applications for Social Network Analysis in Public Sector Marketing...
Practical Applications for Social Network Analysis in Public Sector Marketing...Practical Applications for Social Network Analysis in Public Sector Marketing...
Practical Applications for Social Network Analysis in Public Sector Marketing...
Mike Kujawski
 
2010 Catalyst Conference - Trends in Social Network Analysis
2010 Catalyst Conference - Trends in Social Network Analysis2010 Catalyst Conference - Trends in Social Network Analysis
2010 Catalyst Conference - Trends in Social Network Analysis
Marc Smith
 
Survey of data mining techniques for social
Survey of data mining techniques for socialSurvey of data mining techniques for social
Survey of data mining techniques for social
Firas Husseini
 
Lecture4 Social Web
Lecture4 Social Web Lecture4 Social Web
Lecture4 Social Web
Marieke van Erp
 
Social Media Mining - Chapter 4 (Network Models)
Social Media Mining - Chapter 4 (Network Models)Social Media Mining - Chapter 4 (Network Models)
Social Media Mining - Chapter 4 (Network Models)
SocialMediaMining
 
Exploring social theory through enterprise social media (muller, ibm research)
Exploring social theory through enterprise social media (muller, ibm research)Exploring social theory through enterprise social media (muller, ibm research)
Exploring social theory through enterprise social media (muller, ibm research)
Michael Muller
 
Lecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and VisualisationLecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and Visualisation
Marieke van Erp
 
01-introduction.ppt the paper that you can unless you want to join me because...
01-introduction.ppt the paper that you can unless you want to join me because...01-introduction.ppt the paper that you can unless you want to join me because...
01-introduction.ppt the paper that you can unless you want to join me because...
teodroscampaus
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social media
rangesharp
 
Knowledge Extraction from Social Media
Knowledge Extraction from Social MediaKnowledge Extraction from Social Media
Knowledge Extraction from Social Media
Seth Grimes
 
User Behaviour Pattern Recognition On Twitter Social Network
User Behaviour Pattern Recognition On Twitter Social NetworkUser Behaviour Pattern Recognition On Twitter Social Network
User Behaviour Pattern Recognition On Twitter Social Network
George Konstantakopoulos
 
Opportunities and methodological challenges of Big Data for official statist...
Opportunities and methodological challenges of  Big Data for official statist...Opportunities and methodological challenges of  Big Data for official statist...
Opportunities and methodological challenges of Big Data for official statist...
Piet J.H. Daas
 
Characterizing Data and Software for Social Science Research
Characterizing Data and Software for Social Science ResearchCharacterizing Data and Software for Social Science Research
Characterizing Data and Software for Social Science Research
Micah Altman
 
Social Network Analysis Using Gephi
Social Network Analysis Using Gephi Social Network Analysis Using Gephi
Social Network Analysis Using Gephi
Goa App
 
Social Network Analysis with NodeXL Part 1
Social Network Analysis with NodeXL Part 1Social Network Analysis with NodeXL Part 1
Social Network Analysis with NodeXL Part 1
Dr Wasim Ahmed
 
Social Computing: From Social Informatics to Social Intelligence
Social Computing: From Social Informatics to Social IntelligenceSocial Computing: From Social Informatics to Social Intelligence
Social Computing: From Social Informatics to Social Intelligence
Teklu_U
 
SOCIAM: The Theory and Practice of Social Machines
SOCIAM: The Theory and Practice of Social MachinesSOCIAM: The Theory and Practice of Social Machines
SOCIAM: The Theory and Practice of Social Machines
SOCIAM Project
 
Intelligent Data Processing for the Internet of Things
Intelligent Data Processing for the Internet of Things Intelligent Data Processing for the Internet of Things
Intelligent Data Processing for the Internet of Things
PayamBarnaghi
 
Dm sei-tutorial-v7
Dm sei-tutorial-v7Dm sei-tutorial-v7
Dm sei-tutorial-v7
CS, NcState
 
Citizen Sensor Data Mining, Social Media Analytics and Applications
Citizen Sensor Data Mining, Social Media Analytics and ApplicationsCitizen Sensor Data Mining, Social Media Analytics and Applications
Citizen Sensor Data Mining, Social Media Analytics and Applications
Amit Sheth
 

Similar to Clarkson - Joshua White - Research Proposal Presentation (20)

Practical Applications for Social Network Analysis in Public Sector Marketing...
Practical Applications for Social Network Analysis in Public Sector Marketing...Practical Applications for Social Network Analysis in Public Sector Marketing...
Practical Applications for Social Network Analysis in Public Sector Marketing...
 
2010 Catalyst Conference - Trends in Social Network Analysis
2010 Catalyst Conference - Trends in Social Network Analysis2010 Catalyst Conference - Trends in Social Network Analysis
2010 Catalyst Conference - Trends in Social Network Analysis
 
Survey of data mining techniques for social
Survey of data mining techniques for socialSurvey of data mining techniques for social
Survey of data mining techniques for social
 
Lecture4 Social Web
Lecture4 Social Web Lecture4 Social Web
Lecture4 Social Web
 
Social Media Mining - Chapter 4 (Network Models)
Social Media Mining - Chapter 4 (Network Models)Social Media Mining - Chapter 4 (Network Models)
Social Media Mining - Chapter 4 (Network Models)
 
Exploring social theory through enterprise social media (muller, ibm research)
Exploring social theory through enterprise social media (muller, ibm research)Exploring social theory through enterprise social media (muller, ibm research)
Exploring social theory through enterprise social media (muller, ibm research)
 
Lecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and VisualisationLecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and Visualisation
 
01-introduction.ppt the paper that you can unless you want to join me because...
01-introduction.ppt the paper that you can unless you want to join me because...01-introduction.ppt the paper that you can unless you want to join me because...
01-introduction.ppt the paper that you can unless you want to join me because...
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social media
 
Knowledge Extraction from Social Media
Knowledge Extraction from Social MediaKnowledge Extraction from Social Media
Knowledge Extraction from Social Media
 
User Behaviour Pattern Recognition On Twitter Social Network
User Behaviour Pattern Recognition On Twitter Social NetworkUser Behaviour Pattern Recognition On Twitter Social Network
User Behaviour Pattern Recognition On Twitter Social Network
 
Opportunities and methodological challenges of Big Data for official statist...
Opportunities and methodological challenges of  Big Data for official statist...Opportunities and methodological challenges of  Big Data for official statist...
Opportunities and methodological challenges of Big Data for official statist...
 
Characterizing Data and Software for Social Science Research
Characterizing Data and Software for Social Science ResearchCharacterizing Data and Software for Social Science Research
Characterizing Data and Software for Social Science Research
 
Social Network Analysis Using Gephi
Social Network Analysis Using Gephi Social Network Analysis Using Gephi
Social Network Analysis Using Gephi
 
Social Network Analysis with NodeXL Part 1
Social Network Analysis with NodeXL Part 1Social Network Analysis with NodeXL Part 1
Social Network Analysis with NodeXL Part 1
 
Social Computing: From Social Informatics to Social Intelligence
Social Computing: From Social Informatics to Social IntelligenceSocial Computing: From Social Informatics to Social Intelligence
Social Computing: From Social Informatics to Social Intelligence
 
SOCIAM: The Theory and Practice of Social Machines
SOCIAM: The Theory and Practice of Social MachinesSOCIAM: The Theory and Practice of Social Machines
SOCIAM: The Theory and Practice of Social Machines
 
Intelligent Data Processing for the Internet of Things
Intelligent Data Processing for the Internet of Things Intelligent Data Processing for the Internet of Things
Intelligent Data Processing for the Internet of Things
 
Dm sei-tutorial-v7
Dm sei-tutorial-v7Dm sei-tutorial-v7
Dm sei-tutorial-v7
 
Citizen Sensor Data Mining, Social Media Analytics and Applications
Citizen Sensor Data Mining, Social Media Analytics and ApplicationsCitizen Sensor Data Mining, Social Media Analytics and Applications
Citizen Sensor Data Mining, Social Media Analytics and Applications
 

More from Joshua S. White, PhD josh@securemind.org

Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...
Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...
Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...
Joshua S. White, PhD josh@securemind.org
 
Presentation - Social Relevance Toward Understanding the Impact of the Indivi...
Presentation - Social Relevance Toward Understanding the Impact of the Indivi...Presentation - Social Relevance Toward Understanding the Impact of the Indivi...
Presentation - Social Relevance Toward Understanding the Impact of the Indivi...
Joshua S. White, PhD josh@securemind.org
 
Presentation - Application of Actor Level Social Characteristic Indicator Sel...
Presentation - Application of Actor Level Social Characteristic Indicator Sel...Presentation - Application of Actor Level Social Characteristic Indicator Sel...
Presentation - Application of Actor Level Social Characteristic Indicator Sel...
Joshua S. White, PhD josh@securemind.org
 
Supraja_SMS_presentation
Supraja_SMS_presentationSupraja_SMS_presentation
ase-social-informatics (6)
ase-social-informatics (6)ase-social-informatics (6)
ase-social-informatics (6)
Joshua S. White, PhD josh@securemind.org
 
Social Network Analysis Applications and Approach
Social Network Analysis Applications and ApproachSocial Network Analysis Applications and Approach
Social Network Analysis Applications and Approach
Joshua S. White, PhD josh@securemind.org
 
Clarkson joshua white - ids testing - spie 2013 presentation - jsw - d1
Clarkson   joshua white - ids testing - spie 2013 presentation - jsw - d1Clarkson   joshua white - ids testing - spie 2013 presentation - jsw - d1
Clarkson joshua white - ids testing - spie 2013 presentation - jsw - d1
Joshua S. White, PhD josh@securemind.org
 
Malware bek slides 20131023 final
Malware bek slides 20131023 finalMalware bek slides 20131023 final
Malware bek slides 20131023 final
Joshua S. White, PhD josh@securemind.org
 
CSIAC - Social Media Analysis and Privacy
CSIAC - Social Media Analysis and PrivacyCSIAC - Social Media Analysis and Privacy
CSIAC - Social Media Analysis and Privacy
Joshua S. White, PhD josh@securemind.org
 
Coalmine spie 2012 presentation - jsw -d3
Coalmine   spie 2012 presentation - jsw -d3Coalmine   spie 2012 presentation - jsw -d3
Coalmine spie 2012 presentation - jsw -d3
Joshua S. White, PhD josh@securemind.org
 
Phishing spie 2012 presentation - jsw - d2
Phishing   spie 2012 presentation - jsw - d2Phishing   spie 2012 presentation - jsw - d2
Phishing spie 2012 presentation - jsw - d2
Joshua S. White, PhD josh@securemind.org
 
Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...
Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...
Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...
Joshua S. White, PhD josh@securemind.org
 

More from Joshua S. White, PhD josh@securemind.org (12)

Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...
Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...
Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...
 
Presentation - Social Relevance Toward Understanding the Impact of the Indivi...
Presentation - Social Relevance Toward Understanding the Impact of the Indivi...Presentation - Social Relevance Toward Understanding the Impact of the Indivi...
Presentation - Social Relevance Toward Understanding the Impact of the Indivi...
 
Presentation - Application of Actor Level Social Characteristic Indicator Sel...
Presentation - Application of Actor Level Social Characteristic Indicator Sel...Presentation - Application of Actor Level Social Characteristic Indicator Sel...
Presentation - Application of Actor Level Social Characteristic Indicator Sel...
 
Supraja_SMS_presentation
Supraja_SMS_presentationSupraja_SMS_presentation
Supraja_SMS_presentation
 
ase-social-informatics (6)
ase-social-informatics (6)ase-social-informatics (6)
ase-social-informatics (6)
 
Social Network Analysis Applications and Approach
Social Network Analysis Applications and ApproachSocial Network Analysis Applications and Approach
Social Network Analysis Applications and Approach
 
Clarkson joshua white - ids testing - spie 2013 presentation - jsw - d1
Clarkson   joshua white - ids testing - spie 2013 presentation - jsw - d1Clarkson   joshua white - ids testing - spie 2013 presentation - jsw - d1
Clarkson joshua white - ids testing - spie 2013 presentation - jsw - d1
 
Malware bek slides 20131023 final
Malware bek slides 20131023 finalMalware bek slides 20131023 final
Malware bek slides 20131023 final
 
CSIAC - Social Media Analysis and Privacy
CSIAC - Social Media Analysis and PrivacyCSIAC - Social Media Analysis and Privacy
CSIAC - Social Media Analysis and Privacy
 
Coalmine spie 2012 presentation - jsw -d3
Coalmine   spie 2012 presentation - jsw -d3Coalmine   spie 2012 presentation - jsw -d3
Coalmine spie 2012 presentation - jsw -d3
 
Phishing spie 2012 presentation - jsw - d2
Phishing   spie 2012 presentation - jsw - d2Phishing   spie 2012 presentation - jsw - d2
Phishing spie 2012 presentation - jsw - d2
 
Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...
Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...
Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...
 

Recently uploaded

How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
AyyanKhan40
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Fajar Baskoro
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
National Information Standards Organization (NISO)
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
paigestewart1632
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
eBook.com.bd (প্রয়োজনীয় বাংলা বই)
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
taiba qazi
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
NgcHiNguyn25
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
Dr. Shivangi Singh Parihar
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 

Recently uploaded (20)

How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 

Clarkson - Joshua White - Research Proposal Presentation

  • 1. Thesis Proposal: A Quantitative Analysis of the Spread of Information in Social Networks Joshua S. White Advisor: Jeanna N. Matthews, PhD 03/05/13
  • 2. Outline • Problem • To Date • Recently Completed Work • Current Work • Inspiration • Unanswered Questions • Current Tool Kits • Our Approach • Schedule of Completion
  • 3. Problem • Social Media Networks are the fastest growing, and make up the largest portions, of Internet content today1 • These networks have only recently (2010- Present) been studied in any level of detail • Most work has been in sampling small portions of the network and trying to predict outcomes (predicting politics) 1. Tom Pick. "102 Compelling Social Media and Online Marketing Stats and Facts for 2012 (and 2013)." Business 2 Community. January 2, 2013
  • 4. Problem (continued) ACM Digital Library Search Results (Sampled Dec, 2012 - Total = 20796) 132 156 278 25 30 43 2 2495 { Social Networks and Political Analysis Using Social Networks as Datasets for Machine Learning 20,132 Twitter { Actor Types in Any Network Social Network Graphing Malware and Social Networks 3040 Social Network Meme's 666 Botnets and Social Networks Individuals Influence on Social Networks 11359 Social Network Analysis Tools Actor Types in Twitter 3236
  • 5. To Date: Coalmine • The basis for a Social Network Analysis Tool Coalmine: an experience in building a system for social media analytics JS White, JN Matthews, JL Stacy SPIE Defense, Security, and Sensing, 84080A-84080A-11
  • 7. To Date: Coalmine • Coalmine – Method scales well based on initial tests – Manual and automated detection – Configurable data collection capabilities – Trial and error filter design tool • At the Time (Major Future Work) – Rebuild of the tool: • Fix scaling limitations • More extensible Map/Reduce method – Solve map-piping issue • Inclusion of multi-job support • New storage and distribution method – Solve replication and state issues
  • 8. Coalmine: Data Set Overview • Over the course of 2012 we collected 165 TB of Twitter Data (Uncompressed) – 147 “Full Days”, 100 “Partial Days” • Estimated 65 Billion Tweets 1 – Twitter traffic at est. 175 million tweets per day in 2012 – Collection rates between 50% and 80% for “Full Days”. – Data in JSON format using Twitters REST API. 1. Shea Bennett. "Just How Big Is twitter In 2012 [INFOGRAPHIC]," All Twitter - The Unofficial Twitter Resource, February 2013
  • 9. Coalmine: Data Set Overview • Basic observable patterns – Twitter has a lot of outages – Posting rates follow predictable patterns
  • 10. To Date: Phishing Analysis A method for the automated detection phishing websites through both site characteristics and image analysis JS White, JN Matthews, JL Stacy SPIE Defense, Security, and Sensing, 84080B-84080B-11
  • 11. To Date: Phishing Analysis • Phash Process: Results: – Reduce image size to 32p x 32p – Reduce the color to greyscale – Calculate the DCT (creates frequency scalars) – Reduce the DCT to 8p x 8p – Second DCT reduction, set bits to 1 or 0 depending on placement above or below average DCT – Take Hash 5
  • 12. To Date: Phishing Analysis • Two Methods: – Page characteristic analysis – Image similarity analysis • Proof of concept system • Need for a generically customization filter
  • 13. Recently Completed Work: • BEK Infection Vector Analysis – Finished dev. of a filter for detection of suspect accounts • Submitted to the IEEE CNS (Communications Network Security) – “It's you on photo?: Automatic Detection of Twitter Accounts Infected With the Blackhole Exploit Kit”
  • 14. Recently Completed Work: Normal = Infectious
  • 15. Current Work: • KONY2012 Meme Analysis – Finished extraction of relevant data, identification of tag variants, directed graphs of information flow • Preparing for submission to ASONAM (Advances in Social Network Analysis and Mining)
  • 16. Current Work: • Actor Types Analysis – Literature review completed, started identifying statistical and temporal characteristics of each type • Planned for submission to LEET'13 (Large Scale Emerging Exploits and Threats) + =
  • 17. Inspiration • Our work was inspired in part by Malcolm Gladwell’s book, The Tipping Point 1 – Life as an epidemic • Thinking this way lead us to consider the spread of information and trends in terms of an outbreak where key people, Mavens, Connectors, and Salesmen, are primarily responsible. 1. Gladwell, M. (2000). The tipping point. Boston: Little, Brown and Company.
  • 18. Some Unanswered Questions • Automatic classification of actor types in social networks. – Do Gladwell's classifications apply? • Connectors, mavens and salesmen – Who are the opinion leaders? • Privacy related implications of social network analysis • Do social networks have the level of impact on public opinion/mass media that some believe? – Can we predict changes in the public or individuals opinions using social network datasets as a base? – Can we predict how meme's/news will spread? – Are individuals covertly manipulating mass media through social networks? • Is there an generally applicable way to identify major events like natural disasters as they happen?
  • 19. Current Tool Kits • Tool Kits and Methods: – Only one well developed tool kit: • NodeXL1 – Small Datasets (Under 5000 Nodes) – Built In statistics and data collection capabilities – Built on MS Excel – Allows exploration of group relationships – Highest usage seems to be for political related research 1. Smith, M., Milic-Frayling, N., Shneiderman, B., Mendes Rodrigues, E., Leskovec, J., Dunne, C., (2010). NodeXL: a free and open network overview, discovery and exploration add-in for Excel 2007/2010, http://nodexl.codeplex.com/ from the Social Media Research Foundation, http://www.smrfoundation.org
  • 20. Approach • Borrow from traditional “Social Network Analysis” as it relates to the study of Sociology • Most tools can't handle extremely large datasets – We employ the MapReduce methodology as our core for data analysis • Treat the analysis system like a filtering system and build “rules” for how the data should be processed • Each rule is essentially constrained to a single Mapper • Use case studies base on available data to develop individual statistics and rules