Your SlideShare is downloading. ×
  • Like
CSIAC - Social Media Analysis and Privacy
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

CSIAC - Social Media Analysis and Privacy

  • 68 views
Published

 

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
68
On SlideShare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
1
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Unclassified // Public ReleaseSocial Media Analysis and PrivacyJoshua Whitewhitej@ainfosec.comSenior Computer EngineerAssured Information Securityhttp://ainfosec.comPhD Student of Engineering ScienceClarkson UniversityDate: Oct 31, 2012Release: Unclassified // Public 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com Copyright 2012 Assured Information Security, Inc.
  • 2. Unclassified // Public Release About: Company  AIS (Assured Information Security)  Research and Development of technologies and capabilities to support effective operations within the entirety of the cyber domain.  Leading pioneers in the disciplines of Information Operations including Network Operations, Electronic Warfare, and Computer Network Operations of all types.  Located In:  Rome NY (Corporate Headquarters)  Portland OR  Baltimore MD  Beavercreek OH  San Antonio TX  Colorado Springs, CO 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 3. Unclassified // Public Release About: Speaker  Joshua White  Education:  AAS Computer Network Technology (FLCC)  BS / MS Telecommunications (SUNYIT)  PhD Student of Engineering Science (Clarkson University)  Experience:  7+ years Government Contracting in Information Security and Telecommunications Engineering  Areas of Study:  Intrusion Detection Systems  Optical Network Security  Large Dataset Analysis  Distributed Processing 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 4. Unclassified // Public Release Overview: The Big Questions  Introduction  The Big Research Questions:  What are social media networks?  What is the privacy problem relating to them?  Who would want this data and why?  What rights of privacy must I protect?  What regulations regarding privacy exist?  What happens if I dont protect the privacy?  Conclusions  References 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 5. Unclassified // Public Release What are social media networks? 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 6. Unclassified // Public ReleaseDefinition  Social Media Networks  DHS identified multiple categories [1]  Search  Video  Maps  Photos  Blog aggregates  Micro-blogs  Traditional social networks 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 7. Unclassified // Public Release Whats the privacy problem as it relates to these social media networks? 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 8. Unclassified // Public ReleaseProblem  Two-Part Problem  End Users  Unsure or unaware on ways to properly protect their privacy  Data Collectors  Dont know how to properly maintain the privacy of their datasets 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 9. Unclassified // Public ReleaseProblem Social Media-Networking Sites  Provide a communications method thought by many to be at least somewhat private  Many never change the default security settings associated with their accounts  Example: Percentage of Facebook users by age that change their account security settings to anything other then the default (no security) setting [2]  18-19 years old = 71%  30-39 years old = 67%  50-64 years old = 55%  80% of all users fall within the 18-64 age range  Estimated 20+ million users have no security but must still have a basic expectation of privacy  Provides the largest “Social Network” datasets available for study 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 10. Unclassified // Public Release Who would want this data and why? 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 11. Unclassified // Public ReleaseProblem Focus: Data Collectors  The problem of user expectations and knowledge of privacy settings is for another discussion  Lets focus on the “larger” problem  Data Collection  What can we collect?  What can we do with the data?  How must we protect the privacy of an individual’s PII contained within the datasets? 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 12. Unclassified // Public ReleaseSocial Media Networks Awareness  Benefits:  Government  Track locations of persons of interest with reasonable accuracy  “Bad guys” may have protected posts Sometimes accessible by simply looking at their friend’s  posts, or even other sites that they have allowed access within their accounts  Track trends Who said what, who repeated it?   Is it going to cause a riot or worse yet, a war?  News before “official” reports  Natural disasters, shootings, etc 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com Copyright 2012 Assured Information Security, Inc.
  • 13. Unclassified // Public ReleaseSocial Media Networks Awareness  Benefits:  Businesses  Directed advertising Track locations of consumers with reasonable accuracy   Track buying habits and interests  Track trends Who said what, who repeated it, is something going to  effect a brand?  News before “official” reports  Did something happen that will effect the market rapidly  Natural disasters, news reports, etc 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com Copyright 2012 Assured Information Security, Inc.
  • 14. Unclassified // Public ReleaseSocial Media Networks Awareness  Benefits:  Academia  Research Track locations of subjects with reasonable accuracy   Track habits, interests and moods over time  Track trends Who said what, who repeated it (graph theory)?   Study social networks with the largest datasets ever created  Collaborate with millions  Build prediction models 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com Copyright 2012 Assured Information Security, Inc.
  • 15. Unclassified // Public ReleaseSocial Media Networks Awareness  It Concerns Groups Differently  Persons of interest  Dont want to be incriminated in things that you may not have done  Consumers  Dont want others to know things about their buying habits that can be used against them  Subjects  Dont want information released that might cause them to be judged by their peers  Some Concerns Everyone Shares  Discrimination  A feeling of (privacy) violation 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com Copyright 2012 Assured Information Security, Inc.
  • 16. Unclassified // Public ReleaseCase Study: Twitter  A real-time social media network of microblogs  Various APIs  Search, Live, Historical  Highly accessible  Example: NodeXL offers a MS Excel plugin for quickly grabbing a few thousand samples a day from multiple sites  Large user base  65+ million “tweets” per day  750+ “tweets” per second  International Community  At least 27 languages represented 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 17. Unclassified // Public ReleaseCase Study: Twitter  Twitter is used by:  People  Every Day Individuals  Politicians  Celebrities  Professionals  Bad Guys  Objects  Gadgets that tweet (Sensors, bots, computers, spammers)  Labeled Nefarious Groups  Lulzsec  Anonymous  others 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 18. Unclassified // Public ReleaseCase Study: Twitter  Whats accessible:  Posts contain far more then whats shown in the http://www.twitter.com web interface  Data is accessible as XML or in its native JSON form  Data includes:  Location (Geo fields)  User names / real names  Threading  Track conversations using replies  Track re-tweets  Twitter client software data  Time stamping  Tweet text  And so much more 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 19. Unclassified // Public ReleaseCase Study: Twitter 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 20. Unclassified // Public ReleaseCase Study: Twitter  What can be done with all of this:  NYC Company DataMinr  Report the death of Bin Laden: 25 minutes after he was killed   13 minutes before the presidents address  They saw the first message regarding this only 19 minutes after it happened  They were able to trace even earlier messages that with the right algorithms would have shown something going on before the initial military strike  Reports of US helicopter flying over head 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 21. Unclassified // Public ReleaseCase Study: Twitter  Consequences  Data on sites like twitter can be used to:  Predict Social Security numbers with reasonable accuracy [4]  Deduce the gender of an individual from nothing but the message text [5]  Track a persons physical location and create predictable pattern maps  Deny services based on views and opinions expressed  Use posts, even those that were deleted as evidence in court [6]  So much more 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 22. Unclassified // Public Release What rights of privacy must I protect? & What laws regarding privacy regulation exist? 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 23. Unclassified // Public ReleasePrivacy Protection / Regulation  First we need a strict definition for what is and isnt PII (Personally Identifiable Information)  PII is any information that can be used to identify a specific individual  This includes data that can be combined with other sources to identify an individual 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 24. Unclassified // Public ReleasePrivacy Protection / Regulation  You decided to use this data, whats next?  Protecting the PII of individuals within the dataset is key, and to some extent dependent on who you are  Were back to:  Government  Businesses  Academics  Lets concentrate on US law during the rest of this talk 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 25. Unclassified // Public ReleasePrivacy Protection / Regulation  The US Government  Must protect the privacy of its citizens  Federal: Cannot collect data on citizens without a warrant  States: Cannot collect data on citizens without just cause  Cannot deny citizens the right to use social media networks  Cannot enforce privacy on the individual  Can enforce regulations on the social media companies and those who use the data 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 26. Unclassified // Public ReleasePrivacy Protection / Regulation  Businesses in the US  Must protect the privacy of consumers  Must abide by regulations imposed by the government that the site is located within  While not required by law, its good practice to let consumers know what is being done with their data 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 27. Unclassified // Public ReleasePrivacy Protection / Regulation  Academics in the US  Must protect the privacy of subjects  This applies even in instances where data is gathered without consent, such as from social media network sites  Consent is not required for the collection of information from these sites  Depending on the specific sites EULA, datasets may:  Not be shared with other researchers outside of the organization  Not be duplicated within a publication  Summation through statistics and results is OK 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 28. Unclassified // Public Release What happens if I dont protect the privacy of the individuals within my datasets? 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 29. Unclassified // Public ReleaseConsequences  There are obvious legal ramifications for not protecting the privacy of individuals within a dataset  Legal (Federal / State)  Legal personal injury  Not so obvious  Loss of consumer trust / support  Loss of position through ethics violation clauses 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 30. Unclassified // Public ReleaseConsequences: Example  Ethics can be tied closely to privacy  Harvard researchers accessed complete Facebook profiles of 1700 students [7]  Data consisted of public profiles collected within the university  Researchers outside the university had to apply for access to the data  Data manual contained statistics about the dataset that did not require the application to be filled out  These statistics were used to identify individuals  Consequently researchers lost funding and the University found that opinion of the school had lowered  Researchers were put before the ethics board 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 31. Unclassified // Public ReleaseConclusion  Social media network datasets contain PII  PII is not just profile data, its also unseen fields such as geo-location and data that can be derived from the messages posted  Datasets can not be shared outside an organization without prior permission if required by the EULA  If the EULA allows for sharing of the data, it still must be properly anonymized 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 32. Unclassified // Public ReleaseReferences [1] DHS, Office of Operations Coordination and Planning, “Publicly Available Social Media Monitoring and Situational Awareness Initiative,” June 22, 2010 [2] “Vaidhyanathan, S.; , “Welcome to the surveillance society,” Spectrum, IEEE , vol.48, no.6, pp.48-51, June 2011 doi: 10.1109/MSPEC.2011.5779791 [3] Brodkin, Jon.; , “Bin Laden death-detecting analytics services signs partnership with Twitter,” ArsTechnica, Apr 9 2012 [4] Alessandro Acquisti, Ralph Gross.; ,“Predicting Social Security Numbers from public data”, Proceedings of the National Academy of Sciences, vol. 106, no. 27, July 7, 2009. [5] Burger, John., Et. All.; , “Discriminating Gender on Twitter,” Mitre Corp, Nov, 2011 [6] Smith, . ; , "No warrant needed, no privacy: Judge rules even deleted tweets can be used in court," Network World, Apr. 24, 2012 [7] Parry, Marc., ; , "Harvard Researchers Accused of Breaching Students Privacy," The Chronicle of Higher Education, July 10, 2011 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • 33. Unclassified // Public ReleaseQuestions 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.