CSIAC - Social Media Analysis and Privacy
Upcoming SlideShare
Loading in...5
×
 

CSIAC - Social Media Analysis and Privacy

on

  • 144 views

 

Statistics

Views

Total Views
144
Views on SlideShare
143
Embed Views
1

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 1

https://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

CSIAC - Social Media Analysis and Privacy CSIAC - Social Media Analysis and Privacy Presentation Transcript

  • Unclassified // Public ReleaseSocial Media Analysis and PrivacyJoshua Whitewhitej@ainfosec.comSenior Computer EngineerAssured Information Securityhttp://ainfosec.comPhD Student of Engineering ScienceClarkson UniversityDate: Oct 31, 2012Release: Unclassified // Public 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com Copyright 2012 Assured Information Security, Inc.
  • Unclassified // Public Release About: Company  AIS (Assured Information Security)  Research and Development of technologies and capabilities to support effective operations within the entirety of the cyber domain.  Leading pioneers in the disciplines of Information Operations including Network Operations, Electronic Warfare, and Computer Network Operations of all types.  Located In:  Rome NY (Corporate Headquarters)  Portland OR  Baltimore MD  Beavercreek OH  San Antonio TX  Colorado Springs, CO 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public Release About: Speaker  Joshua White  Education:  AAS Computer Network Technology (FLCC)  BS / MS Telecommunications (SUNYIT)  PhD Student of Engineering Science (Clarkson University)  Experience:  7+ years Government Contracting in Information Security and Telecommunications Engineering  Areas of Study:  Intrusion Detection Systems  Optical Network Security  Large Dataset Analysis  Distributed Processing 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public Release Overview: The Big Questions  Introduction  The Big Research Questions:  What are social media networks?  What is the privacy problem relating to them?  Who would want this data and why?  What rights of privacy must I protect?  What regulations regarding privacy exist?  What happens if I dont protect the privacy?  Conclusions  References 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public Release What are social media networks? 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseDefinition  Social Media Networks  DHS identified multiple categories [1]  Search  Video  Maps  Photos  Blog aggregates  Micro-blogs  Traditional social networks 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public Release Whats the privacy problem as it relates to these social media networks? 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseProblem  Two-Part Problem  End Users  Unsure or unaware on ways to properly protect their privacy  Data Collectors  Dont know how to properly maintain the privacy of their datasets 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseProblem Social Media-Networking Sites  Provide a communications method thought by many to be at least somewhat private  Many never change the default security settings associated with their accounts  Example: Percentage of Facebook users by age that change their account security settings to anything other then the default (no security) setting [2]  18-19 years old = 71%  30-39 years old = 67%  50-64 years old = 55%  80% of all users fall within the 18-64 age range  Estimated 20+ million users have no security but must still have a basic expectation of privacy  Provides the largest “Social Network” datasets available for study 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public Release Who would want this data and why? 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseProblem Focus: Data Collectors  The problem of user expectations and knowledge of privacy settings is for another discussion  Lets focus on the “larger” problem  Data Collection  What can we collect?  What can we do with the data?  How must we protect the privacy of an individual’s PII contained within the datasets? 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseSocial Media Networks Awareness  Benefits:  Government  Track locations of persons of interest with reasonable accuracy  “Bad guys” may have protected posts Sometimes accessible by simply looking at their friend’s  posts, or even other sites that they have allowed access within their accounts  Track trends Who said what, who repeated it?   Is it going to cause a riot or worse yet, a war?  News before “official” reports  Natural disasters, shootings, etc 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com Copyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseSocial Media Networks Awareness  Benefits:  Businesses  Directed advertising Track locations of consumers with reasonable accuracy   Track buying habits and interests  Track trends Who said what, who repeated it, is something going to  effect a brand?  News before “official” reports  Did something happen that will effect the market rapidly  Natural disasters, news reports, etc 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com Copyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseSocial Media Networks Awareness  Benefits:  Academia  Research Track locations of subjects with reasonable accuracy   Track habits, interests and moods over time  Track trends Who said what, who repeated it (graph theory)?   Study social networks with the largest datasets ever created  Collaborate with millions  Build prediction models 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com Copyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseSocial Media Networks Awareness  It Concerns Groups Differently  Persons of interest  Dont want to be incriminated in things that you may not have done  Consumers  Dont want others to know things about their buying habits that can be used against them  Subjects  Dont want information released that might cause them to be judged by their peers  Some Concerns Everyone Shares  Discrimination  A feeling of (privacy) violation 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com Copyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseCase Study: Twitter  A real-time social media network of microblogs  Various APIs  Search, Live, Historical  Highly accessible  Example: NodeXL offers a MS Excel plugin for quickly grabbing a few thousand samples a day from multiple sites  Large user base  65+ million “tweets” per day  750+ “tweets” per second  International Community  At least 27 languages represented 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseCase Study: Twitter  Twitter is used by:  People  Every Day Individuals  Politicians  Celebrities  Professionals  Bad Guys  Objects  Gadgets that tweet (Sensors, bots, computers, spammers)  Labeled Nefarious Groups  Lulzsec  Anonymous  others 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseCase Study: Twitter  Whats accessible:  Posts contain far more then whats shown in the http://www.twitter.com web interface  Data is accessible as XML or in its native JSON form  Data includes:  Location (Geo fields)  User names / real names  Threading  Track conversations using replies  Track re-tweets  Twitter client software data  Time stamping  Tweet text  And so much more 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseCase Study: Twitter 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseCase Study: Twitter  What can be done with all of this:  NYC Company DataMinr  Report the death of Bin Laden: 25 minutes after he was killed   13 minutes before the presidents address  They saw the first message regarding this only 19 minutes after it happened  They were able to trace even earlier messages that with the right algorithms would have shown something going on before the initial military strike  Reports of US helicopter flying over head 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseCase Study: Twitter  Consequences  Data on sites like twitter can be used to:  Predict Social Security numbers with reasonable accuracy [4]  Deduce the gender of an individual from nothing but the message text [5]  Track a persons physical location and create predictable pattern maps  Deny services based on views and opinions expressed  Use posts, even those that were deleted as evidence in court [6]  So much more 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public Release What rights of privacy must I protect? & What laws regarding privacy regulation exist? 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleasePrivacy Protection / Regulation  First we need a strict definition for what is and isnt PII (Personally Identifiable Information)  PII is any information that can be used to identify a specific individual  This includes data that can be combined with other sources to identify an individual 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleasePrivacy Protection / Regulation  You decided to use this data, whats next?  Protecting the PII of individuals within the dataset is key, and to some extent dependent on who you are  Were back to:  Government  Businesses  Academics  Lets concentrate on US law during the rest of this talk 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleasePrivacy Protection / Regulation  The US Government  Must protect the privacy of its citizens  Federal: Cannot collect data on citizens without a warrant  States: Cannot collect data on citizens without just cause  Cannot deny citizens the right to use social media networks  Cannot enforce privacy on the individual  Can enforce regulations on the social media companies and those who use the data 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleasePrivacy Protection / Regulation  Businesses in the US  Must protect the privacy of consumers  Must abide by regulations imposed by the government that the site is located within  While not required by law, its good practice to let consumers know what is being done with their data 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleasePrivacy Protection / Regulation  Academics in the US  Must protect the privacy of subjects  This applies even in instances where data is gathered without consent, such as from social media network sites  Consent is not required for the collection of information from these sites  Depending on the specific sites EULA, datasets may:  Not be shared with other researchers outside of the organization  Not be duplicated within a publication  Summation through statistics and results is OK 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public Release What happens if I dont protect the privacy of the individuals within my datasets? 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseConsequences  There are obvious legal ramifications for not protecting the privacy of individuals within a dataset  Legal (Federal / State)  Legal personal injury  Not so obvious  Loss of consumer trust / support  Loss of position through ethics violation clauses 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseConsequences: Example  Ethics can be tied closely to privacy  Harvard researchers accessed complete Facebook profiles of 1700 students [7]  Data consisted of public profiles collected within the university  Researchers outside the university had to apply for access to the data  Data manual contained statistics about the dataset that did not require the application to be filled out  These statistics were used to identify individuals  Consequently researchers lost funding and the University found that opinion of the school had lowered  Researchers were put before the ethics board 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseConclusion  Social media network datasets contain PII  PII is not just profile data, its also unseen fields such as geo-location and data that can be derived from the messages posted  Datasets can not be shared outside an organization without prior permission if required by the EULA  If the EULA allows for sharing of the data, it still must be properly anonymized 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseReferences [1] DHS, Office of Operations Coordination and Planning, “Publicly Available Social Media Monitoring and Situational Awareness Initiative,” June 22, 2010 [2] “Vaidhyanathan, S.; , “Welcome to the surveillance society,” Spectrum, IEEE , vol.48, no.6, pp.48-51, June 2011 doi: 10.1109/MSPEC.2011.5779791 [3] Brodkin, Jon.; , “Bin Laden death-detecting analytics services signs partnership with Twitter,” ArsTechnica, Apr 9 2012 [4] Alessandro Acquisti, Ralph Gross.; ,“Predicting Social Security Numbers from public data”, Proceedings of the National Academy of Sciences, vol. 106, no. 27, July 7, 2009. [5] Burger, John., Et. All.; , “Discriminating Gender on Twitter,” Mitre Corp, Nov, 2011 [6] Smith, . ; , "No warrant needed, no privacy: Judge rules even deleted tweets can be used in court," Network World, Apr. 24, 2012 [7] Parry, Marc., ; , "Harvard Researchers Accused of Breaching Students Privacy," The Chronicle of Higher Education, July 10, 2011 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.
  • Unclassified // Public ReleaseQuestions 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.comCopyright 2012 Assured Information Security, Inc.