• Like
  • Save
Analyzing the Privacy of Smartphone Apps, for CMU Cylab Talk on April 2013
Upcoming SlideShare
Loading in...5
×
 

Analyzing the Privacy of Smartphone Apps, for CMU Cylab Talk on April 2013

on

  • 163 views

This is a talk I gave in April 2013 at Carnegie Mellon University's CyLab weekly seminar. It describes some of our team's latest work on combining crowdsourcing with static and dynamic analysis to ...

This is a talk I gave in April 2013 at Carnegie Mellon University's CyLab weekly seminar. It describes some of our team's latest work on combining crowdsourcing with static and dynamic analysis to understand the privacy and security behaviors of smartphone apps.

Statistics

Views

Total Views
163
Views on SlideShare
163
Embed Views
0

Actions

Likes
0
Downloads
7
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • image source: http://expatlingo.com/2012/10/04/phone-addicts-hong-kong-welcomes-you/http://news.cnet.com/8301-1035_3-57534132-94/worldwide-smartphone-user-base-hits-1-billion/
  • Start out with a statement that probably won’t be controversial, which is that smartphones are pervasiveAbout 40% of all mobile phones sold today are smartphones, and the number is rapidly growingWhat’s also interestingare trends in how people use these smartphoneshttp://blog.sciencecreative.com/2011/03/16/the-authentic-online-marketer/http://www.generationalinsights.com/millennials-addicted-to-their-smartphones-some-suffer-nomophobia/In fact, Millennials don’t just sleep with their smartphones. 75% use them in bed before going to sleep and 90% check them again first thing in the morning.  Half use them while eating and third use them in the bathroom. A third check them every half hour. Another fifth check them every ten minutes. A quarter of them check them so frequently that they lose count.http://www.androidtapp.com/how-simple-is-your-smartphone-to-use-funny-videos/Pew Research CenterAround 83 percent of those 18- to 29-year-olds sleep with their cell phones within reach. http://persquaremile.com/category/suburbia/
  • Smartphones intimate part of our livesLocation,call logs,SMS,pics, moreCan capture human behavior atunprecedented fidelity and scale
  • We know these relationships, but computers have an overly simplified model of our relationships, usually just “friend”Can we do better?
  • Image adapted from Real Life Social Network, by Paul Adams
  • http://mashable.com/2012/10/31/pipsqueek-bluetooth-smartphone-kids/
  • http://www.google.com/intl/en/chrome/webstore/apps.html
  • Screenshot from AppBrainhttp://www.appbrain.com/stats/libraries/ad
  • http://www.micklerandassociates.com/10-apps-everybody-should-have/
  • cbsnews.com/8301-505263_162-57560825/smartphone-snoops-how-your-phone-data-is-being-shared/
  • DARPAGoogleCMU CyLab
  • http://www.flickr.com/photos/robby_van_moor/478725670/
  • http://about-google-android.blogspot.com/2012/12/an-android-powered-car-infotainment.html

Analyzing the Privacy of Smartphone Apps, for CMU Cylab Talk on April 2013 Analyzing the Privacy of Smartphone Apps, for CMU Cylab Talk on April 2013 Presentation Transcript

  • ©2009CarnegieMellonUniversity:1 Analyzing the Privacy of Smartphone Apps Apr 22, 2013 Shah Amini Jialiu Lin Prateek Sachdeva Jason Hong Janne Lindqvist Norman Sadeh Joy Zhang Computer Human Interaction: Mobility Privacy Security
  • ©2013CarnegieMellonUniversity:2 How to Manage Smartphone Privacy? • Lots of smart devices – 1B smartphones worldwide • Lots of apps – ~700k apps and 40B+ downloads for each of Android and iOS • Highly intimate • Lots of rich data • Lots of inferences
  • ©2013CarnegieMellonUniversity:3 Smartphones are Intimate Mobile phones and millennials (Pew 2012): • 75% use in bed before going to sleep • 83% sleep with their mobile phones • 90% check first thing in the morning • Half use them while eating • A third use them in the bathroom (!) • A fifth check them every ten minutes
  • ©2013CarnegieMellonUniversity:4 Smartphone Data is Rich Who we know (contact list, social networking) Who we call (call log) Who we text (sms log, Kakao, social networking)
  • ©2013CarnegieMellonUniversity:5 Smartphone Data is Rich Where we go (gps, foursquare) Photos (some geotagged) Sensors (accel, sound, light)
  • ©2013CarnegieMellonUniversity:6 Inferences from Data Example: Modeling Social Relationships • If you were in a jail in Mexico, which of the 500+ “friends” in your phone contact list would come and get you out?
  • ©2013CarnegieMellonUniversity:7 Inferences from Data Example: Modeling Social Relationships • Can we build a richer augmented social graph? – models tie strength, group, role
  • ©2013CarnegieMellonUniversity:8 Inferences from Data Example: Modeling Social Relationships
  • ©2013CarnegieMellonUniversity:9 Inferences from Data Example: Modeling Social Relationships
  • ©2013CarnegieMellonUniversity:10 Inferences from Data Example: Modeling Social Relationships
  • ©2013CarnegieMellonUniversity:11 • Friend or not – 92% accuracy – Using just GPS co-location data • Life facet {family, social, work} – 90% • Tie strength {low, med, high} – 75% – Using just contacts, call logs, SMS logs Cranshaw et al, Bridging the Gap Between Physical Location and Online Social Networks, Ubicomp 2010. Min et al, Mining Smartphone Data to Classify Life-Facets of Social Relationships, CSCW 2013. Inferences from Data Example: Modeling Social Relationships
  • ©2013CarnegieMellonUniversity:12 Sensor data Sleep data (self-reported ground truth) Inferences from Data Example: Sleep
  • ©2013CarnegieMellonUniversity:13 Smartphone Data for Depression Social Relationships • Isolation • Lack of close family or friends Physical Activities • Mobility • Consistency • Places you go to Sleep Patterns • Excessive sleep • Too little sleep • Change over time Cognitive Behaviors • Multitasking • Lots of phone use
  • ©2013CarnegieMellonUniversity:14 How to Manage Smartphone Privacy? • Lots of smart devices – 1B smartphones worldwide • Lots of apps – ~700k apps and 40B+ downloads for each of Android and iOS • High intimacy • Lots of rich data • Lots of inferences
  • ©2013CarnegieMellonUniversity:15 Shares your location, gender, unique phone ID, phone# with advertisers Uploads your entire contact list to their server (including phone #s) What are your apps really doing?
  • ©2013CarnegieMellonUniversity:16 Many Smartphone Apps Have “Unusual” Permissions App Permissions Used Tiny Flashlight + LED Internet Access, phone# Backgrounds Contact List Dictionary Location Bible Quotes Location • Advertising, malware, bootstrapping social networks, future permissions
  • ©2013CarnegieMellonUniversity:17 Android • What do these permissions mean? • Why does app need this permission? • When does it use these permissions?
  • ©2013CarnegieMellonUniversity:18 Two Threads of Work • Works in progress, feedback appreciated • CrowdScanning – Crowdsourcing approach to understand coarse-grain privacy perceptions of apps • Gort – Tool for analysts to understand fine-grain app behaviors
  • ©2013CarnegieMellonUniversity:19 CrowdScanning Core Ideas • Idea 1: find the gap between what people expect an app to do and what it actually does • Idea 2: use crowdsourcing to do this (crowdsource privacy) Lin et al, Expectation and Purpose: Understanding User’s Mental Models of Mobile App Privacy thru Crowdsourcing. Ubicomp 2012.
  • ©2013CarnegieMellonUniversity:20 Nissan Maxima Gear Shift
  • ©2013CarnegieMellonUniversity:21 Privacy as Expectations • Apply this same idea of mental models for privacy – Compare what people expect an app to do vs what an app actually does – Emphasize the biggest gaps, misconceptions that many people had App Behavior (What an app actually does) User Expectations (What people think the app does)
  • ©2013CarnegieMellonUniversity:22 Crowdsourcing Privacy • Few people read privacy policies – We want to install the app – Reading policies not part of main task – Complexity of these policies (the pain!!!) – Clear cost (time) for unclear benefit • Crowdsourcing can mitigate these problems
  • ©2013CarnegieMellonUniversity:23 10% users were surprised this app wrote contents to their SD card. 25% users were surprised this app sent their approximate location to dictionary.com for searching nearby words. 85% users were surprised this app sent their phone’s unique ID to mobile ads providers. 0% users were surprised this app could control their audio settings. See all 90% users were surprised this app sent their precise location to mobile ads providers. 95% users were surprised this app sent their approximate location to mobile ads providers. 95% users were surprised this app sent their phone’s unique ID to mobile ads providers. 0% users were surprised this app can control camera flashlight.
  • ©2013CarnegieMellonUniversity:24 Our Study on App Privacy • Showed crowd workers screenshots and description of app (from Google Play) – 56 of top 100 Android Apps • Showed permissions one at a time – Only those related to privacy • Expectation Condition – Why they think the app uses permission – How comfortable they were with it • Purpose Condition – We gave an explanation (based on our analysis) – How comfortable they were with it
  • ©2013CarnegieMellonUniversity:25 Our Study on App Privacy • Participants – Recruited from Mturk, US people only – Asked what version of Android OS they used – Between-subjects (one condition only) • Method – Only 56 of top 100 apps requested use of unique phone ID, contact list, or location • Led to a total of 134 app-resource pairs – 20 participants per pair per condition • 2*20*134 = 5360 tasks
  • ©2013CarnegieMellonUniversity:26 Results for Location Data (N=20 per app, Expectations Condition) App Comfort Level (-2 – 2) Maps 1.52 GasBuddy 1.47 Weather Channel 1.45 Foursquare 0.95 TuneIn Radio 0.60 Evernote 0.15 Angry Birds -0.70 Brightest Flashlight Free -1.15 Toss It -1.2
  • ©2013CarnegieMellonUniversity:27 Most Unexpected Uses (N=20 per app, Expectations Condition) • Found strong correlation between expectations & comfort level (r=0.91) Apps using Contact List Comfort Level (-2 – 2) Backgrounds HD Wallpaper -1.35 Pandora -0.70 GO Launcher EX -0.75
  • ©2013CarnegieMellonUniversity:28 Showing Purpose Lowers Concerns • All differences statistically significant • Big increases for dictionary, Shazam, Air Control Lite, and others (> 1.0) App Comfort w/ Purpose Comfort w/o Purpose Device ID 0.47 ( =0.30) -0.10 ( =0.41) Contact List 0.66 ( =0.22) 0.16 ( =0.54) Network Location 0.90 ( =0.53) 0.65 ( =0.55) GPS Location 0.72 ( =0.62) 0.35 ( =0.73)
  • ©2013CarnegieMellonUniversity:29 Scaling Up CrowdScanning • It took ~2 wks to crowdsource 56 apps • 700k+ apps for iOS & Android markets • Idea: Use static & dynamic analysis + clustering for privacy models of apps – Ex. “Games uses location” -1.3 – Ex. “Uses location for map” +0.5
  • ©2013CarnegieMellonUniversity:30 Scaling Up CrowdScanning Crawled Data Set • Crawled 171k apps from Google Play – App name – Category (Arcade, Finance, etc) – Number of downloads – Average user rating (1-5) – Rating distribution – Price – Content Rating – 13M user reviews
  • ©2013CarnegieMellonUniversity:31
  • ©2013CarnegieMellonUniversity:32
  • ©2013CarnegieMellonUniversity:33 Scaling Up CrowdScanning Static Analysis of Apps • Starting assumptions: – Most apps use third-party libraries – When sensitive data is used, b/c libraries • Ex. Location sent to ad server via library • Ex. Location sent to Google for maps • Understanding what libraries app uses and how they are used can offer us richer semantics and explanations
  • ©2013CarnegieMellonUniversity:34 Scaling Up CrowdScanning Libraries are Major Point of Leverage
  • ©2013CarnegieMellonUniversity:35 Scaling Up CrowdScanning Static Analysis of Apps • Features extracted: – Libraries used – Network conn (in library or in main code) – Permissions (in library or main code) • 124k apps processed – Uses PyDev (Python for Eclipse) and AndroGuard (reverse eng apps) – 5 Amazon EC2 instances, 30 secs / app • Will crowdsource core set of 400 apps and build models to predict privacy
  • ©2013CarnegieMellonUniversity:36 Scaling Up CrowdScanning Tangent: Analyzing App Comments • Linear regression of most common words to 5-star ratings – Out of 1M comments, 8% of dataset – Only 0.09% comments related to privacy
  • ©2013CarnegieMellonUniversity:37 Two Threads of Work • CrowdScanning – Crowdsourcing approach to understand coarse-grain privacy perceptions of apps • Gort – Tool for analysts to understand fine-grain app behaviors
  • ©2013CarnegieMellonUniversity:38 Gort App Analysis Tool • Goal of Gort is to help analysts understand and vet behaviors of apps – Journalists – Privacy advocates – Three letter agencies
  • ©2013CarnegieMellonUniversity:39 Example Comparison • CrowdScanning: Yelp uses location • Gort: When (what screens) and why?
  • ©2013CarnegieMellonUniversity:40 Gort v1 Control Flow Graph Current Screen Servers contacted HTTP details HTTP requests Market description Permissions used Personal data sent
  • ©2013CarnegieMellonUniversity:41 Gort v2 Envisioned Workflow • Start with a pool of apps • Use heuristics to flag unusual behaviors to direct analyst’s attention – Static and dynamic heuristics • See overview of apps, view individual apps, check odd behaviors and context (screens)
  • ©2013CarnegieMellonUniversity:42 Gort v2 Heuristics for Apps • Interviewed 13 experts – Asked what characteristics and behaviors they would check to vet an app – Got ~100 heuristics, still organizing them Network • Sends password w/o SSL • Connects to fixed IP address Permissions • Contact List • Location but not for maps or ads • Uses mic Phone / SMS • SMS to fixed / premium num • Forwards SMS to server
  • ©2013CarnegieMellonUniversity:43 Traversing Screens in Apps • Have to traverse app for some heuristics – Ex. when exactly does the app use location? – Also want to capture screenshots
  • ©2013CarnegieMellonUniversity:44 Traversing Screens in Apps • General case is fairly easy – Breadth-first-search from home screen – Uses TEMA to get widgets on screen – Use Android’s MonkeyRunner to simulate input and get screenshots • But lots of exception cases…
  • ©2013CarnegieMellonUniversity:45 Some Hard Cases for Traversal Dialogs w/ side effects Text InputsLogins
  • ©2013CarnegieMellonUniversity:46 Some Hard Cases for Traversal Changes to system env App Updates Randomized dialogs
  • ©2013CarnegieMellonUniversity:47 Scaling Up CrowdScanning Making the Results Public • What will we do with all these results? • Basic idea: deploy a web site – Let public see results of our scans – Show privacy scores (and explanations) – Tell app developers how to fix their apps • Awareness, Knowledge, Motivation • Still early stages here, should have first iteration of site out end of May
  • ©2013CarnegieMellonUniversity:48 Public Feedback to Date • Slate • Yahoo News • MSNBC • Pittsburgh Tribune Review
  • ©2013CarnegieMellonUniversity:49 Thanks! More info at cmuchimps.org or email jasonh@cs.cmu.edu Special thanks to: • Army Research Office • National Science Foundation • Alfred P. Sloan Foundation • Google • CMU Cylab Join our community for researchers at: www.reddit.com/r/pervasivecomputing
  • ©2013CarnegieMellonUniversity:50
  • ©2013CarnegieMellonUniversity:51 The Opportunity • We are creating a worldwide sensor network with these smartphones • We can now capture and analyze human behavior at unprecedented fidelity and scale
  • ©2013CarnegieMellonUniversity:52 Summary • Smartphones offer big opportunity to understand human behavior at unprecedented fidelity and scale • Augmented Social Graph • Urban Analytics • CrowdScanning
  • ©2013CarnegieMellonUniversity:53 Reach of Apps Growing Finances Automobiles Homes
  • ©2013CarnegieMellonUniversity:54 Reach of Apps Growing