Fostering an Ecosystem for
Smartphone Privacy
Jason Hong
jasonh@cs.cmu.edu
New Kinds of Guidelines and Regulations
US Federal Trade
Commission guidelines
California Attorney General
recommendations European Union
General Data Protection
My Research Focused on Smartphones
and Privacy
• Over 1B smartphones
sold every year
– Perhaps most widely
deployed platform
• Well over 100B apps
downloaded on each of
Android and iOS
• Incredibly intimate devices
Smartphones are Intimate
Fun Facts about Millennials
• 83% sleep with phones
Smartphones are Intimate
Fun Facts about Millennials
• 83% sleep with phones
• 90% check first thing in morning
Smartphones are Intimate
Fun Facts about Millennials
• 83% sleep with phones
• 90% check first thing in morning
• 1 in 3 use in bathroom
Smartphone Data is Intimate
Who we know
(contacts + call log)
Sensors
(accel, sound, light)
Where we go
(gps, photos)
The Opportunity and the Risk
• There are all these
amazing things we
could do
– Healthcare
– Urban analytics
– Sustainability
• But only if we can
legitimately address
privacy concerns
– Spam, misuse, breaches
http://www.flickr.com/photos/robby_van_moor/478725670/
My Main Point: We Need to Foster a
Better Ecosystem of Privacy
• Today, too much burden is on end-users
– Should I install this app?
– What are all the settings I need to know?
– What are all the terms and conditions?
– Trackers, cookies, VPNs, anonymizers, etc
• We need a better ecosystem for privacy
– Push burden from end-users onto rest of ecosystem
– Analogy: Spam email
– Other players: OS, app stores, developers, services,
crowds, policy makers, journalists
Today’s Talk
• Why is privacy hard?
• Our research in smartphone privacy
– PrivacyGrade.org for grading app privacy
– Studies on what developers know about privacy
– Helping developers
– Helping app stores
• What you can do to help with privacy
Why is Privacy Hard?
#1 Privacy is a broad and fuzzy term
• Privacy is a broad umbrella term that captures
concerns about our relationships with others
– The right to be left alone
– Control and feedback over one’s data
– Anonymity (popular among researchers)
– Presentation of self (impression management)
– Right to be forgotten
– Contextual integrity (take social norms into account)
• Each leads to different way of handling privacy
– Right to be left alone -> do not call list, blocking
– Right to be forgotten -> delete from search engines
Today, Will Focus on One Form of Privacy
Data Privacy
• Data privacy is primarily about how orgs collect,
use, and protect sensitive data
– Focuses on Personally Identifiable Information (PII)
• Ex. Name, street address, unique IDs, pictures
– Rules about data use, privacy notices
• Led to the Fair Information Practices
– Notice / Awareness
– Choice / Consent
– Access / Participation
– Integrity / Security
– Enforcement / Redress
Some Comments on Data Privacy
• Data privacy tends to be procedurally-oriented
– Did you follow this set of rules?
– Did you check off all of the boxes?
– Somewhat hard to measure too (Better? Worse?)
– This is in contrast to outcome-oriented
• Many laws embody the Fair Information Practices
– GDPR, HIPAA, Financial Privacy Act, COPPA, FERPA
– But, enforcement is a weakness here
• If an org violates, can be hard to detect
• In practice, limited resources for enforcement
Why is Privacy Hard?
#2 No Common Set of Best Practices for Privacy
• Security has lots of best practices + tools for devs
– Use TLS/SSL
– Devices should not have common default passwords
– Use firewalls to block unauthorized traffic
• For privacy, not so much
– Choice / Consent: Best way of offering choice?
– Access / Participation: Best way of offering access?
– Notice / Awareness: Typically privacy policies, useful?
• New York Times privacy policy
• Still state of the art for privacy notices
• But no one reads these
Why is Privacy Hard?
#3 Technological Capabilities Rapidly Growing
• Data gathering easier and pervasive
– Everything on the web (Google + FB)
– Sensors (smartphones, IoT)
• Data storage and querying bigger and faster
• Inferences more powerful
– Some examples shortly
• Data sharing more widespread
– Social media
– Lots of companies collecting and sharing with each
other, hard to explain to end-users (next slide)
• 2010 diagram of ad tech ecosystem
• Most of these are collecting and using
data about you
Built a logistic regression
to predict sexuality based
on what your friends on
Facebook disclosed, even
if you didn’t disclose
Inferences about people more powerful
“[An analyst at Target] was able to identify about
25 products that… allowed him to assign each
shopper a ‘pregnancy prediction’ score. [H]e
could also estimate her due date to within a small
window, so Target could send coupons timed to
very specific stages of her pregnancy.” (NYTimes)
Why is Privacy Hard?
#4 Multiple Use of the Same Data
• The same data can help as well as harm (or
creep people out) depending on use and re-use
Recap of Why Privacy is Hard
• Privacy is a broad and fuzzy term
• No common set of best practices
• Technological capabilities rapidly growing
• Same data can be used for good and for bad
• Note that these are just a few reasons,
there are many, many more
– But enough so that we have common ground
Some Smartphone Apps Use Your Data in
Unexpected Ways
Shared your location,
gender, unique phone ID,
phone# with advertisers
Uploaded your entire
contact list to their server
(including phone #s)
More Unexpected Uses of Your Data
Location Data
Unique device ID
Location Data
Network Access
Unique device ID
Location Data
Microphone
Unique device ID
PrivacyGrade.org
• Improve transparency
• Assign privacy grades to all
1M+ Android apps
• Does not help devs directly
Expectations vs Reality
Privacy as Expectations
Use crowdsourcing to compare what people expect
an app to do vs what an app actually does
App Behavior
(What an app
actually does)
User Expectations
(What people think
the app does)
How PrivacyGrade Works
• We crowdsourced people’s expectations of
core set of 837 apps
– Ex. “How comfortable are you with
Drag Racing using your location for ads?”
• We generated purposes by examining
what third-party libraries used by app
• Created a model to predict people’s likely
privacy concerns and applied to 1M Android apps
How PrivacyGrade Works
How PrivacyGrade Works
• Long tail distribution of libraries
• We focused on top 400 libraries, which covers
vast majority of cases
Impact of PrivacyGrade
• Popular Press
– NYTimes, CNN, BBC, CBS, more
• Government
– Earlier work helped lead to FTC fines
• Google
– Google has something like PrivacyGrade internally
• Developers
Market Failure for Privacy
• Let’s say you want to purchase a web cam
– Go into store, can compare price, color, features
– But can’t easily compare security (hidden feature)
– So, security does not influence customer purchases
– So, devs not incentivized to improve
• Same is true for privacy
– This is where things like PrivacyGrade can help
– Improve transparency, address market failures
– More broadly, what other ways to incentivize?
Study 1
What Do Developers Know about Privacy?
• A lot of privacy research is about end-users
– Very little about developers
• Interviewed 13 app developers
• Surveyed 228 app developers
– Got a good mix of experiences and size of orgs
• What knowledge? What tools used? Incentives?
• Are there potential points of leverage?
Balebako et al, The Privacy and Security Behaviors
of Smartphone App Developers. USEC 2014.
Study 1 Summary of Findings
Third-party Libraries Problematic
• Use ads and analytics to monetize
• Hard to understand their behaviors
– A few didn’t know they were using libraries
(based on inconsistent answers)
– Some didn’t know the libraries collected data
– “If either Facebook or Flurry had a privacy policy that
was short and concise and condensed into real
English rather than legalese, we definitely would
have read it.”
– In a later study we did on apps, we found 40% apps
used sensitive data only b/c of libraries [Chitkara 2017]
Study 1 Summary of Findings
Devs Don’t Know What to Do
• Low awareness of existing privacy guidelines
– Fair Information Practices, FTC guidelines, Google
– Often just ask others around them
• Low perceived value of privacy policies
– Mostly protection from lawsuits
– “I haven’t even read [our privacy policy]. I mean, it’s
just legal stuff that’s required, so I just put in there.”
Study 2
How do developers address privacy when coding?
• Interviewed 9 Android developers
• Semi-structured interview probing about their
three most recent apps
– Their understanding of privacy
– Any privacy training they received
– What data collected in app and how used
• Libraries used?
• Was data sent to cloud server?
• How and where data stored?
– We also checked against their app if on app store
Study 2 Findings
Inaccurate Understanding of Their Own Apps
• Some data practices they claimed didn’t match
app behaviors
• Lacked knowledge of library behaviors
• Fast iterations led to changes in data collection
and data use
• Team dynamics
– Division of labor, don’t know what other devs doing
– Turnover, use of sensitive data not documented
Study 2 Findings
Lack of Knowledge of Alternatives
• Many alternatives exist, but often went with first
solution found (e.g. StackOverflow)
• Example: Many apps use some kind of identifier,
and different identifiers have tradeoffs
– Hardware identifiers (riskiest since persistent)
– Application identifier (email, hashcode)
– Advertising identifier
Study 2 Findings
Lack of Motivation to Address Privacy Issues
• Might ignore privacy issues if not required
– Ex. Get location permission for one reason (maps),
but also use for other reasons (ads)
– Ex. Get name and email address, only need email
– Ex. Get device ID because no permission needed
• Android permissions and Play Store requirements
useful in forcing devs to improve
– In Android, have to declare use of most sensitive data
– Google Play has requirements too (ex. privacy policy)
How to Get People to Change Behaviors?
Security Sensitivity Stack
Awareness
Knowledge
Motivation
Does person know of existing threat?
Does person know tools, behaviors,
strategies to protect?
Can person identify attack / problem?
Can person use tools, behaviors,
strategies?
Does person care?
Security Sensitivity Stack Adapted for
Developers and Privacy
Awareness
Knowledge
Motivation
Are devs aware of privacy problem?
Ex. Identifier tradeoffs, library behavior
Do devs know how to address?
Ex. Might not know right API call
Do devs care?
Ex. Sometimes ignore issues if not required
Privacy-Enhanced Android
• A large DARPA project to improve privacy
• Key idea: have devs declare in apps the purpose
of why sensitive data being used
– Devs select from a small set of defined purposes
• Today: “Uses location”
• Tomorrow: “Uses location for advertising”
– Use these purposes to help developers
• Managing data better, generate privacy policies, etc
– … to check app behaviors throughout ecosystem
– … for new kinds of GUIs explaining app behaviors
Helping Developers
PrivacyStreams Programming Model
• Observations
– Most apps don’t need raw data (GPS vs City location)
– Many ancillary issues (threads, format, different APIs)
• PrivacyStreams works like Unix pipes on streams
– Easier for developers (threading, uniform API + format)
– Devs never see raw data, only final outputs
– Also easier to analyze, since one line of code
• “This app uses your microphone only to get loudness”
UQI.getData(Audio.recordPeriodic(DURATION, INTERVAL),
Purpose.HEALTH("monitor sleep"))
.setField("loudness", calcLoudness(Audio.AUDIO_DATA))
.forEach("loudness", callback);
Helping Developers
Coconut IDE Plugin
• Developers add some Java annotations for each
use of sensitive data
• Can offer alternatives
• Can aggregate all data use in one place
• (Future) Can auto-generate privacy policies
Helping App Stores
• Ways of checking the behavior of apps
– Ex. When devs upload to app store
• Decompile the app and examine the text
– If app uses location data, and if we see strings from
app like exif, photo, tag, probably geotagging
• Check network data of apps
– Similar to above, except for network traffic
• Add safety checks to apps
– If app has well-defined policy, can add extra checks
to app to make sure it does the right thing
Addressing Market Failure (Work in Progress)
Who Knows What About Us and Why
Addressing Market Failure (Work in Progress)
Who Knows What About Us and Why
ProtectMyPrivacy (PMP) for Making
Decisions
• For jailbroken iOS and Android
– Intercept calls to sensitive data
– Over 200k people using iOS PMP
– Over 6M decisions (+ stack traces)
– 20 data types protected
• Recommender system too
– Have recs for 97% of top 10k apps
• User study with 1321 people
– 1 year with old model (by app)
– 1 month with new model (by library)
Long Tail of Third-party Libraries
Chitkara, S. et al. Why does this app need my Location? Context aware
Privacy Management on Android. In IMWUT 1(3). 2017.
Most Popular 30 Libraries Account for
Over Half of all Sensitive Data Access
About 40% Apps Use Sensitive Data Only
because of Third-party Libraries
Allowing or Denying Access by Library
(vs by App) Reduces #Decisions Made
How You Can Help with Privacy
Some Opportunities
• Imagine a gigantic blob of privacy work
– This is amount of work needed for “good” privacy
– Right now, most of this blob is managed by end-users
– What are useful ways of slicing up this blob so that
other parts of ecosystem can manage better?
• Better decision making by crowds or by experts
– How good are decisions? Ways of making better ones?
– How to easily share these decisions?
How You Can Help with Privacy
Some Opportunities
• Economics of privacy
– GDPR; other ways of addressing market failures?
• Ex. Consumer Reports really interested in this area
– Third-party services and libraries a major problem
• Incentives for privacy
– Improving awareness, knowledge, motivation for devs
– User attention for privacy
• Special cases of privacy law
– Privacy for children, healthcare, finances
How can we create
a connected world we
would all want to live in?
Thanks!
More info at cmuchimps.org
or email jasonh@cs.cmu.edu
Special thanks to:
• DARPA Brandeis
• Google
• Yuvraj Agarwal
• Shah Amini
• Rebecca Balebako
• Mike Czapik
• Matt Fredrikson
• Shawn Hanna
• Haojian Jin
• Tianshi Li
• Yuanchun Li
• Jialiu Lin
• Song Luan
• Swarup Sahoo
• Mike Villena
• Jason Wiese
• Alex Yu
• And many more…
• CMU Cylab
• NQ Mobile

Fostering an Ecosystem for Smartphone Privacy

  • 1.
    Fostering an Ecosystemfor Smartphone Privacy Jason Hong jasonh@cs.cmu.edu
  • 3.
    New Kinds ofGuidelines and Regulations US Federal Trade Commission guidelines California Attorney General recommendations European Union General Data Protection
  • 4.
    My Research Focusedon Smartphones and Privacy • Over 1B smartphones sold every year – Perhaps most widely deployed platform • Well over 100B apps downloaded on each of Android and iOS • Incredibly intimate devices
  • 5.
    Smartphones are Intimate FunFacts about Millennials • 83% sleep with phones
  • 6.
    Smartphones are Intimate FunFacts about Millennials • 83% sleep with phones • 90% check first thing in morning
  • 7.
    Smartphones are Intimate FunFacts about Millennials • 83% sleep with phones • 90% check first thing in morning • 1 in 3 use in bathroom
  • 8.
    Smartphone Data isIntimate Who we know (contacts + call log) Sensors (accel, sound, light) Where we go (gps, photos)
  • 9.
    The Opportunity andthe Risk • There are all these amazing things we could do – Healthcare – Urban analytics – Sustainability • But only if we can legitimately address privacy concerns – Spam, misuse, breaches http://www.flickr.com/photos/robby_van_moor/478725670/
  • 10.
    My Main Point:We Need to Foster a Better Ecosystem of Privacy • Today, too much burden is on end-users – Should I install this app? – What are all the settings I need to know? – What are all the terms and conditions? – Trackers, cookies, VPNs, anonymizers, etc • We need a better ecosystem for privacy – Push burden from end-users onto rest of ecosystem – Analogy: Spam email – Other players: OS, app stores, developers, services, crowds, policy makers, journalists
  • 11.
    Today’s Talk • Whyis privacy hard? • Our research in smartphone privacy – PrivacyGrade.org for grading app privacy – Studies on what developers know about privacy – Helping developers – Helping app stores • What you can do to help with privacy
  • 12.
    Why is PrivacyHard? #1 Privacy is a broad and fuzzy term • Privacy is a broad umbrella term that captures concerns about our relationships with others – The right to be left alone – Control and feedback over one’s data – Anonymity (popular among researchers) – Presentation of self (impression management) – Right to be forgotten – Contextual integrity (take social norms into account) • Each leads to different way of handling privacy – Right to be left alone -> do not call list, blocking – Right to be forgotten -> delete from search engines
  • 13.
    Today, Will Focuson One Form of Privacy Data Privacy • Data privacy is primarily about how orgs collect, use, and protect sensitive data – Focuses on Personally Identifiable Information (PII) • Ex. Name, street address, unique IDs, pictures – Rules about data use, privacy notices • Led to the Fair Information Practices – Notice / Awareness – Choice / Consent – Access / Participation – Integrity / Security – Enforcement / Redress
  • 14.
    Some Comments onData Privacy • Data privacy tends to be procedurally-oriented – Did you follow this set of rules? – Did you check off all of the boxes? – Somewhat hard to measure too (Better? Worse?) – This is in contrast to outcome-oriented • Many laws embody the Fair Information Practices – GDPR, HIPAA, Financial Privacy Act, COPPA, FERPA – But, enforcement is a weakness here • If an org violates, can be hard to detect • In practice, limited resources for enforcement
  • 15.
    Why is PrivacyHard? #2 No Common Set of Best Practices for Privacy • Security has lots of best practices + tools for devs – Use TLS/SSL – Devices should not have common default passwords – Use firewalls to block unauthorized traffic • For privacy, not so much – Choice / Consent: Best way of offering choice? – Access / Participation: Best way of offering access? – Notice / Awareness: Typically privacy policies, useful?
  • 16.
    • New YorkTimes privacy policy • Still state of the art for privacy notices • But no one reads these
  • 17.
    Why is PrivacyHard? #3 Technological Capabilities Rapidly Growing • Data gathering easier and pervasive – Everything on the web (Google + FB) – Sensors (smartphones, IoT) • Data storage and querying bigger and faster • Inferences more powerful – Some examples shortly • Data sharing more widespread – Social media – Lots of companies collecting and sharing with each other, hard to explain to end-users (next slide)
  • 18.
    • 2010 diagramof ad tech ecosystem • Most of these are collecting and using data about you
  • 19.
    Built a logisticregression to predict sexuality based on what your friends on Facebook disclosed, even if you didn’t disclose Inferences about people more powerful
  • 20.
    “[An analyst atTarget] was able to identify about 25 products that… allowed him to assign each shopper a ‘pregnancy prediction’ score. [H]e could also estimate her due date to within a small window, so Target could send coupons timed to very specific stages of her pregnancy.” (NYTimes)
  • 21.
    Why is PrivacyHard? #4 Multiple Use of the Same Data • The same data can help as well as harm (or creep people out) depending on use and re-use
  • 22.
    Recap of WhyPrivacy is Hard • Privacy is a broad and fuzzy term • No common set of best practices • Technological capabilities rapidly growing • Same data can be used for good and for bad • Note that these are just a few reasons, there are many, many more – But enough so that we have common ground
  • 23.
    Some Smartphone AppsUse Your Data in Unexpected Ways Shared your location, gender, unique phone ID, phone# with advertisers Uploaded your entire contact list to their server (including phone #s)
  • 24.
    More Unexpected Usesof Your Data Location Data Unique device ID Location Data Network Access Unique device ID Location Data Microphone Unique device ID
  • 25.
    PrivacyGrade.org • Improve transparency •Assign privacy grades to all 1M+ Android apps • Does not help devs directly
  • 30.
  • 31.
    Privacy as Expectations Usecrowdsourcing to compare what people expect an app to do vs what an app actually does App Behavior (What an app actually does) User Expectations (What people think the app does)
  • 32.
    How PrivacyGrade Works •We crowdsourced people’s expectations of core set of 837 apps – Ex. “How comfortable are you with Drag Racing using your location for ads?” • We generated purposes by examining what third-party libraries used by app • Created a model to predict people’s likely privacy concerns and applied to 1M Android apps
  • 33.
  • 34.
    How PrivacyGrade Works •Long tail distribution of libraries • We focused on top 400 libraries, which covers vast majority of cases
  • 35.
    Impact of PrivacyGrade •Popular Press – NYTimes, CNN, BBC, CBS, more • Government – Earlier work helped lead to FTC fines • Google – Google has something like PrivacyGrade internally • Developers
  • 36.
    Market Failure forPrivacy • Let’s say you want to purchase a web cam – Go into store, can compare price, color, features – But can’t easily compare security (hidden feature) – So, security does not influence customer purchases – So, devs not incentivized to improve • Same is true for privacy – This is where things like PrivacyGrade can help – Improve transparency, address market failures – More broadly, what other ways to incentivize?
  • 37.
    Study 1 What DoDevelopers Know about Privacy? • A lot of privacy research is about end-users – Very little about developers • Interviewed 13 app developers • Surveyed 228 app developers – Got a good mix of experiences and size of orgs • What knowledge? What tools used? Incentives? • Are there potential points of leverage? Balebako et al, The Privacy and Security Behaviors of Smartphone App Developers. USEC 2014.
  • 38.
    Study 1 Summaryof Findings Third-party Libraries Problematic • Use ads and analytics to monetize • Hard to understand their behaviors – A few didn’t know they were using libraries (based on inconsistent answers) – Some didn’t know the libraries collected data – “If either Facebook or Flurry had a privacy policy that was short and concise and condensed into real English rather than legalese, we definitely would have read it.” – In a later study we did on apps, we found 40% apps used sensitive data only b/c of libraries [Chitkara 2017]
  • 39.
    Study 1 Summaryof Findings Devs Don’t Know What to Do • Low awareness of existing privacy guidelines – Fair Information Practices, FTC guidelines, Google – Often just ask others around them • Low perceived value of privacy policies – Mostly protection from lawsuits – “I haven’t even read [our privacy policy]. I mean, it’s just legal stuff that’s required, so I just put in there.”
  • 40.
    Study 2 How dodevelopers address privacy when coding? • Interviewed 9 Android developers • Semi-structured interview probing about their three most recent apps – Their understanding of privacy – Any privacy training they received – What data collected in app and how used • Libraries used? • Was data sent to cloud server? • How and where data stored? – We also checked against their app if on app store
  • 41.
    Study 2 Findings InaccurateUnderstanding of Their Own Apps • Some data practices they claimed didn’t match app behaviors • Lacked knowledge of library behaviors • Fast iterations led to changes in data collection and data use • Team dynamics – Division of labor, don’t know what other devs doing – Turnover, use of sensitive data not documented
  • 42.
    Study 2 Findings Lackof Knowledge of Alternatives • Many alternatives exist, but often went with first solution found (e.g. StackOverflow) • Example: Many apps use some kind of identifier, and different identifiers have tradeoffs – Hardware identifiers (riskiest since persistent) – Application identifier (email, hashcode) – Advertising identifier
  • 43.
    Study 2 Findings Lackof Motivation to Address Privacy Issues • Might ignore privacy issues if not required – Ex. Get location permission for one reason (maps), but also use for other reasons (ads) – Ex. Get name and email address, only need email – Ex. Get device ID because no permission needed • Android permissions and Play Store requirements useful in forcing devs to improve – In Android, have to declare use of most sensitive data – Google Play has requirements too (ex. privacy policy)
  • 44.
    How to GetPeople to Change Behaviors? Security Sensitivity Stack Awareness Knowledge Motivation Does person know of existing threat? Does person know tools, behaviors, strategies to protect? Can person identify attack / problem? Can person use tools, behaviors, strategies? Does person care?
  • 45.
    Security Sensitivity StackAdapted for Developers and Privacy Awareness Knowledge Motivation Are devs aware of privacy problem? Ex. Identifier tradeoffs, library behavior Do devs know how to address? Ex. Might not know right API call Do devs care? Ex. Sometimes ignore issues if not required
  • 46.
    Privacy-Enhanced Android • Alarge DARPA project to improve privacy • Key idea: have devs declare in apps the purpose of why sensitive data being used – Devs select from a small set of defined purposes • Today: “Uses location” • Tomorrow: “Uses location for advertising” – Use these purposes to help developers • Managing data better, generate privacy policies, etc – … to check app behaviors throughout ecosystem – … for new kinds of GUIs explaining app behaviors
  • 47.
    Helping Developers PrivacyStreams ProgrammingModel • Observations – Most apps don’t need raw data (GPS vs City location) – Many ancillary issues (threads, format, different APIs) • PrivacyStreams works like Unix pipes on streams – Easier for developers (threading, uniform API + format) – Devs never see raw data, only final outputs – Also easier to analyze, since one line of code • “This app uses your microphone only to get loudness” UQI.getData(Audio.recordPeriodic(DURATION, INTERVAL), Purpose.HEALTH("monitor sleep")) .setField("loudness", calcLoudness(Audio.AUDIO_DATA)) .forEach("loudness", callback);
  • 48.
    Helping Developers Coconut IDEPlugin • Developers add some Java annotations for each use of sensitive data • Can offer alternatives • Can aggregate all data use in one place • (Future) Can auto-generate privacy policies
  • 49.
    Helping App Stores •Ways of checking the behavior of apps – Ex. When devs upload to app store • Decompile the app and examine the text – If app uses location data, and if we see strings from app like exif, photo, tag, probably geotagging • Check network data of apps – Similar to above, except for network traffic • Add safety checks to apps – If app has well-defined policy, can add extra checks to app to make sure it does the right thing
  • 50.
    Addressing Market Failure(Work in Progress) Who Knows What About Us and Why
  • 51.
    Addressing Market Failure(Work in Progress) Who Knows What About Us and Why
  • 52.
    ProtectMyPrivacy (PMP) forMaking Decisions • For jailbroken iOS and Android – Intercept calls to sensitive data – Over 200k people using iOS PMP – Over 6M decisions (+ stack traces) – 20 data types protected • Recommender system too – Have recs for 97% of top 10k apps • User study with 1321 people – 1 year with old model (by app) – 1 month with new model (by library)
  • 53.
    Long Tail ofThird-party Libraries Chitkara, S. et al. Why does this app need my Location? Context aware Privacy Management on Android. In IMWUT 1(3). 2017.
  • 54.
    Most Popular 30Libraries Account for Over Half of all Sensitive Data Access
  • 55.
    About 40% AppsUse Sensitive Data Only because of Third-party Libraries
  • 56.
    Allowing or DenyingAccess by Library (vs by App) Reduces #Decisions Made
  • 57.
    How You CanHelp with Privacy Some Opportunities • Imagine a gigantic blob of privacy work – This is amount of work needed for “good” privacy – Right now, most of this blob is managed by end-users – What are useful ways of slicing up this blob so that other parts of ecosystem can manage better? • Better decision making by crowds or by experts – How good are decisions? Ways of making better ones? – How to easily share these decisions?
  • 58.
    How You CanHelp with Privacy Some Opportunities • Economics of privacy – GDPR; other ways of addressing market failures? • Ex. Consumer Reports really interested in this area – Third-party services and libraries a major problem • Incentives for privacy – Improving awareness, knowledge, motivation for devs – User attention for privacy • Special cases of privacy law – Privacy for children, healthcare, finances
  • 60.
    How can wecreate a connected world we would all want to live in?
  • 61.
    Thanks! More info atcmuchimps.org or email jasonh@cs.cmu.edu Special thanks to: • DARPA Brandeis • Google • Yuvraj Agarwal • Shah Amini • Rebecca Balebako • Mike Czapik • Matt Fredrikson • Shawn Hanna • Haojian Jin • Tianshi Li • Yuanchun Li • Jialiu Lin • Song Luan • Swarup Sahoo • Mike Villena • Jason Wiese • Alex Yu • And many more… • CMU Cylab • NQ Mobile

Editor's Notes

  • #3 Every week, there are headline news articles like these, capturing people’s growing concerns about technology and privacy.
  • #4 There are also a growing number of guidelines and regulations about how these technologies should be designed and be operated. So even if you don’t personally believe privacy is an issue, it’s still something that has to be addressed in the design and operation of systems we build. https://www.ftc.gov/sites/default/files/documents/reports/mobile-privacy-disclosures-building-trust-through-transparency-federal-trade-commission-staff-report/130201mobileprivacyreport.pdf https://oag.ca.gov/sites/all/files/agweb/pdfs/privacy/privacy_on_the_go.pdf
  • #5 Will just focus on smartphones for now, since they are the most pervasive devices we have today Representative of many of the problems and opportunities we will be grappling with in the future Smartphones are everywhere http://marketingland.com/report-us-smartphone-penetration-now-75-percent-117746 http://www.pewinternet.org/fact-sheets/mobile-technology-fact-sheet/ http://www.androidauthority.com/google-play-store-vs-the-apple-app-store-601836/
  • #6 These devices are also incredibly intimate, perhaps the most intimate computing devices we’ve ever created. From Pew Internet and Cisco 2012 study Main stats on this page are from: http://www.cisco.com/c/en/us/solutions/enterprise/connected-world-technology-report/index.html#~2012 Additional stats about mobile phones: http://www.pewinternet.org/fact-sheets/mobile-technology-fact-sheet/ ----------------------- What’s also interesting are trends in how people use these smartphones http://blog.sciencecreative.com/2011/03/16/the-authentic-online-marketer/ http://www.generationalinsights.com/millennials-addicted-to-their-smartphones-some-suffer-nomophobia/ In fact, Millennials don’t just sleep with their smartphones. 75% use them in bed before going to sleep and 90% check them again first thing in the morning.  Half use them while eating and third use them in the bathroom. A third check them every half hour. Another fifth check them every ten minutes. A quarter of them check them so frequently that they lose count. http://www.androidtapp.com/how-simple-is-your-smartphone-to-use-funny-videos/ Pew Research Center Around 83 percent of those 18- to 29-year-olds sleep with their cell phones within reach.  http://persquaremile.com/category/suburbia/
  • #7 From Cisco report
  • #8 Also from Cisco report
  • #9 But it’s not just the devices that are intimate, the data is also intimate. Location, call logs, SMS, pics, more
  • #10 A grand challenge for computer science http://www.flickr.com/photos/robby_van_moor/478725670/
  • #13 Data privacy and personal privacy
  • #14 In contrast to personal privacy, which is mostly about what you do to manage your persona Lots of forms of FIPs, here are the ones from FTC
  • #15 In contrast to personal privacy, which is mostly about what you do to manage your persona
  • #16 Hash user passwords
  • #17 Grade 12.5 About 10 min to read So based on Lorrie and Aleecia’s work, it will take 25 full days to read all privacy policies of all web sites But this assumes people read it Rationale behavior not to read privacy policies: we want to use the service, painful to read, clear cost but unclear benefit
  • #19 https://adexchanger.com/venture-capital/luma-partners-ad-tech-ecosystem-map-the-december-2010-update/ 2010 diagram
  • #20 http://firstmonday.org/article/view/2611/2302
  • #21 http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html As Pole’s computers crawled through the data, he was able to identify about 25 products that, when analyzed together, allowed him to assign each shopper a “pregnancy prediction” score.  Later in the article, talks about how one father accidentally discovered his daughter was pregnant b/c of these ads
  • #22 But, Many Smartphone Apps Access this Sensitive Data in Surprising Ways
  • #25 Moto Racing / https://play.google.com/store/apps/details?id=com.motogames.supermoto
  • #31 On the left is Nissan Maxima gear shift. It turns out my brother was driving in 3rd gear for over a year before I pointed out to him that 3 and D are separate. The older Nissan Maxima gear shift on the right makes it hard to make this mistake.
  • #33 Lin et al, Modeling Users’ Mobile App Privacy Preferences: Restoring Usability in a Sea of Permission Settings. SOUPS 2014. INTERNET, READ_PHONE_STATES, ACCESS_COARSE_LOCATION, ACCESS_FINE_LOCATION, CAMERA, GET_ACCOUNTS, SEND_SMS, READ_SMS, RECORD_AUDIO, BLUE_TOOTH and READ_CONTACT
  • #35 INTERNET, READ_PHONE_STATES, ACCESS_COARSE_LOCATION, ACCESS_FINE_LOCATION, CAMERA, GET_ACCOUNTS, SEND_SMS, READ_SMS, RECORD_AUDIO, BLUE_TOOTH and READ_CONTACT
  • #38 http://www.cmuchimps.org/publications/the_privacy_and_security_behaviors_of_smartphone_app_developers_2014/pub_download
  • #39 Separate study is Chitkara, S., N. Gothoskar, S. Harish, J.I. Hong, Y. Agarwal. Does this App Really Need My Location? Context aware Privacy Management on Android. PACM on Interactive, Mobile, Wearable, and Ubiquitous Technologies (IMWUT) 1(3). 2017. http://www.cmuchimps.org/publications/does_this_app_really_need_my_location_context-aware_privacy_management_for_smartphones_2017
  • #53 Agarwal, Y., and M. Hall. ProtectMyPrivacy: Detecting and Mitigating Privacy Leaks on iOS Devices Using Crowdsourcing. Mobisys 2013.
  • #57 26% decisions reduced in the far right
  • #60 https://www.flickr.com/photos/johnivara/536856713 https://creativecommons.org/licenses/by-nc-nd/2.0/ Today, we are at a crossroads. There is only one time in human history when a global network of computers is created, and that time is now. And there is only one time in human history when computation, communication, and sensing is woven into our everyday world, and that time is now. Now, I’ve avoided using the term Internet of Things because as you may remember from yesterday, I don’t really like the term. But regardless of what it’s called, it’s coming, and coming soon. And it will offer tremendous benefits to society in terms of safety, sustainability, transportation, health care, and more, but only if we can address the real privacy problems that these same technologies pose. So I’ll end with a question for you to consider:
  • #61 https://www.flickr.com/photos/johnivara/536856713 https://creativecommons.org/licenses/by-nc-nd/2.0/ Today, we are at a crossroads. There is only one time in human history when a global network of computers is created, and that time is now. And there is only one time in human history when computation, communication, and sensing is woven into our everyday world, and that time is now. Now, I’ve avoided using the term Internet of Things because as you may remember from yesterday, I don’t really like the term. But regardless of what it’s called, it’s coming, and coming soon. And it will offer tremendous benefits to society in terms of safety, sustainability, transportation, health care, and more, but only if we can address the real privacy problems that these same technologies pose. So I’ll end with a question for you to consider:
  • #62 DARPA Google CMU CyLab