1) The document discusses privacy challenges posed by personal data and big data, focusing on data collected from smartphones.
2) It notes that big data analysis is not always beneficial and can result in incorrect or harmful inferences being made about individuals from their data.
3) The document argues that privacy considerations need to be built into the design of platforms and systems from the start to help users maintain control over their data and context.
2. Outline
• From Personal Data to Big Data
• Demystify Privacy ?
• Privacy issues around smartphone sensing
• UX for privacy: data and context
• Privacy rethink
2
14. So What Happened in that Scenario?
• Inferences made from both data instances
and pattern
• Inferences might be incorrect
• Data used for one purpose might also be used
for another purpose
• Inferences might be harmful
14
16. Challenges
• What if my data in Big Data is incomplete?
• Do I have control to what parts of their data
get involved in any Big Data analysis?
• What could be the harms?
– social groups
– insurance
– work
16
25. First Step to Improve Privacy on
Smartphone
• Transparency
– What kinds of data are collected by the apps?
– Where are they sent to?
– How will the data be used?
25
38. E.g. Privacy-impacting Behavior
Revealing Privacy-Impacting Behavior Patterns of Smartphone Applications, Gokhan Bal, 2012
38
39. 研究議題2: UX for Privacy
• 三大支柱之一 :Self-determination (control)
• Do we feel some places are more private then
others?
• Privacy in public place?
• Not dichotomy but involves various factors
• Depends on situations
40. Research problems
• How do we design fine-grained control for
people to disclose their data to applications?
40
41. Research problems
• Could we give more fine-grained control for
people to disclose their data to applications?
• How do people create a policy?
– what are the factors that affect their decisions?
41
42. Research problems
• Could we give more fine-grained control for
people to disclose their data to applications?
• How do people create a policy?
– what are the factors that affect their decisions?
• Control of what?
– what should be the appropriate data abstractions
for control?
– e.g. Google circle, a better abstraction for sharing
in social network?
42
43. Study Flow
• Logs various types of sensors
• Prompts the user with a notification
1. annotations (location, situation, timestamp)
2. sensor data (ambient sound, accelerometer, Bluetooth, GPS, Wi-Fi,
gyroscope, cellular info..etc )
generate
survey
Databases
43
44. Survey
3 different types of data consumers (apps)
(academic, local stores, online companies)
are selected randomly for each question)
Local store label
is customized for
each user
44
45. Recall context then respond
• Without interrupting the user to think about
*privacy* questions at the moment, we help
the user to reconstruct his situation context
later
So I was talking to
my colleagues this
morning in my
office, this app asked
for my location Yes
No
[*policy*] I don’t
want my locations
in work be disclosed
to any app
45
46. 結論
• We found that context does affect people’s
decisions for data disclosure
• People actually made non-reasonable decision
when they are ignorant of the privacy
implications
– “I am willing to give my situation to Google
because I figure that they might already know
everything about me”
46
47. 重新思考 今天的 Data 現況
Networked, inferred
and public by default
47
49. Where are we heading?
• Big Data Analysis + 應用平台
– 電子發票
– Smart City
• 如果要把資料效益最大化,應該把 Privacy
納入系統設計
– security/anonymity*
– user experience
– access control
– audit logs
49
52. 30 seconds takeaway
• Big Data is not necessary good data
• 隱私 (privacy) 其實不是一個好詞 for issues
around personal data in Big Data
• Platform designer needs to think about usage
of data, not applications
• Think context and data together whenever
design UX for privacy
52
53. 30 seconds takeaway
• Big Data is not necessary good data
• 隱私 (privacy) 其實不是一個好的形容詞 for
issues around personal data in Big Data
• Platform designer needs to think about usage
of data, not applications
• Think context and data together whenever
design UX for privacy
53
54. 30 seconds takeaway
• Big Data is not necessary good data
• 隱私 (privacy) 其實不是一個好詞 for issues
around personal data in Big Data
• Platform designer needs to also think about
usage of data and its impact on privacy, not
just application features
• Think context and data together whenever
design UX for privacy
54
55. 30 seconds takeaway
• Big Data is not necessary good data
• 隱私 (privacy) 其實不是一個好詞 for issues
around personal data in Big Data
• Platform designer needs to also think about
usage of data and its impact on privacy, not
just application features
• Think context and data together whenever
design UX for privacy
55
Imagine a world where all information about you iscaptured! The places you go, the emails you send, the websites you visit, the people you talk to, the things you do, your tweets, your Facebook updates, your everything. This is the Holy Grail for many people in research. They believe that our lives can be improved by allowing a machine to help us by observing the signals we create. Using these signals, machines can predict the possible actions we will take, and automate the decision making process for us. Machines canmake inferences about you from the data collected about you, and these inferences can be useful in service automation. But these inferences can also be “misleading”. The canonical example is to take someone’s word “out-of-context” which leads to incorrect interpretation of the data. (words you speak).Interpret data in the wrong context raises issues of privacy, and we will talk more about privacy and context in the later slides. The main message of this talk is to ask: Is it possible to preserve the benefits of using personal data for automating decision making, while giving some amount of control to the user?
one year proposal. a. specify what will I accomplish (沒看見) b. 必須 convince people 我不是要做 situation detection (people 很 confused, people think I am doing situation awareness) b1. 給一個圖表,說明 situation/context-awareness is the input, and what’s the challenges then? c. research challenges: predictions could still have errors (點出如果, situation detection is perfect, what if there are errors?) c1. Usability problem c2. d. plan 如何解決問題2.
Siri是目前最具代表性的AI程式之一, 一方面說來是AI演算法
Here is an example of how companies (applications) are making inferences from that personal information. Bob drives to work everyday. He has an iPhone and an application called DriveSmart. Bob lets Drivesmart access his location so that the app will give him the best route every morning. So Bob is happy about that and he drives almost the same route everyday. Drivesmart has now updated to version 2.0, it reads Bob’s driving pattern, and provides some location-based service to recommend stores for breakfast while Bob is on his way to the office.Bob chooses fast food restaurant M, and uses an e-coupon to buy a cup of coffee everyday. Later the week, he gets an email from MASS RMV, entitled “Drive safely, do not eat while driving”. Later he gets another email, from “Eat-healthy-america.org”, saying “Be healthy, Don’t eat fast food for breakfast.”
Here is an example of how companies (applications) are making inferences from that personal information. Bob drives to work everyday. He has an iPhone and an application called DriveSmart. Bob lets Drivesmart access his location so that the app will give him the best route every morning. So Bob is happy about that and he drives almost the same route everyday. Drivesmart has now updated to version 2.0, it reads Bob’s driving pattern, and provides some location-based service to recommend stores for breakfast while Bob is on his way to the office.Bob chooses fast food restaurant M, and uses an e-coupon to buy a cup of coffee everyday. Later the week, he gets an email from MASS RMV, entitled “Drive safely, do not eat while driving”. Later he gets another email, from “Eat-healthy-america.org”, saying “Be healthy, Don’t eat fast food for breakfast.”
* We see that Drive smart can calculate a route for Bob giving his current condition and the destination. We also see that the location based service is using Bob’s location pattern to recommend fast food stores on his way to work.** Inferences made about you can be inaccurate, for example, Eat-healthy-america.org made inferences that Bob eats fastfood while drives to work. But in fact, he only gets a cup of coffee from M everyday. ** Data used for one purpose might also be used for another purposeBob’s data is collected by DriveSmart for routing purpose, and later is used for service recommendation and used by RMV****is collected by DriveSmart for routing purpose, and later is used for service recommendation and used by RMV
Inform and consentRequire not only the available information to make decision but also having the actual literacy with which to make that decision. What it means about being informed about what’s happening in the algorithm of Big Data? What it means to make about consent: the challenge of agency (knowledge + control Self-determination (access control) Personal Identifiable information
Pew survey: 54% of people uninstalltheir app because of privacy issues
今天雖然我們可以收集很多的資訊,但是有一些資料還不完整,或沒有收集
前一項我是外行
我現在看到的只有商人, 記者, 律師, 這三種人
第一個item需要去找danahboyd的演講
第一個item需要去找danahboyd的演講
第一個item需要去找danahboyd的演講
第一個item需要去找danahboyd的演講privacy is not a good term for describing issues around personal data in big data, then people will ask what are the good terms?