1
Privacy Secrets
Your Systems May Be Telling
TODAY’S WEBINAR:
Presented by: Kevin Poniatowski
2
About Security Innovation
• Authority in Software Security
• 15+ years research on software vulnerabilities
• Platform Centers of Excellence for specialization
• Authors of 18 books, 11 with Microsoft
• Named to the Gartner Magic Quadrant 5 years in a row
• Helping organizations minimize risk, regardless of problem complexity
3
What is Privacy
4
Defining Privacy
• Control over one’s own data
• The ability to hide all or part of one’s data from one or more
parties
“The notion of privacy remains out of the grasp of every
academic chasing it. Even when it is cornered by such
additional modifiers as "our" privacy, it still finds a way to
remain elusive.” – Serge Gutwirth
5
Seven Types of Privacy
1. Privacy of the Person
2. Privacy of behavior and action
3. Privacy of Communication
4. Privacy of data and images
5. Privacy of thoughts and feelings
6. Privacy of location and space
7. Privacy of association (including group privacy)
6
Privacy vs. Security
• Privacy: Empowering users to
control collection, use, and
distribution of their personal
information.
• Security: Establishing
protective measures that defend
against hostile acts or
influences and provides
assurance of defense.
Source Microsoft: Privacy in Software Development
7
Why Invest in Privacy: Key Drivers
Minimizes potential for legal/PR
issues
• High stakes, lowers overall risk
• COPPA, GLBA, HIPAA, CFAA, EU, FTC
Increases User Satisfaction and Trust
• Competitive differentiator
• Loyalty goes up with choice and control
• Powerful emotional factor, “Right Thing” to do
“Privacy concerns do have an
influence on user behavior.”
Pew Research Center’s Internet &
American Life Project
8
Threats to Privacy
• Big Data / Machine Learning
• Cookie Proliferation
• Seizure of Cloud Data
• Location and sensor data leakage and
betrayal
• Meta-data analysis
“Make no mistake, everything we
touch that is digital in the future
will be a data source….
The one thing that can threaten big
data is getting privacy wrong and
screwing up consumer trust. The
companies that miss that message
are going to suffer.”
- Trevor Hughes
9
Privacy Regulation
• HIPAA
• Healthcare data
• GDPR
• European online protections
• GLB Act
• Financial data
10
Privacy Regulation
• Children’s Online Privacy Act of 1998
• Requires parental consent before collecting data relating to minors
• Specific to California:
• Confidentiality of Medical Information Act
• Protects Confidentiality of Personally Identifiable Medical Information
• Online Privacy Protection Act of 2003
• Commercial websites and services must include a privacy policy
• California Consumer Privacy Act (2018), A.B 375
• Becomes effective in 2020
• Enforces transparency of data collection and processing
• Includes Right to be Forgotten
11
Thank You!Understanding the Data
12
Data Types
• PII – Personal Data
• PHI – Health Data
• PCI – Cardholder Data
• Anonymous Data
• Pseudonymous Data
• De-identified data
• Health
• Financial
• Age Restricted (minors)
• Protected classes
• Race, religion, age, origin, sex,
pregnancy, etc.
13
Anonymous Data
• Goal – keep only data that does not allow the original source to
be identified from the information. Irreversible.
• Example – Identifying individuals from
cellphone location data
• Issues
• De-anonymization
• Process: Remove [name, address, post code,
etc.]
• Additionally remove any data that could be used together with other
information that could be used to identify the origin
14
Pseudonymous Data
• Goals – Replace PII fields with artificial identifiers. Allow original data
to be restored through additional
processing
• Example – AOL released private
“anonymous” data. Researchers quickly
de-anonymized it
• Issues – Statistically useful re-identification
from pseudonymized data
• Process –
• Identify fields to be replaced
• Research ID generation process
• Replace fields with IDs
• Maintain mapping in a secure location
15
De-identified Data
• Goals – Remove identifiers, but keep core data
• Example – Remove names and SSNs from Health Records to
analyze rate of infection without naming individuals
• Issues – re-identification
• Process –
• Delete or mask personal identifiers [name, SSN, etc.]
• Suppress or generalize quasi-identifiers [date of birth, zip code]
16
Exercise
• What’s missing? What other kinds
of data exist?
• How are these items categorized or
regulated differently?
17
Thank You!Data Lifecycle and Data Flow
18
Example
Data
Lifecycle
Capture
• new data
Maintenance
• process/transfer
Synthesis
• creation through
analysis
Usage
• backup, partner
Publication
• make available
elsewhere
Archival
• Remove data from
cycle
Purging
• Remove data from
everywhere
19
Data Flow Analysis
• How data flows through an application and is stored within the
system
• As part of a GDPR compliance project, organizations will need
to map their data and information flows in order to assess their
privacy risks
20
Data Flow Analysis Steps
• 1. Understand the information flow
• Example: From customer to organization to 3rd party supplier
• 2. Describe the information flow
• Walk through flow to identify unintended uses of data
• 3. Identify key elements
• Data Item
• Formats
• Transfer Method
• Locations (offices, Cloud, 3rd party suppliers, etc.)
21
Example Data Flow Analysis
Input Filtering Processing Masking Storage
Retrieval Encoding Display Edit Storage
22
Data Transmission
• How is data sent?
• Protocol
• Encryption
• What properties do you expect from the transmission of the
data?
• Privacy
• Speed
• What are the threats targeting the transmission?
• Network Sniffing
• Attacks on SSL
23
Data in Other Systems
• When data flows out of your control, what obligation for
protection do you have?
• How will you process the data before transmission?
24
Questions?
25
Thank You!
www.securityinnovation.com
Everyone who attended today’s session will receive:
• Webinar recording
• Copy of the presentation
• Free 14-Day Trial of our Data Privacy Courses
Join us February 12th for the next webinar in this Privacy in the
SDL series: Creating an Effective Application Privacy Policy

Privacy Secrets Your Systems May Be Telling

  • 1.
    1 Privacy Secrets Your SystemsMay Be Telling TODAY’S WEBINAR: Presented by: Kevin Poniatowski
  • 2.
    2 About Security Innovation •Authority in Software Security • 15+ years research on software vulnerabilities • Platform Centers of Excellence for specialization • Authors of 18 books, 11 with Microsoft • Named to the Gartner Magic Quadrant 5 years in a row • Helping organizations minimize risk, regardless of problem complexity
  • 3.
  • 4.
    4 Defining Privacy • Controlover one’s own data • The ability to hide all or part of one’s data from one or more parties “The notion of privacy remains out of the grasp of every academic chasing it. Even when it is cornered by such additional modifiers as "our" privacy, it still finds a way to remain elusive.” – Serge Gutwirth
  • 5.
    5 Seven Types ofPrivacy 1. Privacy of the Person 2. Privacy of behavior and action 3. Privacy of Communication 4. Privacy of data and images 5. Privacy of thoughts and feelings 6. Privacy of location and space 7. Privacy of association (including group privacy)
  • 6.
    6 Privacy vs. Security •Privacy: Empowering users to control collection, use, and distribution of their personal information. • Security: Establishing protective measures that defend against hostile acts or influences and provides assurance of defense. Source Microsoft: Privacy in Software Development
  • 7.
    7 Why Invest inPrivacy: Key Drivers Minimizes potential for legal/PR issues • High stakes, lowers overall risk • COPPA, GLBA, HIPAA, CFAA, EU, FTC Increases User Satisfaction and Trust • Competitive differentiator • Loyalty goes up with choice and control • Powerful emotional factor, “Right Thing” to do “Privacy concerns do have an influence on user behavior.” Pew Research Center’s Internet & American Life Project
  • 8.
    8 Threats to Privacy •Big Data / Machine Learning • Cookie Proliferation • Seizure of Cloud Data • Location and sensor data leakage and betrayal • Meta-data analysis “Make no mistake, everything we touch that is digital in the future will be a data source…. The one thing that can threaten big data is getting privacy wrong and screwing up consumer trust. The companies that miss that message are going to suffer.” - Trevor Hughes
  • 9.
    9 Privacy Regulation • HIPAA •Healthcare data • GDPR • European online protections • GLB Act • Financial data
  • 10.
    10 Privacy Regulation • Children’sOnline Privacy Act of 1998 • Requires parental consent before collecting data relating to minors • Specific to California: • Confidentiality of Medical Information Act • Protects Confidentiality of Personally Identifiable Medical Information • Online Privacy Protection Act of 2003 • Commercial websites and services must include a privacy policy • California Consumer Privacy Act (2018), A.B 375 • Becomes effective in 2020 • Enforces transparency of data collection and processing • Includes Right to be Forgotten
  • 11.
  • 12.
    12 Data Types • PII– Personal Data • PHI – Health Data • PCI – Cardholder Data • Anonymous Data • Pseudonymous Data • De-identified data • Health • Financial • Age Restricted (minors) • Protected classes • Race, religion, age, origin, sex, pregnancy, etc.
  • 13.
    13 Anonymous Data • Goal– keep only data that does not allow the original source to be identified from the information. Irreversible. • Example – Identifying individuals from cellphone location data • Issues • De-anonymization • Process: Remove [name, address, post code, etc.] • Additionally remove any data that could be used together with other information that could be used to identify the origin
  • 14.
    14 Pseudonymous Data • Goals– Replace PII fields with artificial identifiers. Allow original data to be restored through additional processing • Example – AOL released private “anonymous” data. Researchers quickly de-anonymized it • Issues – Statistically useful re-identification from pseudonymized data • Process – • Identify fields to be replaced • Research ID generation process • Replace fields with IDs • Maintain mapping in a secure location
  • 15.
    15 De-identified Data • Goals– Remove identifiers, but keep core data • Example – Remove names and SSNs from Health Records to analyze rate of infection without naming individuals • Issues – re-identification • Process – • Delete or mask personal identifiers [name, SSN, etc.] • Suppress or generalize quasi-identifiers [date of birth, zip code]
  • 16.
    16 Exercise • What’s missing?What other kinds of data exist? • How are these items categorized or regulated differently?
  • 17.
  • 18.
    18 Example Data Lifecycle Capture • new data Maintenance •process/transfer Synthesis • creation through analysis Usage • backup, partner Publication • make available elsewhere Archival • Remove data from cycle Purging • Remove data from everywhere
  • 19.
    19 Data Flow Analysis •How data flows through an application and is stored within the system • As part of a GDPR compliance project, organizations will need to map their data and information flows in order to assess their privacy risks
  • 20.
    20 Data Flow AnalysisSteps • 1. Understand the information flow • Example: From customer to organization to 3rd party supplier • 2. Describe the information flow • Walk through flow to identify unintended uses of data • 3. Identify key elements • Data Item • Formats • Transfer Method • Locations (offices, Cloud, 3rd party suppliers, etc.)
  • 21.
    21 Example Data FlowAnalysis Input Filtering Processing Masking Storage Retrieval Encoding Display Edit Storage
  • 22.
    22 Data Transmission • Howis data sent? • Protocol • Encryption • What properties do you expect from the transmission of the data? • Privacy • Speed • What are the threats targeting the transmission? • Network Sniffing • Attacks on SSL
  • 23.
    23 Data in OtherSystems • When data flows out of your control, what obligation for protection do you have? • How will you process the data before transmission?
  • 24.
  • 25.
    25 Thank You! www.securityinnovation.com Everyone whoattended today’s session will receive: • Webinar recording • Copy of the presentation • Free 14-Day Trial of our Data Privacy Courses Join us February 12th for the next webinar in this Privacy in the SDL series: Creating an Effective Application Privacy Policy

Editor's Notes

  • #5 Page 2 of Seven Types of Privacy Lab 10-20 min Do a quick lab, have people pair up, or in small groups to discuss their own definitions of privacy. What do they want protected? Do different pieces of data have different sensitivity attached to it? How personal is privacy definitions? If their: phone number, address, email address, SSN, credit card number were lost how would they feel? What could be done by the user to prevent? What obligation does that user have to protect their data, what obligation does the developer have? Keep this information around for the next lab, they link together.
  • #6 Page 4 of Seven types of Privacy Privacy of the person encompasses the right to keep body functions and body characteristics (such as genetic codes and biometrics) private. According to Mordini, the human body has a strong symbolic dimension as the result of the integration of the physical body and the mind and is “unavoidably invested with cultural values”.22 Privacy of the person is thought to be conducive to individual feelings of freedom and helps to support a healthy, well-adjusted democratic society. This aspect of privacy is shared with Clarke’s categorisation. We extend Clarke’s notion of privacy of personal behaviour to privacy of behaviour and action. This concept includes sensitive issues such as sexual preferences and habits, political activities and religious practices. However, the notion of privacy of personal behaviour concerns activities that happen in public space, as well as private space, and Clarke makes a distinction between casual observation of behaviour by a few nearby people in a public space with the systematic recording and storage of information about those activities.23 The ability to behave in public, semi-public or one’s private space without having actions monitored or controlled by others contributes to “the development and exercise of autonomy and freedom in thought and action”.24 Privacy of communication aims to avoid the interception of communications, including mail interception, the use of bugs, directional microphones, telephone or wireless communication interception or recording and access to e-mail messages. This right is recognised by many governments through requirements that wiretapping or other communication interception must be overseen by a judicial or other authority. This aspect of privacy benefits individuals and society because it enables and encourages a free discussion of a wide range of views and options, and enables growth in the communications sector. We expand Clarke’s category of privacy of personal data to include the capture of images as these are considered a type of personal data by the European Union as part of the 1995 Data Protection Directive as well as other sources. This privacy of data and image includes concerns about making sure that individuals’ data is not automatically available to other individuals and organisations and that people can “exercise a substantial degree of control over that data and its use”.25 Such control over personal data builds self-confidence and enables individuals to feel empowered. Like privacy of thought and feelings, this aspect of privacy has social value in that it addresses the balance of power between the state and the person. Our case studies reveal that new and emerging technologies carry the potential to impact on individuals’ privacy of thoughts and feelings. People have a right not to share their thoughts or feelings or to have those thoughts or feeling revealed. Individuals should have the right to think whatever they like. Such creative freedom benefits society because it relates to the balance of power between the state and the individual.26 This aspect of privacy may be coming under threat as a direct result of new and emerging technologies.27 Privacy of thought and feelings can be distinguished from privacy of the person, in the same way that the mind can be distinguished from the body. Similarly, we can (and do) distinguish between thought, feelings and behaviour. Thought does not automatically translate into behaviour. Similarly, one can behave thoughtlessly (as many people often do). According to our conception of privacy of location and space, individuals have the right to move about in public or semi-public space without being identified, tracked or monitored. This conception of privacy also includes a right to solitude and a right to privacy in spaces such as the home, the car or the office. Such a conception of privacy has social value. When citizens are free to move about public space without fear of identification, monitoring or tracking, they experience a sense of living in a democracy and experiencing freedom. Both these subjective feelings contribute to a healthy, well-adjusted democracy. Furthermore, they encourage dissent and freedom of assembly, both of which are essential to a healthy democracy. This categorisation of privacy was also not as obviously under threat when Clarke was writing in 1997, however, this has changed with technological advances. The final type of privacy that we identify, privacy of association (including group privacy), is concerned with people’s right to associate with whomever they wish, without being monitored. This has long been recognised as desirable (necessary) for a democratic society as it fosters freedom of speech, including political speech, freedom of worship and other forms of association. Society benefits from this aspect of privacy in that a wide variety of interest groups will be fostered, which may help to ensure that marginalised voices, some of whom will press for more political or economic change, are heard. This aspect of privacy was not considered by Clarke, and a number of new technologies outlined below could negatively impact upon individuals’ privacy of association. One might question what the difference is between privacy of location and space or privacy of association and privacy of behaviour. Privacy of location means that a person is entitled to move through physical space, to travel where she wants without being tracked and monitored. Privacy of behaviour means the person has a right to behave as she wants (to sleep in class, to wear funny clothes) so long as the behaviour does not harm someone else. Privacy of behaviour does not necessarily have anything to do with a person travelling through space, driving to work, going shopping or whatever. One can behave as one wants in private, separately from others. Privacy of association differs from privacy of behaviour because it is not only about groups or organisations (e.g., political parties, trade unions, religious groups, etc.) to which we choose to belong, privacy of association also connects to groupings or profiles over which we have no control – for example, DNA testing can revealing that we are members of a particular ethnic group or a particular family. Privacy of association directly relates to other fundamental rights such as freedom of religion, freedom of assembly, etc., from which privacy of behaviour and action (as we define it) are a step removed. Our typology of privacy (or, rather, our expansion of Clarke’s typology) offers various benefits to a range of stakeholders. It is important above all in policy terms, i.e., policy- makers should ensure that these different types of privacy are adequately protected in legislation, i.e., it is not sufficient to protect only personal data and personal communications (e.g., against interception). This typology is also of instrumental value in the development of a privacy impact assessment methodology in Europe (as is being done in the EC-funded PIAF28, PRESCIENT29 and SAPIENT30 projects, for example). Similarly, organisations that carry out privacy impact assessments should be concerned not only about privacy of personal data and privacy of communications, but also the other types of privacy as well. We also believe our typology provides academics and other privacy experts with a useful, logical, well-structured and coherent typology in which to frame their privacy studies. Our typology is similarly useful for privacy advocates. Although a widely accepted definition of privacy has proven elusive, this typology, firmly building on that established by Clarke, should be widely accepted.
  • #8 Children's Online Privacy Protection Act Computer Fraud and Abuse Act
  • #13 Depending on data types and data classification depending different treatments and regulations may be applicable.
  • #14 We use an anonymization process with some of our reports, if they need to be shared or analyzed publicly. One rule of thumb we use is “would the client be able to recognize themselves in this report”, if so it has not been anonymized. In this case de-anonymization may occur through data analysis: what data was leaked? Who has access to that data? What level of authorization was used to collect data? Match AuthZ to data. http://news.mit.edu/2013/how-hard-it-de-anonymize-cellphone-data?mod=article_inline "Researchers at MIT and the , in Belgium, analyzed data on 1.5 million cellphone users in a small European country over a span of 15 months and found that just four points of reference, with fairly low spatial and temporal resolution, was enough to uniquely identify 95 percent of them. In other words, to extract the complete location information for a single person from an “anonymized” data set of more than a million people, all you would need to do is place him or her within a couple of hundred yards of a cellphone transmitter, sometime over the course of an hour, four times in one year. A few Twitter posts would probably provide all the information you needed, if they contained specific information about the person’s whereabouts."
  • #15 https://en.wikipedia.org/wiki/AOL_search_data_leak https://techcrunch.com/2006/08/06/aol-proudly-releases-massive-amounts-of-user-search-data/ While the AOL username has been changed to a random ID number, the ability to analyze all searches by a single user will often lead people to easily determine who the user is, and what they are up to. The data includes personal names, addresses, social security numbers and everything else someone might type into a search box. Through clues revealed in the search queries, The New York Times successfully uncovered the identities of several searchers. With her permission, they exposed user #4417749 as Thelma Arnold, a 62-year-old widow from Lilburn, Georgia.[9] This privacy breach was widely reported, and led to the resignation of AOL's CTO, Maureen Govern, on August 21, 2006. The media quoted an insider as saying that two employees had been fired: the researcher who released the data, and his immediate supervisor, who reported to Govern.[10][11]
  • #16 Read this for ways this can go wrong: https://en.wikipedia.org/wiki/Data_Re-Identification#Consequences Also note: https://en.wikipedia.org/wiki/Differential_Privacy n the mid 1990s, a government agency in Massachusetts called Group Insurance Commission (GIC), which purchased health insurance for employees of the state, decided to release records of hospital visits to any researcher who requested the data, at no cost. GIC assured that the patient's privacy was not a concern since it had removed identifiers such as name, addresses, social security numbers. However, information such as zip codes, birth date and sex remained untouched. The GIC assurance was reinforced by the then governor of Massachusetts, William Weld. Latanya Sweeney, a graduate student at the time, put her mind to picking out the governor's records in the GIC data. By combining the GIC data with the voter database of the city Cambridge, which she purchased for 20 dollars, Governor Weld's record was discovered with ease.[10]
  • #17 Start conversation by asking class: what could be used from this image to identify this person? in image: shoes, rings, hands, scars, ethnicity, background meta: image headers and metadata – location, camera type, etc. Bring up the information that the class discovered during the defining privacy lab. Start with the types of data they started to think about in that lab, what else is there? What data is unique to their industry, software, or team? 20-30 min
  • #19 Note this is an example of how data might live through at an organization, theirs may be different. Do a quick whiteboarding session to learn how their lifecycle.