Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

GDPR for developers

106 views

Published on

GDPR for developers by Janne Kalliola, Exove
2nd November 2018 at DrupalCamp Baltics

Published in: Technology
  • Be the first to comment

GDPR for developers

  1. 1. Janne Kalliola Exove GDPR for Developers Tallinn, November 2, 2018
  2. 2. Agenda § GDPR in detail § Rights of individuals § Data transfers § GDPR and CMS platforms § Existing systems § Future systems § Work surrounding technical platforms
  3. 3. About Exove § Digital design and development company in Finland, Estonia, the UK, and Singapore § Full service portfolio from business consulting and service design to development and care § We serve both multinational giants and new start-ups alike We deliver digital growth More about us: § www.exove.com § www.exove.com/gdpr § @exove
  4. 4. About Janne Kalliola § Founder and CEO of Exove § Continuent, First Hop, SSH, Helsinki University of Technology § Been coding since 1983, first web stuff in 1994 § Worked with web publishing and content managements systems since 1999 § I’ve written three CMS in the past § Worked with open source since 1998, with Drupal from 2007 More about me: § www.kallio.la § linkedin.com/in/jannekalliola § @plastic
  5. 5. GDPR in Detail
  6. 6. General Data Protection Regulation § The new EU data protection act that harmonises the use of private data across EEA § The regulation has been heavily lobbied and it took several years to negotiate the final version § Transition period ended in May 2018 § The GDPR replaced the national laws and regulations based on the EU Data Protection Directive (46/95/EC) § The GDPR is directly applicable in each member state § Will lead to a greater degree of data protection harmonization across EU nations § Member States have retained significant rights to legislate in certain areas
  7. 7. Key Concepts § Data Controller – company managing personal data § Data Processor – company handling data for a data controller § Data Subject – an individual person § Private Data – very broad definition of a data that can be used to identify a person directly or non-directly § Name, email, user account, phone number, address, IP address § Private data can be processed only and only if it is required to provide the service § If the service can be provided to anonymous users, it cannot ask for private data
  8. 8. Two Data Handling Roles Controller § The company collecting the data and controlling its usage § Responsible for and able to demonstrate compliance with the regulation § Including also work done by processors Processor § A company that processes personal data on behalf of a controller § Must be contractually bound to the controller and follow written orders § Must return or delete data when contract ends
  9. 9. Key Concepts – Special Category § Data that reveals racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing of genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health or data concerning a natural person’s sex life or sexual orientation § Data in special category has stricter rules than the generic private data § It can be processed, but there needs to be reason to do so
  10. 10. Children § Children are identified as vulnerable individuals that require specific protection § Consent given by person with parental responsibility for the child § Also national laws about children making contracts, etc.
  11. 11. Key Principles – Controllers and Processors § Accountability § Demonstrating compliance § Increased documentation obligations § Risk-based approach § Privacy by design and default § Privacy Impact Assessment and prior consultation where risk is high § Data Protection Officers § New breach reporting obligations § Detailed prescription of what must be included in outsourcing contracts
  12. 12. Key Principles – Individuals § Transparency and consent – The individuals need to know how and why their data is used, and companies need to have valid reason for the data usage § More extensive data subject rights § Restriction § Erasure § Portability § "Profiling" § Changing consent requirements (including in relation to children)
  13. 13. Key Concepts – Risk Based Approach § Authorities have few resources to control many companies with growing number of data § Thus § The company is made accountable § The measures need to be in relation with the risk involved, for example: § Appropriate § Effective § By design and default
  14. 14. Accountability § Organisations must be able to proof that they are following the regulation, i.e. reversed burden of proof § Requires process documentation, paper trails of decisions, and in some cases privacy impact assessments
  15. 15. Key Concepts – Applicability § The regulation applies to the private data of an EEA national § Notwithstanding the location of the person, data, or processing § Only one EEA national is enough to make the data processing regulated by GDPR
  16. 16. Fines § There has been a lot of talk about ”fines” in GDPR, or administrative sanctions § The maximum fines are high – 20M€ or 4% of global turnover, which one is higher § In reality, big fines are probably exceptions and one needs to show utter disregard of GDPR to get such sanctions § The scale of sanctions start from notification and turns into monetary sanctions somewhere down the road § But the sanctions have made sure that everybody has taken GDPR seriously
  17. 17. Rights of the Individuals
  18. 18. The Rights of the Individuals Article Description 13/14 Transparency, right to be informed 15 Access to personal data 16 Rectification of inaccurate data 17 Right to be forgotten 18 Right to restrict processing 20 Data portability 21 Automated decision making and right for human intervention
  19. 19. Rights Explained (1/2) § Access to data – The individuals must be able to see the data collected about them § By request that needs to be followed in a month - there are extensions for some cases, in commonly used electronic format § First copy must be free of charge § Rectification of inaccurate data – The individuals can ask inaccurate data to be corrected § Right of erasure – The individuals can ask data to be removed § Object of processing – The individuals can stop specific kind of processing, for example, direct marketing
  20. 20. Rights Explained (2/2) § Portability – The individuals have right to have their data ported to them or to another service § Restricting processing – The individuals can ask to stop processing their data for a period of time § Data can also be temporarily removed in this case § Profiling and automated decision-taking – Profiling based on sensitive data requires explicit consent and the individuals can request manual intervention of automated decision-taking that cause them significant effects
  21. 21. Lawfulness of Processing § Data subject has given consent § Necessary for the performance of contract or to take steps prior to entering into a contract § Necessary to protect vital interests of data subject § Necessary for legitimate interests of controller or 3rd party § Necessary for compliance with legal obligation to which the controller is subject § Necessary for task carried out in the public interest or exercise of official authority
  22. 22. Consent § Consent must be § Actively given § Separable from other written agreements § Clearly presented § As easily revoked as given § Additional requirements include an effective prohibition on "bundled" consents and the offering of services which are contingent on consent to processing § Where consent is relied on controllers should be able to demonstrate that consent was given by the data subject to the processing § In practice, consent metadata is necessary
  23. 23. Consent – Implications for UX § Consent is more regulated than before § Needs to be specific and unambigious, cannot be part of other written agreements § Must be active – i.e. no preticked checkboxes § Must be reversible – in other words, must be available in user profile or similar § Record of the given content is required § Consent cannot be required for a service that works also without processing personal data § Privacy policy is more important than before § Data has to have storage times, and a lot of other tidbits
  24. 24. Legitimate Interest § Consent is rather difficult to achieve & demonstrate § Other grounds for processing relatively narrow § Legitimate interests likely to be one of the most important grounds
  25. 25. Legitimate Interest § Controllers that rely on "legitimate interests" should maintain a record of the assessment to demonstrate that they have given proper consideration to the rights and freedoms of data subjects § When relying on "legitimate interests” – must be set out in the information notices § Recommendation: perform risk assessment and documentation Examples of legitimate interest: § Processing for direct marketing purposes or preventing fraud § Transmission of personal data within a group of undertakings for internal administrative purposes, including client and employee data § Processing for the purposes of ensuring network and information security, including preventing unauthorised access to e- communications networks and stopping damage to computer and e-communication systems § Reporting possible criminal acts or threats to public security to a competent authority
  26. 26. Data Transfers
  27. 27. Data Transfers – Basic Principles § Transfers outside EEA (European Economic Area) are restricted, but not forbidden § Transfers require adequate level of data protection, such as following EU model clauses or binding corporate rules inside a group of companies § Safe Harbor is now replaced with Privacy Shield, a new deal to self-certify US companies to allow hosting data regulated by the GDPR § Number of safe countries whose regulation provides similar protection of personal data as GDPR § Andorra, Argentina, Canada (only commercial organisations), Faroe Islands, Guernsey, Israel, Isle of Man, Jersey, New Zealand, Switzerland, Uruguay and USA (if the recipient belongs to the Privacy Shield) § Updated from time to time by European Commission
  28. 28. Data Transfers – Hidden Complexity § Modern IT architectures are complex and they are designed in a layered fashion § Thus the complexity of the underlying systems may easily escape § The data flows should be designed and documented clearly § And this documentation must be kept up to date all the time § Reducing privacy complexity by restricting the data to essentials, using encryption, hashes, pseudonymisation, etc. makes perfect sense
  29. 29. Data Transfers – APIs and Integrations § Be aware what is sent over API and/or integrations with other systems § As the definition of private data is very broad, it is too easy to send also private data through an integration point § If you provide the API end points, check the API thoroughly to see whether it inadvertently provides some private data § There are no technical measures to control the flow or the destination of the data after it has left the system § Users must be kept informed about the potential of their private data being handled outside of the system, including also the locations
  30. 30. Data Transfers – They Are Needed § You cannot avoid data transfers in the modern networked economy § Cloud services and serverless paradigm multiply the interconnectivity § And each interconnecting point might be a source of data transfer § There is no point fighting back and trying to do everything by yourself § You will be so inefficient in rolling out new features that competition will crush you § Instead, try to minimise the risks while reaping most of the benefits
  31. 31. CMS Specific Considerations
  32. 32. Structured vs. Unstructured Data § Most of the data processed by computers is structured, in other words it contains named fields that might have types § Structured data is easy to put into spreadsheets § Content management systems handle a lot of unstructured data – the content § Unstructured data is easy to put into documents § This data is also under GDPR
  33. 33. Content and GDPR § Content contains easily a lot of personal information, such as names, email addresses, phone numbers, and images of people § These cannot easily be exported from the system to satisfy end user rights § Thus, one needs to be diligent § Best solutions are to make suitable content types and other structures that move a lot repeating data into structured data § For example, staff listing implemented as a list of persons and not freely editable page
  34. 34. Content and Consent § Remember also to have consent from people to use their personal information § Discussion forums, blog comments, etc. § This applies to your own personnel, too § Using names and photos in a staff listing needs a consent or legitimate interest § It does not help whether you use company provided email addresses or phone numbers, as people can still be identified using them – thus they are also personal information
  35. 35. Analytics § Using analytics is ok in general § It is good to check what kind of data goes into analytics and how the system processes them § Even if does not store the data, it might temporarily be accessible by the personnel of the analytics provider § And this needs to be covered in the contract between you and them § IP address is a typical piece of data transferred to analytics § Some solutions – such as Google Analytics – offer anonymisation of IP address before sending it to the analytics
  36. 36. Access and Error Logs § Content management systems generate various logs for administrative and error management purposes § These logs have at least the IP address of the user and thus are also full of personal data § The procedures for such logs need to be checked § Who has access to them § Whether they are exported to an analysis system § Also own or third party extensions to CMS may write own log files § Debug mode may cause more personal data to be written to the files
  37. 37. URLs § Your system may transfer personal data in URLs, such as § https://example.com/person/?name=Janne+Kalliola&birthdate=... § All systems storing that URL – logs, analytics, etc. – suddenly may contain way more personal data that you know and have defined in your processes § Also transaction ids and other pieces of data that identify a single user are considered personal data
  38. 38. Staging and Development Environments § GDPR affects to all systems, including also staging and development environments § In case of requests from users, the data in these systems need to be included in erasure, rectification, etc. § When data is copied from production to staging or development – typically to debug issues – special care is needed § As people tend to have a more relaxed attitude towards these systems, the probability of data leaks increase
  39. 39. Digital Marketing Specific Considerations
  40. 40. Compliance § Digital marketing platforms must be GDPR compliant § This should not be a problem with all major platform provider, as without compliance they would be quickly out of business § But it is a good thing to check § Your processes need to be compliant, too § This is typically harder § And also connections between platforms need to be compliant
  41. 41. Mass Mailing § It is still allowed to send cold emails to people under GDPR, with the following requirements: § The recipient address is a business address § The recipients are targeted based on your business – the mail should benefit the recipient § You need to inform recipient how their personal data is processed § You need to include instructions how to remove or change their data § The personal data is not processed any longer than it is necessary
  42. 42. Subscribers Added before GDPR § If you have asked permission at the very beginning and you have received their consent, there is no need to ask the permission again § If the purpose of the processing has changed or will change, they need to be informed and given an easy way to decide if they want to allow processing their data or not § If you have bought subscriber lists, you need to know how the data was obtained and be able to explain to individuals, how and why you got their data § This applies also to cases that you outsource address collection to a partner
  43. 43. Tool Chains § Digital marketing tools are typically chained § The source of the data is in CRM § Then there are marketing automation systems, websites, etc. § When data must be removed or changed, it has to be done through the whole chain § Or the systems should be implemented so that they do not store anything – just use the data when it is received and then discard it right away § It is very important to define retention times for the personal data that did not lead into a business relationship
  44. 44. GDPR and Existing Systems
  45. 45. Documentation Can Mislead § If the system’s documentation is from era before GDPR, it does not focus on data privacy much or at all § Further, the documentation is typically somewhat simplified view of the architecture § Sometimes very simplified § Finally, it is most probably also outdated
  46. 46. Example Architecture Diagram
  47. 47. The Same System, Zoomed in
  48. 48. Data Storage § Data is stored in modern systems into multiple locations and multiple times § Performance, scalability, error management, data security needs, etc. § Without thorough and detailed understanding of the architecture, some data storages may not be known by anyone § But the data needs to be expunged from those, too, when requested or when the data is not needed anymore
  49. 49. Auditing Storage § For each existing system, find out: § Where the personal data is stored § What are the retention times and criteria § If these have not been specified, start the work § ”Forever” is not a retention policy and it must change § Why the data is stored – there needs to be legitimate reason for keeping the data § Also the metadata of consent needs to be stored
  50. 50. Data Deletion – Real or Not? § Deletion of data is a complex task in a networked data model § Removing something may left dangling pointers or otherwise render part of the data unusable § Thus, deletion might have been implemented by marking the item deleted or hidden § The user cannot see it and considers it removed § This, of course, does not work with GDPR – unless you have valid legal reasons to keep the data
  51. 51. Residual Data § Modern architectures duplicate data frequently – also private data § Some of these duplications are not deleted when they are no longer needed technically § Log files, especially audit and debug logs § Synchronization files § This is called residual data § And there can be plenty of it
  52. 52. Our Architecture, Again
  53. 53. § Varnish or CDN in the front § Web server logs § Platform logs § Local caches § Uploaded binary files § Maillog of all the sent emails § Backups of the servers
  54. 54. § SQL logs § Binary logs on all servers § Backups of binary logs § Database dumps made by developers § Production dumps to staging environment
  55. 55. § Integration platform logs and local caches § Integration platform document DB oplogs § SaaS messaging platform logs and internal database
  56. 56. § Finally the actual data master, its logs, backups and development environment
  57. 57. What to Do? § Data flow mapping is crucial § The natural starting point is the data entry, typically a website or a mobile application § Map the flow of the data from the source to the storage § Also external integrations need to be documented § Reduce data, if possible § Tune log levels, synchronisation frequencies, etc. § Mark down or define retention policies for residual data § Log rotation, cron based removals, etc. § Have proper policies for the rest § For example, how to make database dump for testing, how to handle it, when to remove it, etc.
  58. 58. Special Categories § The private data falling under special categories – health, religion, union membership, etc. – needs to handled with extra care § Proper access control who can see and manipulate the data § Audit trail of all actions § Also, use tight scrutiny to check whether the special category data is actually needed or not § It adds extra burden that might not be bring good enough benefits § Or ask it when needed, use, and discard – no storing at all
  59. 59. Privacy Policies § The privacy policies of the systems need to be constantly upgraded when the system, the processes, or the purpose of the processing changes § This is surprisingly frequent activity, if the system is under active development § Of course, the first step with existing systems is to check that the policies actually exist and they are compliant with GDPR § This is more of a territory for lawyers § Just make sure that the document is not written in hard to understand legalise, but also a layman can understand it
  60. 60. Data Security and GDPR § Focus in the past has been in data security § GDPR is not about data security and it does not define data security requirements § It requires adequate security § Adequacy depends on the situation, and no hard and fast rules can be given § Data security procedures have not taken data privacy into account that much
  61. 61. GDPR and Future Systems
  62. 62. By Design and by Default § Data protection by design and data protection by default is still very much undefined § We will have new clues flowing in as there is more guidelines from authorities and actual cases § Requirements for processes and daily handling of personal data are not defined, nor have they gotten much focus in GDPR preparations
  63. 63. Architecture Planning under GDPR § When planning architectures of new systems, take the following into consideration: § Allowing data subject rights in new services § Personal data design § Risk-based security built-in to new services § Data protection and security in maintaining new services
  64. 64. Personal Data Design § Create a personal data design for the new service § Do not collect anything that you cannot design a use § Do not collect anything that can be considered a high risk § Limit technical data collection
  65. 65. Example High Level Personal Data Design Before Use While Using After Usage Unregistered usage tracking based on cookies, email address if on mailing list, technical data Full contact details, profile, usage tracking, purchase history, mailing list actions, technical data Email address for mailing list and re-contact, purchase history for 2 years
  66. 66. Example Use-Case Level Personal Data Design Registration Update Profile Purchase Contact to Customer Care Full name, address, email, gender, ip number, user- agent, anonymous cookies connected, phone number Avatar image, preferences, hobbies, age, household income, children, marital status Product details, cost, discounts, path to purchase Full call record, call transcript, phone number, product reference, internal comms regarding support case
  67. 67. Minimising Use of Private Data § The amount of private data collected can – and should – be minimised § Requires good architectural skills § Several strategies, such as § Collect, use, discard – do not store for later use, works well with background checks § Encryption – when data is passed through a system that is not using it § Hashing – storing one way hashes instead of real information, for example, banned accounts
  68. 68. Risk-Based Approach to Security § Data security should be built in accordance to risk § Risk to the rights and freedoms of data subjects § Risk is not based on data only, but also context of the service § Risk should be knowingly analysed with the Product Owner and the technical people § Risk analysis should be documented § Data security should be documented as functional requirements and non- functional requirements; otherwise it does not happen
  69. 69. Risk-Based Approach to Security, Example § Limit the completeness of data sets § Denormalisation for performance – in other words, copying the same data to several places to speed up data reading § Leakage of full or individually usable data set has higher impact than partial data – for example, leaking addresses vs. leaking addresses and names § Risk of unencrypted data in transit § For example, email notifications – the risk grows when the service has higher impact on individuals, such as banks, stock brokers, or dating services § Leaking data via user friendly features § For example, login boxes that inform whether an account exists or not
  70. 70. Documentation Risk analysis & technical security decisions with reasoning Data protection impact assessment Record of processing activities Privacy policy
  71. 71. Privacy Related Metadata § GDPR requires some metadata about private data, such as recording giving the consent § More you know about the allowed usage of the data, more it offers benefits and possibilities § When drafting personal data designs, discuss and document also the needs of the metadata § Keep in mind that the metadata will most probably be also private data and it must be treated accordingly
  72. 72. Managing Consents § As consent must be reversible at will and any time, it requires extra thinking to make it right § Also, part of the service might use other legal basis and they should continue to operate even if consent is withdrawn § Further, there might be several consents asked throughout the service lifecycle § If possible, unify consent checking in the code into a library § Document the consent checking to keep the system internally uniform – when and what
  73. 73. Aggregation § Collecting all private data under a single service helps to tackle the individuals’ right to check their data § Implementing this is somewhat straightforward § When an individual wants to change or remove data, things become trickier § Deletion is straightforward if there is a single identifier for the individual across systems – this is rarely the case § Changing is more complex operation, especially if the data has almost but not quite duplicate fields – for example, shipping address, billing address, address, registered address, etc. § The typical choices are § Do the changes manually, in other words add the request to queue and handle it later § Require other systems to expose API to control changes and deletion
  74. 74. Automation § If some task occurs frequently, it might make sense to automate it § If your organisation receives only a few GDPR related requests per year, documentation might be better choice § The level of automation defines the cost § Simple scripts to clean an individual database vs. § One button to remove all personal data from all systems § Automation is not a silver bullet, use it only when it makes sense
  75. 75. Good Development Practices § Peer reviews – helps to raise quality on other matters, too § Auditing of third party components – must be based on risk § Automated, controlled, and repeatable process for deployments § Remove all manual work § Encryption of data at rest and at move § Automatic anonymisation when moving data from production to staging or development § If not possible, have good and thorough processes that are also followed
  76. 76. Work Surrounding Technical Platforms
  77. 77. Privacy Policies § Privacy policy is the first and foremost tool to show your compliance to GDPR § It must be included in every service processing private data § Privacy policy must be kept up to date § Consider versioning it § Checking its validity should be in a release checklist § Also, all changes to private data handling should be document – for example, written in the change log § Based on these changes, it should be relatively easy to see whether the privacy policy needs updates § The simpler is the policy, the easier is the update procedure § You cannot automate this
  78. 78. Privacy Policy – Contents § You need to define § Who is collecting the data § What information is collected and processed § Why it is collected – the purpose and legal basis for processing § Are there any transfers to third parties, and if yes to whom and where § How long the data is processed § How the individual can fulfil her rights and raise complaints
  79. 79. Deployments § Badly done deployments lead to increased security and privacy risks § Automate everything that is humanly possible § Remove every need for human interaction § If possible, make sure that the deployment can be rolled back
  80. 80. Maintenance § Maintenance process of digital services should be governed by data protection policies § Data security in maintenance is usually directed at attack vectors on a platform – not preventing data leaks § Data security should also focus on preventing data leaks instead of penetration protection § Of course, systems implemented well from privacy standpoint need to be compromised before a leak can take place § Keep privacy debt in discussion when doing small-scale development § Quick fixes may have very and tedious tail
  81. 81. Backups § Data in backups is also under GDPR § There are no clear instructions how to deal with backups § One solution is to have shorter backup cycle than 30 days – the limit of responding to queries of users § The integrity of the backups must be kept § In other words, they should not be tampered when removing user’s data from the system § Backups should have similar retention period as other data § And if you need to do a restore after removing or correcting user’s data, you need to play the changes again
  82. 82. Data Portability § GDPR requires the controller to provide the data in an interchangeable format, should one exist § Currently, there are few cases that provide interchangeable formats § The world might move towards more uniformity in the future § This requires, of course, a first mover that sees business benefits of having interchangeable format § Or an open source project that does this with “the right thing to do” mentality § Until then, it is sufficient to provide the data in machine readable format
  83. 83. Thank You! Questions? Comments?
  84. 84. drupalcampbaltics.com

×