L’Application Economy obbliga l’IT a correre alla stessa velocità del business. Nel contempo l’entrata in vigore di nuove stringenti normative in ambito sicurezza impone l’adeguamento del Software Delivery LifeCycle affinché queste possano essere implementate e testate già dalle fasi iniziale dello sviluppo, ottimizzando i tempi di delivery e minimizzando il time to market.
Pronti per la legge sulla data protection GDPR? No Panic! - Stefano Sali, Domenico Maracci - Codemotion Milan 2016
1. Pronti per la legge sulla data protection
GDPR? No Panic!
Stefano Sali
Domenico Maracci
MILAN 25-26 NOVEMBER 2016
2. 1 > What is GDPR
2 > Highlights & Key Impacts
3 > How to approach GDPR from an IT Security Perspective
4 > Demo
3. • Brings into law the original Data Protection Directive
• A single set of rules will apply to all EU member states
GDPR
General Data Protection Regulation 2016/679
4. DIRECTIVE
A "directive" is a legislative act that sets out a goal that all EU countries must
achieve. However, it is up to the individual countries to devise their own laws on
how to reach these goals.
REGULATION
A "regulation" is a binding legislative act. It must be applied in its entirety
across the EU.
REGULATION vs DIRECTIVE
What is the difference between a Regulation (like e.g. GDPR) and a Directive (like e.g. PSD2)?
5. PRIMARY OBJECTIVES OF GDPR
DATA SUBJECTS RIGHTS
to give citizens back the control of their personal data
HARMONISATION
to simplify the regulatory environment for international
business by unifying the regulation within the EU
6. GDPR DEFINITIONS
• Any information relating to an identified or identifiable
natural person 'data subject'; an identifiable person is
one who can be identified, directly or indirectly
• Name
• ID number
• Location or address
• Physical (Gender, color, age, stature etc)
• Genetic ( includes inherited or acquired
characteristics and Health Data HPII, race)
• Physiological (disability, mental)
• Economic, creed or social identity
• May include online identifiers including IP address,
cookies if they can be easily linked back to the data
subject.
• No distinction between personal data about individuals
in their private, public or work roles
PERSONAL DATA
7. GDPR DEFINITIONS
Personal Data Breach means a breach of
security leading to the accidental or unlawful
destruction, loss, alteration, unauthorised
disclosure of, or access to, personal data
transmitted, stored or otherwise processed
PERSONAL DATA BREACH
Data controllers must notify most data breaches to the DPA. This must be done without undue delay and,
where feasible, within 72 hours of awareness. A reasoned justification must be provided if this timeframe is
not met.
In some cases, the data controller must also notify the affected data subjects without undue delay (Art. 33)
8. The GDPR establishes a tiered approach to
penalties for breach which enables the DPAs to
impose maximum fines of up to 20M€ or 4%
of annual turnover (whichever is highest) if
full compliance cannot be demonstrated (Art.
83)
GDPR FINES
ARTICLE 83
9. Regulation applies to the processing of personal data
in the context of the activities of an establishment of
a controller or a processor in the Union, regardless of
whether the processing takes place in the Union or not.
(Art. 3)
Article 5.1(f) needs to be taken into account because it
literally states: “Personal data should be processed in a
manner that ensures appropriate security of personal
data, including protection against unauthorized or
unlawful processing and against accidental loss,
destruction or damage, using appropriate technical or
organizational measures (‘integrity and
confidentiality’).”
10. Excerpt
One of the most important topics included in this Regulation is a chapter devoted to the
rights of the data subject. The bar has been raised and new rights have been included
that will profoundly impact into the way IT will need to process and control personal
data. While traditional rights of access (Art.15), rectification (Art. 16), erasure (Art.17),
and objection (Art.21) remain largely the same, there has been a new right included:
right to data portability (Art.18) and some modifications to the right to erasure by
including the concept of right to be forgotten (Art 17) and the inclusion of right to
restriction (Art. 18).
11. Excerpt
Art. 25 “The controller shall implement appropriate technical and organisational
measures for ensuring that, by default, only personal data which are necessary for each
specific purpose of the processing are processed. That obligation applies to the amount of
personal data collected, the extent of their processing, the period of their storage and
their accessibility. In particular, such measures shall ensure that by default personal data
are not made accessible without the individual's intervention to an indefinite number of
natural persons”. And article 30 mandates the recording of processing activities.
12. Excerpt
Data can only be used if: Explicit consent has been given for its use for the specific
purpose, necessary for legal purposes (e.g. to fulfil a contract, the subject's vital interest),
it is necessary for public interest, or for a legitimate interest of the processor
Organization need to mask personal data and other sensitive data, or getting a sub-set of
production data for testing.
To realize the full benefits of better test data management you must strongly consider
implementing synthetic data generation, as well as how they store, manage and provision
data.
Anonymisation and
Pseudonymisation
13. Key Impacts for IT Organizations: Summary
A Few Words to Review
DISCOVER PERSONAL DATA ACROSS YOUR ORGANIZATION AND PROTECT THEM FROM
UNAUTHORIZED ACCESS - PREVENT DATA LOSS1
CENTRALIZE USER IDENTITY MANAGEMENT AND ACCESS CONTROL IN PARTICULAR (BUT NOT
EXCLUSIVELY) OF PRIVILEGED USERS2
MANAGE AND OPTIMIZE THE USE OF TEST DATA IN YOUR SOFTWARE DEVELOPMENT
LIFECYCLE AND CONSIDER IMPLEMENTING SYNTHETIC DATA GENERATION3
EXPOSE PERSONAL DATA TO DATA SUBJECT IN A SECURE AND AUDITABLE WAY BY USING
ENTERPRISE API GATEWAYS / PORTALS4
14. How to approach GDPR from
an IT Security Perspective
• New environments
• Existing environments
• Tools available for Application Developers
• Demo
15. New environments
Main goals: contains data leakage & control data use
• Use a single repository
• Access through APIs
• Technical Access through Password Vault
• Agents on end user workstations and email servers for
Data Protection
• Data Classification Engine usable by Applications
• Big Data Lens
16. Existing environments
in addition to “new env” activities: Identify GDPR data
• GDPR Data Identification and Classification
• Data at rest
• Database
• File System
• Cloud Storage
• Usage Monitoring
• Email
• Workstation
• Network
17. Existing environments
in addition to “new env” activities: Identify GDPR data
• Transformation for managing access to GDPR Data
• Data Centralization
• Data accessible through APIs
• API/SOA Transformation Layer for minimal impact on
Applications (that already use WS/APIs)
• API Gateway layer with Data Classification Engine
• Technical and Privileged Accounts through Password Vault
• Synthetic Data for test and pre-production environments
18. • Identifying key data to make decision how it is stored, transmitted, used and
secured across distributed systems
• Data discovery and control across multiple channels
• At rest
• In motion
• In Use
• At Access
• Active prevention of data breaches working on different access methods
• Email / Email Server
• Web
• Desktop (save/print/usb/cd/…)
• Mobile
• Network
• Cloud Storage
• Dynamic Data Classification and Customizable Active Policy Design
Discover and Prevent Data Loss
19. • Content Classification Service (CCS) provides the Data Classification
Engine to Application Developers
• Automatic Classification of sensitive content within on-premise,
virtual and cloud environments
• Provide visibility into the location and sensitivity of data
• Real-time visibility into the
sensitive data classification are
consumable by 3rd party
software through web services
APIs
• Personally Identifiable Information (PII) Policies already available (for
Italian language too)
Discover and Prevent Data Loss
for Application Developers
20. CCS APIs High Level Overview
• Authentication:
• Windows Client Authentication over HTTP or HTTPS
• Mutual authentication using certificates over HTTPS.
• Message level security using the SOAP WS* standards.
• API Interfaces:
• CCSDictionary: access to the list of Dictionary Classifications that are available for the calling
application. These classifications are available to be applied to content that CCS has
classified.
• CCSClassify: classify new content, and to return a list of existing classifications for content
that was previously classified.
• Methods:
• GetDictionaryClassifications: returns the full list of Dictionary Classifications that are
currently defined in the CCS.
• DictionaryVersion: returns the current Dictionary API Version The method has the following
syntax:
• Classes
• DictionaryArgs, DictionaryResult, DictionaryClassificationsList, DictionaryClassificationItem
• Classify, ClassifyVersion
• ….. The information in these sections describes the methods and
options that the CCS web service provides. You can obtain the
interface definition as a Web Service Definition Language (WSDL)
from a running instance of the CA Data Protection service
21. Example: Retrieve Full List of Dictionary Classifications*
// Create the proxy class that connects to the remote service.
CCS.CCSDictionaryClient proxyDictionary = new CCS.CCSDictionaryClient();
// Create the arguments object and set options
CCS.DictionaryArgs dictionaryArgs = new CCS.DictionaryArgs();
// Set the locale
dictionaryArgs.Locale = "en-gb";
// Call the Dictionary service to get the dictionary classifications
// which are returned in a DictionaryResult instance
CCS.DictionaryResult dictionaryResult =
proxyDictionary.GetDictionaryClassifications(dictionaryArgs);
// A real client would check the error status here
Console.WriteLine("Severity: {0} ErrorCode: {1} = {2}",
dictionaryResult.Severity.ToString(),
dictionaryResult.ErrorCode.ToString(),
dictionaryResult.ErrorMessage);
// Process the classifications from the list
foreach (var classItem in dictionaryResult.DictionaryClassifications)
{
Console.WriteLine("ClassificationID: {0}", classItem.ClassificationID);
}
// The proxy instance can be used for multiple calls but should
// be closed when no longer required.
proxyDictionary.Close();
* Example based on consuming the WSDL from a C# client application in Visual Studio 2010 with the .NET 4.0 Framework
22. Example: Get Existing Classifications and Classify if Required*
// Create the proxy class that connects to the remote service.
CCS.CCSClassifyClient proxyClassify = new CCS.CCSClassifyClient();
// Create the arguments object and set options
CCS.ClassificationArgs classifyArgs = new CCS.ClassificationArgs();
// Set the locale if required
classifyArgs.Locale = "en-gb";
// Set action so to classify if required
classifyArgs.AnalyzeDataAction = CCS.AnalyzeDataActionType.MayAnalyze;
// Should the CCS timeout and return early if it doesn't have result
classifyArgs. TimeOutMilliseconds = 0;
// We may not always want to know the classification results
classifyArgs.ReturnClassifications = true;
// Is the data passed by reference or included in this object
classifyArgs.DataLocation = CCS.DataLocationType.Reference;
// Add identifier
CCS.ContentIdentifierList itemList = new CCS.ContentIdentifierList();
CCS.ContentIdentifier item = new CCS.ContentIdentifier();
item.IdentifierType = CCS.ContentIdentifierType.URL;
item.Identifier = "http://myserver.com/Shared Documents/important.doc";
item.CanAccessContent = true;
item.CanRetrieveLastModifiedDate = false;
item.DoNotCacheIdentifier = false;
itemList.Add(item);
// Add the item list to the args object
classifyArgs.IdentifierList = itemList;
// Call the classifier
CCS.ClassifyResult csResult = proxyClassify.Classify(classifyArgs);
// Process the results
Console.WriteLine("Severity: {0} ErrorCode: {1} = {2}",
csResult.Severity.ToString(), csResult.ErrorCode.ToString(),
csResult.ErrorMessage);
if (csResult.ErrorCode.Equals(0))
{
// Process the classifications
foreach (var classItem in csResult. Classifications)
{
Console.WriteLine("ClassificationID: {0}", classItem.ClassificationID);
}
}
// The proxy instance can be used for multiple calls but should
// be closed when no longer required.
proxyClassify.Close();
* Example based on consuming the WSDL from a C# client application in Visual Studio 2010 with the .NET 4.0 Framework
23. Discover the Content and Protect
Data in Mainframe Environment
• Identifying key data to make decision how it is stored, transmitted
and secured
• Classifying data that is considered as sensitive
• Alerting and preventing when sensitive data is accessed or when it
leaves a secured environment
• Providing audit information: Classification and Accesses
• Finding and notifying potential Data breach as mentioned in GDPR
Articles 21-32-33
• Using ad-hoc reports to demonstrate the compliance.
• Using CA Data Content Discovery predefined PII policies to match the
code of conduct and certification in GDPR (Article 38a) and privacy by
design (Article 25)
24. Network Perimeter
EXTERNAL THREATS
INTERNAL THREATS
C&C, Data/IP
Exfiltration
Wreak HavocElevate Privilege
Lateral Movement,
Reconnaissance
Threat
Actor
Trusted
Insider
Gain/Expand Access
• WeakAuthentication/Default
Passwords
• Stolen/CompromisedCredentials
• Poor Password/KeyManagement
• SharedAccounts/Lack of Attribution
• Authentication = Access Control
•No Limits on Lateral Movement
•No Limits on Commands
• Lack of Monitoring/Analysis
Use Technical and Privileged
Accounts without Security Pains
• Privileged accounts are always
the target for any attacker
• All PII data can be at risk if a
privileged user is
compromised
• PAM with Threat Analytics can
detect privileged accounts
anomalies or misuses and
acts accordingly to security
rules
• Technical account can be
managed too, without
impacts for applications
- NO External element within the service chain for any JDBC request
- NO Single Point of Failure or Performance/Scalability/Availability issues
- NO changes for application developers or datasources definition
- Roll-back always available: Plug-n-Run / Un-Plug-n-Run
Application App Server JDBC CallConfiguration Files
RDBMS
PAM SC Agent A2A Agent
25. Expose securely Personal Data to
Data Subject
• API Management
integrated with CCS
permits to comply
with the GDPR
regulation without
the need of changing
current applications
• API Live Creator
might be used to
build new API’s that
will include the
appropriate controls
and will expose the
information needed
to third parties as
APIs
Outside the Enterprise
End Users
Applications
Developers
Within the Enterprise
Generic Storage
RDBMS
Data Classification Engine
GDPR Data Classification
Real-Time classification of PII data
API driven Classification Engine
Event Triggered Actions
Content Aware Access Control
Integrate and Create APIs
Easily connect SOA, ESB, and legacy applications
Aggregate data including NoSQL up to 10x faster
Automatically create data APIs with live
business logic
Protocol Transformation (SOA-REST-JSon-SQL-..)
Big Data Lens
Web Services
Generic User Directory
API Management
26. Manage Test Data in SDLC
It will be much harder to use production data for testing and
development
The GDPR will strengthen existing legislation forbidding the
use of personal data for reasons other than why it was
given
Data can only be used if:
explicit consent has been given for its use for the specific purpose
necessary for legal purposes (e.g. to fulfil a contract, the subject's vital interest)
it is necessary for public interest, or for a legitimate interest of the processor
Data shall not be retained “beyond the minimum
necessary, in terms of amount of the data and time of their
storage”, and shall not be made accessible to an indefinite
number of individuals
27. CA Test Data Manager
enables testers to get
test data fit for purpose, on time,
with guarantee of consistency,
better coverage, and respect of
regulations
28. As-Is: What most people do
Production DB
App2
App 1
Test DBs
App2App 1
(Subset &) Mask
30. What is synthetic data ?
• Data generated respecting all format and integrity
constraints of the target system
• Create accurate combinations of high quality and up-
to-date test data
• Purpose is to cover more scenarios with less data
31. Constraints solved by synthetic data provided by CA
CONSTRAINT RESOLUTION
Regulations & compliance • 0% risk as data is generated
Data functional coverage
• Measure & get 100% coverage
• Capacity to identify data ‘holes’ by
comparison between environments
• Or directly identify data from test cases
Test DB size • Just most efficient set of data
Delay to provision
• Provisioning in minutes, on-demand,
through the web portal
• Capacity to book data for each tester
32. Substitution Variables
Combinable Functions
CA Test Data Manager
Data Model
Generation
Bulking Scripts
Production Data / Files
Test
Data
Warehouse
Test/Dev Environments
1 2
4 5
Secure Data Subsets
XML
Files
XLS
SQL
Files
CSV Files
API
HTML
Files
FD
TXT
Files
NoSQL
3
6
Synthetic Data Generation
Editor's Notes
Un "regolamento" è un atto legislativo vincolante. Si deve essere applicato nella sua interezza in tutta l'UE
Una "direttiva" è un atto legislativo che prevede un obiettivo che tutti i paesi dell'UE devono raggiungere. Tuttavia, spetta ai singoli paesi per elaborare le proprie leggi su come raggiungere questi obiettivi.
<audio>
Enter Script here.
</audio>
By using modern software such as CA API Management, organizations can include a front end that will permit to comply with the regulation without the need of changing current applications. In addition, CA API Live Creator might be used to build new API’s that will include the appropriate controls and will expose the information needed to third parties.
Just making a calculation on the cost of modifying all applications that currently manage personal data inside your organization and, on the other hand, the cost of just putting one single and standardize interface that might be also used for complying with other regulations related to the industry will suffice to understand the benefits of this approach.
. For deeper info, visit http://transform.ca.com/beyond-masking-subsetting.html
Data can only be used if: Explicit consent has been given for its use for the specific purpose, necessary for legal purposes (e.g. to fulfil a contract, the subject's vital interest), it is necessary for public interest, or for a legitimate interest of the processor
You need to mask personally data and other sensitive data, or getting a sub-set of production data for testing, while important.
Organizations wishing to realize the full benefits of better test data management must strongly consider implementing synthetic data generation, as well as how they store, manage and provision data.
Synthetic data generation is not only more effective in terms of time, quality and money, but also often proves to be easier and more secure than fully masking production data - with the right technology, processes and structural team changes
So, why does this not work in the real world?
Story to tell:
1) Profile data & model existing Build a multi-dimensional cube/model
2) Apply sophisticated data Coverage techniques data visualization; Find missing data enterprise wide/invalid data, etc.
3) Synthetically generate/enhance the data based on this model so that it can satisfy every possible test