This document discusses how Index Engines technology can help organizations comply with the General Data Protection Regulation (GDPR). It provides an overview of the GDPR requirements and highlights Index Engines' capabilities such as data classification, reporting, disposition, and automated monitoring that allow organizations to know, manage, and govern their enterprise data in accordance with the GDPR. Index Engines provides a single platform to classify petabytes of data, enable flexible search and disposition of personal data, and demonstrate ongoing compliance through automated policy monitoring and auditing.
Enterprise Data Classification and Disposition for GDPR Compliance
1. Enterprise Data Classification and Disposition Technology Introduction
Supporting GDPR Compliance through Data Classification
2. Index Engines Introduction
▪ Enterprise Class Information Management Platform
▪ Purpose-built, high-speed indexing for global data centers
▪ Scalable platform that supports petabytes of unstructured data and email
▪ Only solution to support both network and backup data sources
▪ Find, manage and govern data based on policies
▪ Corporate Profile
▪ Private company headquartered in Holmdel, NJ
▪ Founded in 2004
▪ Partnered with Dell EMC, Amazon/AWS, EY, FTI
▪ Patented technology
▪ Clients include: JPMC, Citi, Barclays, TIAA-CREF, State of CA, DOJ, Catholic Health, Cincinnati Children’s, Qualcom, Merck
Copyright Index Engines Inc. 2017 All rights reserved. 2
3. The GDPR Overview
▪ Changes how organizations manage personal
data
▪ Puts ownership of data back in the hands of
citizens
▪ Rights to access, rectify, erase, restrict, migrate, etc.
▪ Significant penalties (up to 4% of turnover) and
restrictions based on non-compliance
▪ Requires detailed knowledge and management
of content containing personal information
▪ Home address, email address, license plate, credit
cards, etc.
Copyright Index Engines Inc. 2017 All rights reserved. 3
4. The Articles of the GDPR
9 key articles focus on areas where the right
technology platform can make easy work of the
regulation:
▪ Article 15: Right of access by the data subject
▪ Article 16: Right to rectification/correction of data
▪ Article 17: Right to erasure (right to be forgotten)
▪ Article 18: Right to restriction of processing
▪ Article 20: Right to data portability
▪ Article 21: Right to object
▪ Article 22: Automated individual decision-making,
including profiling
▪ Article 25: Data protection by design and default
▪ Article 35: Data protection impact assessments
Copyright Index Engines Inc. 2017 All rights reserved. 4
5. Index Engines Support for the GDPR
Know IT
Manage IT
Govern IT
▪ Data Classification & Profiling
▪ Enterprise class indexing software
▪ Metadata, full text, pattern/regex/PII, security ACLs, activity logs
▪ Reporting & classification on user files and email
▪ Defensible Disposition
▪ Delete, copy or migrate
▪ Integrated archiving & preservation
▪ Defensible audit trails and logs
▪ Automation and Monitoring
▪ Ongoing monitoring
▪ Automated management based on policy
▪ Instant access to personal data
Copyright Index Engines Inc. 2017 All rights reserved. 5
6. Classify Data for Simplified Access and Management
Classify data by Active Directory group
membership
Example: Client Services, HR
Use metadata to filter on data that
typically contains personal information
Example: Documents, email, etc.
Tag this content for easy access and
future queries
Example: PII/RegEx or persons name
▪ Create an automated data map based on a
range of criteria
▪ Classify content to focus on areas that are
highly suspect for personal information
▪ Allows for more targeted and simplified search
and audits
Copyright Index Engines Inc. 2017 All rights reserved. 6
7. Classification for ROT Analysis and Clean Up
▪ Clean Redundant, Obsolete & Trivial content to simplify data
management & prepare for the GDPR
▪ ROT can comprise up to 40% of network data
Copyright Index Engines Inc. 2017 All rights reserved. 7
• Duplicate content
• Aged data, not accessed
in more than X years
• Abandoned data, owned
by ex-employees and not
accessed
• Non-Business Multimedia
files: photos, videos,
audio (iTunes)
• Trivial files: log files,
iTunes music, personal
vacation photos, etc.
Classify
Data
• Migrate non-active data
that should be preserved
to cloud
• Delete content with no
business value
maintaining full audit trail
• Archive non-active data
with personal
information for further
investigation
Defensible
Disposition
8. Legal Mktg Fin HR Oper Mfg
Percentage 17% 18% 12% 28% 8% 17%
Capacity (TB) 850 900 600 1,400 400 850
# Files (B) 42.5 45 30 70 20 42.5
8 2 7 12 22
5
92 98 93 88 78
95
0
50
100
150
LEGAL MKTG FIN HR OPER MFG
Active Data
Last Accessed in Last Year
1 Year > 1 Year
17%
18%
12%
28%
8%
17%
Capacity by Department
Total Capacity 5,000TB
Legal
Marketing
Finance
HR
Operations
Manufacturing
0
20
40
60
80
100
Legal Mktg Fin HR Oper Mfg
Abandoned Data
Ex-Employee based on Active Directory (TBs)
Accessed in Past Year Not Accessed in > 1 Year
ROT Analysis
Classification of Redundant, Obsolete & Trivial Content
Legal ,
248
Mktg, 270
Fin, 240
HR, 560
Oper, 123 Mfg,
170
Redundant Content (TBs)
0 1000 2000 3000
< 1 Year
2 - 3 Years
3 - 4 Years
4+ Years
Obsolete Content by
Last Accessed (TBs)
0
20
40
60
80
100
Logs Video Photos Music Other
Trivial Files (TBs)
1,256
35
989
768
49 88
Email Audit
# of PSTs on Shared Network by Department
Legal Mktg Fin HR Oper Mfg
Note: Charts generated using 3rd party software based on Index Engines data
Copyright Index Engines Inc. 2017 All rights reserved. 8
9. Index Engines Reporting and Classification Features
▪ Supports petabyte-class data center
environments, 1% index footprint for
metadata
▪ Federated search, reporting and
archiving for large scale, distributed data
▪ High speed indexing, reaching up to
1TB/hour/node
▪ Active Directory integration to group
data by departments
▪ Tagging to classify data based on any
criteria
▪ Flexible queries and reporting on:
▪ Metadata
▪ Full text and keyword
▪ Boolean search including proximity
▪ Pattern/PII including credit cards, bank
routing, social security, etc.
▪ Regular expression, POSIX basic and
extended
▪ Conceptual search (coming soon)
▪ Security ACLs, read/write/browse
permissions
▪ Activity logs reporting on user access to
specific files
Copyright Index Engines Inc. 2017 All rights reserved. 9
10. Search and Reporting Interface
Copyright Index Engines Inc. 2017 All rights reserved. 10
11. The GDPR Requirements for Personal Data
▪ The GDPR defines personal data as:
“any information relating to an identified or identifiable natural person; an identifiable natural person is one who can be identified, directly or
indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more
factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person”
▪ Requires flexible and comprehensive search and reporting
▪ Personal Data: any information relating to an identified person, such as a name, an identification number,
location data, online identifier or to one or more factors specific to the physical, physiological, genetic,
mental, economic, cultural or social identity of that person.
▪ Sensitive Personal Data: personal data revealing racial or ethnic origin, political opinions, religious or
philosophical beliefs, trade-union membership, and data concerning health or sex life such as racial or
ethnic origin, political opinions, religious or philosophical beliefs, trade-union membership; data
concerning health or sex life and sexual orientation
▪ Examples of Index Engines queries:
▪ Keyword: Name, address, TaxID, etc.
▪ Pattern: Bank routing, social security, credit card numbers, etc.
▪ RegEx: Phone numbers, postal code, license plate, UK bank sort code, email address, etc.
▪ Concept: train the query engine to identify personal data
▪ Queries can be combined with Boolean parameters (i.e. NEAR)
Copyright Index Engines Inc. 2017 All rights reserved. 11
12. Departmental Drill Down
LegalOperHRFinMktgMfg
< 1 Year
2 - 3 Years
3 - 4 Years
4+ Years
Last Accessed (TBs)
< 1 Year
2 - 3 Years
3 - 4 Years
4+ Years
Last Accessed (TBs)
< 1 Year
2 - 3 Years
3 - 4 Years
4+ Years
Last Accessed (TBs)
< 1 Year
2 - 3 Years
3 - 4 Years
4+ Years
Last Accessed (TBs)
< 1 Year
2 - 3 Years
3 - 4 Years
4+ Years
Last Accessed (TBs)
< 1 Year
2 - 3 Years
3 - 4 Years
4+ Years
Last Accessed (TBs)
Other
Server Z
Server Y
Server X
Location of Data (TBs)
Other
Server Z
Server Y
Server X
Location of Data (TBs)
Other
Server Z
Server Y
Server X
Location of Data (TBs)
Other
Server Z
Server Y
Server X
Location of Data (TBs)
Other
Server Z
Server Y
Server X
Location of Data (TBs)
Other
Server Z
Server Y
Server X
Location of Data (TBs)
Data Types (TBs)
Document
Spreadsheet
Presentation
Other
Data Types (TBs)
Document
Spreadsheet
Presentation
Other
Data Types (TBs)
Document
Spreadsheet
Presentation
Other
Data Types (TBs)
Document
Spreadsheet
Presentation
Other
Data Types (TBs)
Document
Spreadsheet
Presentation
Other
Data Types (TBs)
Document
Spreadsheet
Presentation
Other
Documents Containing PII
17
Documents Containing PII
22
Documents Containing PII
857
Documents Containing PII
1,232
Documents Containing PII
5
Documents Containing PII
217
Other
Server Z
Server Y
Server X
Location of PII (Files)
Other
Server Z
Server Y
Server X
Location of PII (Files)
Other
Server Z
Server Y
Server X
Location of PII (Files)
Other
Server Z
Server Y
Server X
Location of PII (Files)
Other
Server Z
Server Y
Server X
Location of PII (Files)
Other
Server Z
Server Y
Server X
Location of PII (Files)
Copyright Index Engines Inc. 2017 All rights reserved. 12
13. Security Assessments
▪ The GDPR mandates strict notifications and potential fines for data breaches
▪ Security assessments will find sensitive data and manage it proactively
▪ Examples of Index Engines capabilities:
▪ Find documents containing personal information
▪ Use Activity Logs to determine who has accessed these files to find potential rogue employees
▪ Use Access Control Lists (ACLs) to determine who has read/write/browse permission for sensitive files
▪ Proactively clean up sensitive data so it does not breach the fire wall
▪ Proactively determine unusual access to sensitive content
Copyright Index Engines Inc. 2017 All rights reserved. 13
14. Defensible Disposition
▪ The GDPR delivers citizens the right to access, rectify, erase, restrict or migrate their
personal information
▪ Timeframes are defined to accomplish this requirement, otherwise potential fines
▪ Examples of Index Engines integrated disposition capabilities:
▪ Find it for editing
▪ Find and export
▪ Migration/archiving to cloud storage
▪ Deletion
▪ Archive and restrict
Copyright Index Engines Inc. 2017 All rights reserved. 14
15. The GDPR Workflow with Index Engines
1.First pass
metadata
classification.
1.Include
metadata
classification
into data
mapping
interviews.
1.Defensible
deletion –
purge
everything
you can.
1.Second pass
classification,
identify
personal data
zones.
1.Deep
analysis using
conceptual
search to
identify PII.
1.Tag
personal data
for easy
access.
Copyright Index Engines Inc. 2017 All rights reserved. 15
16. Ongoing Monitoring and Automation
▪ The GDPR requires “monitoring compliance with the GDPR and other Union or
Member State data protection laws, including managing internal data protection
activities, training data processing staff, and conducting internal audits.”
▪ Organizations will need to show compliance with the GDPR through the use of
technology and sound policies
▪ Examples of Index Engines capabilities:
▪ Store queries and polices
▪ Automated policy search and identification (email notifications)
▪ Automated reporting with csv/text file for use in 3rd party reporting tools
▪ Automated indexing and preservation/archiving
▪ Activity logs/audit trails
Copyright Index Engines Inc. 2017 All rights reserved. 16
17. Legacy Backup Data and GDPR
▪ The GDPR defines processing of data as: “any operation or set of operations which is performed on personal data or
on sets of personal data, whether or not by automated means, such as collection, recording, organisation, structuring,
storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise
making available, alignment or combination, restriction, erasure or destruction;”
▪ Personal data shall be kept in a form which permits identification of data subjects for no longer than is necessary for the
purposes for which the personal data are processed; personal data may be stored for longer periods insofar as the
personal data will be processed solely for archiving purposes in the public interest, scientific or historical research
purposes or statistical purposes
▪ Backup is an automated operation that processes personal data.
▪ Backup is not an archive! How can you find and purge data on old backup tapes?
▪ Index Engines delvers a solution to migrate data of value from tape to disk/cloud
▪ Go tapeless and eliminate the risk and liability of inaccessible backup data
▪ Examples of Index Engines capabilities:
▪ The only solution that provides direct access to backup data
▪ No need for the original backup software
▪ Profile and classify backup data
▪ Migrate data of value to disk cloud with full search capability
▪ Go tapeless
18. Index Engines Key Advantages for the GDPR
Enterprise data insightKnow IT
• The only enterprise class indexing platform on the market today
• Supports all classes of data from primary storage to backup content
Streamlined dispositionManage IT
• Classify and report on content across the data center
• Flexible access and disposition options to manage effectively
Take control of dataGovern IT
• Integrated archiving and preservation
• Support for legal and security policies
Copyright Index Engines Inc. 2017 All rights reserved. 18
19. Next Steps…
Copyright Index Engines Inc. 2017 All rights reserved. 19
▪ Get your complimentary GDPR Guide here.
▪ Contact us about our GDPR assessment service
▪ Index Engines
www.indexengines.com
jim.mcgann@indexengines.com