Securing & Protecting Data in
DevOps…or any way else
Karen Lopez
Data Evangelist
InfoAdvisors
www.datamodel.com
1Karen Lopez - www.datamodel.com - @datachick
www.spotthestation.nasa.gov
3Karen Lopez - www.datamodel.com - @datachick
Karen Lopez
• Karen has 20+ years of
data and information
architecture experience
on large, multi-project
programs
• She is a space fan
• She wants you to love
your data
POLL: Who
Are You?
5Karen Lopez - www.datamodel.com - @datachick
Why this topic?
•Because
•We
•Love
•Our
•Data
Karen Lopez - www.datamodel.com - @datachick
Let’s Go!
7Karen Lopez - www.datamodel.com - @datachick
About this
session
• Interactive*
• Some Questions
• Several Answers
• I’m a DATA person
• “At another
company”
• Sharing data tools
& approaches
• SQL Server as
example
• Love Your Data
8Karen Lopez - www.datamodel.com - @datachick
Protecting
data
OVERVIEW DISCOVER CATEGORIZE
PROTECT MONITOR & ASSESS MORE
THOUGHTFUL
STUFF
9Karen Lopez - www.datamodel.com - @datachick
Let’s Chat
10
How did the most recent data
breaches happen?
Which are the most embarrassing
for the the IT Profession?
…for the developer profession?
Karen Lopez - www.datamodel.com - @datachick
What about Dev, Ops, Data,
and Security?
Is it after 25 May?
What year?
Karen Lopez - www.datamodel.com - @datachick
Requirements
Data Model
Database*
More
requirements
/ changes /
tuning /
whims
+ Non Model Stuff
Data Model
Driven
Data Model Driven
12Karen Lopez - www.datamodel.com - @datachick
13Karen Lopez - www.datamodel.com - @datachick
Data Models
• Karen’s Preference
• Track all kinds of
metadata
• Advanced Compare
features
• Support DevOps and
Iterative development
• Support Conceptual,
Logical and Physical
design Karen Lopez - www.datamodel.com - @datachick
Normal
Conflicts..
Developers vs. Data Quality
Data Professionals vs. Development Speed
Data vs. Code
Data vs. Metadata
Software Defined vs. Data Defined
15Karen Lopez - www.datamodel.com - @datachick
Karen’s Data Governance Position
Data security at the data level
Models & catalogs capture security/privacy needs
Design security from the start
Measurement & monitoring
In other words, governance
Karen Lopez - www.datamodel.com - @datachick
Typical DevOps Security Focuses
Code Reviews Auditing
Key and Secrets
Management
Repositories …what else?
17Karen Lopez - www.datamodel.com - @datachick
Typical
DevOps Data
Security
Misses
18
What’s actually in that database?
What’s actually in that JSON/XML?
Where did the test data come from?
What’s actually in that _____?
What did I just post to Github?
Karen Lopez - www.datamodel.com - @datachick
Discovery
What do we have?
Where is it? How do we
know?
19Karen Lopez - www.datamodel.com - @datachick
Data
Classification
/Categorization
Syntax-based
Sematic-based
AI-based
Data Profiling vs. Data Naming
Karen Lopez - www.datamodel.com - @datachick
Data Curation
Related to Data
Stewardship
Covers more than Data
Categorization
Important part of Data
Governance
New-ish term going into
GDPR and other
protection concepts
22Karen Lopez - www.datamodel.com - @datachick
One more time…
Every Design
Decision must be
based on Cost,
Benefit and Risk
www.datamodel.com
Karen Lopez - www.datamodel.com - @datachick
Data Curation
• Takes time, but…
• Builds on other efforts
• Contributes to future efforts
24Karen Lopez - www.datamodel.com - @datachick
Catalog Data
Assets
Every compliance effort starts with
inventory
Capture the hard work of every project
Build incrementally
Start with what exists physically
25Karen Lopez - www.datamodel.com - @datachick
Azure Data Catalog
Azure Data Catalog is a
fully managed cloud
service whose users can
discover the data sources
they need and
understand the data
sources they find. At the
same time, Data Catalog
helps organizations get
more value from their
existing investments.
Karen Lopez - www.datamodel.com - @datachick
Azure Data Catalog
Karen Lopez - www.datamodel.com - @datachick
App
Karen Lopez - www.datamodel.com - @datachick
Microsoft
Oracle
Hadoop
DB2
Teradata
MySQL
HANA
Salesforce
..and more
Data Source
Karen Lopez - www.datamodel.com - @datachick
30Karen Lopez - www.datamodel.com - @datachick
Categorization Sensitive, Confidential,
PII and Special Data
32Karen Lopez - www.datamodel.com - @datachick
But really, who?
• End Users
• Self-Serve BI Users
• DBAs
• Developers
• Ops
• Data Architects
Karen Lopez - www.datamodel.com - @datachick
Other Options
Informatica IBM Watson
Erwin Data
Governance
Data Modeling
Tool Portal
???
34Karen Lopez - www.datamodel.com - @datachick
Assess
What sorts of data do
we steward? How
should we protect it?
36Karen Lopez - www.datamodel.com - @datachick
37Karen Lopez - www.datamodel.com - @datachick
38Karen Lopez - www.datamodel.com - @datachick
Issues
• Data Pros spend 80% of their
time sourcing, prepping and
cleansing data
• Likely everyone else has these
issues
• We are lousy at documenting
data and meta data
• This makes Karen sad
Karen Lopez - www.datamodel.com - @datachick
Auditing and
Threat
Detection
Karen Lopez - www.datamodel.com - @datachick
Dynamic Data Masking
42Karen Lopez - www.datamodel.com - @datachick
Data Masking
Exampes
XXXX XXXX XXXX 1234
kxxxxxx@ixxxxx.com
$99,9999
June, 99, 9999
KXXXXX Lopez
43Karen Lopez - www.datamodel.com - @datachick
Privacy - Dynamic Data Masking
CREATE TABLE Membership(
MemberID int IDENTITY PRIMARY KEY,
FirstName varchar(100) MASKED WITH (FUNCTION =
'partial(1,"XXXXXXX",0)') NULL,
LastName varchar(100) NOT NULL,
Phone# varchar(12) MASKED WITH (FUNCTION = 'default()') NULL,
Email varchar(100) MASKED WITH (FUNCTION = 'email()') NULL);
INSERT Membership (FirstName, LastName, Phone#, Email) VALUES
('Roberto', 'Tamburello', '555.123.4567', 'RTamburello@contoso.com'),
('Janice', 'Galvin', '555.123.4568', 'JGalvin@contoso.com.co'),
('Zheng', 'Mu', '555.123.4569', 'ZMu@contoso.net');
44Karen Lopez - www.datamodel.com - @datachick
Dynamic Data Masking
45
Column level
Data in the database, at
rest, is not masked
Meant to complement
other methods
Performed at the end of
a database query right
before data returned
Performance impact
small
Karen Lopez - www.datamodel.com - @datachick
Security –
Dynamic Data
Masking in
SQL Server
4
functions
available.
today
• Default
• Email
• Custom String
• Random
46Karen Lopez - www.datamodel.com - @datachick
Dynamic Data Masking
Data in database is
not changed
0101
Ad-hoc queries
*can* expose data
0202
Does not aim to
prevent users from
exposing pieces of
sensitive data
0303
48Karen Lopez - www.datamodel.com - @datachick
Why would a Data Pro love it?
• Allows central, reusable design for
standard masking
• Offers more reliable masking and
more usable masking
• Applies across applications
• Removes whining about “we can
do that later”
50Karen Lopez - www.datamodel.com - @datachick
Security – Row Level Security
51Karen Lopez - www.datamodel.com - @datachick
Security –
Row Level
Security
Filtering result sets (predicate-based
access)
Predicates applied when reading data
Can be used to block write access
User defined policies tied to inline table
functions
52Karen Lopez - www.datamodel.com - @datachick
Row Level Security
No indication that results have been filtered
If all rows are filtered than NULL set returned
For block predicates, an error returned
Works even if you are dbo or db_owner role
53Karen Lopez - www.datamodel.com - @datachick
Why would a Data Pro love it?
• Allows a designer to do this sort of
data protection IN THE DATABASE,
not just rely on code.
• Many, many pieces of code
• Applies across applications
54Karen Lopez - www.datamodel.com - @datachick
Always!
Security – Always Encrypted
55Karen Lopez - www.datamodel.com - @datachick
Security – Always Encrypted
ENABLED AT COLUMN LEVEL PROTECTS DATA AT REST
*AND* IN MEMORY
USES COLUMN MASTER KEY
(CLIENT) AND COLUMN
ENCRYPTION KEY (SERVER)
56Karen Lopez - www.datamodel.com - @datachick
Always Encrypted
57Karen Lopez - www.datamodel.com - @datachick
Security –
Always
Encrypted
Foreign keys must match
encryption types
Client code needs to support
AE (currently this means .NET
4.x)
58Karen Lopez - www.datamodel.com - @datachick
Security –
Always
Encrypted
Wizard
59Karen Lopez - www.datamodel.com - @datachick
Why would a Data Pro love it?
• Always Encrypted, yeah.
• Allows designers to not only specify
which columns need to be
protected, but how.
• Parameters are encrypted as well
• Built in to the engine, easier for
Devs
60Karen Lopez - www.datamodel.com - @datachick
What should we STOP doing?
Nobody ever talks about this….
61Karen Lopez - www.datamodel.com - @datachick
SQL Injection
• WE ARE STILL DOING THIS!
• IT’S STILL THE #1 (but
unsecured storage is
getting more popular)
• TEST. TEST SOME MORE
• Automated Testing
• Governance is important
Unprotected “buckets”
“I’LL DELETE IT ONCE
YOU GRAB THAT FILE”
SHARING WITH 2ND
PARTIES
SHARING WITH 3RD
PARTIES
64Karen Lopez - www.datamodel.com - @datachick
Trusting good people
Good people don’t always stay that way
People mess up
Monitoring
Checking
Automatic alerting
Karen Lopez - www.datamodel.com - @datachick
Karen’s Rant Topic for
2019
66Karen Lopez - www.datamodel.com - @datachick
Test Data
• Restoring Production to
Development
• Restoring Production, with
Masking
• Restoring Production, with
Randomizing
• Restoring
Production…anywhere
• Design Test Data
• Lorem Ipsum for Data
• Really, Design Test Data
67Karen Lopez - www.datamodel.com - @datachick
Building a
Culture of
Data Security
& Privacy
• Reward
identification of
threats
• Reward
identification of risks
• Trust, but always cut
the deck
• Monitor, test,
monitor, test,
monitor…
• Be a customer with
your data in there
• Don’t use
production data
for anything other
than production
and support
68Karen Lopez - www.datamodel.com - @datachick
Thank You
• @DataChick
• karenlopez@infoadvisors.com
75Karen Lopez - www.datamodel.com - @datachick

Data Security and Protection in DevOps

  • 1.
    Securing & ProtectingData in DevOps…or any way else Karen Lopez Data Evangelist InfoAdvisors www.datamodel.com 1Karen Lopez - www.datamodel.com - @datachick
  • 2.
    www.spotthestation.nasa.gov 3Karen Lopez -www.datamodel.com - @datachick
  • 3.
    Karen Lopez • Karenhas 20+ years of data and information architecture experience on large, multi-project programs • She is a space fan • She wants you to love your data
  • 4.
    POLL: Who Are You? 5KarenLopez - www.datamodel.com - @datachick
  • 5.
  • 6.
    Let’s Go! 7Karen Lopez- www.datamodel.com - @datachick
  • 7.
    About this session • Interactive* •Some Questions • Several Answers • I’m a DATA person • “At another company” • Sharing data tools & approaches • SQL Server as example • Love Your Data 8Karen Lopez - www.datamodel.com - @datachick
  • 8.
    Protecting data OVERVIEW DISCOVER CATEGORIZE PROTECTMONITOR & ASSESS MORE THOUGHTFUL STUFF 9Karen Lopez - www.datamodel.com - @datachick
  • 9.
    Let’s Chat 10 How didthe most recent data breaches happen? Which are the most embarrassing for the the IT Profession? …for the developer profession? Karen Lopez - www.datamodel.com - @datachick
  • 10.
    What about Dev,Ops, Data, and Security? Is it after 25 May? What year? Karen Lopez - www.datamodel.com - @datachick
  • 11.
    Requirements Data Model Database* More requirements / changes/ tuning / whims + Non Model Stuff Data Model Driven Data Model Driven 12Karen Lopez - www.datamodel.com - @datachick
  • 12.
    13Karen Lopez -www.datamodel.com - @datachick
  • 13.
    Data Models • Karen’sPreference • Track all kinds of metadata • Advanced Compare features • Support DevOps and Iterative development • Support Conceptual, Logical and Physical design Karen Lopez - www.datamodel.com - @datachick
  • 14.
    Normal Conflicts.. Developers vs. DataQuality Data Professionals vs. Development Speed Data vs. Code Data vs. Metadata Software Defined vs. Data Defined 15Karen Lopez - www.datamodel.com - @datachick
  • 15.
    Karen’s Data GovernancePosition Data security at the data level Models & catalogs capture security/privacy needs Design security from the start Measurement & monitoring In other words, governance Karen Lopez - www.datamodel.com - @datachick
  • 16.
    Typical DevOps SecurityFocuses Code Reviews Auditing Key and Secrets Management Repositories …what else? 17Karen Lopez - www.datamodel.com - @datachick
  • 17.
    Typical DevOps Data Security Misses 18 What’s actuallyin that database? What’s actually in that JSON/XML? Where did the test data come from? What’s actually in that _____? What did I just post to Github? Karen Lopez - www.datamodel.com - @datachick
  • 18.
    Discovery What do wehave? Where is it? How do we know? 19Karen Lopez - www.datamodel.com - @datachick
  • 19.
  • 20.
    Data Curation Related toData Stewardship Covers more than Data Categorization Important part of Data Governance New-ish term going into GDPR and other protection concepts 22Karen Lopez - www.datamodel.com - @datachick
  • 21.
    One more time… EveryDesign Decision must be based on Cost, Benefit and Risk www.datamodel.com Karen Lopez - www.datamodel.com - @datachick
  • 22.
    Data Curation • Takestime, but… • Builds on other efforts • Contributes to future efforts 24Karen Lopez - www.datamodel.com - @datachick
  • 23.
    Catalog Data Assets Every complianceeffort starts with inventory Capture the hard work of every project Build incrementally Start with what exists physically 25Karen Lopez - www.datamodel.com - @datachick
  • 24.
    Azure Data Catalog AzureData Catalog is a fully managed cloud service whose users can discover the data sources they need and understand the data sources they find. At the same time, Data Catalog helps organizations get more value from their existing investments. Karen Lopez - www.datamodel.com - @datachick
  • 25.
    Azure Data Catalog KarenLopez - www.datamodel.com - @datachick
  • 26.
    App Karen Lopez -www.datamodel.com - @datachick
  • 27.
  • 28.
    30Karen Lopez -www.datamodel.com - @datachick
  • 29.
    Categorization Sensitive, Confidential, PIIand Special Data 32Karen Lopez - www.datamodel.com - @datachick
  • 30.
    But really, who? •End Users • Self-Serve BI Users • DBAs • Developers • Ops • Data Architects Karen Lopez - www.datamodel.com - @datachick
  • 31.
    Other Options Informatica IBMWatson Erwin Data Governance Data Modeling Tool Portal ??? 34Karen Lopez - www.datamodel.com - @datachick
  • 32.
    Assess What sorts ofdata do we steward? How should we protect it? 36Karen Lopez - www.datamodel.com - @datachick
  • 33.
    37Karen Lopez -www.datamodel.com - @datachick
  • 34.
    38Karen Lopez -www.datamodel.com - @datachick
  • 35.
    Issues • Data Prosspend 80% of their time sourcing, prepping and cleansing data • Likely everyone else has these issues • We are lousy at documenting data and meta data • This makes Karen sad Karen Lopez - www.datamodel.com - @datachick
  • 36.
    Auditing and Threat Detection Karen Lopez- www.datamodel.com - @datachick
  • 37.
    Dynamic Data Masking 42KarenLopez - www.datamodel.com - @datachick
  • 38.
    Data Masking Exampes XXXX XXXXXXXX 1234 kxxxxxx@ixxxxx.com $99,9999 June, 99, 9999 KXXXXX Lopez 43Karen Lopez - www.datamodel.com - @datachick
  • 39.
    Privacy - DynamicData Masking CREATE TABLE Membership( MemberID int IDENTITY PRIMARY KEY, FirstName varchar(100) MASKED WITH (FUNCTION = 'partial(1,"XXXXXXX",0)') NULL, LastName varchar(100) NOT NULL, Phone# varchar(12) MASKED WITH (FUNCTION = 'default()') NULL, Email varchar(100) MASKED WITH (FUNCTION = 'email()') NULL); INSERT Membership (FirstName, LastName, Phone#, Email) VALUES ('Roberto', 'Tamburello', '555.123.4567', 'RTamburello@contoso.com'), ('Janice', 'Galvin', '555.123.4568', 'JGalvin@contoso.com.co'), ('Zheng', 'Mu', '555.123.4569', 'ZMu@contoso.net'); 44Karen Lopez - www.datamodel.com - @datachick
  • 40.
    Dynamic Data Masking 45 Columnlevel Data in the database, at rest, is not masked Meant to complement other methods Performed at the end of a database query right before data returned Performance impact small Karen Lopez - www.datamodel.com - @datachick
  • 41.
    Security – Dynamic Data Maskingin SQL Server 4 functions available. today • Default • Email • Custom String • Random 46Karen Lopez - www.datamodel.com - @datachick
  • 42.
    Dynamic Data Masking Datain database is not changed 0101 Ad-hoc queries *can* expose data 0202 Does not aim to prevent users from exposing pieces of sensitive data 0303 48Karen Lopez - www.datamodel.com - @datachick
  • 43.
    Why would aData Pro love it? • Allows central, reusable design for standard masking • Offers more reliable masking and more usable masking • Applies across applications • Removes whining about “we can do that later” 50Karen Lopez - www.datamodel.com - @datachick
  • 44.
    Security – RowLevel Security 51Karen Lopez - www.datamodel.com - @datachick
  • 45.
    Security – Row Level Security Filteringresult sets (predicate-based access) Predicates applied when reading data Can be used to block write access User defined policies tied to inline table functions 52Karen Lopez - www.datamodel.com - @datachick
  • 46.
    Row Level Security Noindication that results have been filtered If all rows are filtered than NULL set returned For block predicates, an error returned Works even if you are dbo or db_owner role 53Karen Lopez - www.datamodel.com - @datachick
  • 47.
    Why would aData Pro love it? • Allows a designer to do this sort of data protection IN THE DATABASE, not just rely on code. • Many, many pieces of code • Applies across applications 54Karen Lopez - www.datamodel.com - @datachick
  • 48.
    Always! Security – AlwaysEncrypted 55Karen Lopez - www.datamodel.com - @datachick
  • 49.
    Security – AlwaysEncrypted ENABLED AT COLUMN LEVEL PROTECTS DATA AT REST *AND* IN MEMORY USES COLUMN MASTER KEY (CLIENT) AND COLUMN ENCRYPTION KEY (SERVER) 56Karen Lopez - www.datamodel.com - @datachick
  • 50.
    Always Encrypted 57Karen Lopez- www.datamodel.com - @datachick
  • 51.
    Security – Always Encrypted Foreign keysmust match encryption types Client code needs to support AE (currently this means .NET 4.x) 58Karen Lopez - www.datamodel.com - @datachick
  • 52.
  • 53.
    Why would aData Pro love it? • Always Encrypted, yeah. • Allows designers to not only specify which columns need to be protected, but how. • Parameters are encrypted as well • Built in to the engine, easier for Devs 60Karen Lopez - www.datamodel.com - @datachick
  • 54.
    What should weSTOP doing? Nobody ever talks about this…. 61Karen Lopez - www.datamodel.com - @datachick
  • 55.
    SQL Injection • WEARE STILL DOING THIS! • IT’S STILL THE #1 (but unsecured storage is getting more popular) • TEST. TEST SOME MORE • Automated Testing • Governance is important
  • 56.
    Unprotected “buckets” “I’LL DELETEIT ONCE YOU GRAB THAT FILE” SHARING WITH 2ND PARTIES SHARING WITH 3RD PARTIES 64Karen Lopez - www.datamodel.com - @datachick
  • 57.
    Trusting good people Goodpeople don’t always stay that way People mess up Monitoring Checking Automatic alerting Karen Lopez - www.datamodel.com - @datachick
  • 58.
    Karen’s Rant Topicfor 2019 66Karen Lopez - www.datamodel.com - @datachick
  • 59.
    Test Data • RestoringProduction to Development • Restoring Production, with Masking • Restoring Production, with Randomizing • Restoring Production…anywhere • Design Test Data • Lorem Ipsum for Data • Really, Design Test Data 67Karen Lopez - www.datamodel.com - @datachick
  • 60.
    Building a Culture of DataSecurity & Privacy • Reward identification of threats • Reward identification of risks • Trust, but always cut the deck • Monitor, test, monitor, test, monitor… • Be a customer with your data in there • Don’t use production data for anything other than production and support 68Karen Lopez - www.datamodel.com - @datachick
  • 61.
    Thank You • @DataChick •karenlopez@infoadvisors.com 75Karen Lopez - www.datamodel.com - @datachick