Nowadays data-driven products in the cloud are delivered faster, IT resources become more responsive and productive with lower costs and higher performance for data operations.
Causing Cyber Security risks involved in accessing sensitive data and regulatory compliance requirements.
15. Impact
• Lack of detection
• Long time organization assets abuse
• Cloud Account takeover
16. Mitigation
a strong and long Password Policy
Maintain
Multi Factor Authentication (MFA)
Enable
Inactive Identities and empty groups
Delete
Access Keys
Rotate
31. 31
DevOops
DevOps as a team, not as a mindset
Misconfigurations and Change
Management
Cost of security-as-a-patch can be high
Security has to be bolted into the process!
32. 32
From DevOps to DevSecOps
● Shift-left
● Incremental changes
● Automation
● Security is embedded into the
process
Source: https://meming.world
33. 33
So… What’s DataSecOps?
An agile, holistic, security-embedded approach to
coordination of the ever-changing data and its users,
aimed at delivering quick data-to-value, while keeping
data private, safe and well-governed.
34. 34
DataSecOps Principles
● Security as continuous part of their data operations,
not an afterthought
● Ad-hoc continuous
● Separation of environments, testing & automation
● Prioritization is key - mostly sensitive data
● Data is clearly owned
● Simplified & deterministic data access
38. We are a billion dollar company but
anyone can run a SQL query and
get a million email addresses.
VP Data Engineering, SaaS Company
״ I have an army of people creating
users, roles and views. By the time
they are done, it's already outdated.
CDO, Financial Services Company
״
Security vs. Productivity
40. But this is AS IMPORTANT…
Between 60% and 85% of
data projects fail.
DevOps + Data
engineering teams
experience 20%-30% loss
of productivity.
41. Or looking at it from another perspective…
62% says security & compliance
slows down data projects
71%-79% Of Data Leaders
Deal with PII
42. Automated Compliance
Always know
where data is, who has
access to it, what are
they doing with it
Tight Security
User can only access
data they need when
they need it
Productivity
Central governance,
distributed operations
with no restrictions on
data architecture
Key benefits of Just-in-Time Automated Access
46. 46
What To Automate?
● Whatever:
○ has the most effect on security & compliance
○ is taking its toll
● Meaning:
○ Log processing
○ Data access (Authentication & Authorization)
○ Security policies
48. 48
The Journey to Access Automation
Level 1 Level 2 Level 3
Data Access Model Ad-hoc Access Basic Access
Management
Just-in-Time Access
Provisioning Employees get access
upfront when they join or
ad-hoc when requested.
Basic RBAC framework. Employees get access
Just-in-Time based on
business needs.
Permissions Persistence 100% High 90% Based on business needs
(~20%)
Automation Fully manual Role provisioning
Some policies
Fully automated
Typical Time 1-3 months 6-9 months 12-18 months
51. 51
DevOps: Access To Production
● Productivity was NOT top concern.
● 25% of DevOps time was spent on granting/revoking
permissions, etc.
● Moving to JIT → several headcounts are now working on
MEANINGFUL things.
● Factors: # data users, grant time, revoke time,
monitoring time, pager duties
52. 52
Data Engineers: DWH
● Project initiated by the data team (DIY)
● Tale chasing:
○ Masking, RLS
○ Managing RBAC, ABAC
○ Moving targets
● # data users, time to set policies which gets longer,
roles management/explosion
89. No one except for the application
runtime has access to the DEK
89
90. Use Tink for encryption in application side
90
import tink
daead.register()
keyset_handle = tink.KeysetHandle.read(
tink.JsonKeysetReader('{"encryptedKeyset":"Ci..g=",...}'),
gcpkms.GcpKmsClient('',gcp_credential_path)
.get_aead('gcp-kms://projects/…/kek'))
cipher = keyset_handle.primitive(daead.DeterministicAead)
ciphertext = cipher.encrypt_deterministically(b'plaintext', b'')
plaintext = cipher.decrypt_deterministically(ciphertext, b'')
Decrypt the wrapper
Wrapper
KEK URI
Create Cipher object
Encrypt / Decrypt using the DEK
91. On-demand encrypt in BigQuery
91
SET KMS_RESOURCE_NAME = 'gcp-kms://projects/aead-
poc/locations/us-central1/keyRings/poc-
keyring/cryptoKeys/kek';
SET WRAPPER =
FROM_BASE64("CiQA14LE......................brY9fZ3U=");
CREATE TABLE `aead-poc.app.users_encrypted` as
SELECT
DETERMINISTIC_ENCRYPT(
KEYS.KEYSET_CHAIN(KMS_RESOURCE_NAME, WRAPPER),
email, "") email
FROM `aead-poc.app.users`
94. Performance
94
100M
Records
64
Bytes
Plain text Decrypt first
Elapsed time Slot time Elapsed time Slot time
Substring + group by 14 sec 10 min 15 sec 18 min
Select distinct 21 sec 23 min 22 sec 35 min
~50-80%
Almost the same
95. Pricing
95
SET KMS_RESOURCE_NAME = 'gcp-kms://projects/aead-
poc/locations/us-central1/keyRings/poc-
keyring/cryptoKeys/kek';
SET WRAPPER =
FROM_BASE64("CiQA14LE......................brY9fZ3U=");
SELECT
DETERMINISTIC_DECRYPT_STRING(
KEYS.KEYSET_CHAIN(KMS_RESOURCE_NAME, WRAPPER),
email, "") decrypted_email
FROM `aead-poc.app.users_encrypted`