More Related Content Similar to Big Data security: Facing the challenge by Carlos Gómez at Big Data Spain 2017 (20) More from Big Data Spain (20) Big Data security: Facing the challenge by Carlos Gómez at Big Data Spain 20174. © Stratio 2017. Confidential, All Rights Reserved. 3
About me
• Father of a 5 year old child
• Technical leader in Architecture and Security team at Stratio
• Sailing skipper
5. © Stratio 2017. Confidential, All Rights Reserved.
In your opinion, how difficult is it to manage security in your
projects?
4
● Very difficult
● Difficult
● Easy
● Very Easy
● What is security?
6. DATA
GOVERNANCE
LOGS
CENTRALIZATION
PROJECTS FOR EVER ONGOING IN BIG COMPANIES
In a monolithic application centric it with data silos these
initiatives never get accomplished
HUNDRED OF MILLIONS OF EUROS SPENT DURING THE YEARS IN GLOBAL IT CROSS INITIATIVES
SAS
CRM
Earnix
(Pricing)
Towers Watson
ERP
Data Warehouse
Lab H0
(Plataforma Big Data
compartida por el grupo)
WebFocus
Oracle
Mainframe
MONITORING
SECURITYDATA SECURITY AUDIT
7. PROJECTS FOR EVER ONGOING IN BIG COMPANIES
DATA
GOVERNANCE
LOGS
CENTRALIZATION
MONITORING
DATA SECURITY AUDIT
1
2 3
4
5
8. PROJECTS FOR EVER ONGOING IN BIG COMPANIES
DATA
GOVERNANCE
LOGS
CENTRALIZATION
MONITORING
DATA SECURITY AUDIT
1
2 3
4
5
9. PROJECTS FOR EVER ONGOING IN BIG COMPANIES
DATA
GOVERNANCE
LOGS
CENTRALIZATION
MONITORING
DATA SECURITY AUDIT
1
2 3
4
5
10. GALGO CHASING ELECTRONIC RABBIT…
COMPANIES ALWAYS TRY TO GET THE RABBIT
In an application centric company with data silos you never will be able to
achieve successfully those projects
DATA
GOVERNANCE
LOGS
CENTRALIZATION MONITORING SECURITY
DATA
SECURITY AUDIT
11. STRUCTURAL INITIATIVES ARE SOLVED COMPLETELY WITH DATA CENTRIC
DaaS (data as a service)
Data
Data Intelligence
DATA
GOVERNANCE
LOGS
CENTRALIZATION
MONITORING
SECURITYDATA SECURITY AUDIT
Functionalities Implemented in the product
12. RABBIT IN A JAIL
MINIMUM EFFORT AND
COST TO GET THE RABBIT
14. © Stratio 2017. Confidential, All Rights Reserved. 13
SECURITY IN A DATA CENTRIC
Protect the data
• Perimeter security to access the cluster.
• Support identity management and authentication to prove
that a user/service is who claims to be.
• In a multi-data store platform ACLs should be centralized
to simplified the correct authorization to different data
stores.
• Audit events must be centralized to control misuse of the
cluster in real time.
• Data integrity and confidentiality in network
communications to protect data on the fly.
Protect the service
• Perimeter security to access the cluster.
• Support identity management and authentication to prove
that a user/service is who claims to be.
• A user/service should be authorized so more resources than
expected are not used.
• A user/service should not interfere with other
users/services when it is not needed.
• To control the use of resources, it should be audited.
20. © Stratio 2017. Confidential, All Rights Reserved.
In order to guide the security priorities in the product roadmap, we are focused on helping to comply with LOPD within the platform.
Every release of the Stratio platform, the security status is notified through:
● Results of the OWASP tests for the main components of the platform.
● Results of additional general purpose security tests defined to assure the quality expected.
● Security Risk Report that includes the known issues found.
● When Critical and High issues are found:
○ We explain how can be mitigated.
○ We plan to solve them during the next release.
19
SECURITY OVERVIEW
21. © Stratio 2017. Confidential, All Rights Reserved. 20
PERIMETER SECURITY: NETWORKING
Public
Network
Private network
Private Agents
Admin network
Admin Router
Master Nodes
Admin network
Admin Router
Public Agents
• The default network configuration allows a zone-based network
security design:
Public.
Admin.
Private.
• Using Mesos roles to identify nodes ensures that only tasks
specifically configured with this role will be executed outside
the Private zone.
• Using Marathon labels, endpoints can be registered dynamically:
Admin Router for the Admin zone.
Marathon LB for the Public zone.
22. © Stratio 2017. Confidential, All Rights Reserved.
The solution is integrated with LDAP and Kerberos owned by the
company where Stratio DCS is installed.
21
AUTHENTICATION, AUTHORIZATION AND AUDIT
• Authentication:
Web: OAuth2.
Services & Data Stores: Kerberos or TLS-Mutual.
• Authorization:
OAuth2
goSec Management: API Rest and website used to
manage roles, profiles and ACLs. Also it shows users,
groups and audit data.
• Audit:
authentication and authorization events are
structured and stored in a data bus (Kafka) to be
computed and collected.
23. © Stratio 2017. Confidential, All Rights Reserved.
Plugins are lightweight programs running within
processes of each cluster component.
They are responsible for:
• Authorization (using goSec ACLs).
• Audit of every request sent to the component.
Currently plugins have been developed for:
• Crossdata
• Sparta
• Zookeeper
• HDFS
22
AUTHENTICATION, AUTHORIZATION AND AUDIT
• Kafka
• Elasticsearch
24. © Stratio 2017. Confidential, All Rights Reserved.
• It is a good practice to manage secretes by key management system
instead of store them locally.
• For this purpose Stratio DCS uses HashiCorp Vault
23
KEY MANAGEMENT SYSTEM
25. © Stratio 2017. Confidential, All Rights Reserved.
• Can applications obtain authentication tokens in a secure way?
• Where applications save vault’s tokens?
• How are tokens protected?
• How will I know if someone steal tokens?
24
KEY MANAGEMENT SYSTEM the secret of secrets
Mesos
First secret
management
Application
MarathonAdmin
26. © Stratio 2017. Confidential, All Rights Reserved.
• Can applications obtain authentication tokens in a secure way?
• Where applications save vault’s tokens?
• How are tokens protected?
• How will I know if someone steal tokens?
25
KEY MANAGEMENT SYSTEM the secret of secrets
Mesos
First secret
management
Application
Marathon
one time secret
Run Application
Env: one time secretAdmin
27. © Stratio 2017. Confidential, All Rights Reserved.
• Can applications obtain authentication tokens in a secure way?
• Where applications save vault’s tokens?
• How are tokens protected?
• How will I know if someone steal tokens?
26
KEY MANAGEMENT SYSTEM the secret of secrets
Mesos
First secret
management
Application
Marathon
one time secret
login
Run Application
Env: one time secret
token < - > ACL
Admin
28. © Stratio 2017. Confidential, All Rights Reserved.
• Can applications obtain tokens in a secure way?
• Where applications save vault’s tokens?
• How are tokens guarded?
• How will I know if someone steal tokens?
27
KEY MANAGEMENT SYSTEM the secret of secrets
Mesos
First secret
management
Application
Marathon
one time secret
Run Application
Env: one time secretAdmin
29. © Stratio 2017. Confidential, All Rights Reserved.
• Can applications obtain tokens in a secure way?
• Where applications save vault’s tokens?
• How are tokens guarded?
• How will I know if someone steal tokens?
28
KEY MANAGEMENT SYSTEM the secret of secrets
Mesos
First secret
management
Application
Marathon
one time secret
login
Run Application
Env: one time secretAdmin
30. © Stratio 2017. Confidential, All Rights Reserved.
• Can applications obtain tokens in a secure way?
• Where applications save vault’s tokens?
• How are tokens guarded?
• How will I know if someone steal tokens?
29
KEY MANAGEMENT SYSTEM the secret of secrets
Mesos
First secret
management
Application
Marathon
one time secret
login
Run Application
Env: one time secretAdmin
31. © Stratio 2017. Confidential, All Rights Reserved.
• Can applications obtain tokens in a secure way?
• Where applications save vault’s tokens?
• How are tokens guarded?
• How will I know if someone steal tokens?
30
KEY MANAGEMENT SYSTEM the secret of secrets
Mesos
First secret
management
Application
Marathon
one time secret
login
Run Application
Env: one time secret
Logs Alert
Admin
32. © Stratio 2017. Confidential, All Rights Reserved.
• Spark jobs need access to multiple data stores so that Spark
needs to support the security of Stratio DCS.
• Spark 2.x compilation has been modified by Stratio in order
to:
Access secrets that are stored in the KMS.
Allow access to Kerberized HDFS.
Allow access to PostgreSQL with TLS authentication.
Allow access to Elasticsearch TLS authentication.
Allow access to Kafka with TLS authentication.
31
DATA PROCESSING ENGINE: SPARK
33. © Stratio 2017. Confidential, All Rights Reserved. 32
PROTECT THE DATA
Admin
Perimeter security
Authentication, Authorization, Audit
Ciphered communications
- use case -
34. © Stratio 2017. Confidential, All Rights Reserved. 33
PROTECT THE DATA
Admin
Perimeter security
Authentication, Authorization, Audit
Ciphered communications
- use case -
35. © Stratio 2017. Confidential, All Rights Reserved. 34
PROTECT THE DATA
Admin
Perimeter security
Authentication, Authorization, Audit
Ciphered communications
- use case -
36. © Stratio 2017. Confidential, All Rights Reserved. 35
PROTECT THE DATA
Admin
Perimeter security
Authentication, Authorization, Audit
Ciphered communications
- use case -
37. © Stratio 2017. Confidential, All Rights Reserved. 36
PROTECT THE DATA
Admin
Perimeter security
Authentication, Authorization, Audit
Ciphered communications
- use case -
38. © Stratio 2017. Confidential, All Rights Reserved. 37
PROTECT THE DATA
Admin
Perimeter security
Authentication, Authorization, Audit
Ciphered communications
- use case -
39. © Stratio 2017. Confidential, All Rights Reserved.
• Stratio DCS cluster resources (memory, disk, cpus and port ranges) are managed by Mesos.
• Mesos, Marathon and Metronome security can be activated post-installation in order to limit the use of the available resources for each
framework.
• Once it is activated, admins will be able to:
Reserve resources for a Mesos role.
Grant permissions for each user/framework to do actions such as register frameworks, run tasks, reserve resources, create volumes, etc.
• Grant a minimum set of resources to a specific mesos role
38
MULTI-TENANCY CAPABILITIES: RESOURCES ISOLATION
Mesos Cluster
MASTER
Marathon
AGENT 1
role=slave_public
AGENT 2
role=*
AGENT 3
role=postgresql
AGENT 5
role=*
AGENT 4
role=*
40. © Stratio 2017. Confidential, All Rights Reserved. 39
MULTI-TENANCY CAPABILITIES: NETWORKS ISOLATION
• What about network isolation into containerized world?
• For this purpose Stratio DCS uses Project Calico
41. © Stratio 2017. Confidential, All Rights Reserved.
• Virtual networks topologies can be created dynamically.
• Virtual networks topologies can be managed by network policies.
• Virtual networks can manage all Mesos supported containerized technologies.
• Virtual networks barely impacts big data performance.
• Frameworks/apps are authorized into a network.
• Frameworks/apps can be isolated into a virtual network.
• Frameworks/apps IP addresses and ports are managed by instance.
40
MULTI-TENANCY CAPABILITIES: NETWORKS ISOLATION
45. © Stratio 2017. Confidential, All Rights Reserved. 44
PROTECT THE SERVICE
Admin
Framework authentication
Check resources for the role
Authorization to launch tasks
Authorization to use the network
Audit (logs and Mesos API)
- use case -
46. © Stratio 2017. Confidential, All Rights Reserved. 45
PROTECT THE SERVICE
Admin
Framework authentication
Check resources for the role
Authorization to launch tasks
Authorization to use the network
Audit (logs and Mesos API)
- use case -
At least 1 core, 1GB to framework 1
47. © Stratio 2017. Confidential, All Rights Reserved. 46
PROTECT THE SERVICE
Admin
Framework authentication
Check resources for the role
Authorization to launch tasks
Authorization to use the network
Audit (logs and Mesos API)
- use case -
net_2: Deny from framework 1
At least 1 core, 1GB to framework 1
48. © Stratio 2017. Confidential, All Rights Reserved. 47
PROTECT THE SERVICE
User
2. Launches FRAMEWORK 1
Admin
User
2. Launches FRAMEWORK 2
Framework authentication
Check resources for the role
Authorization to launch tasks
Authorization to use the network
Audit (logs and Mesos API)
- use case -
net_2: Deny from framework 1
At least 1 core, 1GB to framework 1
49. © Stratio 2017. Confidential, All Rights Reserved. 48
PROTECT THE SERVICE
User
2. Launches FRAMEWORK 1
Admin
User
2. Launches FRAMEWORK 2
Framework authentication
Check resources for the role
Authorization to launch tasks
Authorization to use the network
Audit (logs and Mesos API)
- use case -
net_2: Deny from framework 1
At least 1 core, 1GB to framework 1
50. © Stratio 2017. Confidential, All Rights Reserved.
MULTI-DATA CENTER
49
- a use case -