4. Object Store for Big
Data
•Scale both Objects & IOPS
Set of Micro-services
- Divide, Conquer,
Scale
Seamless transition
for Yarn, MapReduce,
Hive, Spark apps.
Supports K8s, CSI and
ability to run on K8s
natively.
Ozone
5. Scale beyond HDFS
Large Data Store /
Dedicated Storage
Clusters
Cloud like presence
on-prem
First class citizen
on K8
When
10. Ozone - Write Path
Similar to DFS Write, Blocks are written directly to Datanodes
11. Ozone - Read Path
Similar to DFS Read, Blocks are read directly from Datanodes
12. Using Ozone: Is it as painful as HDFS?
We hear you and we have to setup Ozone every time we test.
• Docker
• docker-compose up -d
• runs it on local machine
• K8s
• helm install ozone
• Traditional tarball
• Untar
• Run genconfig
• Update the configurations
• If you are familiar with HDFS commands
• dfs -ls hdfs://user
• with ozone, it will become
• dfs -ls o3fs://user
• If you are familiar with S3 commands like
• aws s3 ls -endpoint=us-west1. /bucketName
• with Ozone s3 it becomes
• aws s3 ls -endpoint=s3g.local. /bucketName
Setup Usage
14. Ozone for Enterprise
• 10 Billion Keys will be supported in first official release
• Scale OM/SCM independently, without any disruption
• Evenly distribute metadata across the cluster including Datanodes
• RAFT Consensus Protocol via Apache RATIS
• Tested with industry recognized off-the-shelf components
• Blockade Tests - Tests to inject errors/failures in the clusters
• Tested Apache Spark, YARN, Hive workloads
• K8s based clusters, long running clusters, ephemeral clusters
• Freon - custom load generator
15. Ozone for Enterprise
Simplified Security
• Similar to HDFS, relies on Kerberos / Delegation Token / Block Token
• SCM comes with its own Certificate Authority and users DO NOT need to know
about it.
• Kerberos is only needed for OM/SCM, not for datanodes
• Security is on by default, not an afterthought
• Transparent Data Encryption
• Selectively audit READ or WRITE events, switch configs without the need to
restart.
16. Ozone for Enterprise
High Availability
• Built-in HA
• Single HA Configuration mode
• Regular HA Configuration mode [3 instances of OM/SCM]
18. GENERAL DATA PROTECTION REGULATION (GDPR)
• Law for handling personal data
• Imposes responsibility on Data Controllers
• Enforces Accountability for Compliance
• Grants rights to Data Entity
• European Law: Spills outside of EU in Digital Era
19. STORAGE SYSTEMS & GDPR
Territorial Scope
Personal Data
Right to Erasure
(Right to be Forgotten)
Notification Obligatan
of the Controller
24. OZONE & GDPR
• GDPR Enabled Bucket
• During Ozone Key creation, generate Simple Encryption Key(SEK)
• Client writes data to blocks, encoded by SEK under the hood
• During read, the data is decoded using same SEK.
• During delete, OM moves the KeyInfo to Deleted Keys Section.
• SEK is irrevocable lost, Data cannot be decoded even if the actual blocks are
deleted much later
• Notification of Obligation is achieved