SlideShare a Scribd company logo
1 of 28
Download to read offline
Apache Ozone
Evolution of HDFS Scalability & built-in GDPR compliance
Hadoop,	Ozone	&	Apache	are	trademarks	of	the	Apache	Software	Foundation.
Dinesh Chitlangia, Cloudera
Ajay Kumar, Google
Agenda
• Why, When, What
• Notions, Architecture,
Deployment
• Ozone for Enterprise
Ozone
• Ozone – Delete Path
• Ozone & GDPR
GDPR
Q & A
HDFS scalability
limits
400M+
Future
Make your HDFS
healthy day
Why
Object Store for Big
Data
•Scale both Objects & IOPS
Set of Micro-services
- Divide, Conquer,
Scale
Seamless transition
for Yarn, MapReduce,
Hive, Spark apps.
Supports K8s, CSI and
ability to run on K8s
natively.
Ozone
Scale beyond HDFS
Large Data Store /
Dedicated Storage
Clusters
Cloud like presence
on-prem
First class citizen
on K8
When
Notions
Volumes ~
user accounts
Buckets ~
directories (no
sub-buckets)
Keys ~ files
HDDS Notions
Containers
[Collection of
Blocks]
Pipeline
Architecture
Ozone’s Microservices - Divide, Conquer, Scale
• Ozone Manager - namespace [~Namenodes]
• Storage Container Managers - blockspace [~BlockServer]
• Recon Server - Control Plane
• S3 Gateway
• Datanodes
Deployment
Variants
Ozone - Write Path
Similar to DFS Write, Blocks are written directly to Datanodes
Ozone - Read Path
Similar to DFS Read, Blocks are read directly from Datanodes
Using Ozone: Is it as painful as HDFS?
We hear you and we have to setup Ozone every time we test.
• Docker
• docker-compose up -d
• runs it on local machine
• K8s
• helm install ozone
• Traditional tarball
• Untar
• Run genconfig
• Update the configurations
• If you are familiar with HDFS commands
• dfs -ls hdfs://user
• with ozone, it will become
• dfs -ls o3fs://user
• If you are familiar with S3 commands like
• aws s3 ls -endpoint=us-west1. /bucketName
• with Ozone s3 it becomes
• aws s3 ls -endpoint=s3g.local. /bucketName
Setup Usage
Ozone for Enterprise
Scale
Consistency
Security
Ozone for Enterprise
• 10 Billion Keys will be supported in first official release
• Scale OM/SCM independently, without any disruption
• Evenly distribute metadata across the cluster including Datanodes
• RAFT Consensus Protocol via Apache RATIS
• Tested with industry recognized off-the-shelf components
• Blockade Tests - Tests to inject errors/failures in the clusters
• Tested Apache Spark, YARN, Hive workloads
• K8s based clusters, long running clusters, ephemeral clusters
• Freon - custom load generator
Ozone for Enterprise
Simplified Security
• Similar to HDFS, relies on Kerberos / Delegation Token / Block Token
• SCM comes with its own Certificate Authority and users DO NOT need to know
about it.
• Kerberos is only needed for OM/SCM, not for datanodes
• Security is on by default, not an afterthought
• Transparent Data Encryption
• Selectively audit READ or WRITE events, switch configs without the need to
restart.
Ozone for Enterprise
High Availability
• Built-in HA
• Single HA Configuration mode
• Regular HA Configuration mode [3 instances of OM/SCM]
ENFORCEMENTTRACKER.COM
British Airways £183.39M
Marriott International £100M
Swedish School for facial tracking
Dutch Hospital for unsecured patient
data
GENERAL DATA PROTECTION REGULATION (GDPR)
• Law for handling personal data
• Imposes responsibility on Data Controllers
• Enforces Accountability for Compliance
• Grants rights to Data Entity
• European Law: Spills outside of EU in Digital Era
STORAGE SYSTEMS & GDPR
Territorial Scope
Personal Data
Right to Erasure
(Right to be Forgotten)
Notification Obligatan
of the Controller
Delete Path - Overview
Delete Path – Under the hood
OZONE & GDPR
• GDPR Enabled Bucket
• During Ozone Key creation, generate Simple Encryption Key(SEK)
• Client writes data to blocks, encoded by SEK under the hood
• During read, the data is decoded using same SEK.
• During delete, OM moves the KeyInfo to Deleted Keys Section.
• SEK is irrevocable lost, Data cannot be decoded even if the actual blocks are
deleted much later
• Notification of Obligation is achieved
OZONE & GDPR -Limitations
• Backups & Restore
• Rapid Key Create/Delete cycles – false positives
• Existing Buckets need manual copy
• Network Topology
• HA Support
• Disk Scanner
• In-place upgrades for HDFS Clusters
• Erasure Coding
• Consistent Reads from Standby OM/SCM
• Stability & Scale testing
• TPC-DS, Chaos Monkey, Scale testing with Partners
Road ahead
Interested in Ozone?
https://hadoop.apache.org/ozone/
https://cwiki.apache.org/confluence/display/HADOOP/Ozone+Road+Map
Q & A
THANK YOU

More Related Content

What's hot

Oracle Solaris 11 as a BIG Data Platform Apache Hadoop Use Case
Oracle Solaris 11 as a BIG Data Platform Apache Hadoop Use CaseOracle Solaris 11 as a BIG Data Platform Apache Hadoop Use Case
Oracle Solaris 11 as a BIG Data Platform Apache Hadoop Use CaseOrgad Kimchi
 
Migrate Oracle database to Amazon RDS
Migrate Oracle database to Amazon RDSMigrate Oracle database to Amazon RDS
Migrate Oracle database to Amazon RDSJesus Guzman
 
Introduction to storage
Introduction to storageIntroduction to storage
Introduction to storagesagaroceanic11
 
NGENSTOR_ODA_HPDA
NGENSTOR_ODA_HPDANGENSTOR_ODA_HPDA
NGENSTOR_ODA_HPDAUniFabric
 
Oracle Cloud Infrastructure – Storage
Oracle Cloud Infrastructure – StorageOracle Cloud Infrastructure – Storage
Oracle Cloud Infrastructure – StorageMarketingArrowECS_CZ
 
Beyond x86: Managing Multi-platform Environments with OpenStack
Beyond x86: Managing Multi-platform Environments with OpenStackBeyond x86: Managing Multi-platform Environments with OpenStack
Beyond x86: Managing Multi-platform Environments with OpenStackPhil Estes
 
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red_Hat_Storage
 
TUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data CenterTUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data CenterEttore Simone
 
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...Red_Hat_Storage
 
Aerospike Architecture
Aerospike ArchitectureAerospike Architecture
Aerospike ArchitecturePeter Milne
 
Ceph Day Melabourne - Community Update
Ceph Day Melabourne - Community UpdateCeph Day Melabourne - Community Update
Ceph Day Melabourne - Community UpdateCeph Community
 
Flexible and Fast Storage for Deep Learning with Alluxio
Flexible and Fast Storage for Deep Learning with Alluxio Flexible and Fast Storage for Deep Learning with Alluxio
Flexible and Fast Storage for Deep Learning with Alluxio Alluxio, Inc.
 
Oracle Database Appliance (ODA) X6-2 Portfolio Overview
Oracle Database Appliance (ODA) X6-2 Portfolio OverviewOracle Database Appliance (ODA) X6-2 Portfolio Overview
Oracle Database Appliance (ODA) X6-2 Portfolio OverviewDaryll Whyte
 
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature DataWorks Summit
 
Red Hat Storage Day Seattle: Supermicro Solutions for Red Hat Ceph and Red Ha...
Red Hat Storage Day Seattle: Supermicro Solutions for Red Hat Ceph and Red Ha...Red Hat Storage Day Seattle: Supermicro Solutions for Red Hat Ceph and Red Ha...
Red Hat Storage Day Seattle: Supermicro Solutions for Red Hat Ceph and Red Ha...Red_Hat_Storage
 
State of the Art Thin Provisioning
State of the Art Thin ProvisioningState of the Art Thin Provisioning
State of the Art Thin ProvisioningStephen Foskett
 
NFS and CIFS Options for AWS (STG401) | AWS re:Invent 2013
NFS and CIFS Options for AWS (STG401) | AWS re:Invent 2013NFS and CIFS Options for AWS (STG401) | AWS re:Invent 2013
NFS and CIFS Options for AWS (STG401) | AWS re:Invent 2013Amazon Web Services
 

What's hot (20)

Ceph as software define storage
Ceph as software define storageCeph as software define storage
Ceph as software define storage
 
ODA X6-2 family
ODA X6-2 familyODA X6-2 family
ODA X6-2 family
 
Oracle Solaris 11 as a BIG Data Platform Apache Hadoop Use Case
Oracle Solaris 11 as a BIG Data Platform Apache Hadoop Use CaseOracle Solaris 11 as a BIG Data Platform Apache Hadoop Use Case
Oracle Solaris 11 as a BIG Data Platform Apache Hadoop Use Case
 
Migrate Oracle database to Amazon RDS
Migrate Oracle database to Amazon RDSMigrate Oracle database to Amazon RDS
Migrate Oracle database to Amazon RDS
 
Introduction to storage
Introduction to storageIntroduction to storage
Introduction to storage
 
NGENSTOR_ODA_HPDA
NGENSTOR_ODA_HPDANGENSTOR_ODA_HPDA
NGENSTOR_ODA_HPDA
 
Oracle Cloud Infrastructure – Storage
Oracle Cloud Infrastructure – StorageOracle Cloud Infrastructure – Storage
Oracle Cloud Infrastructure – Storage
 
Storage for VDI
Storage for VDIStorage for VDI
Storage for VDI
 
Beyond x86: Managing Multi-platform Environments with OpenStack
Beyond x86: Managing Multi-platform Environments with OpenStackBeyond x86: Managing Multi-platform Environments with OpenStack
Beyond x86: Managing Multi-platform Environments with OpenStack
 
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
 
TUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data CenterTUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data Center
 
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
 
Aerospike Architecture
Aerospike ArchitectureAerospike Architecture
Aerospike Architecture
 
Ceph Day Melabourne - Community Update
Ceph Day Melabourne - Community UpdateCeph Day Melabourne - Community Update
Ceph Day Melabourne - Community Update
 
Flexible and Fast Storage for Deep Learning with Alluxio
Flexible and Fast Storage for Deep Learning with Alluxio Flexible and Fast Storage for Deep Learning with Alluxio
Flexible and Fast Storage for Deep Learning with Alluxio
 
Oracle Database Appliance (ODA) X6-2 Portfolio Overview
Oracle Database Appliance (ODA) X6-2 Portfolio OverviewOracle Database Appliance (ODA) X6-2 Portfolio Overview
Oracle Database Appliance (ODA) X6-2 Portfolio Overview
 
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature
 
Red Hat Storage Day Seattle: Supermicro Solutions for Red Hat Ceph and Red Ha...
Red Hat Storage Day Seattle: Supermicro Solutions for Red Hat Ceph and Red Ha...Red Hat Storage Day Seattle: Supermicro Solutions for Red Hat Ceph and Red Ha...
Red Hat Storage Day Seattle: Supermicro Solutions for Red Hat Ceph and Red Ha...
 
State of the Art Thin Provisioning
State of the Art Thin ProvisioningState of the Art Thin Provisioning
State of the Art Thin Provisioning
 
NFS and CIFS Options for AWS (STG401) | AWS re:Invent 2013
NFS and CIFS Options for AWS (STG401) | AWS re:Invent 2013NFS and CIFS Options for AWS (STG401) | AWS re:Invent 2013
NFS and CIFS Options for AWS (STG401) | AWS re:Invent 2013
 

Similar to Ozone: Evolution of HDFS

Building a Hybrid Cloud Solution
Building a Hybrid Cloud Solution Building a Hybrid Cloud Solution
Building a Hybrid Cloud Solution Cloudian
 
Cloud - NDT - Presentation
Cloud - NDT - PresentationCloud - NDT - Presentation
Cloud - NDT - PresentationÉric Dusablon
 
Webinar: Cloud Storage: The 5 Reasons IT Can Do it Better
Webinar: Cloud Storage: The 5 Reasons IT Can Do it BetterWebinar: Cloud Storage: The 5 Reasons IT Can Do it Better
Webinar: Cloud Storage: The 5 Reasons IT Can Do it BetterStorage Switzerland
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Community
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWS Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWS Tom Laszewski
 
Big Data Approaches to Cloud Security
Big Data Approaches to Cloud SecurityBig Data Approaches to Cloud Security
Big Data Approaches to Cloud SecurityPaul Morse
 
Managing storage on Prem and in Cloud
Managing storage on Prem and in CloudManaging storage on Prem and in Cloud
Managing storage on Prem and in CloudHoward Marks
 
Hadoop Meetup Jan 2019 - Overview of Ozone
Hadoop Meetup Jan 2019 - Overview of OzoneHadoop Meetup Jan 2019 - Overview of Ozone
Hadoop Meetup Jan 2019 - Overview of OzoneErik Krogen
 
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloud
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloudA1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloud
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloudDr. Wilfred Lin (Ph.D.)
 
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stackAccelerating analytics in the cloud with the Starburst Presto + Alluxio stack
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stackAlluxio, Inc.
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inRahulBhole12
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesKamesh Pemmaraju
 
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Red_Hat_Storage
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWSMigrating enterprise workloads to AWS
Migrating enterprise workloads to AWSTom Laszewski
 

Similar to Ozone: Evolution of HDFS (20)

Building a Hybrid Cloud Solution
Building a Hybrid Cloud Solution Building a Hybrid Cloud Solution
Building a Hybrid Cloud Solution
 
Deploying Big-Data-as-a-Service (BDaaS) in the Enterprise
Deploying Big-Data-as-a-Service (BDaaS) in the EnterpriseDeploying Big-Data-as-a-Service (BDaaS) in the Enterprise
Deploying Big-Data-as-a-Service (BDaaS) in the Enterprise
 
TenT-Day01.pptx
TenT-Day01.pptxTenT-Day01.pptx
TenT-Day01.pptx
 
TenT-Day01.pptx
TenT-Day01.pptxTenT-Day01.pptx
TenT-Day01.pptx
 
Cloud - NDT - Presentation
Cloud - NDT - PresentationCloud - NDT - Presentation
Cloud - NDT - Presentation
 
Webinar: Cloud Storage: The 5 Reasons IT Can Do it Better
Webinar: Cloud Storage: The 5 Reasons IT Can Do it BetterWebinar: Cloud Storage: The 5 Reasons IT Can Do it Better
Webinar: Cloud Storage: The 5 Reasons IT Can Do it Better
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWS Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWS
 
DAOS Middleware overview
DAOS Middleware overviewDAOS Middleware overview
DAOS Middleware overview
 
Big Data Approaches to Cloud Security
Big Data Approaches to Cloud SecurityBig Data Approaches to Cloud Security
Big Data Approaches to Cloud Security
 
Managing storage on Prem and in Cloud
Managing storage on Prem and in CloudManaging storage on Prem and in Cloud
Managing storage on Prem and in Cloud
 
Hadoop Meetup Jan 2019 - Overview of Ozone
Hadoop Meetup Jan 2019 - Overview of OzoneHadoop Meetup Jan 2019 - Overview of Ozone
Hadoop Meetup Jan 2019 - Overview of Ozone
 
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloud
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloudA1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloud
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloud
 
Iaas storage-170302090824
Iaas storage-170302090824Iaas storage-170302090824
Iaas storage-170302090824
 
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stackAccelerating analytics in the cloud with the Starburst Presto + Alluxio stack
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack
 
Azure Databases with IaaS
Azure Databases with IaaSAzure Databases with IaaS
Azure Databases with IaaS
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation in
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference Architectures
 
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWSMigrating enterprise workloads to AWS
Migrating enterprise workloads to AWS
 

Recently uploaded

power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
microprocessor 8085 and its interfacing
microprocessor 8085  and its interfacingmicroprocessor 8085  and its interfacing
microprocessor 8085 and its interfacingjaychoudhary37
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 

Recently uploaded (20)

power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
microprocessor 8085 and its interfacing
microprocessor 8085  and its interfacingmicroprocessor 8085  and its interfacing
microprocessor 8085 and its interfacing
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 

Ozone: Evolution of HDFS

  • 1. Apache Ozone Evolution of HDFS Scalability & built-in GDPR compliance Hadoop, Ozone & Apache are trademarks of the Apache Software Foundation. Dinesh Chitlangia, Cloudera Ajay Kumar, Google
  • 2. Agenda • Why, When, What • Notions, Architecture, Deployment • Ozone for Enterprise Ozone • Ozone – Delete Path • Ozone & GDPR GDPR Q & A
  • 4. Object Store for Big Data •Scale both Objects & IOPS Set of Micro-services - Divide, Conquer, Scale Seamless transition for Yarn, MapReduce, Hive, Spark apps. Supports K8s, CSI and ability to run on K8s natively. Ozone
  • 5. Scale beyond HDFS Large Data Store / Dedicated Storage Clusters Cloud like presence on-prem First class citizen on K8 When
  • 6. Notions Volumes ~ user accounts Buckets ~ directories (no sub-buckets) Keys ~ files HDDS Notions Containers [Collection of Blocks] Pipeline
  • 7. Architecture Ozone’s Microservices - Divide, Conquer, Scale • Ozone Manager - namespace [~Namenodes] • Storage Container Managers - blockspace [~BlockServer] • Recon Server - Control Plane • S3 Gateway • Datanodes
  • 8.
  • 10. Ozone - Write Path Similar to DFS Write, Blocks are written directly to Datanodes
  • 11. Ozone - Read Path Similar to DFS Read, Blocks are read directly from Datanodes
  • 12. Using Ozone: Is it as painful as HDFS? We hear you and we have to setup Ozone every time we test. • Docker • docker-compose up -d • runs it on local machine • K8s • helm install ozone • Traditional tarball • Untar • Run genconfig • Update the configurations • If you are familiar with HDFS commands • dfs -ls hdfs://user • with ozone, it will become • dfs -ls o3fs://user • If you are familiar with S3 commands like • aws s3 ls -endpoint=us-west1. /bucketName • with Ozone s3 it becomes • aws s3 ls -endpoint=s3g.local. /bucketName Setup Usage
  • 14. Ozone for Enterprise • 10 Billion Keys will be supported in first official release • Scale OM/SCM independently, without any disruption • Evenly distribute metadata across the cluster including Datanodes • RAFT Consensus Protocol via Apache RATIS • Tested with industry recognized off-the-shelf components • Blockade Tests - Tests to inject errors/failures in the clusters • Tested Apache Spark, YARN, Hive workloads • K8s based clusters, long running clusters, ephemeral clusters • Freon - custom load generator
  • 15. Ozone for Enterprise Simplified Security • Similar to HDFS, relies on Kerberos / Delegation Token / Block Token • SCM comes with its own Certificate Authority and users DO NOT need to know about it. • Kerberos is only needed for OM/SCM, not for datanodes • Security is on by default, not an afterthought • Transparent Data Encryption • Selectively audit READ or WRITE events, switch configs without the need to restart.
  • 16. Ozone for Enterprise High Availability • Built-in HA • Single HA Configuration mode • Regular HA Configuration mode [3 instances of OM/SCM]
  • 17. ENFORCEMENTTRACKER.COM British Airways £183.39M Marriott International £100M Swedish School for facial tracking Dutch Hospital for unsecured patient data
  • 18. GENERAL DATA PROTECTION REGULATION (GDPR) • Law for handling personal data • Imposes responsibility on Data Controllers • Enforces Accountability for Compliance • Grants rights to Data Entity • European Law: Spills outside of EU in Digital Era
  • 19. STORAGE SYSTEMS & GDPR Territorial Scope Personal Data Right to Erasure (Right to be Forgotten) Notification Obligatan of the Controller
  • 20. Delete Path - Overview
  • 21. Delete Path – Under the hood
  • 22.
  • 23.
  • 24. OZONE & GDPR • GDPR Enabled Bucket • During Ozone Key creation, generate Simple Encryption Key(SEK) • Client writes data to blocks, encoded by SEK under the hood • During read, the data is decoded using same SEK. • During delete, OM moves the KeyInfo to Deleted Keys Section. • SEK is irrevocable lost, Data cannot be decoded even if the actual blocks are deleted much later • Notification of Obligation is achieved
  • 25. OZONE & GDPR -Limitations • Backups & Restore • Rapid Key Create/Delete cycles – false positives • Existing Buckets need manual copy
  • 26. • Network Topology • HA Support • Disk Scanner • In-place upgrades for HDFS Clusters • Erasure Coding • Consistent Reads from Standby OM/SCM • Stability & Scale testing • TPC-DS, Chaos Monkey, Scale testing with Partners Road ahead
  • 28. Q & A THANK YOU