SlideShare a Scribd company logo
1 © Hortonworks Inc. 2011–2018. All rights reserved
HDFS Scalability and Evolution: HDDS and
Ozone
Sanjay Radia,
Founder, Chief Architect, Hortonworks
Anu Engineer
HDFS Engineering, Hortonworks
2 © Hortonworks Inc. 2011–2018. All rights reserved
About the Speakers
Sanjay Radia
• Chief Architect, Founder, Hortonworks
• Apache Hadoop PMC and Committer
Part of the original Hadoop team at
Yahoo! since 2007
• Chief Architect of Hadoop Core at Yahoo!
Prior
• Data center automation, virtualization, Java,
HA, OSs, File Systems
• Startup, Sun Microsystems, INRIA…
• Ph.D., University of Waterloo
Anu Engineer
• Senior Member of HDFS Engineering
at Hortonworks
• Lead developer of Ozone/HDDS
• Apache Hadoop PMC and Committer
Prior
• Part of the founding team that created
Windows Azure
Page 2Architecting the Future of Big Data
3 © Hortonworks Inc. 2011–2018. All rights reserved
HDFS – What it does well and not so well
HDFS does well
• Scaling – IO + PBs + clients
• Horizontal scaling – IO + PBs
• Fast IO – scans and writes
• Number of concurrent clients 60K++
• Low latency metadata operations
• Fault tolerant storage layer
• Locality
• Replicas/Reliability and parallelism
• Layering – Namespace layer and storage
layer
• Security
But scaling Namespace is limited
to 500M files (192G Heap)
• Scaling Namespace – 500M FILES
• Scaling Block space
• Scaling Block reports
• Scaling DN’s block management
• Need further scaling of client/RPC 150K++
Ironically, Namespace in mem
is strength and weakness
4 © Hortonworks Inc. 2011–2018. All rights reserved
Proof points OF Scaling Data, IO, Clients/RPC
• Proof points of large data and large clusters
• Single Organizations have over 600PB in HDFS
• Single clusters with over 200PB using federation
• Large clusters over 4K multi-core nodes bombarding a single NN
• Federation is the currents caling solution (both Namespace & Operations)
• In deployment at Twitter, Yahoo, FB, and elsewhere
Metadata in memory the strength of the original GFS and HDFS design
But also its weakness in scaling number of files and blocks
5 © Hortonworks Inc. 2011–2018. All rights reserved
Scaling HDFS
- with HDDS and Ozone
6 © Hortonworks Inc. 2011–2018. All rights reserved
HDFS Layering
DN 1 DN 2 DN m
.. .. ..
NS1
...
NS k
Block Management Layer
Block Pool kBlock Poo 1
NN-1 NN-k
Common Storage
BlockStorageNamespace
7 © Hortonworks Inc. 2011–2018. All rights reserved
Solutions to Scaling Files, Blocks, Clients/RPC
Scale Namespace
• Hierarchical file system
– Cache only workingSet of
namespace in memory
– Partition:
- Distributed namespace (transparent
automatic partitioning)
- Volumes (static partitioning)
• Flat Key-Value store
– Cache only workingSet of
namespace in memory
– Partition/Shard the space (easy to
hash)
Scale Metadata Clients/RPC
• Multi-thread namespace manager
• Partitioning/Sharding
Slow NN startup
• Cache only workingSet in mem
• Shard/partition namespace
Scale Block Management
• Containers of blocks (2GB-16GB+)
• Will significantly reduce BlockMap
• Reduce Number of Block/Container
reports
8 © Hortonworks Inc. 2011–2018. All rights reserved
Scaling HDFS
Must Scale both the Namespace and the Block Layer
• Scaling one is not sufficient
Scalable Block layer: Hadoop Distributed Data Storage (HDDS)
• Containers of blocks
• Replicated as a group
• Reduces Block Map
Scale Namespace: Several approaches (not exclusive)
• Partial namespace in memory
• Shard namespace
• Use flat namespace (KV namespace) – easier to implement and scale – Ozone
9 © Hortonworks Inc. 2011–2018. All rights reserved
Scale Storage Layer:
Container of Blocks
HDDS
Flat KV
Namespace:
Ozone
New
Hdfs
OzoneFS:
Hadoop
Compatible
FS
Hierarchical
Namespace:
New Scalable
NN
Evolution towards new HDFS
10 © Hortonworks Inc. 2011–2018. All rights reserved
How it all Fits Together
Old HDFS NN
All namespace in
memory
Storage&IONamespace
HDFS Block storage on DataNodes
(Bid -> Data)
Physical Storage - Shared DataNodes and physical
storage shared between
Old HDFS and HDDS
Block Reports
BlockMap
(Bid ->IPAddress of DN
File = Bid[]
Ozone Master
K-V Flat
Namespace
File (Object) = Bid[]
Bid = Cid+ LocalId
New HDFS NN
(scalable)
Hierarchical
Namespace
File = Bid[]
Bid = Cid+ LocalId
Container Management
& Cluster Membership
HDDS Container Storage on DataNodes
(Bid -> Data, but blocks grouped in containers)
HDDS
HDDS – Clean
Separation of
Block layer
DataNodes
ContainerMap
(CId ->IPAddress of DNContainer Reports
NewExisting HDFS
11 © Hortonworks Inc. 2011–2018. All rights reserved
Ozone FS
Ozone/HDDS can be used separately, or also with HDFS
• Initially HDFS is the default FS
• Has many features
• so cannot be replaced by OzoneFS on day one
• Ozone FS sits on side as additional namespace, sharing DNs
• For applications work with Hadoop Compatible FS
on K-V Store – Hive, Spark …
• How is Ozone FS accessed?
• Use direct URIs for either HDFS or OzoneFS
• Mount in HDFS or in ViewFS
HDFS
Default
FS
12 © Hortonworks Inc. 2011–2018. All rights reserved
Scalable Block Layer:
Hadoop Distributed Data Storage (HDDS)
Container: Containers of blocks (2GB-16GB+)
• Replicated as a group
• Each Container has a unique ContainerId
– Every block within a container has a block id
- BlockId = ContainerId, LocalId
CM – Container manager
• Cluster membership
• Receives container reports from DNs
• Manages container replication
• Maintained Container Map (Cid->IPAddr)
Data Nodes – HDFS and HDDS can share DNs
• DataNodes contain a set of containers (just like
they used to contain blocks)
• DataNodes send Container-reports (like block
reports) to CM (Container Manager)
Block Pools
• Just like blocks were in block pools, containers are
also in container pools
– This allow independent namespaces to carve out their
block space
HDDS: Separate layer from namespace layer (strictly separate, not almost)
13 © Hortonworks Inc. 2011–2018. All rights reserved
HDDS+Ozone – Addressing some of the limitations of HDFS
• Scale Block Management
• Containers of block (2 GB to 16GB)
• 2-4gb block containers initially => 40-80x
reduction in BR and CM block map
• Reduce BR on DNs, Masters, Network
• Scale Namespace
• Key Space Manager caches only working set in
memory
• Future scaling:
• Flat namespace is easy to shard (Bucket are
natural sharding points)
• Scale Num of Metadata Clients/Rpc
• No single global lock like NN
• Metadata operations are simpler
• Sharding will help further
 Fault Tolerance
– Blocks – inherits HDFS’s block-layer FT
– Namespace – uses Raft rather then Journal Nodes
•HA Easier
 Manageability
– GC/Overloaded Master is not longer an issue
• caches working set
– Journal nodes disappear – Raft is used
– Faster and more predictable failover
– Fast start up
• Faster upgrades
• Faster failover
14 © Hortonworks Inc. 2011–2018. All rights reserved
Will OzoneFS’s Key-Value Store work with Hadoop Apps
• Two years ago – NO!
• Today - Yes!
• Hive, Spark and others are making sure they work on Cloud K-V Object Stores via HCFS
• Even customers are ensuring that their apps work on Cloud K-V Object Stores via HCFS
• Lack of real directories and their ACLs: Fake directories + Buckets ACLs
• Lack of eventual consistency in S3 is being worked around – S3Gaurd (Note: OzoneFS is consistent)
• Lack of rename in S3 is being worked around
• Various direct output committers (early versions had issues)
• Netflix Direct Commiter; being replaced by Iceberg
• Via Metastore (Databricks has proprietary version, Hive’s approach)
15 © Hortonworks Inc. 2011–2018. All rights reserved
Details of HDDS
16 © Hortonworks Inc. 2011–2018. All rights reserved
Container Structure (Using RocksDB)
Container
Index
Chunk
data file
Chunk data
file
Chunk data
file
Chunk data
file
Key 1
LSM
LevelDB/RocksDB
Key N
Chunk Data
File Name
Offset Length
 An embedded LSM/KVStore (RocksDB)
 BlockId is the key,
– filename of local chunk file is value
 Optimizations
– Small blocks (< 1MB) can be stored directly in
rocksDB
– Compaction for block data to avoid lots of files
• But this can be evolved over time
17 © Hortonworks Inc. 2011–2018. All rights reserved
Replication of Container
• Use RAFT replication instead of data pipeline, for both data and metadata
• Proven to be correct
• Traditionally Raft used for small updates and transactions, fits well for metadata
• Performance considerations
• When writing the meta data into raft-journal, put the data directly in container
storage
• Raft-journal in separate disk – fast contagious writes without seeking
• Data spread across the other disks
• Client uses Raft protocol to write data to the DNs storing the container
Page 17
18 © Hortonworks Inc. 2011–2018. All rights reserved
Open and Closed Containers
Open – active writers
• Need at least( NumSpindles * Data nodes) open active containers
• Clients can get locality on writes
• Data is spread across all data nodes
• Improved IO and better chance of getting locality
• Keep DNs and ALL spindles busy
Closed – typically when full or had a failure in the past
• Why close a container on failures
• We originally considered keeping it open and bringing in a new DN
• Wait for the data to copy?
• Decided to close it, and have it replicated
• Can open later or can merge with other closed container – under design
Page 18
19 © Hortonworks Inc. 2011–2018. All rights reserved
Details of Ozone
20 © Hortonworks Inc. 2011–2018. All rights reserved
Ozone Master
DN1 DN2 DNn
Ozone Master
K-V
Namespace
File (Object) = Bid[]
Bid = Cid+ LocalId
CM
ContainerMap
(CId ->IPAddress of DN
Client
RocksDB
bId[]= Open(Key,..)
GetBlockLocations(Bid)
$$$
$$$ - Container Map Cache
$$$
Read, Write, …
21 © Hortonworks Inc. 2011–2018. All rights reserved
Ozone APIs
• Key: /VolumeName/BucketId/ObjectKey e.g /Home/John/foo/bar/zoo)
• ACLs at Volume and Bucket level (the other dirs are fake)
• Future sharding at bucket level
• => Ozone is Consistent (unlike S3)
Ozone Object API (RPC)
S3 Connector
Hadoop FileSystem and Hadoop
FileContext Connectors
22 © Hortonworks Inc. 2011–2018. All rights reserved
Where does the Ozone Master run?
Which Node?
• On a separate node with large enough memory for caching the working set
• Caching the working set is important for large number of concurrent clients
• This option would give predictable performance for large clusters
• On the Datanodes
• How much memory for caching,
• Note: tasks and other services run on DN since they are typically also compute nodes
Where is Storage for the Ozone KV Metadata?
• Local disk
• If on DN then is it dedicated disk or shared with DN?
• Use the container storage (Its using RocksDB anyway)
• Spread Ozone volumes across containers to gain performance,
• but this maylimit volume size & force more Ozone volumes than Admin wants
23 © Hortonworks Inc. 2011–2018. All rights reserved
Quadra – Lun-like Raw-Block Storage
Used for creating mountable disk FS volume
24 © Hortonworks Inc. 2011–2018. All rights reserved
Quadra: Raw-Block Storage Volume (Lun)
Lun-like storage service where the blocks are stored on HDDS
• Volume: A raw-block device that can be used to create a mountable disk on Linux.
• Raw-Blocks - those of the native FS that will use the Lun Volume
• Raw-block size is dictated by the native fs like ext4 (4K)
• Raw-Blocks are unit of IO operations by native file systems.
• Raw-Block is the unit of read/write/update to HDDS
• Ozone and Quadra share HDDS as a common storage backend
• Current prototype: 1 raw-block = 1 HDDS block (but this will change later)
Can be used in Kubernetes for container state
25 © Hortonworks Inc. 2011–2018. All rights reserved
Quadra
SCM
DN DNDN DN
Containers
Host
Kern
el
User
SCSI Initiator
Linux
Distro
Quadra
Volume
Manager
Volume
API
JSCSI
Ref: http://jscsi.org/
Quadra
Plugin
HDDS : Ozone’s storage layer
26 © Hortonworks Inc. 2011–2018. All rights reserved
Quadra Volume Manager
• Quadra volumes are virtual block devices that are mounted on hosts over
SCSI
• Volume Manager tracks the name of volumes and storage containers
assigned to them
• Volume Manager talks to CM to get containers allocated for a volume
• Volume Manager assigns leases for clients to mount a volume
• Volume Manager state is persisted and replicated via RAFT
27 © Hortonworks Inc. 2011–2018. All rights reserved
Status
HDDS: Block container
• 2-4gb block containers initially
– Reduction of 40-80 in BR and block map
– Reduce BR pressure in on NN/OzoneMaster
• Initial version to scale to 10s billions of blocks
Ozone Master
• Implemented using RocksDB (just like the HDDS in DNs)
• Initial version to scale to 10 billion objects
Current Status and Steps to GA
• Stabilize HDDS and Ozone
• Measure and improve performance
• Add HA for Ozone Master and Container Manager
• Add security – Security design completed and published
After GA
• Further stabilization and performance improvements
• Transparent encryption
• Erasure codes
• Snapshots (or their equivalent)
• ..
28 © Hortonworks Inc. 2011–2018. All rights reserved
Summary
• HDFS scale proven in real production systems
• 4K+ clusters
• Raw Storage >200PB in single federated NN cluster and >30PB in non-federated clusters
• Scales to 60K+ concurrent clients bombarding the NN
• But very large number of small files is a challenge (500M files)
• HDDS + Ozone: Scalable Hadoop Storage
• Retains
• HDFS block storage Fault-tolerance
• HDFS Horizonal scaling for Storage, IO
• HDFS’s move computation to Storage
• HDDS: Block containers:
• Initially scale to 10B blocks, later to 100B+ blocks (HDFS-7240)
• Ozone – Flat KV namespace + Hadoop Compatible FS (OzoneFS)
• initially scale to 10B files (HDFS-13074)
• Community working on a Hierarchal Namespace on HDDS (HDFS-10419)
29 © Hortonworks Inc. 2011–2018. All rights reserved
Thank You
Q&A

More Related Content

What's hot

HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
Biju Nair
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
Lars Hofhansl
 
Cloudera Impala Internals
Cloudera Impala InternalsCloudera Impala Internals
Cloudera Impala Internals
David Groozman
 
Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
Benjamin Leonhardi
 
Using Time Window Compaction Strategy For Time Series Workloads
Using Time Window Compaction Strategy For Time Series WorkloadsUsing Time Window Compaction Strategy For Time Series Workloads
Using Time Window Compaction Strategy For Time Series Workloads
Jeff Jirsa
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


Cloudera, Inc.
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
colorant
 
Ozone- Object store for Apache Hadoop
Ozone- Object store for Apache HadoopOzone- Object store for Apache Hadoop
Ozone- Object store for Apache Hadoop
Hortonworks
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Odinot Stanislas
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
Flink Forward
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
Jiangjie Qin
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
Alexey Grishchenko
 
Performance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterPerformance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla Cluster
ScyllaDB
 
HDFS: Optimization, Stabilization and Supportability
HDFS: Optimization, Stabilization and SupportabilityHDFS: Optimization, Stabilization and Supportability
HDFS: Optimization, Stabilization and Supportability
DataWorks Summit/Hadoop Summit
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
DataWorks Summit
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
Yoshinori Matsunobu
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
DataWorks Summit
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkOptimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache Spark
Databricks
 
Delta Lake: Optimizing Merge
Delta Lake: Optimizing MergeDelta Lake: Optimizing Merge
Delta Lake: Optimizing Merge
Databricks
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
Sohil Jain
 

What's hot (20)

HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
 
Cloudera Impala Internals
Cloudera Impala InternalsCloudera Impala Internals
Cloudera Impala Internals
 
Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
 
Using Time Window Compaction Strategy For Time Series Workloads
Using Time Window Compaction Strategy For Time Series WorkloadsUsing Time Window Compaction Strategy For Time Series Workloads
Using Time Window Compaction Strategy For Time Series Workloads
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
 
Ozone- Object store for Apache Hadoop
Ozone- Object store for Apache HadoopOzone- Object store for Apache Hadoop
Ozone- Object store for Apache Hadoop
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Performance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterPerformance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla Cluster
 
HDFS: Optimization, Stabilization and Supportability
HDFS: Optimization, Stabilization and SupportabilityHDFS: Optimization, Stabilization and Supportability
HDFS: Optimization, Stabilization and Supportability
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkOptimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache Spark
 
Delta Lake: Optimizing Merge
Delta Lake: Optimizing MergeDelta Lake: Optimizing Merge
Delta Lake: Optimizing Merge
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
 

Similar to Ozone and HDFS’s evolution

Ozone and HDFS's Evolution
Ozone and HDFS's EvolutionOzone and HDFS's Evolution
Ozone and HDFS's Evolution
DataWorks Summit
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolution
DataWorks Summit
 
Evolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage SubsystemEvolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage Subsystem
DataWorks Summit/Hadoop Summit
 
Evolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage SubsystemEvolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage Subsystem
DataWorks Summit/Hadoop Summit
 
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage SubsystemEvolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
DataWorks Summit/Hadoop Summit
 
Ozone: scaling HDFS to trillions of objects
Ozone: scaling HDFS to trillions of objectsOzone: scaling HDFS to trillions of objects
Ozone: scaling HDFS to trillions of objects
DataWorks Summit
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
DataWorks Summit/Hadoop Summit
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
DataWorks Summit
 
Hadoop 3 in a Nutshell
Hadoop 3 in a NutshellHadoop 3 in a Nutshell
Hadoop 3 in a Nutshell
DataWorks Summit/Hadoop Summit
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and Future
DataWorks Summit
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
DataWorks Summit
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
DataWorks Summit
 
HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase
HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBaseHBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase
HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase
Cloudera, Inc.
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Community
 
Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2
hdhappy001
 
Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5
Chris Nauroth
 
HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
HDFS Erasure Code Storage - Same Reliability at Better Storage EfficiencyHDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
DataWorks Summit
 
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
DataWorks Summit
 
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Kevin Crocker
 
Democratizing Memory Storage
Democratizing Memory StorageDemocratizing Memory Storage
Democratizing Memory Storage
DataWorks Summit
 

Similar to Ozone and HDFS’s evolution (20)

Ozone and HDFS's Evolution
Ozone and HDFS's EvolutionOzone and HDFS's Evolution
Ozone and HDFS's Evolution
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolution
 
Evolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage SubsystemEvolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage Subsystem
 
Evolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage SubsystemEvolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage Subsystem
 
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage SubsystemEvolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
 
Ozone: scaling HDFS to trillions of objects
Ozone: scaling HDFS to trillions of objectsOzone: scaling HDFS to trillions of objects
Ozone: scaling HDFS to trillions of objects
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
 
Hadoop 3 in a Nutshell
Hadoop 3 in a NutshellHadoop 3 in a Nutshell
Hadoop 3 in a Nutshell
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and Future
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
 
HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase
HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBaseHBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase
HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 
Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2
 
Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5
 
HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
HDFS Erasure Code Storage - Same Reliability at Better Storage EfficiencyHDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
 
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
 
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
 
Democratizing Memory Storage
Democratizing Memory StorageDemocratizing Memory Storage
Democratizing Memory Storage
 

More from DataWorks Summit

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 

Recently uploaded (20)

Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 

Ozone and HDFS’s evolution

  • 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved HDFS Scalability and Evolution: HDDS and Ozone Sanjay Radia, Founder, Chief Architect, Hortonworks Anu Engineer HDFS Engineering, Hortonworks
  • 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved About the Speakers Sanjay Radia • Chief Architect, Founder, Hortonworks • Apache Hadoop PMC and Committer Part of the original Hadoop team at Yahoo! since 2007 • Chief Architect of Hadoop Core at Yahoo! Prior • Data center automation, virtualization, Java, HA, OSs, File Systems • Startup, Sun Microsystems, INRIA… • Ph.D., University of Waterloo Anu Engineer • Senior Member of HDFS Engineering at Hortonworks • Lead developer of Ozone/HDDS • Apache Hadoop PMC and Committer Prior • Part of the founding team that created Windows Azure Page 2Architecting the Future of Big Data
  • 3. 3 © Hortonworks Inc. 2011–2018. All rights reserved HDFS – What it does well and not so well HDFS does well • Scaling – IO + PBs + clients • Horizontal scaling – IO + PBs • Fast IO – scans and writes • Number of concurrent clients 60K++ • Low latency metadata operations • Fault tolerant storage layer • Locality • Replicas/Reliability and parallelism • Layering – Namespace layer and storage layer • Security But scaling Namespace is limited to 500M files (192G Heap) • Scaling Namespace – 500M FILES • Scaling Block space • Scaling Block reports • Scaling DN’s block management • Need further scaling of client/RPC 150K++ Ironically, Namespace in mem is strength and weakness
  • 4. 4 © Hortonworks Inc. 2011–2018. All rights reserved Proof points OF Scaling Data, IO, Clients/RPC • Proof points of large data and large clusters • Single Organizations have over 600PB in HDFS • Single clusters with over 200PB using federation • Large clusters over 4K multi-core nodes bombarding a single NN • Federation is the currents caling solution (both Namespace & Operations) • In deployment at Twitter, Yahoo, FB, and elsewhere Metadata in memory the strength of the original GFS and HDFS design But also its weakness in scaling number of files and blocks
  • 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved Scaling HDFS - with HDDS and Ozone
  • 6. 6 © Hortonworks Inc. 2011–2018. All rights reserved HDFS Layering DN 1 DN 2 DN m .. .. .. NS1 ... NS k Block Management Layer Block Pool kBlock Poo 1 NN-1 NN-k Common Storage BlockStorageNamespace
  • 7. 7 © Hortonworks Inc. 2011–2018. All rights reserved Solutions to Scaling Files, Blocks, Clients/RPC Scale Namespace • Hierarchical file system – Cache only workingSet of namespace in memory – Partition: - Distributed namespace (transparent automatic partitioning) - Volumes (static partitioning) • Flat Key-Value store – Cache only workingSet of namespace in memory – Partition/Shard the space (easy to hash) Scale Metadata Clients/RPC • Multi-thread namespace manager • Partitioning/Sharding Slow NN startup • Cache only workingSet in mem • Shard/partition namespace Scale Block Management • Containers of blocks (2GB-16GB+) • Will significantly reduce BlockMap • Reduce Number of Block/Container reports
  • 8. 8 © Hortonworks Inc. 2011–2018. All rights reserved Scaling HDFS Must Scale both the Namespace and the Block Layer • Scaling one is not sufficient Scalable Block layer: Hadoop Distributed Data Storage (HDDS) • Containers of blocks • Replicated as a group • Reduces Block Map Scale Namespace: Several approaches (not exclusive) • Partial namespace in memory • Shard namespace • Use flat namespace (KV namespace) – easier to implement and scale – Ozone
  • 9. 9 © Hortonworks Inc. 2011–2018. All rights reserved Scale Storage Layer: Container of Blocks HDDS Flat KV Namespace: Ozone New Hdfs OzoneFS: Hadoop Compatible FS Hierarchical Namespace: New Scalable NN Evolution towards new HDFS
  • 10. 10 © Hortonworks Inc. 2011–2018. All rights reserved How it all Fits Together Old HDFS NN All namespace in memory Storage&IONamespace HDFS Block storage on DataNodes (Bid -> Data) Physical Storage - Shared DataNodes and physical storage shared between Old HDFS and HDDS Block Reports BlockMap (Bid ->IPAddress of DN File = Bid[] Ozone Master K-V Flat Namespace File (Object) = Bid[] Bid = Cid+ LocalId New HDFS NN (scalable) Hierarchical Namespace File = Bid[] Bid = Cid+ LocalId Container Management & Cluster Membership HDDS Container Storage on DataNodes (Bid -> Data, but blocks grouped in containers) HDDS HDDS – Clean Separation of Block layer DataNodes ContainerMap (CId ->IPAddress of DNContainer Reports NewExisting HDFS
  • 11. 11 © Hortonworks Inc. 2011–2018. All rights reserved Ozone FS Ozone/HDDS can be used separately, or also with HDFS • Initially HDFS is the default FS • Has many features • so cannot be replaced by OzoneFS on day one • Ozone FS sits on side as additional namespace, sharing DNs • For applications work with Hadoop Compatible FS on K-V Store – Hive, Spark … • How is Ozone FS accessed? • Use direct URIs for either HDFS or OzoneFS • Mount in HDFS or in ViewFS HDFS Default FS
  • 12. 12 © Hortonworks Inc. 2011–2018. All rights reserved Scalable Block Layer: Hadoop Distributed Data Storage (HDDS) Container: Containers of blocks (2GB-16GB+) • Replicated as a group • Each Container has a unique ContainerId – Every block within a container has a block id - BlockId = ContainerId, LocalId CM – Container manager • Cluster membership • Receives container reports from DNs • Manages container replication • Maintained Container Map (Cid->IPAddr) Data Nodes – HDFS and HDDS can share DNs • DataNodes contain a set of containers (just like they used to contain blocks) • DataNodes send Container-reports (like block reports) to CM (Container Manager) Block Pools • Just like blocks were in block pools, containers are also in container pools – This allow independent namespaces to carve out their block space HDDS: Separate layer from namespace layer (strictly separate, not almost)
  • 13. 13 © Hortonworks Inc. 2011–2018. All rights reserved HDDS+Ozone – Addressing some of the limitations of HDFS • Scale Block Management • Containers of block (2 GB to 16GB) • 2-4gb block containers initially => 40-80x reduction in BR and CM block map • Reduce BR on DNs, Masters, Network • Scale Namespace • Key Space Manager caches only working set in memory • Future scaling: • Flat namespace is easy to shard (Bucket are natural sharding points) • Scale Num of Metadata Clients/Rpc • No single global lock like NN • Metadata operations are simpler • Sharding will help further  Fault Tolerance – Blocks – inherits HDFS’s block-layer FT – Namespace – uses Raft rather then Journal Nodes •HA Easier  Manageability – GC/Overloaded Master is not longer an issue • caches working set – Journal nodes disappear – Raft is used – Faster and more predictable failover – Fast start up • Faster upgrades • Faster failover
  • 14. 14 © Hortonworks Inc. 2011–2018. All rights reserved Will OzoneFS’s Key-Value Store work with Hadoop Apps • Two years ago – NO! • Today - Yes! • Hive, Spark and others are making sure they work on Cloud K-V Object Stores via HCFS • Even customers are ensuring that their apps work on Cloud K-V Object Stores via HCFS • Lack of real directories and their ACLs: Fake directories + Buckets ACLs • Lack of eventual consistency in S3 is being worked around – S3Gaurd (Note: OzoneFS is consistent) • Lack of rename in S3 is being worked around • Various direct output committers (early versions had issues) • Netflix Direct Commiter; being replaced by Iceberg • Via Metastore (Databricks has proprietary version, Hive’s approach)
  • 15. 15 © Hortonworks Inc. 2011–2018. All rights reserved Details of HDDS
  • 16. 16 © Hortonworks Inc. 2011–2018. All rights reserved Container Structure (Using RocksDB) Container Index Chunk data file Chunk data file Chunk data file Chunk data file Key 1 LSM LevelDB/RocksDB Key N Chunk Data File Name Offset Length  An embedded LSM/KVStore (RocksDB)  BlockId is the key, – filename of local chunk file is value  Optimizations – Small blocks (< 1MB) can be stored directly in rocksDB – Compaction for block data to avoid lots of files • But this can be evolved over time
  • 17. 17 © Hortonworks Inc. 2011–2018. All rights reserved Replication of Container • Use RAFT replication instead of data pipeline, for both data and metadata • Proven to be correct • Traditionally Raft used for small updates and transactions, fits well for metadata • Performance considerations • When writing the meta data into raft-journal, put the data directly in container storage • Raft-journal in separate disk – fast contagious writes without seeking • Data spread across the other disks • Client uses Raft protocol to write data to the DNs storing the container Page 17
  • 18. 18 © Hortonworks Inc. 2011–2018. All rights reserved Open and Closed Containers Open – active writers • Need at least( NumSpindles * Data nodes) open active containers • Clients can get locality on writes • Data is spread across all data nodes • Improved IO and better chance of getting locality • Keep DNs and ALL spindles busy Closed – typically when full or had a failure in the past • Why close a container on failures • We originally considered keeping it open and bringing in a new DN • Wait for the data to copy? • Decided to close it, and have it replicated • Can open later or can merge with other closed container – under design Page 18
  • 19. 19 © Hortonworks Inc. 2011–2018. All rights reserved Details of Ozone
  • 20. 20 © Hortonworks Inc. 2011–2018. All rights reserved Ozone Master DN1 DN2 DNn Ozone Master K-V Namespace File (Object) = Bid[] Bid = Cid+ LocalId CM ContainerMap (CId ->IPAddress of DN Client RocksDB bId[]= Open(Key,..) GetBlockLocations(Bid) $$$ $$$ - Container Map Cache $$$ Read, Write, …
  • 21. 21 © Hortonworks Inc. 2011–2018. All rights reserved Ozone APIs • Key: /VolumeName/BucketId/ObjectKey e.g /Home/John/foo/bar/zoo) • ACLs at Volume and Bucket level (the other dirs are fake) • Future sharding at bucket level • => Ozone is Consistent (unlike S3) Ozone Object API (RPC) S3 Connector Hadoop FileSystem and Hadoop FileContext Connectors
  • 22. 22 © Hortonworks Inc. 2011–2018. All rights reserved Where does the Ozone Master run? Which Node? • On a separate node with large enough memory for caching the working set • Caching the working set is important for large number of concurrent clients • This option would give predictable performance for large clusters • On the Datanodes • How much memory for caching, • Note: tasks and other services run on DN since they are typically also compute nodes Where is Storage for the Ozone KV Metadata? • Local disk • If on DN then is it dedicated disk or shared with DN? • Use the container storage (Its using RocksDB anyway) • Spread Ozone volumes across containers to gain performance, • but this maylimit volume size & force more Ozone volumes than Admin wants
  • 23. 23 © Hortonworks Inc. 2011–2018. All rights reserved Quadra – Lun-like Raw-Block Storage Used for creating mountable disk FS volume
  • 24. 24 © Hortonworks Inc. 2011–2018. All rights reserved Quadra: Raw-Block Storage Volume (Lun) Lun-like storage service where the blocks are stored on HDDS • Volume: A raw-block device that can be used to create a mountable disk on Linux. • Raw-Blocks - those of the native FS that will use the Lun Volume • Raw-block size is dictated by the native fs like ext4 (4K) • Raw-Blocks are unit of IO operations by native file systems. • Raw-Block is the unit of read/write/update to HDDS • Ozone and Quadra share HDDS as a common storage backend • Current prototype: 1 raw-block = 1 HDDS block (but this will change later) Can be used in Kubernetes for container state
  • 25. 25 © Hortonworks Inc. 2011–2018. All rights reserved Quadra SCM DN DNDN DN Containers Host Kern el User SCSI Initiator Linux Distro Quadra Volume Manager Volume API JSCSI Ref: http://jscsi.org/ Quadra Plugin HDDS : Ozone’s storage layer
  • 26. 26 © Hortonworks Inc. 2011–2018. All rights reserved Quadra Volume Manager • Quadra volumes are virtual block devices that are mounted on hosts over SCSI • Volume Manager tracks the name of volumes and storage containers assigned to them • Volume Manager talks to CM to get containers allocated for a volume • Volume Manager assigns leases for clients to mount a volume • Volume Manager state is persisted and replicated via RAFT
  • 27. 27 © Hortonworks Inc. 2011–2018. All rights reserved Status HDDS: Block container • 2-4gb block containers initially – Reduction of 40-80 in BR and block map – Reduce BR pressure in on NN/OzoneMaster • Initial version to scale to 10s billions of blocks Ozone Master • Implemented using RocksDB (just like the HDDS in DNs) • Initial version to scale to 10 billion objects Current Status and Steps to GA • Stabilize HDDS and Ozone • Measure and improve performance • Add HA for Ozone Master and Container Manager • Add security – Security design completed and published After GA • Further stabilization and performance improvements • Transparent encryption • Erasure codes • Snapshots (or their equivalent) • ..
  • 28. 28 © Hortonworks Inc. 2011–2018. All rights reserved Summary • HDFS scale proven in real production systems • 4K+ clusters • Raw Storage >200PB in single federated NN cluster and >30PB in non-federated clusters • Scales to 60K+ concurrent clients bombarding the NN • But very large number of small files is a challenge (500M files) • HDDS + Ozone: Scalable Hadoop Storage • Retains • HDFS block storage Fault-tolerance • HDFS Horizonal scaling for Storage, IO • HDFS’s move computation to Storage • HDDS: Block containers: • Initially scale to 10B blocks, later to 100B+ blocks (HDFS-7240) • Ozone – Flat KV namespace + Hadoop Compatible FS (OzoneFS) • initially scale to 10B files (HDFS-13074) • Community working on a Hierarchal Namespace on HDDS (HDFS-10419)
  • 29. 29 © Hortonworks Inc. 2011–2018. All rights reserved Thank You Q&A