SlideShare a Scribd company logo
1 of 14
Ceph Design Summit – Jewel
-- Hadoop over RGW with SSD Cache Status update
yuan.zhou@intel.com
7/2015
Content
• Hadoop over RGW with SSD cache design
• Status update since Infernalis
• Performance of Hadoop over Swift
Rack 2
Server 1
RGW
(vanilla)
RGWFS
FileSystem
Interface
M/R
Server 2
RGW
(vanilla)
RGWFS
FileSystem
Interface
M/R
RGW-Proxy
(NEW!)
RGWFS
FileSystem
Interface
Scheduler
1
2
3
RGW
Service
1. Scheduler ask RGW service
where a particular block
locates (control path)
• RGW-Proxy returns the closest
active RGW instance(s)
2. Scheduler allocates a task on
the server that is near to the
data
3. Task access data from
nearby (data path)
4. RGW get/put data from the
CT, and CT would get/put
data from BT if necessary
(data path)
Ceph RGW with SSD cache
OSD OSD OSD OSD OSD OSD
Ceph
RADOS
Rack 1
4
Gateway
Gateway
OSD(SSD) OSD(SSD) OSD(SSD) OSD(SSD) Cache
Tier
Base
Tier
Isolated
network MON
Status update
• RGW-Proxy(done)
• Restful service based on Python WSGI
• Give out the block location(s) on restful request, where the locations are sorted by the
distance between RGW instances and the data OSD
• curl http://RGW-proxy/con/test1/1
• http://192.168.6.115/con/test1, http://192.168.6.114/con/test1
• RGWFS(70% done)
1. Forked from SwiftFS, RGWFS can talk to single RGW
• But only to single RGW instance
• Each Get/Put will need to go through the RGW
2. Now with RGW-proxy, it can talk to multiple RGW instances
3. We also add ‘block level location aware read’ feature
• Performance testing
• Baseline performance with HDFS and Swift is done
• New filesystem URL rgw://
1. Forked from Hadoop-8545(SwiftFS)
2. Hadoop is able to talk to a RGW cluster with this plugin
3. A new ‘block concept’ was added since Swift doesn’t
support blocks
• Thus scheduler could use multiple tasks to access the same file
• Based on the location, RGWFS is able to read from the closest RGW
through range GET API
• But for PUT all the traffic still goes through single RGW
RGWFS – a new adaptor for HCFS
RGWFS
FileSystem
Interface
Scheduler
RGW://
RGW
RADOS
RGW
file1
1. Before get/put, RGWFS would try to get the
location of each block from RGW-Proxy
1. One topology file of the cluster is generated
2. RGW-Proxy would get the manifest from the head
object first(librados + getxattr)
3. Then based on the crushmap RGW-proxy can get
the location of each object block(ceph osd map)
4. RGW-proxy could get the closest RGW the data
osd info and the topology file(simple lookup in the
topology file)
RGW-Proxy
(NEW!)
RGW-Proxy – Give out the closest RGW instance
RGWFS
FileSystem
Interface
Scheduler
RGW://
RGW
RADOS
RGW
file1
1. Setting up RGW on a Cache Tier thus we could
use SSD as the cache.
1. With some dedicated chunk size: e.g., 64MB
considering the data are quite big usually
2. rgw_max_chunk_size, rgw_obj_strip_size
2. Based on the account/container/object name,
RGW could get/put the content.
1. Using Range Read to get each chunk
3. We’ll use write-through mode here as a start
point to bypass the data consistency issue.
RGW(vanillia) – Serve the data requests
RGWFS
FileSystem
Interface
Scheduler
RGW
(modified)
RGW
(modified)
RGW-Proxy
(NEW!)RGW
Service
OSD OSD OSD OSD OSD OSD
Ceph
RADOS
Gateway
Gateway
OSD OSD OSD OSD
Cache
Tier
Base
Tier
MON
HDFS vs Swift
HDFS
Swift
with
list-enpoint
Swift
without
list-enpoint
Host
Data Node
MapReduce
Host
Data Node
MapReduce
…
Host
Object-Server
MapReduce
Host
Object-Server
MapReduce
Host
Data Node
MapReduce
Host
Object-Server
MapReduce
…
Proxy-Server
Host
Object-Server
MapReduce
Host
Object-Server
MapReduce
Host
Object-Server
MapReduce
…
Proxy-Server
Name Node
IMPACT
• List Endpoint impact is huge
• Swift overhead comes from “Rename”
1X
1.25X
1.67X
Less is better
Rename in Reduce Task
• The output of the reduce function is written to a temporary location in
HDFS. After completing, the output will automatically renamed from its
temporary location to its final location.
• Object storage cannot support rename, swiftfs use “copy and delete” for
rename function.
HDFS Rename -> Change METADATA in Name Node
Swift Rename -> Copy new object and Delete the older one in Swift
Next Step
• Finish the development(70% done) and complete the performance
testing work
• Based on the performance of Swift, we’ll need to solve the heavy
rename issue, may need to patch RGW
• Open source code repo(WIP)
• https://github.com/intel-bigdata/MOC
10
Q&A
23
Deployment Consideration Matrix
12
Storage
Compute
Distro/Plugin
Data Processing API
Vanilla CDH HDP MapRSpark
VM Container Bare-metal
Tenant vs. Admin
provisioned
Disaggregated vs. Collocated HDFS vs. other options
Traditional
EDP
(Sahara native)
3rd party
APIs
Storm
Performance results in the next section
Storage Architecture Tenant provisioned (in VM)
 HDFS in the same VMs of computing tasks vs. in the
different VMs
 Ephemeral disk vs. Cinder volume
 Admin provided
 Logically disaggregated from computing tasks
 Physical collocation is a matter of deployment
 For network remote storage, Neutron DVR is very
useful feature
 A disaggregated (and centralized) storage system has
significant values
 No data silos, more business opportunities
 Could leverage Manila service
 Allow to create advanced solutions (.e.g. in-memory
overlayer)
 More vendor specific optimization opportunities
13
#2 #4#3#1
Host
HDFS
VM
Comput-
ing Task
VM
Comput-
ing Task
Host
HDFS
VM
Comput-
ing Task
VM
HDFS
Host
HDFS
VM
Comput-
ing Task
VM
Comput-
ing Task
Legacy
NFS
GlusterFS Ceph*
External
HDFS Swift
HDFS
Scenario #1: computing and data service collocate in the VMs
Scenario #2: data service locates in the host world
Scenario #3: data service locates in a separate VM world
Scenario #4: data service locates in the remote network
Compute Engine
14
Pros Cons
VM
• Best support in OpenStack
• Strong security
• Slow to provision
• Relatively high runtime performance overhead
Container
• Light-weight, fast provisioning
• Better runtime performance than VM
• Nova-docker readiness
• Cinder volume support is not ready yet
• Weaker security than VM
• Not the ideal way to use container
Bare-Metal
• Best performance and QoS
• Best security isolation
• Ironic readiness
• Worst efficiency (e.g. consolidation of workloads with
different behaviors)
• Worst flexibility (e.g. migration)
• Worst elasticity due to slow provisioning
 Container seems to be promising but still need better support
 Determining the appropriate cluster size is always a challenge to tenants
 e.g. small flavor with more nodes or large flavor with less nodes

More Related Content

What's hot

BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephSage Weil
 
Ceph and RocksDB
Ceph and RocksDBCeph and RocksDB
Ceph and RocksDBSage Weil
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
 
A crash course in CRUSH
A crash course in CRUSHA crash course in CRUSH
A crash course in CRUSHSage Weil
 
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake SolutionCeph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake SolutionKaran Singh
 
Designing for High Performance Ceph at Scale
Designing for High Performance Ceph at ScaleDesigning for High Performance Ceph at Scale
Designing for High Performance Ceph at ScaleJames Saint-Rossy
 
What you need to know about ceph
What you need to know about cephWhat you need to know about ceph
What you need to know about cephEmma Haruka Iwao
 
BlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InBlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InSage Weil
 
CephFS update February 2016
CephFS update February 2016CephFS update February 2016
CephFS update February 2016John Spray
 
Linux Stammtisch Munich: Ceph - Overview, Experiences and Outlook
Linux Stammtisch Munich: Ceph - Overview, Experiences and OutlookLinux Stammtisch Munich: Ceph - Overview, Experiences and Outlook
Linux Stammtisch Munich: Ceph - Overview, Experiences and OutlookDanny Al-Gaaf
 
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)Ontico
 
Distributed Storage and Compute With Ceph's librados (Vault 2015)
Distributed Storage and Compute With Ceph's librados (Vault 2015)Distributed Storage and Compute With Ceph's librados (Vault 2015)
Distributed Storage and Compute With Ceph's librados (Vault 2015)Sage Weil
 
Ceph Introduction 2017
Ceph Introduction 2017  Ceph Introduction 2017
Ceph Introduction 2017 Karan Singh
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephSage Weil
 
SF Ceph Users Jan. 2014
SF Ceph Users Jan. 2014SF Ceph Users Jan. 2014
SF Ceph Users Jan. 2014Kyle Bader
 
Community Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonCommunity Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonSage Weil
 
Ceph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldCeph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldSage Weil
 

What's hot (19)

BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
MySQL on Ceph
MySQL on CephMySQL on Ceph
MySQL on Ceph
 
Ceph and RocksDB
Ceph and RocksDBCeph and RocksDB
Ceph and RocksDB
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
 
A crash course in CRUSH
A crash course in CRUSHA crash course in CRUSH
A crash course in CRUSH
 
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake SolutionCeph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
 
Designing for High Performance Ceph at Scale
Designing for High Performance Ceph at ScaleDesigning for High Performance Ceph at Scale
Designing for High Performance Ceph at Scale
 
What you need to know about ceph
What you need to know about cephWhat you need to know about ceph
What you need to know about ceph
 
BlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InBlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year In
 
CephFS update February 2016
CephFS update February 2016CephFS update February 2016
CephFS update February 2016
 
Linux Stammtisch Munich: Ceph - Overview, Experiences and Outlook
Linux Stammtisch Munich: Ceph - Overview, Experiences and OutlookLinux Stammtisch Munich: Ceph - Overview, Experiences and Outlook
Linux Stammtisch Munich: Ceph - Overview, Experiences and Outlook
 
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
 
Distributed Storage and Compute With Ceph's librados (Vault 2015)
Distributed Storage and Compute With Ceph's librados (Vault 2015)Distributed Storage and Compute With Ceph's librados (Vault 2015)
Distributed Storage and Compute With Ceph's librados (Vault 2015)
 
Intorduce to Ceph
Intorduce to CephIntorduce to Ceph
Intorduce to Ceph
 
Ceph Introduction 2017
Ceph Introduction 2017  Ceph Introduction 2017
Ceph Introduction 2017
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
SF Ceph Users Jan. 2014
SF Ceph Users Jan. 2014SF Ceph Users Jan. 2014
SF Ceph Users Jan. 2014
 
Community Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonCommunity Update at OpenStack Summit Boston
Community Update at OpenStack Summit Boston
 
Ceph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldCeph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud world
 

Viewers also liked

Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)Jens Hadlich
 
Mellanox High Performance Networks for Ceph
Mellanox High Performance Networks for CephMellanox High Performance Networks for Ceph
Mellanox High Performance Networks for CephMellanox Technologies
 
Your 1st Ceph cluster
Your 1st Ceph clusterYour 1st Ceph cluster
Your 1st Ceph clusterMirantis
 
Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Ceph Community
 
Ceph at Spreadshirt (June 2016)
Ceph at Spreadshirt (June 2016)Ceph at Spreadshirt (June 2016)
Ceph at Spreadshirt (June 2016)Jens Hadlich
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterPatrick Quairoli
 
SUSE Enterprise Storage - a Gentle Introduction
SUSE Enterprise Storage - a Gentle IntroductionSUSE Enterprise Storage - a Gentle Introduction
SUSE Enterprise Storage - a Gentle IntroductionGábor Nyers
 
High availability and fault tolerance of openstack
High availability and fault tolerance of openstackHigh availability and fault tolerance of openstack
High availability and fault tolerance of openstackDeepak Mane
 
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_sing
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_singC cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_sing
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_singJohn Sing
 
Cloud init and cloud provisioning [openstack summit vancouver]
Cloud init and cloud provisioning [openstack summit vancouver]Cloud init and cloud provisioning [openstack summit vancouver]
Cloud init and cloud provisioning [openstack summit vancouver]Joshua Harlow
 
SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)Lars Marowsky-Brée
 
Architecting Ceph Solutions
Architecting Ceph SolutionsArchitecting Ceph Solutions
Architecting Ceph SolutionsRed_Hat_Storage
 
Zimbra Forum France 2016 - Karine and StarXpert
Zimbra Forum France 2016 - Karine and StarXpertZimbra Forum France 2016 - Karine and StarXpert
Zimbra Forum France 2016 - Karine and StarXpertZimbra
 
Guts & OpenStack migration
Guts & OpenStack migrationGuts & OpenStack migration
Guts & OpenStack migrationopenstackindia
 
OpenStack Storage Buddy Ceph
OpenStack Storage Buddy CephOpenStack Storage Buddy Ceph
OpenStack Storage Buddy Cephopenstackindia
 
Zimbra Forum France 2016 - Beezim and Ceph
Zimbra Forum France 2016 - Beezim and CephZimbra Forum France 2016 - Beezim and Ceph
Zimbra Forum France 2016 - Beezim and CephZimbra
 

Viewers also liked (20)

Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
 
Mellanox High Performance Networks for Ceph
Mellanox High Performance Networks for CephMellanox High Performance Networks for Ceph
Mellanox High Performance Networks for Ceph
 
Your 1st Ceph cluster
Your 1st Ceph clusterYour 1st Ceph cluster
Your 1st Ceph cluster
 
Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt
 
Ceph at Spreadshirt (June 2016)
Ceph at Spreadshirt (June 2016)Ceph at Spreadshirt (June 2016)
Ceph at Spreadshirt (June 2016)
 
Cloudinit
CloudinitCloudinit
Cloudinit
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
 
SUSE Enterprise Storage - a Gentle Introduction
SUSE Enterprise Storage - a Gentle IntroductionSUSE Enterprise Storage - a Gentle Introduction
SUSE Enterprise Storage - a Gentle Introduction
 
High availability and fault tolerance of openstack
High availability and fault tolerance of openstackHigh availability and fault tolerance of openstack
High availability and fault tolerance of openstack
 
CloudInit Introduction
CloudInit IntroductionCloudInit Introduction
CloudInit Introduction
 
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_sing
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_singC cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_sing
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_sing
 
Ceph on rdma
Ceph on rdmaCeph on rdma
Ceph on rdma
 
Cloud init and cloud provisioning [openstack summit vancouver]
Cloud init and cloud provisioning [openstack summit vancouver]Cloud init and cloud provisioning [openstack summit vancouver]
Cloud init and cloud provisioning [openstack summit vancouver]
 
SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)
 
Ceph Object Store
Ceph Object StoreCeph Object Store
Ceph Object Store
 
Architecting Ceph Solutions
Architecting Ceph SolutionsArchitecting Ceph Solutions
Architecting Ceph Solutions
 
Zimbra Forum France 2016 - Karine and StarXpert
Zimbra Forum France 2016 - Karine and StarXpertZimbra Forum France 2016 - Karine and StarXpert
Zimbra Forum France 2016 - Karine and StarXpert
 
Guts & OpenStack migration
Guts & OpenStack migrationGuts & OpenStack migration
Guts & OpenStack migration
 
OpenStack Storage Buddy Ceph
OpenStack Storage Buddy CephOpenStack Storage Buddy Ceph
OpenStack Storage Buddy Ceph
 
Zimbra Forum France 2016 - Beezim and Ceph
Zimbra Forum France 2016 - Beezim and CephZimbra Forum France 2016 - Beezim and Ceph
Zimbra Forum France 2016 - Beezim and Ceph
 

Similar to Hadoop over rgw

Ceph Day Beijing: Big Data Analytics on Ceph Object Store
Ceph Day Beijing: Big Data Analytics on Ceph Object Store Ceph Day Beijing: Big Data Analytics on Ceph Object Store
Ceph Day Beijing: Big Data Analytics on Ceph Object Store Ceph Community
 
20171101 taco scargo luminous is out, what's in it for you
20171101 taco scargo   luminous is out, what's in it for you20171101 taco scargo   luminous is out, what's in it for you
20171101 taco scargo luminous is out, what's in it for youTaco Scargo
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanNarayana B
 
NoSql day 2019 - Floating on a Raft - Apache HBase durability with Apache Ratis
NoSql day 2019 - Floating on a Raft - Apache HBase durability with Apache RatisNoSql day 2019 - Floating on a Raft - Apache HBase durability with Apache Ratis
NoSql day 2019 - Floating on a Raft - Apache HBase durability with Apache RatisAnkit Singhal
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and BeyondSage Weil
 
Ceph Day London 2014 - The current state of CephFS development
Ceph Day London 2014 - The current state of CephFS development Ceph Day London 2014 - The current state of CephFS development
Ceph Day London 2014 - The current state of CephFS development Ceph Community
 
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph Ceph Community
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Ceph Day San Jose - Ceph at Salesforce
Ceph Day San Jose - Ceph at Salesforce Ceph Day San Jose - Ceph at Salesforce
Ceph Day San Jose - Ceph at Salesforce Ceph Community
 
Hw09 Production Deep Dive With High Availability
Hw09   Production Deep Dive With High AvailabilityHw09   Production Deep Dive With High Availability
Hw09 Production Deep Dive With High AvailabilityCloudera, Inc.
 
Ceph at salesforce ceph day external presentation
Ceph at salesforce   ceph day external presentationCeph at salesforce   ceph day external presentation
Ceph at salesforce ceph day external presentationSameer Tiwari
 
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...Equnix Business Solutions
 
How to configure the cluster based on Multi-site (WAN) configuration
How to configure the clusterbased on Multi-site (WAN) configurationHow to configure the clusterbased on Multi-site (WAN) configuration
How to configure the cluster based on Multi-site (WAN) configurationAkihiro Kitada
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep DiveRed_Hat_Storage
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices: A Deep DiveCeph Block Devices: A Deep Dive
Ceph Block Devices: A Deep Divejoshdurgin
 
Red Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) OverviewRed Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) OverviewMarcel Hergaarden
 
Geospatial web services using little-known GDAL features and modern Perl midd...
Geospatial web services using little-known GDAL features and modern Perl midd...Geospatial web services using little-known GDAL features and modern Perl midd...
Geospatial web services using little-known GDAL features and modern Perl midd...Ari Jolma
 

Similar to Hadoop over rgw (20)

Ceph Day Beijing: Big Data Analytics on Ceph Object Store
Ceph Day Beijing: Big Data Analytics on Ceph Object Store Ceph Day Beijing: Big Data Analytics on Ceph Object Store
Ceph Day Beijing: Big Data Analytics on Ceph Object Store
 
20171101 taco scargo luminous is out, what's in it for you
20171101 taco scargo   luminous is out, what's in it for you20171101 taco scargo   luminous is out, what's in it for you
20171101 taco scargo luminous is out, what's in it for you
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_Plan
 
NoSql day 2019 - Floating on a Raft - Apache HBase durability with Apache Ratis
NoSql day 2019 - Floating on a Raft - Apache HBase durability with Apache RatisNoSql day 2019 - Floating on a Raft - Apache HBase durability with Apache Ratis
NoSql day 2019 - Floating on a Raft - Apache HBase durability with Apache Ratis
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and Beyond
 
Hadoop 1.x vs 2
Hadoop 1.x vs 2Hadoop 1.x vs 2
Hadoop 1.x vs 2
 
Ceph Day London 2014 - The current state of CephFS development
Ceph Day London 2014 - The current state of CephFS development Ceph Day London 2014 - The current state of CephFS development
Ceph Day London 2014 - The current state of CephFS development
 
Ceph as software define storage
Ceph as software define storageCeph as software define storage
Ceph as software define storage
 
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Ceph Day San Jose - Ceph at Salesforce
Ceph Day San Jose - Ceph at Salesforce Ceph Day San Jose - Ceph at Salesforce
Ceph Day San Jose - Ceph at Salesforce
 
Hw09 Production Deep Dive With High Availability
Hw09   Production Deep Dive With High AvailabilityHw09   Production Deep Dive With High Availability
Hw09 Production Deep Dive With High Availability
 
Ceph at salesforce ceph day external presentation
Ceph at salesforce   ceph day external presentationCeph at salesforce   ceph day external presentation
Ceph at salesforce ceph day external presentation
 
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
 
How to configure the cluster based on Multi-site (WAN) configuration
How to configure the clusterbased on Multi-site (WAN) configurationHow to configure the clusterbased on Multi-site (WAN) configuration
How to configure the cluster based on Multi-site (WAN) configuration
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep Dive
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices: A Deep DiveCeph Block Devices: A Deep Dive
Ceph Block Devices: A Deep Dive
 
Red Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) OverviewRed Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) Overview
 
Geospatial web services using little-known GDAL features and modern Perl midd...
Geospatial web services using little-known GDAL features and modern Perl midd...Geospatial web services using little-known GDAL features and modern Perl midd...
Geospatial web services using little-known GDAL features and modern Perl midd...
 

Recently uploaded

哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查ydyuyu
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...kajalverma014
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制pxcywzqs
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdfMatthew Sinclair
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtrahman018755
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasDigicorns Technologies
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrHenryBriggs2
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirtrahman018755
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Roommeghakumariji156
 
75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptx75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptxAsmae Rabhi
 
Power point inglese - educazione civica di Nuria Iuzzolino
Power point inglese - educazione civica di Nuria IuzzolinoPower point inglese - educazione civica di Nuria Iuzzolino
Power point inglese - educazione civica di Nuria Iuzzolinonuriaiuzzolino1
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdfMatthew Sinclair
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查ydyuyu
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdfMatthew Sinclair
 
Microsoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck MicrosoftMicrosoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck MicrosoftAanSulistiyo
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsMonica Sydney
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsMonica Sydney
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样ayvbos
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsMonica Sydney
 

Recently uploaded (20)

哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency Dallas
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
 
75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptx75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptx
 
Power point inglese - educazione civica di Nuria Iuzzolino
Power point inglese - educazione civica di Nuria IuzzolinoPower point inglese - educazione civica di Nuria Iuzzolino
Power point inglese - educazione civica di Nuria Iuzzolino
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
 
Microsoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck MicrosoftMicrosoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck Microsoft
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
 

Hadoop over rgw

  • 1. Ceph Design Summit – Jewel -- Hadoop over RGW with SSD Cache Status update yuan.zhou@intel.com 7/2015
  • 2. Content • Hadoop over RGW with SSD cache design • Status update since Infernalis • Performance of Hadoop over Swift
  • 3. Rack 2 Server 1 RGW (vanilla) RGWFS FileSystem Interface M/R Server 2 RGW (vanilla) RGWFS FileSystem Interface M/R RGW-Proxy (NEW!) RGWFS FileSystem Interface Scheduler 1 2 3 RGW Service 1. Scheduler ask RGW service where a particular block locates (control path) • RGW-Proxy returns the closest active RGW instance(s) 2. Scheduler allocates a task on the server that is near to the data 3. Task access data from nearby (data path) 4. RGW get/put data from the CT, and CT would get/put data from BT if necessary (data path) Ceph RGW with SSD cache OSD OSD OSD OSD OSD OSD Ceph RADOS Rack 1 4 Gateway Gateway OSD(SSD) OSD(SSD) OSD(SSD) OSD(SSD) Cache Tier Base Tier Isolated network MON
  • 4. Status update • RGW-Proxy(done) • Restful service based on Python WSGI • Give out the block location(s) on restful request, where the locations are sorted by the distance between RGW instances and the data OSD • curl http://RGW-proxy/con/test1/1 • http://192.168.6.115/con/test1, http://192.168.6.114/con/test1 • RGWFS(70% done) 1. Forked from SwiftFS, RGWFS can talk to single RGW • But only to single RGW instance • Each Get/Put will need to go through the RGW 2. Now with RGW-proxy, it can talk to multiple RGW instances 3. We also add ‘block level location aware read’ feature • Performance testing • Baseline performance with HDFS and Swift is done
  • 5. • New filesystem URL rgw:// 1. Forked from Hadoop-8545(SwiftFS) 2. Hadoop is able to talk to a RGW cluster with this plugin 3. A new ‘block concept’ was added since Swift doesn’t support blocks • Thus scheduler could use multiple tasks to access the same file • Based on the location, RGWFS is able to read from the closest RGW through range GET API • But for PUT all the traffic still goes through single RGW RGWFS – a new adaptor for HCFS RGWFS FileSystem Interface Scheduler RGW:// RGW RADOS RGW file1
  • 6. 1. Before get/put, RGWFS would try to get the location of each block from RGW-Proxy 1. One topology file of the cluster is generated 2. RGW-Proxy would get the manifest from the head object first(librados + getxattr) 3. Then based on the crushmap RGW-proxy can get the location of each object block(ceph osd map) 4. RGW-proxy could get the closest RGW the data osd info and the topology file(simple lookup in the topology file) RGW-Proxy (NEW!) RGW-Proxy – Give out the closest RGW instance RGWFS FileSystem Interface Scheduler RGW:// RGW RADOS RGW file1
  • 7. 1. Setting up RGW on a Cache Tier thus we could use SSD as the cache. 1. With some dedicated chunk size: e.g., 64MB considering the data are quite big usually 2. rgw_max_chunk_size, rgw_obj_strip_size 2. Based on the account/container/object name, RGW could get/put the content. 1. Using Range Read to get each chunk 3. We’ll use write-through mode here as a start point to bypass the data consistency issue. RGW(vanillia) – Serve the data requests RGWFS FileSystem Interface Scheduler RGW (modified) RGW (modified) RGW-Proxy (NEW!)RGW Service OSD OSD OSD OSD OSD OSD Ceph RADOS Gateway Gateway OSD OSD OSD OSD Cache Tier Base Tier MON
  • 8. HDFS vs Swift HDFS Swift with list-enpoint Swift without list-enpoint Host Data Node MapReduce Host Data Node MapReduce … Host Object-Server MapReduce Host Object-Server MapReduce Host Data Node MapReduce Host Object-Server MapReduce … Proxy-Server Host Object-Server MapReduce Host Object-Server MapReduce Host Object-Server MapReduce … Proxy-Server Name Node IMPACT • List Endpoint impact is huge • Swift overhead comes from “Rename” 1X 1.25X 1.67X Less is better
  • 9. Rename in Reduce Task • The output of the reduce function is written to a temporary location in HDFS. After completing, the output will automatically renamed from its temporary location to its final location. • Object storage cannot support rename, swiftfs use “copy and delete” for rename function. HDFS Rename -> Change METADATA in Name Node Swift Rename -> Copy new object and Delete the older one in Swift
  • 10. Next Step • Finish the development(70% done) and complete the performance testing work • Based on the performance of Swift, we’ll need to solve the heavy rename issue, may need to patch RGW • Open source code repo(WIP) • https://github.com/intel-bigdata/MOC 10
  • 12. Deployment Consideration Matrix 12 Storage Compute Distro/Plugin Data Processing API Vanilla CDH HDP MapRSpark VM Container Bare-metal Tenant vs. Admin provisioned Disaggregated vs. Collocated HDFS vs. other options Traditional EDP (Sahara native) 3rd party APIs Storm Performance results in the next section
  • 13. Storage Architecture Tenant provisioned (in VM)  HDFS in the same VMs of computing tasks vs. in the different VMs  Ephemeral disk vs. Cinder volume  Admin provided  Logically disaggregated from computing tasks  Physical collocation is a matter of deployment  For network remote storage, Neutron DVR is very useful feature  A disaggregated (and centralized) storage system has significant values  No data silos, more business opportunities  Could leverage Manila service  Allow to create advanced solutions (.e.g. in-memory overlayer)  More vendor specific optimization opportunities 13 #2 #4#3#1 Host HDFS VM Comput- ing Task VM Comput- ing Task Host HDFS VM Comput- ing Task VM HDFS Host HDFS VM Comput- ing Task VM Comput- ing Task Legacy NFS GlusterFS Ceph* External HDFS Swift HDFS Scenario #1: computing and data service collocate in the VMs Scenario #2: data service locates in the host world Scenario #3: data service locates in a separate VM world Scenario #4: data service locates in the remote network
  • 14. Compute Engine 14 Pros Cons VM • Best support in OpenStack • Strong security • Slow to provision • Relatively high runtime performance overhead Container • Light-weight, fast provisioning • Better runtime performance than VM • Nova-docker readiness • Cinder volume support is not ready yet • Weaker security than VM • Not the ideal way to use container Bare-Metal • Best performance and QoS • Best security isolation • Ironic readiness • Worst efficiency (e.g. consolidation of workloads with different behaviors) • Worst flexibility (e.g. migration) • Worst elasticity due to slow provisioning  Container seems to be promising but still need better support  Determining the appropriate cluster size is always a challenge to tenants  e.g. small flavor with more nodes or large flavor with less nodes

Editor's Notes

  1. A new adaptor for RGW(RGWFS): basically it will follow SwiftFS using a restful requests to do read/write the data. [Java] Need to change the path.. A RGW-Proxy to do the master selection: it’s just like the Swift-Proxy which will give out the RGW instance and the object address(ip, port, account/container/obj) [python or Java] Based on the container/object name, this daemon would give out the closest RGW instance(thus we know the closest rack). We’ll use the SSD as the cache layer here. It’s actually one Cache Tier of below Base Tier. A RGW cache coherence algorithm: it’s working among the RGW instances which will keep the cache coherence. [C++] Based on the container/object name, RGW instance would read the object out from Ceph cluster.