SlideShare a Scribd company logo
1 of 41
Download to read offline
1
Orit Wasserman
Reversim 2018
Storing your data in the
cloud: doing it right
About me
● 20+ years of development
● 10+ in open source:
○ Nested virtualization for KVM
○ Maintainer of live migration in Qemu/kvm
● 4 years as Ceph core developer at Red Hat
● Architect at lightbits labs
2
Storing your data in the cloud: doing it right
● Introduction to cloud object storage
● Features:
○ Multipart upload
○ Tags
○ Versioning
○ Life cycle
○ Prefix
○ Static website
● Security
● Summary
3
Introduction to cloud object storage
4
5
● Simple
● No metadata
● Fast
● Protocols:
○ FC/SCSI/SAS/SATA
○ ISCSI/FCoE
○ NVME
● Used by large DB like Oracle and for VM images
Block storage
6
● Hierarchy: Directories and files
● Metadata:
○ Ownership
○ Permissions/ACL
○ Creation/Modification time
● Authentication
● Sharing semantics
● Inplace writes
● Protocols:
○ Local: ext4,xfs, btrfs, zfs, NTFS, …
○ Network: NFS, SMB, AFP
File system
7
8
Scale problems: hierarchy
9
Solution: flat namespace
● Flat namespace:
○ Bucket/container
○ Objects
● Virtual hierarchy
● Buckets and objects are
referenced by name
Scale problems: hierarchy
10
Scale problems: file allocation
● Complex
● Fragmentation
11
Solution: Objects are immutable
● Object are overwritten
● Read range of an object
Scale problems: file allocation
12
● Flat namespace
● Objects are immutable
● Range Read
● Rich Metadata:
○ Ownership (Users and tenants)
○ ACL
○ User metadata
Object storage
13
Cloud object storage
● Restful API
● Common clouds:
○ AWS S3
○ Swift (openstack)
○ Google cloud storage
○ Azure blob storage
○ Ceph
○ Digital Ocean
14
● Cloud or large scale environment
● Lots of large objects that are rarely updated.
● Small objects that are updated infrequently and are
not performance sensitive.
● Hard drives
When to use cloud object storage
15
● If the application does lots of inplace writes inside
big files.
● Legacy application
When not to use cloud object storage
16
Cloud object storage
features
17
Example: Media
18
● Upload a single object as a set of parts
● Transaction:
○ Initiate
○ Upload parts
○ Complete
Multipart upload
19
Multipart upload
● Improved throughput
● Quick recovery from any network issues
● Pause and resume object uploads
● Begin an upload before you know the final object size
● Instead of FS rename
20
Multipart upload pitfalls
● Due to the performance impact not recommend for small objects
● Regular upload is up to 5 GB
● Check your framework/SDK defaults!
● Orphans ...
21
Multipart upload
● Improved throughput
● Quick recovery from any network issues
● Pause and resume object uploads
● Begin an upload before you know the final object size
● Instead of FS rename
● Due to the performance impact not recommend for small objects
22
● Better object classification
● Tag is key/value pair : color=red,
name=orit
● Filter objects
● Can be used by:
○ Access control
○ Life cycle
● Metrics and analytics
Object tagging
23
Example: Documents
24
● Keeps the previous copy of the object in case of overwrite or
deletion
Versioning
● Problem: space usage
25
Example: Documents
26
● Configure automatic object transition:
○ Expiration: used to clean old objects, older versions and failed
multipart uploads
○ Tiering: move object to colder storage
Life cycle
27
● Add a prefix to an object
● Listing a sub folder by listing objs with a specific prefix
virtual hierarchy
28
Host a static website directly from the cloud object storage
Static website
29
Security
30
● More secure:
○ Key is not part of the request
○ All requests are signed
○ Streaming support
● Not all SDK use it by default or even support it
AWS4
31
● Encrypt the traffic
● High performance penalty
● Options:
○ Tunneling
○ Terminate at the load balancers like HAProxy and use http for
your internal network
Protocol and transport
32
● Server side encryption is not enough
● Use client side encryption:
○ SSE-C: Customer provided keys
○ SSE-KMS: Key management service
Encryption
33
● Owner
● System/Admin user
● Other users: Read/Write/Read ACP/Write ACP/Full control
● Canned ACL (predefined): private,public-read, public-read-write, ..
Bucket and Object ACL
34
Be careful of public buckets
35
● Access policies for users and buckets
● Examples:
○ Grant access from multiple accounts
○ Cross account permission
○ Read only for anonymous users
○ Restricting access to a IP specific
Bucket and Users policy
36
● Provides a temporary token to access the cloud
storage
● Assume rule
● Used by storage class and glacier
Secure Token Service
37
● Object storage was designed for large scale and
for the cloud
● It has features designed for cloud application, in
order to use all those features you will need to
adopt your application
● Make sure your data is safe!
● Use Ceph for private cloud object storage!
github.com/oritwas
@oritwas
Summary
38
Backup
39
Disaster Recovery
40
Solution: geo replication
● Global object storage clusters with a single
namespace
● Enables deployment of clusters across multiple
geographic locations
● Clusters synchronize, allowing users to read from
or write to the closest one
● Disaster recovery in case of a zone failure
41
One cloud is not enough
Disaster recovery to a different public cloud
Replicate your private cloud data to public cloud

More Related Content

What's hot

Distributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraDistributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraSATOSHI TAGOMORI
 
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...PROIDEA
 
Monitoring your shiny new docker environment
Monitoring your shiny new docker environmentMonitoring your shiny new docker environment
Monitoring your shiny new docker environmentSamuel Vandamme
 
Tiering barcelona
Tiering barcelonaTiering barcelona
Tiering barcelonaGluster.org
 
Fluentd Intro for OpenShift Commons Briefing
Fluentd Intro for OpenShift Commons BriefingFluentd Intro for OpenShift Commons Briefing
Fluentd Intro for OpenShift Commons BriefingEduardo Silva Pereira
 
Developing apps and_integrating_with_gluster_fs_-_libgfapi
Developing apps and_integrating_with_gluster_fs_-_libgfapiDeveloping apps and_integrating_with_gluster_fs_-_libgfapi
Developing apps and_integrating_with_gluster_fs_-_libgfapiGluster.org
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdoseGluster.org
 
Sdc 2012-challenges
Sdc 2012-challengesSdc 2012-challenges
Sdc 2012-challengesGluster.org
 
Pomerania Cloud case study - Openstack Day Warsaw 2017
Pomerania Cloud case study - Openstack Day Warsaw 2017Pomerania Cloud case study - Openstack Day Warsaw 2017
Pomerania Cloud case study - Openstack Day Warsaw 2017Łukasz Klimek
 
Distributed Timeseries Database In Go (gophercon India 17)
Distributed Timeseries Database In Go (gophercon India 17)Distributed Timeseries Database In Go (gophercon India 17)
Distributed Timeseries Database In Go (gophercon India 17)Matthew Campbell
 
Geospatial web services using little-known GDAL features and modern Perl midd...
Geospatial web services using little-known GDAL features and modern Perl midd...Geospatial web services using little-known GDAL features and modern Perl midd...
Geospatial web services using little-known GDAL features and modern Perl midd...Ari Jolma
 
Gluster d thread_synchronization_using_urcu_lca2016
Gluster d thread_synchronization_using_urcu_lca2016Gluster d thread_synchronization_using_urcu_lca2016
Gluster d thread_synchronization_using_urcu_lca2016Gluster.org
 
Gluster for sysadmins
Gluster for sysadminsGluster for sysadmins
Gluster for sysadminsGluster.org
 
NATS vs HTTP
NATS vs HTTPNATS vs HTTP
NATS vs HTTPApcera
 
Scale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_glusterScale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_glusterGluster.org
 
Lcna tutorial-2012
Lcna tutorial-2012Lcna tutorial-2012
Lcna tutorial-2012Gluster.org
 
Approaches for duplicating Kubernetes Storage with Gluster
Approaches for duplicating Kubernetes Storage with GlusterApproaches for duplicating Kubernetes Storage with Gluster
Approaches for duplicating Kubernetes Storage with Glustermountpoint.io
 
Rook: Storage for Containers in Containers – data://disrupted® 2020
Rook: Storage for Containers in Containers  – data://disrupted® 2020Rook: Storage for Containers in Containers  – data://disrupted® 2020
Rook: Storage for Containers in Containers – data://disrupted® 2020data://disrupted®
 
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized StoreGlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized StoreAtin Mukherjee
 

What's hot (20)

Distributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraDistributed Logging Architecture in Container Era
Distributed Logging Architecture in Container Era
 
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
 
Monitoring your shiny new docker environment
Monitoring your shiny new docker environmentMonitoring your shiny new docker environment
Monitoring your shiny new docker environment
 
Tiering barcelona
Tiering barcelonaTiering barcelona
Tiering barcelona
 
Cncf microservices security
Cncf microservices securityCncf microservices security
Cncf microservices security
 
Fluentd Intro for OpenShift Commons Briefing
Fluentd Intro for OpenShift Commons BriefingFluentd Intro for OpenShift Commons Briefing
Fluentd Intro for OpenShift Commons Briefing
 
Developing apps and_integrating_with_gluster_fs_-_libgfapi
Developing apps and_integrating_with_gluster_fs_-_libgfapiDeveloping apps and_integrating_with_gluster_fs_-_libgfapi
Developing apps and_integrating_with_gluster_fs_-_libgfapi
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdose
 
Sdc 2012-challenges
Sdc 2012-challengesSdc 2012-challenges
Sdc 2012-challenges
 
Pomerania Cloud case study - Openstack Day Warsaw 2017
Pomerania Cloud case study - Openstack Day Warsaw 2017Pomerania Cloud case study - Openstack Day Warsaw 2017
Pomerania Cloud case study - Openstack Day Warsaw 2017
 
Distributed Timeseries Database In Go (gophercon India 17)
Distributed Timeseries Database In Go (gophercon India 17)Distributed Timeseries Database In Go (gophercon India 17)
Distributed Timeseries Database In Go (gophercon India 17)
 
Geospatial web services using little-known GDAL features and modern Perl midd...
Geospatial web services using little-known GDAL features and modern Perl midd...Geospatial web services using little-known GDAL features and modern Perl midd...
Geospatial web services using little-known GDAL features and modern Perl midd...
 
Gluster d thread_synchronization_using_urcu_lca2016
Gluster d thread_synchronization_using_urcu_lca2016Gluster d thread_synchronization_using_urcu_lca2016
Gluster d thread_synchronization_using_urcu_lca2016
 
Gluster for sysadmins
Gluster for sysadminsGluster for sysadmins
Gluster for sysadmins
 
NATS vs HTTP
NATS vs HTTPNATS vs HTTP
NATS vs HTTP
 
Scale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_glusterScale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_gluster
 
Lcna tutorial-2012
Lcna tutorial-2012Lcna tutorial-2012
Lcna tutorial-2012
 
Approaches for duplicating Kubernetes Storage with Gluster
Approaches for duplicating Kubernetes Storage with GlusterApproaches for duplicating Kubernetes Storage with Gluster
Approaches for duplicating Kubernetes Storage with Gluster
 
Rook: Storage for Containers in Containers – data://disrupted® 2020
Rook: Storage for Containers in Containers  – data://disrupted® 2020Rook: Storage for Containers in Containers  – data://disrupted® 2020
Rook: Storage for Containers in Containers – data://disrupted® 2020
 
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized StoreGlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
 

Similar to Storing your data in the cloud: doing right reversim 2018

Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)Marcos García
 
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...WebCamp
 
WebCamp Ukraine 2016: Instant messenger with Python. Back-end development
WebCamp Ukraine 2016: Instant messenger with Python. Back-end developmentWebCamp Ukraine 2016: Instant messenger with Python. Back-end development
WebCamp Ukraine 2016: Instant messenger with Python. Back-end developmentViach Kakovskyi
 
Everything you wanted to know about RadosGW - Orit Wasserman, Matt Benjamin
Everything you wanted to know about RadosGW - Orit Wasserman, Matt BenjaminEverything you wanted to know about RadosGW - Orit Wasserman, Matt Benjamin
Everything you wanted to know about RadosGW - Orit Wasserman, Matt BenjaminCeph Community
 
Using Zettabyte Filesystem (ZFS)
Using Zettabyte Filesystem (ZFS)Using Zettabyte Filesystem (ZFS)
Using Zettabyte Filesystem (ZFS)GLC Networks
 
Ceph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldCeph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldSage Weil
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific DashboardCeph Community
 
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022JayjeetChakraborty
 
[scala.by] Launching new application fast
[scala.by] Launching new application fast[scala.by] Launching new application fast
[scala.by] Launching new application fastDenis Karpenko
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016aspyker
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Sharma Podila
 
Kubernetes from scratch at veepee sysadmins days 2019
Kubernetes from scratch at veepee   sysadmins days 2019Kubernetes from scratch at veepee   sysadmins days 2019
Kubernetes from scratch at veepee sysadmins days 2019🔧 Loïc BLOT
 
Docker Security - Secure Container Deployment on Linux
Docker Security - Secure Container Deployment on LinuxDocker Security - Secure Container Deployment on Linux
Docker Security - Secure Container Deployment on LinuxMichael Boelen
 
Webinar: Building a multi-cloud Kubernetes storage on GitLab
Webinar: Building a multi-cloud Kubernetes storage on GitLabWebinar: Building a multi-cloud Kubernetes storage on GitLab
Webinar: Building a multi-cloud Kubernetes storage on GitLabMayaData Inc
 
OpenSearch.pdf
OpenSearch.pdfOpenSearch.pdf
OpenSearch.pdfAbhi Jain
 
Challenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan LambrightChallenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan LambrightGluster.org
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1Ruslan Meshenberg
 
Lt2013 glusterfs.talk
Lt2013 glusterfs.talkLt2013 glusterfs.talk
Lt2013 glusterfs.talkUdo Seidel
 
Scalable Clusters On Demand
Scalable Clusters On DemandScalable Clusters On Demand
Scalable Clusters On DemandBogdan Kyryliuk
 

Similar to Storing your data in the cloud: doing right reversim 2018 (20)

Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)
 
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
 
WebCamp Ukraine 2016: Instant messenger with Python. Back-end development
WebCamp Ukraine 2016: Instant messenger with Python. Back-end developmentWebCamp Ukraine 2016: Instant messenger with Python. Back-end development
WebCamp Ukraine 2016: Instant messenger with Python. Back-end development
 
Everything you wanted to know about RadosGW - Orit Wasserman, Matt Benjamin
Everything you wanted to know about RadosGW - Orit Wasserman, Matt BenjaminEverything you wanted to know about RadosGW - Orit Wasserman, Matt Benjamin
Everything you wanted to know about RadosGW - Orit Wasserman, Matt Benjamin
 
Using Zettabyte Filesystem (ZFS)
Using Zettabyte Filesystem (ZFS)Using Zettabyte Filesystem (ZFS)
Using Zettabyte Filesystem (ZFS)
 
Ceph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldCeph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud world
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
 
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
 
[scala.by] Launching new application fast
[scala.by] Launching new application fast[scala.by] Launching new application fast
[scala.by] Launching new application fast
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016
 
Kubernetes from scratch at veepee sysadmins days 2019
Kubernetes from scratch at veepee   sysadmins days 2019Kubernetes from scratch at veepee   sysadmins days 2019
Kubernetes from scratch at veepee sysadmins days 2019
 
Docker Security - Secure Container Deployment on Linux
Docker Security - Secure Container Deployment on LinuxDocker Security - Secure Container Deployment on Linux
Docker Security - Secure Container Deployment on Linux
 
Webinar: Building a multi-cloud Kubernetes storage on GitLab
Webinar: Building a multi-cloud Kubernetes storage on GitLabWebinar: Building a multi-cloud Kubernetes storage on GitLab
Webinar: Building a multi-cloud Kubernetes storage on GitLab
 
OpenSearch.pdf
OpenSearch.pdfOpenSearch.pdf
OpenSearch.pdf
 
Challenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan LambrightChallenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan Lambright
 
RubiX
RubiXRubiX
RubiX
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Lt2013 glusterfs.talk
Lt2013 glusterfs.talkLt2013 glusterfs.talk
Lt2013 glusterfs.talk
 
Scalable Clusters On Demand
Scalable Clusters On DemandScalable Clusters On Demand
Scalable Clusters On Demand
 

Recently uploaded

Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 

Recently uploaded (20)

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 

Storing your data in the cloud: doing right reversim 2018

  • 1. 1 Orit Wasserman Reversim 2018 Storing your data in the cloud: doing it right
  • 2. About me ● 20+ years of development ● 10+ in open source: ○ Nested virtualization for KVM ○ Maintainer of live migration in Qemu/kvm ● 4 years as Ceph core developer at Red Hat ● Architect at lightbits labs 2
  • 3. Storing your data in the cloud: doing it right ● Introduction to cloud object storage ● Features: ○ Multipart upload ○ Tags ○ Versioning ○ Life cycle ○ Prefix ○ Static website ● Security ● Summary 3
  • 4. Introduction to cloud object storage 4
  • 5. 5 ● Simple ● No metadata ● Fast ● Protocols: ○ FC/SCSI/SAS/SATA ○ ISCSI/FCoE ○ NVME ● Used by large DB like Oracle and for VM images Block storage
  • 6. 6 ● Hierarchy: Directories and files ● Metadata: ○ Ownership ○ Permissions/ACL ○ Creation/Modification time ● Authentication ● Sharing semantics ● Inplace writes ● Protocols: ○ Local: ext4,xfs, btrfs, zfs, NTFS, … ○ Network: NFS, SMB, AFP File system
  • 7. 7
  • 9. 9 Solution: flat namespace ● Flat namespace: ○ Bucket/container ○ Objects ● Virtual hierarchy ● Buckets and objects are referenced by name Scale problems: hierarchy
  • 10. 10 Scale problems: file allocation ● Complex ● Fragmentation
  • 11. 11 Solution: Objects are immutable ● Object are overwritten ● Read range of an object Scale problems: file allocation
  • 12. 12 ● Flat namespace ● Objects are immutable ● Range Read ● Rich Metadata: ○ Ownership (Users and tenants) ○ ACL ○ User metadata Object storage
  • 13. 13 Cloud object storage ● Restful API ● Common clouds: ○ AWS S3 ○ Swift (openstack) ○ Google cloud storage ○ Azure blob storage ○ Ceph ○ Digital Ocean
  • 14. 14 ● Cloud or large scale environment ● Lots of large objects that are rarely updated. ● Small objects that are updated infrequently and are not performance sensitive. ● Hard drives When to use cloud object storage
  • 15. 15 ● If the application does lots of inplace writes inside big files. ● Legacy application When not to use cloud object storage
  • 18. 18 ● Upload a single object as a set of parts ● Transaction: ○ Initiate ○ Upload parts ○ Complete Multipart upload
  • 19. 19 Multipart upload ● Improved throughput ● Quick recovery from any network issues ● Pause and resume object uploads ● Begin an upload before you know the final object size ● Instead of FS rename
  • 20. 20 Multipart upload pitfalls ● Due to the performance impact not recommend for small objects ● Regular upload is up to 5 GB ● Check your framework/SDK defaults! ● Orphans ...
  • 21. 21 Multipart upload ● Improved throughput ● Quick recovery from any network issues ● Pause and resume object uploads ● Begin an upload before you know the final object size ● Instead of FS rename ● Due to the performance impact not recommend for small objects
  • 22. 22 ● Better object classification ● Tag is key/value pair : color=red, name=orit ● Filter objects ● Can be used by: ○ Access control ○ Life cycle ● Metrics and analytics Object tagging
  • 24. 24 ● Keeps the previous copy of the object in case of overwrite or deletion Versioning ● Problem: space usage
  • 26. 26 ● Configure automatic object transition: ○ Expiration: used to clean old objects, older versions and failed multipart uploads ○ Tiering: move object to colder storage Life cycle
  • 27. 27 ● Add a prefix to an object ● Listing a sub folder by listing objs with a specific prefix virtual hierarchy
  • 28. 28 Host a static website directly from the cloud object storage Static website
  • 30. 30 ● More secure: ○ Key is not part of the request ○ All requests are signed ○ Streaming support ● Not all SDK use it by default or even support it AWS4
  • 31. 31 ● Encrypt the traffic ● High performance penalty ● Options: ○ Tunneling ○ Terminate at the load balancers like HAProxy and use http for your internal network Protocol and transport
  • 32. 32 ● Server side encryption is not enough ● Use client side encryption: ○ SSE-C: Customer provided keys ○ SSE-KMS: Key management service Encryption
  • 33. 33 ● Owner ● System/Admin user ● Other users: Read/Write/Read ACP/Write ACP/Full control ● Canned ACL (predefined): private,public-read, public-read-write, .. Bucket and Object ACL
  • 34. 34 Be careful of public buckets
  • 35. 35 ● Access policies for users and buckets ● Examples: ○ Grant access from multiple accounts ○ Cross account permission ○ Read only for anonymous users ○ Restricting access to a IP specific Bucket and Users policy
  • 36. 36 ● Provides a temporary token to access the cloud storage ● Assume rule ● Used by storage class and glacier Secure Token Service
  • 37. 37 ● Object storage was designed for large scale and for the cloud ● It has features designed for cloud application, in order to use all those features you will need to adopt your application ● Make sure your data is safe! ● Use Ceph for private cloud object storage! github.com/oritwas @oritwas Summary
  • 40. 40 Solution: geo replication ● Global object storage clusters with a single namespace ● Enables deployment of clusters across multiple geographic locations ● Clusters synchronize, allowing users to read from or write to the closest one ● Disaster recovery in case of a zone failure
  • 41. 41 One cloud is not enough Disaster recovery to a different public cloud Replicate your private cloud data to public cloud