SlideShare a Scribd company logo
1
Orit Wasserman
Reversim 2018
Storing your data in the
cloud: doing it right
About me
● 20+ years of development
● 10+ in open source:
○ Nested virtualization for KVM
○ Maintainer of live migration in Qemu/kvm
● 4 years as Ceph core developer at Red Hat
● Architect at lightbits labs
2
Storing your data in the cloud: doing it right
● Introduction to cloud object storage
● Features:
○ Multipart upload
○ Tags
○ Versioning
○ Life cycle
○ Prefix
○ Static website
● Security
● Summary
3
Introduction to cloud object storage
4
5
● Simple
● No metadata
● Fast
● Protocols:
○ FC/SCSI/SAS/SATA
○ ISCSI/FCoE
○ NVME
● Used by large DB like Oracle and for VM images
Block storage
6
● Hierarchy: Directories and files
● Metadata:
○ Ownership
○ Permissions/ACL
○ Creation/Modification time
● Authentication
● Sharing semantics
● Inplace writes
● Protocols:
○ Local: ext4,xfs, btrfs, zfs, NTFS, …
○ Network: NFS, SMB, AFP
File system
7
8
Scale problems: hierarchy
9
Solution: flat namespace
● Flat namespace:
○ Bucket/container
○ Objects
● Virtual hierarchy
● Buckets and objects are
referenced by name
Scale problems: hierarchy
10
Scale problems: file allocation
● Complex
● Fragmentation
11
Solution: Objects are immutable
● Object are overwritten
● Read range of an object
Scale problems: file allocation
12
● Flat namespace
● Objects are immutable
● Range Read
● Rich Metadata:
○ Ownership (Users and tenants)
○ ACL
○ User metadata
Object storage
13
Cloud object storage
● Restful API
● Common clouds:
○ AWS S3
○ Swift (openstack)
○ Google cloud storage
○ Azure blob storage
○ Ceph
○ Digital Ocean
14
● Cloud or large scale environment
● Lots of large objects that are rarely updated.
● Small objects that are updated infrequently and are
not performance sensitive.
● Hard drives
When to use cloud object storage
15
● If the application does lots of inplace writes inside
big files.
● Legacy application
When not to use cloud object storage
16
Cloud object storage
features
17
Example: Media
18
● Upload a single object as a set of parts
● Transaction:
○ Initiate
○ Upload parts
○ Complete
Multipart upload
19
Multipart upload
● Improved throughput
● Quick recovery from any network issues
● Pause and resume object uploads
● Begin an upload before you know the final object size
● Instead of FS rename
20
Multipart upload pitfalls
● Due to the performance impact not recommend for small objects
● Regular upload is up to 5 GB
● Check your framework/SDK defaults!
● Orphans ...
21
Multipart upload
● Improved throughput
● Quick recovery from any network issues
● Pause and resume object uploads
● Begin an upload before you know the final object size
● Instead of FS rename
● Due to the performance impact not recommend for small objects
22
● Better object classification
● Tag is key/value pair : color=red,
name=orit
● Filter objects
● Can be used by:
○ Access control
○ Life cycle
● Metrics and analytics
Object tagging
23
Example: Documents
24
● Keeps the previous copy of the object in case of overwrite or
deletion
Versioning
● Problem: space usage
25
Example: Documents
26
● Configure automatic object transition:
○ Expiration: used to clean old objects, older versions and failed
multipart uploads
○ Tiering: move object to colder storage
Life cycle
27
● Add a prefix to an object
● Listing a sub folder by listing objs with a specific prefix
virtual hierarchy
28
Host a static website directly from the cloud object storage
Static website
29
Security
30
● More secure:
○ Key is not part of the request
○ All requests are signed
○ Streaming support
● Not all SDK use it by default or even support it
AWS4
31
● Encrypt the traffic
● High performance penalty
● Options:
○ Tunneling
○ Terminate at the load balancers like HAProxy and use http for
your internal network
Protocol and transport
32
● Server side encryption is not enough
● Use client side encryption:
○ SSE-C: Customer provided keys
○ SSE-KMS: Key management service
Encryption
33
● Owner
● System/Admin user
● Other users: Read/Write/Read ACP/Write ACP/Full control
● Canned ACL (predefined): private,public-read, public-read-write, ..
Bucket and Object ACL
34
Be careful of public buckets
35
● Access policies for users and buckets
● Examples:
○ Grant access from multiple accounts
○ Cross account permission
○ Read only for anonymous users
○ Restricting access to a IP specific
Bucket and Users policy
36
● Provides a temporary token to access the cloud
storage
● Assume rule
● Used by storage class and glacier
Secure Token Service
37
● Object storage was designed for large scale and
for the cloud
● It has features designed for cloud application, in
order to use all those features you will need to
adopt your application
● Make sure your data is safe!
● Use Ceph for private cloud object storage!
github.com/oritwas
@oritwas
Summary
38
Backup
39
Disaster Recovery
40
Solution: geo replication
● Global object storage clusters with a single
namespace
● Enables deployment of clusters across multiple
geographic locations
● Clusters synchronize, allowing users to read from
or write to the closest one
● Disaster recovery in case of a zone failure
41
One cloud is not enough
Disaster recovery to a different public cloud
Replicate your private cloud data to public cloud

More Related Content

What's hot

Distributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraDistributed Logging Architecture in Container Era
Distributed Logging Architecture in Container Era
SATOSHI TAGOMORI
 
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
PROIDEA
 
Monitoring your shiny new docker environment
Monitoring your shiny new docker environmentMonitoring your shiny new docker environment
Monitoring your shiny new docker environment
Samuel Vandamme
 
Tiering barcelona
Tiering barcelonaTiering barcelona
Tiering barcelona
Gluster.org
 
Cncf microservices security
Cncf microservices securityCncf microservices security
Cncf microservices security
Leonardo Gonçalves
 
Fluentd Intro for OpenShift Commons Briefing
Fluentd Intro for OpenShift Commons BriefingFluentd Intro for OpenShift Commons Briefing
Fluentd Intro for OpenShift Commons Briefing
Eduardo Silva Pereira
 
Developing apps and_integrating_with_gluster_fs_-_libgfapi
Developing apps and_integrating_with_gluster_fs_-_libgfapiDeveloping apps and_integrating_with_gluster_fs_-_libgfapi
Developing apps and_integrating_with_gluster_fs_-_libgfapi
Gluster.org
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdose
Gluster.org
 
Sdc 2012-challenges
Sdc 2012-challengesSdc 2012-challenges
Sdc 2012-challenges
Gluster.org
 
Pomerania Cloud case study - Openstack Day Warsaw 2017
Pomerania Cloud case study - Openstack Day Warsaw 2017Pomerania Cloud case study - Openstack Day Warsaw 2017
Pomerania Cloud case study - Openstack Day Warsaw 2017
Łukasz Klimek
 
Distributed Timeseries Database In Go (gophercon India 17)
Distributed Timeseries Database In Go (gophercon India 17)Distributed Timeseries Database In Go (gophercon India 17)
Distributed Timeseries Database In Go (gophercon India 17)
Matthew Campbell
 
Geospatial web services using little-known GDAL features and modern Perl midd...
Geospatial web services using little-known GDAL features and modern Perl midd...Geospatial web services using little-known GDAL features and modern Perl midd...
Geospatial web services using little-known GDAL features and modern Perl midd...
Ari Jolma
 
Gluster d thread_synchronization_using_urcu_lca2016
Gluster d thread_synchronization_using_urcu_lca2016Gluster d thread_synchronization_using_urcu_lca2016
Gluster d thread_synchronization_using_urcu_lca2016
Gluster.org
 
Gluster for sysadmins
Gluster for sysadminsGluster for sysadmins
Gluster for sysadmins
Gluster.org
 
NATS vs HTTP
NATS vs HTTPNATS vs HTTP
NATS vs HTTP
Apcera
 
Scale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_glusterScale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_gluster
Gluster.org
 
Lcna tutorial-2012
Lcna tutorial-2012Lcna tutorial-2012
Lcna tutorial-2012
Gluster.org
 
Approaches for duplicating Kubernetes Storage with Gluster
Approaches for duplicating Kubernetes Storage with GlusterApproaches for duplicating Kubernetes Storage with Gluster
Approaches for duplicating Kubernetes Storage with Gluster
mountpoint.io
 
Rook: Storage for Containers in Containers – data://disrupted® 2020
Rook: Storage for Containers in Containers  – data://disrupted® 2020Rook: Storage for Containers in Containers  – data://disrupted® 2020
Rook: Storage for Containers in Containers – data://disrupted® 2020
data://disrupted®
 
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized StoreGlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
Atin Mukherjee
 

What's hot (20)

Distributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraDistributed Logging Architecture in Container Era
Distributed Logging Architecture in Container Era
 
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
 
Monitoring your shiny new docker environment
Monitoring your shiny new docker environmentMonitoring your shiny new docker environment
Monitoring your shiny new docker environment
 
Tiering barcelona
Tiering barcelonaTiering barcelona
Tiering barcelona
 
Cncf microservices security
Cncf microservices securityCncf microservices security
Cncf microservices security
 
Fluentd Intro for OpenShift Commons Briefing
Fluentd Intro for OpenShift Commons BriefingFluentd Intro for OpenShift Commons Briefing
Fluentd Intro for OpenShift Commons Briefing
 
Developing apps and_integrating_with_gluster_fs_-_libgfapi
Developing apps and_integrating_with_gluster_fs_-_libgfapiDeveloping apps and_integrating_with_gluster_fs_-_libgfapi
Developing apps and_integrating_with_gluster_fs_-_libgfapi
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdose
 
Sdc 2012-challenges
Sdc 2012-challengesSdc 2012-challenges
Sdc 2012-challenges
 
Pomerania Cloud case study - Openstack Day Warsaw 2017
Pomerania Cloud case study - Openstack Day Warsaw 2017Pomerania Cloud case study - Openstack Day Warsaw 2017
Pomerania Cloud case study - Openstack Day Warsaw 2017
 
Distributed Timeseries Database In Go (gophercon India 17)
Distributed Timeseries Database In Go (gophercon India 17)Distributed Timeseries Database In Go (gophercon India 17)
Distributed Timeseries Database In Go (gophercon India 17)
 
Geospatial web services using little-known GDAL features and modern Perl midd...
Geospatial web services using little-known GDAL features and modern Perl midd...Geospatial web services using little-known GDAL features and modern Perl midd...
Geospatial web services using little-known GDAL features and modern Perl midd...
 
Gluster d thread_synchronization_using_urcu_lca2016
Gluster d thread_synchronization_using_urcu_lca2016Gluster d thread_synchronization_using_urcu_lca2016
Gluster d thread_synchronization_using_urcu_lca2016
 
Gluster for sysadmins
Gluster for sysadminsGluster for sysadmins
Gluster for sysadmins
 
NATS vs HTTP
NATS vs HTTPNATS vs HTTP
NATS vs HTTP
 
Scale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_glusterScale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_gluster
 
Lcna tutorial-2012
Lcna tutorial-2012Lcna tutorial-2012
Lcna tutorial-2012
 
Approaches for duplicating Kubernetes Storage with Gluster
Approaches for duplicating Kubernetes Storage with GlusterApproaches for duplicating Kubernetes Storage with Gluster
Approaches for duplicating Kubernetes Storage with Gluster
 
Rook: Storage for Containers in Containers – data://disrupted® 2020
Rook: Storage for Containers in Containers  – data://disrupted® 2020Rook: Storage for Containers in Containers  – data://disrupted® 2020
Rook: Storage for Containers in Containers – data://disrupted® 2020
 
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized StoreGlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
 

Similar to Storing your data in the cloud: doing right reversim 2018

Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)
Marcos García
 
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
WebCamp
 
WebCamp Ukraine 2016: Instant messenger with Python. Back-end development
WebCamp Ukraine 2016: Instant messenger with Python. Back-end developmentWebCamp Ukraine 2016: Instant messenger with Python. Back-end development
WebCamp Ukraine 2016: Instant messenger with Python. Back-end development
Viach Kakovskyi
 
Everything you wanted to know about RadosGW - Orit Wasserman, Matt Benjamin
Everything you wanted to know about RadosGW - Orit Wasserman, Matt BenjaminEverything you wanted to know about RadosGW - Orit Wasserman, Matt Benjamin
Everything you wanted to know about RadosGW - Orit Wasserman, Matt Benjamin
Ceph Community
 
Using Zettabyte Filesystem (ZFS)
Using Zettabyte Filesystem (ZFS)Using Zettabyte Filesystem (ZFS)
Using Zettabyte Filesystem (ZFS)
GLC Networks
 
Ceph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldCeph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud world
Sage Weil
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
Ceph Community
 
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
JayjeetChakraborty
 
[scala.by] Launching new application fast
[scala.by] Launching new application fast[scala.by] Launching new application fast
[scala.by] Launching new application fast
Denis Karpenko
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016
aspyker
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016
Sharma Podila
 
Kubernetes from scratch at veepee sysadmins days 2019
Kubernetes from scratch at veepee   sysadmins days 2019Kubernetes from scratch at veepee   sysadmins days 2019
Kubernetes from scratch at veepee sysadmins days 2019
🔧 Loïc BLOT
 
Docker Security - Secure Container Deployment on Linux
Docker Security - Secure Container Deployment on LinuxDocker Security - Secure Container Deployment on Linux
Docker Security - Secure Container Deployment on Linux
Michael Boelen
 
Webinar: Building a multi-cloud Kubernetes storage on GitLab
Webinar: Building a multi-cloud Kubernetes storage on GitLabWebinar: Building a multi-cloud Kubernetes storage on GitLab
Webinar: Building a multi-cloud Kubernetes storage on GitLab
MayaData Inc
 
OpenSearch.pdf
OpenSearch.pdfOpenSearch.pdf
OpenSearch.pdf
Abhi Jain
 
Challenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan LambrightChallenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan Lambright
Gluster.org
 
RubiX
RubiXRubiX
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
Ruslan Meshenberg
 
Lt2013 glusterfs.talk
Lt2013 glusterfs.talkLt2013 glusterfs.talk
Lt2013 glusterfs.talk
Udo Seidel
 
Scalable Clusters On Demand
Scalable Clusters On DemandScalable Clusters On Demand
Scalable Clusters On Demand
Bogdan Kyryliuk
 

Similar to Storing your data in the cloud: doing right reversim 2018 (20)

Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)
 
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
 
WebCamp Ukraine 2016: Instant messenger with Python. Back-end development
WebCamp Ukraine 2016: Instant messenger with Python. Back-end developmentWebCamp Ukraine 2016: Instant messenger with Python. Back-end development
WebCamp Ukraine 2016: Instant messenger with Python. Back-end development
 
Everything you wanted to know about RadosGW - Orit Wasserman, Matt Benjamin
Everything you wanted to know about RadosGW - Orit Wasserman, Matt BenjaminEverything you wanted to know about RadosGW - Orit Wasserman, Matt Benjamin
Everything you wanted to know about RadosGW - Orit Wasserman, Matt Benjamin
 
Using Zettabyte Filesystem (ZFS)
Using Zettabyte Filesystem (ZFS)Using Zettabyte Filesystem (ZFS)
Using Zettabyte Filesystem (ZFS)
 
Ceph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldCeph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud world
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
 
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
 
[scala.by] Launching new application fast
[scala.by] Launching new application fast[scala.by] Launching new application fast
[scala.by] Launching new application fast
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016
 
Kubernetes from scratch at veepee sysadmins days 2019
Kubernetes from scratch at veepee   sysadmins days 2019Kubernetes from scratch at veepee   sysadmins days 2019
Kubernetes from scratch at veepee sysadmins days 2019
 
Docker Security - Secure Container Deployment on Linux
Docker Security - Secure Container Deployment on LinuxDocker Security - Secure Container Deployment on Linux
Docker Security - Secure Container Deployment on Linux
 
Webinar: Building a multi-cloud Kubernetes storage on GitLab
Webinar: Building a multi-cloud Kubernetes storage on GitLabWebinar: Building a multi-cloud Kubernetes storage on GitLab
Webinar: Building a multi-cloud Kubernetes storage on GitLab
 
OpenSearch.pdf
OpenSearch.pdfOpenSearch.pdf
OpenSearch.pdf
 
Challenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan LambrightChallenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan Lambright
 
RubiX
RubiXRubiX
RubiX
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Lt2013 glusterfs.talk
Lt2013 glusterfs.talkLt2013 glusterfs.talk
Lt2013 glusterfs.talk
 
Scalable Clusters On Demand
Scalable Clusters On DemandScalable Clusters On Demand
Scalable Clusters On Demand
 

Recently uploaded

AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
lorraineandreiamcidl
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Undress Baby
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
Rakesh Kumar R
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
SOCRadar
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
Hornet Dynamics
 
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Envertis Software Solutions
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 

Recently uploaded (20)

AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
 
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 

Storing your data in the cloud: doing right reversim 2018

  • 1. 1 Orit Wasserman Reversim 2018 Storing your data in the cloud: doing it right
  • 2. About me ● 20+ years of development ● 10+ in open source: ○ Nested virtualization for KVM ○ Maintainer of live migration in Qemu/kvm ● 4 years as Ceph core developer at Red Hat ● Architect at lightbits labs 2
  • 3. Storing your data in the cloud: doing it right ● Introduction to cloud object storage ● Features: ○ Multipart upload ○ Tags ○ Versioning ○ Life cycle ○ Prefix ○ Static website ● Security ● Summary 3
  • 4. Introduction to cloud object storage 4
  • 5. 5 ● Simple ● No metadata ● Fast ● Protocols: ○ FC/SCSI/SAS/SATA ○ ISCSI/FCoE ○ NVME ● Used by large DB like Oracle and for VM images Block storage
  • 6. 6 ● Hierarchy: Directories and files ● Metadata: ○ Ownership ○ Permissions/ACL ○ Creation/Modification time ● Authentication ● Sharing semantics ● Inplace writes ● Protocols: ○ Local: ext4,xfs, btrfs, zfs, NTFS, … ○ Network: NFS, SMB, AFP File system
  • 7. 7
  • 9. 9 Solution: flat namespace ● Flat namespace: ○ Bucket/container ○ Objects ● Virtual hierarchy ● Buckets and objects are referenced by name Scale problems: hierarchy
  • 10. 10 Scale problems: file allocation ● Complex ● Fragmentation
  • 11. 11 Solution: Objects are immutable ● Object are overwritten ● Read range of an object Scale problems: file allocation
  • 12. 12 ● Flat namespace ● Objects are immutable ● Range Read ● Rich Metadata: ○ Ownership (Users and tenants) ○ ACL ○ User metadata Object storage
  • 13. 13 Cloud object storage ● Restful API ● Common clouds: ○ AWS S3 ○ Swift (openstack) ○ Google cloud storage ○ Azure blob storage ○ Ceph ○ Digital Ocean
  • 14. 14 ● Cloud or large scale environment ● Lots of large objects that are rarely updated. ● Small objects that are updated infrequently and are not performance sensitive. ● Hard drives When to use cloud object storage
  • 15. 15 ● If the application does lots of inplace writes inside big files. ● Legacy application When not to use cloud object storage
  • 18. 18 ● Upload a single object as a set of parts ● Transaction: ○ Initiate ○ Upload parts ○ Complete Multipart upload
  • 19. 19 Multipart upload ● Improved throughput ● Quick recovery from any network issues ● Pause and resume object uploads ● Begin an upload before you know the final object size ● Instead of FS rename
  • 20. 20 Multipart upload pitfalls ● Due to the performance impact not recommend for small objects ● Regular upload is up to 5 GB ● Check your framework/SDK defaults! ● Orphans ...
  • 21. 21 Multipart upload ● Improved throughput ● Quick recovery from any network issues ● Pause and resume object uploads ● Begin an upload before you know the final object size ● Instead of FS rename ● Due to the performance impact not recommend for small objects
  • 22. 22 ● Better object classification ● Tag is key/value pair : color=red, name=orit ● Filter objects ● Can be used by: ○ Access control ○ Life cycle ● Metrics and analytics Object tagging
  • 24. 24 ● Keeps the previous copy of the object in case of overwrite or deletion Versioning ● Problem: space usage
  • 26. 26 ● Configure automatic object transition: ○ Expiration: used to clean old objects, older versions and failed multipart uploads ○ Tiering: move object to colder storage Life cycle
  • 27. 27 ● Add a prefix to an object ● Listing a sub folder by listing objs with a specific prefix virtual hierarchy
  • 28. 28 Host a static website directly from the cloud object storage Static website
  • 30. 30 ● More secure: ○ Key is not part of the request ○ All requests are signed ○ Streaming support ● Not all SDK use it by default or even support it AWS4
  • 31. 31 ● Encrypt the traffic ● High performance penalty ● Options: ○ Tunneling ○ Terminate at the load balancers like HAProxy and use http for your internal network Protocol and transport
  • 32. 32 ● Server side encryption is not enough ● Use client side encryption: ○ SSE-C: Customer provided keys ○ SSE-KMS: Key management service Encryption
  • 33. 33 ● Owner ● System/Admin user ● Other users: Read/Write/Read ACP/Write ACP/Full control ● Canned ACL (predefined): private,public-read, public-read-write, .. Bucket and Object ACL
  • 34. 34 Be careful of public buckets
  • 35. 35 ● Access policies for users and buckets ● Examples: ○ Grant access from multiple accounts ○ Cross account permission ○ Read only for anonymous users ○ Restricting access to a IP specific Bucket and Users policy
  • 36. 36 ● Provides a temporary token to access the cloud storage ● Assume rule ● Used by storage class and glacier Secure Token Service
  • 37. 37 ● Object storage was designed for large scale and for the cloud ● It has features designed for cloud application, in order to use all those features you will need to adopt your application ● Make sure your data is safe! ● Use Ceph for private cloud object storage! github.com/oritwas @oritwas Summary
  • 40. 40 Solution: geo replication ● Global object storage clusters with a single namespace ● Enables deployment of clusters across multiple geographic locations ● Clusters synchronize, allowing users to read from or write to the closest one ● Disaster recovery in case of a zone failure
  • 41. 41 One cloud is not enough Disaster recovery to a different public cloud Replicate your private cloud data to public cloud