Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Docker Meetup Tokyo #23 - Zenko Open Source Multi-Cloud Data Controller - Laure Vergeron


Published on

In a brisk presentation, we introduce Zenko, the multi-cloud data controller. Zenko is a stack of microservices (Docker containers) deployed via either Kubernetes or Docker Swarm.
The presentation comprised a demo deployment, so make sure to check out the video.
There is a lot of extra material at the end of the presentation going more in depth in topics that were only touched on briefly during the presentation.

Published in: Software
  • Be the first to comment

Docker Meetup Tokyo #23 - Zenko Open Source Multi-Cloud Data Controller - Laure Vergeron

  1. 1. DOCKER TOKYO - 2018-05-15 Zenko: the Multi-Cloud Data ControllerWe enable people who create value with data by making AWS storage cheaper Laure Vergeron Technology Evangelist R&D Engineer
  2. 2. Agenda
  3. 3. Agenda 1 - What is multi-cloud? 2 - Zenko: introduction to the multi-cloud data controller 3 - Zenko Orbit: introduction to the multi-cloud management UI 4 - CloudServer standalone: a simple local S3 endpoint 5 - Zenko S3 API: basically just AWS S3 API 6 - Zenko Enterprise Edition: coming later in 2018 7 - Zenkommunity 8 - Demos!
  4. 4. What is Multi-Cloud?
  5. 5. What/why is multi cloud?
  6. 6. ▪Content Distribution ▪ Media companies have tens of thousands of movies, which they store on Private Cloud for control. When it is time to publish a movie, it makes sense to copy it to a public cloud to use its transcoding and CDN services. ▪Compute Bursting ▪ Banks have to do risk analysis leveraging thousands of CPU every night. These intense computation only run for a few hours. Rather than having idle servers for the rest of the day, it makes sense to use Public Cloud services for the computation ▪Analytics ▪ E-commerce company do more and more machine learning on their very large data lake. Rather than setting up Hadoop infrastructures in-house, the company can copy just a data set to an Hadoop cloud, compute the appropriate algorithm, and get back the result and destroy the cloud copy of the data to save on storage cost. ▪Long-term Archival / cold storage ▪ While storing data which is regularly accessed is cheaper in a private cloud, long term archive of never accessed data is cheaper in long term archive cloud offering. Automatic archival of never accessed data would save a lot of money. Examples of Use-Cases for Multi-Cloud
  7. 7. Who is multi cloud?
  8. 8. Who is multi cloud?
  9. 9. What’s the big deal?
  10. 10. What’s the big deal?
  11. 11. What’s the bigger deal?
  12. 12. • Single Interface to any Cloud ▪ S3 API as a single API set to any cloud • Allow reuse in the Cloud ▪ Maintain the native cloud format • Always know your data and where it is ▪ Metadata search • Trigger actions based on data ▪ Data Workflow to manage replication, location The Zenko Multi-Cloud Data Controller 12
  13. 13. Zenko One namespace One endpoint One API Any Cloud
  14. 14. Zenko: one endpoint, one API, one namespace, any cloud in native format ● One API: the S3 API ● One endpoint: your Zenko endpoint ● Any cloud currently supported: Google Cloud Platform, Microsoft Azure, Amazon S3, Wasabi, Digital Ocean Spaces, Scality RING, local storage ● Data stored in native format: use services native to your cloud; Zenko does not lock you in!
  15. 15. METADATA DATA STORAGE DMD REST/Sproxyd AWS S3 API CLOUD API Shared Local Storage S3 API APP METADATA APP S3 CALLS  Zenko Open Source S3 API—Single API set and 360° access to any cloud   Native format—Data written through Zenko is stored in the native format of the target cloud storage and can be read directly, without going through Zenko. Backbeat, your data workflow manager — Policy-based data management engine Leveraging MongoDB for metadata search— Agregated metadata across all clouds in a replicated MongoDB gives a search tool for optimal data insight HA/Failover – Deployed as multiple containers for resilience via metal-k8s Simple Security – single-tenant credentials managed locally   S3 API S3 CALLS METADATA DATA MONGO DB Metadata Search Bucket LOCATION BACKBEAT Data Policy Engine Bucket LOCATION CRR/DATADATA Zenko: a stack of microservices
  16. 16. Zenko Orbit Multi-cloud management made easy
  17. 17. Zenko: Orbit management UI ● ● Free to use until 1TB ● Can either ○ get a sandbox ○ register own instance ● Available as a pay-as-you-go service for large capacities ● No support with free edition
  18. 18. Zenko: Orbit management UI ● Manage endpoints ● Manage locations ● Manage user keys ● Manage replication workflows ● Hardware stats ● Storage stats ● Probes to avoid mistakes at configuration steps
  19. 19. CloudServer Standalone Make your hard drive an S3 bucket
  20. 20. 20 CloudServer: What? • S3 API served in a Docker Container • Written in Node.js • 100% open source (Apache 2.0) • S3 stands for Simple Storage Service. The S3 API provides a simple interface used to store objects in buckets. • Single AWS S3 API interface to access multiple backend data storage both on-premise or public in the cloud.
  21. 21. Open Source Scality CloudServer Adoption ▪Launched June 2016 ▪Open-source implementation of AWS S3 API ▪Code available on Github under Apache 2.0 license ▪Packaged in Docker container for easy deployment ▪Seamless upgrade to S3 Connector for the RING Now Over 1,000,000
  22. 22. Zenko S3 API What’s there? What’s coming?
  23. 23. Current Release API Support Core S3 APIs • Bucket and object operations (PUT, GET, DELETE, HEAD) • Multi-Part Upload (MPU) for efficient ingest of large objects Advanced S3 APIs • Bucket Website • Bucket CORS • Bucket Versioning • Bucket Cross Region Replication (CRR) Extended APIs • Utilization API for metering of capacity, #objects, bandwidth & ops • Bucket CRR 1-to-many 2018 Roadmap Bucket Lifecycle • Expiration policies - Q3 2018 Extended APIs • Metadata search through extended GET Bucket API – Q3 2018 23
  24. 24. Zenko EE IAM for the enterprise, NFS over S3… … soon!
  25. 25. • Zenko EE is not yet available – We are doing our first beta deployments at large American customers – We are looking at Q1 2019 for GA • Zenko CE is readily available, with community support – – Plenty of documentation on readthedocs – Send an email to your SE or to • Zenko Orbit is available pay-as-you-go – First TB of data managed is free 25 Zenko EE : COMING SOON! Just not yet...
  26. 26. Zenkommunity Do you create value with data? Join us!
  27. 27. Community Meetups • Initiated prior to our S3 Server launch • At Docker Tokyo on May 15th • At Scality Tokyo Open Source Night on May 16th • Participating at open source events for Docker, Nodejs, etc... Developer Hackathons • Paris and San Francisco in 2015-2018 ; maybe Tokyo next? • Co-sponsoring with partners – focused on a specific project goal (e.g., IP Drives, Backblaze integrations) • Great for building visibility & community participation 27 Building a Developer community
  28. 28. How can I get involved with Zenko? • Let us know what you do with Zenko stack! ▪ ▪ Get your project/company featured on the website in a quote • Contribute tutorials ▪ Get a blogpost featuring your introduction of your tutorial ▪ Become part of our readTheDocs hosted documentation • Contribute code ▪ It’s an opportunity to drive the roadmap with us ! ▪ Join the team and be part of the Zenko craze ! ▪ We have Contributing Guidelines on the GitHub repos, and we’ll answer your questions via GitHub issues or our forum • Meet us at AWS Re:invent, DockerCon, Meetups... ▪ All info is on 28
  29. 29. ▪Zenko: MultiCloud Data Controller ▪ ▪ ▪ ▪ ▪CloudServer (S3 API standalone) ▪ ▪ ▪ ▪Backbeat (data driven event manager) ▪ ▪Clueso (cross-cloud metadata search) ▪ Zenko CE: everything you need @zenko_io @scality @LaureVergeron @GiorgioRegni
  30. 30. 30 Ready to join us? • Create an account on our Forum • • Clone Zenko, and its microservices • • • • Install s3cmd and AWS CLI • Read the docs ;) • Start with a minikube deployment • • Reach out on the Forum as you have questions • Try a full bare-metal-kubespray deployment •
  31. 31. DIY and Demo Get your own Zenko sandbox Deploy Zenko on Minikube and register it on Orbit
  32. 32. Q&A If you have more later:!
  33. 33. Email: Thank You
  34. 34. Extra slides
  35. 35. Extra Slides Zenko One namespace One endpoint One API Any Cloud
  36. 36. Zenko: one namespace, any cloud Zenko local Zenko to Public Cloud AWS Region / Location Region Public Cloud “bucket” Region Bucket Bucket Prefix in public cloud bucket Bucket $ aws --profile zenko s3 mb s3://remote-bucket --region aws-zenkobucket make_bucket: s3://mybucket $ aws --profile zenko s3 cp /etc/hosts s3://remote-bucket/test upload: /etc/hosts to s3://remote-bucket/test $ aws --profile zenko s3 ls 2018-05-14 17:08:50 remote-bucket $ aws --profile zenko s3 ls s3://remote-bucket 2018-05-14 17:09:18 235 test $ aws --profile aws s3 ls 2018-05-14 16:00:53 zenkobucket $ aws --profile aws s3 ls s3://zenkobucket PRE remote-bucket/ 2018-05-14 17:09:18 235 test
  37. 37. Zenko: a stack of microservices $ docker stack services ls ID NAME MODE REPLICAS IMAGE PORTS 1j8jb41llhtm zenko-prod_s3-data replicated 1/1 zenko/cloudserver:pensieve-3 *:30010->9991/tcp 3y7vayna97bt zenko-prod_s3-front replicated 1/1 zenko/cloudserver:pensieve-3 *:30009->8000/tcp 957xksl0cbge zenko-prod_mongodb-init replicated 0/1 mongo:3.6.3-jessie cn0v7cf2jxkb zenko-prod_queue replicated 1/1 wurstmeister/kafka:1.0.0 *:30008->9092/tcp jjx9oabeugx1 zenko-prod_mongodb replicated 1/1 mongo:3.6.3-jessie *:30007->27017/tcp o530bkuognu5 zenko-prod_lb global 1/1 zenko/loadbalancer:latest *:80->80/tcp r69lgbue0o3o zenko-prod_backbeat-api replicated 1/1 zenko/backbeat:pensieve-4 ut0ssvmi10tx zenko-prod_backbeat-consumer replicated 1/1 zenko/backbeat:pensieve-4 vj2fr90qviho zenko-prod_cache replicated 1/1 redis:alpine *:30011->6379/tcp vqmkxu7yo859 zenko-prod_quorum replicated 1/1 zookeeper:3.4.11 *:30006->2181/tcp y7tt98x7jdl9 zenko-prod_backbeat-producer replicated 1/1 zenko/backbeat:pensieve-4 [...] Zenko: Multi-Cloud Data Controller Cloudserver Backbeat Utapi S3 API Multi Cloud API translation Event-driven data manager Replication engine Usage Stats Custom node.js Kafka- & Zookeeper- -based Redis-based Bare-Metal Kubespray: custom deployment of Kubespray
  38. 38. Extra slides S3 API & Extended API
  39. 39. CloudServer implements the AWS S3 Bucket Versioning API • Create a versioned Bucket (PUT Bucket Versioning) – enables Bucket to maintain object versions • If an object with the same key is PUT or DELETED, it becomes the current version – DELETE marker used for indicating current version deleted as per AWS semantics – Assigns version IDs for older versions Enables Data Restores • Access to previous states/versions of an object (GET object with specified version ID) Required for both CRR & Lifecycle Management APIs • As specified in S3 API • When writing to AWS S3, the target bucket must have versioning enabled! 39 Zenko S3 API: Bucket Versioning
  40. 40. ● When versioning is enabled on a bucket: ● CREATE NEW VERSIONS: ○ Put Object, Complete Multipart Upload and Object Copy (to a versioning-enabled bucket) will return a version id in the ‘x-amz-version-id’ response header. ○ No special syntax necessary. ● When versioning is enabled or suspended: ● TARGETING SPECIFIC VERSIONS: ○ Include the version id in the request query for GET/HEAD Object or PUT/GET Object ACL ■ Example: `GET [bucket]/[object]?versionId=[versionId]` ○ For Object Copy or Upload Copy Part, to copy a specific version from a version-enabled bucket, add the version id to the ‘x-amz-copy-source’ header: ■ Example value: `[sourcebucket]/[sourceobject]?versionId=[versionId]` ○ Omitting a specific version will get the result for the latest / current version. Zenko S3 API: Bucket Versioning
  41. 41. ● When versioning is enabled or suspended (cont.): ● NULL VERSIONS: ○ Null versions are created when putting an object before versioning is configured or when versioning is suspended. ■ Only one null version is maintained in version history. New null versions will overwrite previous null versions. ○ Target the null version in version-specific actions by specifying the version ID ‘null’. ● DELETING OBJECTS: ○ Regular deletion of objects will create delete markers and return ‘x-amz-delete-marker’: ‘true’ and the version ID of the delete marker in ‘x-amz-version-id’ response headers. ○ Objects with delete markers as the latest version will behave as if they have been deleted when performing non-version specific actions. ○ Permanently remove delete markers or specific versions by specifying the version ID in the request query. Example: `DELETE [bucket]/[object]?versionId=[versionId]` Zenko S3 API: Bucket Versioning
  42. 42. ● When versioning is enabled or suspended (cont.): ● MULTI-OBJECT DELETE: ○ Specify the specific version of an object to delete in a multi-object delete request in the XML body. Example: ● At any time: ● LISTING OBJECTS: ○ A regular listing will list the most recent versions of an object and ignore objects with delete markers as their latest version. ○ To list all object versions and delete markers in a bucket, specify ‘versions’ in request query: ■ Example: `GET [bucket]?versions` ○ FMI about output: consult S3 Connector documentation ● GET BUCKET VERSIONING STATUS: use Get Bucket Versioning API. Zenko S3 API: Bucket Versioning
  43. 43. ● Utapi can be accessed through a REST API with service available on a dedicated port ● API routes use AWS Signature Version V4 for authentication ● API calls for listing metrics can use account credentials or create an IAM user with policy allowing to list metrics ● Metrics are collected in 15 minute intervals (not configurable) ● Requests for listing metrics use POST routes and require at least start time and a list of resources (accounts/buckets/users) ● Refer to wiki here for listing metrics Extended S3 API: UTAPI: UTilization API
  44. 44. S3 Buckets with associated Location •Assigned as an optional request parameter “LocationConstraint” •In PUT Bucket API command, application can specify a location for each Bucket •Enables S3 Connector to manage Buckets across multiple RINGs for scaling or access to multiple DC’s •Enables Zenko to access multiple Public Clouds Location Mapping - Configuration file to manage multiple location to multiple backends mappings - Defines the default location for object PUTs - Object GET Access is transparent 44 Zenko S3 API: Bucket Location Control
  45. 45. aws --endpoint-url s3api create-bucket --bucket test-bucket --create-bucket-configuration LocationConstraint=azure-container aws --endpoint-url s3 cp /etc/hosts s3://test-bucket/hosts • Specify bucket location at bucket creation • Object creation remains as usual PUT / HTTP/1.1 Host: {{BucketName}}.{{StorageService}}.com Content-Length: {{length}} Date: {{date}} Authorization: {{authenticationInformation}} <CreateBucketConfiguration xmlns=""> <LocationConstraint>azure-container</LocationConstraint> </CreateBucketConfiguration> • Request syntax Zenko S3 API: Bucket Location Control
  46. 46. Zenko Multi-Cloud Async Replication - CRR Remote disaster recovery (DR) for WAN environments - Follows the AWS S3 API “cross-region replication” (CRR) API - Async bucket replication: source Bucket -> target Bucket - Versioning must be enabled on source & target Target bucket in S3/RING - CRR to remote S3/RING – in current release CRR Features: - Full site sync - Bucket to bucket sync - Monitoring statistics (throughput, backlog, RTO/RPO) - Failback - CRR from 1 region to many others (one public cloud to several others) 46
  47. 47. Extra slides File or Object
  48. 48. 48 File or object?
  49. 49. 49 File or object?
  50. 50. 50 File or object? Why we do file: - We know it - Easy hierarchy - fopen() and fclose() - Lots of best practices - Perf of NAS / over LAN Why we do object: - Billions of entries - Storage accessed over WAN - For modern apps (REST) - Listing large volumes
  51. 51. Extra slides Community
  52. 52. 52 CloudServer tree structure CheatSheet S3/locationConfig.json and S3/config.json - setup your own endpoints S3/lib/server.js - your entrypoint into the service Arsenal/lib/s3routes/routes/*.js - S3 route calls S3/lib/api/{{yourAPIcommand}} - S3 API calls S3/lib/data/wrapper.js & multipleBackendGateway.js - gateway to external clients S3/lib/data/external/*Client.js - current clients S3/conf/authData.json - setup your own credentials Arsenal/lib/storage/metadata/* - check how metadata works
  53. 53. These commands assume you have S3 cloned locally, s3cmd configured for your S3 server, AWS cli configured for a real AWS bucket, and your locationConfig set up - START SERVER: S3BACKEND=mem S3DATA=multiple npm start - MAKE BUCKET: s3cmd mb s3://[bucket-name] - PUT OBJECT TO SPECIFIC LOCATION: s3cmd put [/path/to/file] s3://[bucket-name]/[object-name] --add-header x-amz-meta-scal-location-constraint:‘[location-name]’ - LIST OBJECTS IN BUCKET: s3cmd ls s3://[bucket-name]/[object-name] - GET S3 OBJECT METADATA: s3cmd info s3://[bucket-name]/[object-name] - IF PUT TO AWS, LIST OBJECTS ON AWS: aws s3api list-objects --bucket [bucket-name] Start S3 Server & Put Object Commands