Cloudian_Cassandra Summit 2012

1,405
-1

Published on

Cloudian presentation at Cassandra Summit 2012 on August 8, 2012 in Santa Clara

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,405
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Add symbols for different files from motohashi ppt.
  • Cloudian_Cassandra Summit 2012

    1. 1. Cassandra Summit 2012 (#cassandra12) Using Cassandra in Cloudian, an S3 Cloud Storage System August 8, 2012 Gary Ogasawara Cloudian, Inc. Copyright © 2012 Cloudian Inc. & KK All Rights Reserved. Page 1
    2. 2. What is Cloudian?Cloudian = S3 Cloud Storage as Packaged Software 2#cassandra12 (c) Copyright , Cloudian Inc. & KK, 2012, All rights reserved.
    3. 3. Cloudian Features 1. Full Amazon S3 API Compatibility, including error codes 2. Multi-datacenter, peer-to-peer architecture. No single point of failure. 3. Multi-tenant: QoS controls, billing, reporting by each User and each Group 4. Public and Private Clouds. 5. Elastic Capacity: small start and scale-out as needed 6. System, Group, and User management by Management Console or REST API 7. Easy to Use Packaged Software, backed by 24x7 carrier grade support. 3#cassandra12 (c) Copyright, Cloudian Inc. & KK, 2012, All rights reserved.
    4. 4. Cloudian Objectives 1. S3 API full compatibility • Use S3 ecosystem applications “as is”. • API already designed. 1. Fully packaged software • Easy to deploy on existing • Hide NoSQL complexity hardware/network. • Easy install/upgrade • Flexible for different customer types. • HyperStore: Best fit store • Scalable. Start small and grow. 1. Complete service platform • User/Group Provisioning • Turnkey system. • Cluster Management • Can choose integration points • Reporting with existing systems. • Billing 4#cassandra12 (c) Copyright, Cloudian Inc. & KK, 2012, All rights reserved.
    5. 5. Object vs. File vs. Block Storage Application Level HTTP OBJECTS OS User Level NAS (NFS, CIFS) FILES OS Kernel Level SAN (iSCSI) BLOCKS Abstraction Level#cassandra12 (c) Copyright , Cloudian Inc. & KK, 2012, All rights reserved. Page 5
    6. 6. S3 Ecosystem Libraries, applications, gateways, etc. using Amazon S3 can be simply re-pointed to Cloudian. Public Hybrid Private#cassandra12 Copyright © 2012 Cloudian Inc. & KK All Rights Reserved. Page 6
    7. 7. S3 Functions • HTTP REST API. PUT, POST, GET, DELETE, HEAD. • Objects organized into buckets. • Security. Requests authenticated using keyed HMAC with symmetric keys. Also, HTTPS option, client-side encryption, server-side encryption. • Access control lists (ACLs) define access rights to bucket and object. • Accounting of bytes inbound, outbound, stored and HTTP request counts. Billing by tiered rating plans per accounting type, per-region. • Multi-part uploads. Allows uploading large objects in multiple parts. • Versioning. Multiple versions of same object. • Location constraint. Buckets can be assigned to a specific region. Each region has own domain. • …#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved. Page 7
    8. 8. Works with leading Cloud Compute Platforms Cloudian-Citrix CloudStack (May 9, 2012) Cloudian-OpenStack (October 21, 2011)#cassandra12 Page 8 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved.
    9. 9. Cloudian CustomersPublic Channel Partners: Hybrid Private#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved. Page 9
    10. 10. Why Cassandra? Why Cassandra?  Scalable • Add capacity by adding nodes to running system. • Distributed (P2P architecture), no single point of failure  Reliable • Resilient to network or hardware failures. • Multi-datacenter replication • Tuneable data consistency level.  Features • TTL, secondary indexes, counters, compression, encryption, …  Fast • Write path especially fast. 10#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved.
    11. 11. Cassandra in Cloudian • v1.0.7 in use (started at 0.7.x) • Forked to add customizations • Hector client • Data stored includes: • Object metadata • Reports/logs • Counters for rate control •…#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved. Page 11
    12. 12. Cloudian: Logical Architecture HTTPS Login Admin Credentials Server DB Account profile / HTTPS Servlets HTTP Security keys Servlets HTTP S3 Server UserData DB Reports (Cassandra) Management Console Data Explorer HTTP AccountInfo & QoS DB (Cassandra) WEB UI HTTP or Data HTTPS Servers Reports DB (S3) (Cassandra) Applications 12#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved.
    13. 13. Minimum Redundant Configuration Servlets Credentials DBBrowser HTTPS Stickyrequests sessionsfor UI HTTP/S Cassandra Server LBApplication HTTP/HTTPSrequestsfor S3 Servlets Credentials DB HTTP/S Cassandra Server 13#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved.
    14. 14. Multi-Datacenter Example  2 datacenters / 4 nodes per datacenter CMC Redis (S) CMC Redis (S) CMC Redis (S) CMC Redis (S) S3/Admin S3/Admin S3/Admin S3/Admin Cassandra Cassandra Cassandra Cassandra /HyperStore /HyperStore /HyperStore /HyperStore CMC Redis (M) CMC Redis (S) CMC Redis (S) CMC Redis (S) S3/Admin S3/Admin S3/Admin S3/Admin Cassandra Cassandra Cassandra Cassandra /HyperStore /HyperStore /HyperStore /HyperStore DC1 DC2  Storage objects, reports, profiles replicated across DCs by Cassandra.  Credentials DB (Redis) has local DC slave and single global master. 14#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved.
    15. 15. Network Scaling Example DC 1-2 DC 3-2 DC 1-1 DC 3-1 Region 3 Region 1 DC 2-1 Region 2 15#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved.
    16. 16. Cassandra for Object Store  Dynamically decide how to store each object (Cassandra or file system).  Cassandra better for small objects.  Large objects split into multiple parts and chunks. Column Random Row Name Partitioner key Value Column Family 16#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved.
    17. 17. Cassandra for Object Metadata  Size, Etag, MD5, timestamp, ACL, part info, version, etc.  Old versions of metadata format supported. Column Column Column Name Name Name Random Row Partitioner Key Value Value … Value Column Family Sorted by Column Name 17#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved.
    18. 18. Cassandra for Account Info DATA MODEL  User - ID, name, contact info, etc.  Group - ID, name, contact info, etc.  Rating Plan  Security Credentials  QoS Counters NOTES  “Static” data. Fixed number of columns.  Could be put in a Relational DB like MySQL, but no need to add another component. 18#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved.
    19. 19. Quality of Service / SLA Management • Configurable maximum limits per- region at per-user, per-group, system level. • Requests/minute • Storage bytes • Storage objects • Data Bytes Inbound • Data Bytes Outbound • While limit is reached, requests are rejected.#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved. Page 19
    20. 20. Cassandra for Reports DATA MODEL  “Raw” column family - User, Group, System - Transaction type (HTTP GET, PUT, DELETE) … - Object path - Size - …  “Rollup” column families. - RollupHour. Summarizes data for each hour using Raw data. - RollupDay. Summarizes data for each day using RollupHour data. - RollupMonth. Summarizes data for each month using RollupDay data. NOTES  High write rate. Low read rate.  Rollup tables used for direct queries.  Automatic deletion using Cassandra TTL (time-to-live). 20#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved.
    21. 21. Cassandra: Wish List 1. Repair • Slow, impact on performance, difficult to monitor progress, manual operator action required. 2. Compaction • Heavy performance impact. Hard to tune. Capacity planning difficult. 3. Schema changes • Fixed in 1.1. 4. Large column slices. 5. Caches (row and key) not useful. Slower performance, large memory use. 6. JMX too slow. Need to directly use and expose Java interfaces. 21#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved.
    22. 22. HyperStore™ HyperStore: Management policies tailored Cloudian S3 Storage Server for different object types.  Object metadata is still stored in Admin NFS Cassandra Credentials  Use Cassandra’s distributed systems methods for data partitioning, replication, S3 REST Reporting node health detection. API (Cassandra)  Fork Cassandra source for customizations. HyperStore Accounting Manager (Cassandra) Benefits:  Better performance  More capacity per node Data Store Data Store  Higher disk utilization (Cassandra) (File System)  Storage layer flexibility 22#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved.
    23. 23. HyperStore: Hybrid Storage Example Storage 1 Storage 2 optimal U X  Optimal solution is to choose the storage method that minimizes latency.  Generally, you want to maximize/minimize U, a performance metric, based on random variables X using a mixture of N storage layers.  In a simple case,  U : average latency  X = {object size}  N = {cassandra, ext4 fs}. 23#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved.
    24. 24. HyperStore: Faster Read & Writes 50 40ms 30 >30% faster PUT-Cass 20 PUT-HS 10 0 0.5 5 50 500 KB /標準 60 /標準 50 40 /標準 >400% fasterms 30 /標準 GET-Cass 20 /標準 GET-HS 10 /標準 /標準 0 KB 0.5 5 50 500#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved.
    25. 25. HyperStore: Less Compaction No HyperStore With HyperStore PUT GET LIST DELETE PUT GET LIST DELETE Operations 50478 1679 3642 422 Operations 50559 9195 3575 2224 Latency (msec) 149.78 314.80 41.60 34.50 Latency (msec) 96.64 35.63 28.14 23.93 iostat % utilization iostat % utilization io read/write (MB) io read/write (MB)#cassandra12 20 tps, 10 threads, 2MB data Strictly Confidential 25
    26. 26. Finally Cassandra and other enabling technologies has allowed “leveling the playing field” for cloud storage providers. Info: www.cloudian.com  Download trial version.  Coming soon:  #1 best seller in “Database” category on amazon.co.jp.#cassandra12 (c) Copyright. Cloudian Inc. & KK, 2012, All rights reserved. Page 26

    ×