Advertisement
Advertisement

More Related Content

Slideshows for you(20)

Similar to Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - Zohar Elkayam(20)

Advertisement
Advertisement

Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - Zohar Elkayam

  1. Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database Speed and Scale when you need it Zohar Elkayam, Solutions Architect, Aerospike April 2020
  2. 2 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • Some background on how Aerospike works • What happens when we need more? • Live Demo • All Flash – Reducing Costs • Aerospike Cloud Agenda
  3. 3 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. Unbreakable Competitive Advantage Flash Optimized Storage Layer ✓ Significantly higher performance & IOPS Multi-threaded Massively Parallel ✓ ‘Scale up’ and ‘Scale out’ Self-healing clusters ✓ Superior Uptime, Availability and Reliability Storage indices in DRAM Data on optimized SSD’s ✓ Predictable Performance regardless of scale ✓ Single-hop to data patented Aerospike Hybrid Memory Architecture TM
  4. 4 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. CLUSTER DATA 5% 5% 5% 5% 5% % OF CLUSTER DATA CLUSTER DATA CLUSTER DATA CLUSTER DATA 25% 25% 25% SSD 1 SSD 2 SSD 3 SSD 4 SSD 5 Linear Scaling ✓ Scale UP – take full advantage of hardware ✓ Scale OUT – linear scaling with number of nodes Automatic Distribution of Data using Smart PartitionsTM Algorithm ✓ Even amount of data on every node and flash device ✓ All hardware used equally ✓ Load on all servers is balanced ✓ No “hot spots” ✓ No config changes as workload or use case changes Smart Clients ✓ Single “hop” from client to server ✓ Cluster-spanning operations (scan, query, batch) sent to all nodes for parallel processing Data Distribution and Scalability
  5. 5 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. Node A Node B Node C Node D Partition Master Replica 1 Replica 2 Replica 3 1 A B C D 2 B D A C 3 D A C B … 4096 C A B D CLIENT CLIENT CLIENT Partition table created when cluster (re-)forms ✓ Deterministic ✓ Optimized for even data distribution Client Pulls Partition Map ✓ Detects when cluster has changed and refreshes map ✓ Constant hashing algorithm (RIPEMD160) used to map key to partition id ✓ Allows single network hop to owning node Node Addition / Removal ✓ Cluster detects new / removed node through heartbeats ✓ Reforms table by promoting / removing ✓ Eg: If node B is removed, becomes replica on partition 1, D becomes master and A becomes replica on partition 2. ✓ Minimizes migration of data ✓ Distributes load of lost node to all other cluster nodes. Cluster Formation
  6. 6 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • Cluster forms using Paxos algorithm and a Partition Table is generated. • Each row in the Partition Table is the Succession List for that partition. The Partition Table
  7. 7 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • Every second, ACL tend thread queries each node for Partition Version. • Cluster change triggers Paxos re-clustering and bumps Partition Version. • When ACL detects change in Partition Version, it re-builds the Partition Map by querying each node for its Master and Replica(s) ownership. Partition Map
  8. 8 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • New node F is added to the cluster. • F may land anywhere in a partition’s succession list. Example: • Partition 0: Node F joins as Replica, B remains Master and fills data into F (Fill Migration). • Partition 1: Node F joins as Master, E continues to act as Master till it finishes filling data into F. When this fill migration completes, F becomes new Master (Master-Handoff). Scaling Out - Adding a Node
  9. 9 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • When a node is lost (e.g., node C ), succession list moves left. Example: • Partition 1: C was Replica, A becomes new Replica. Partition data migrates from Master E to A. Two copies of data restored upon completion of migration. • Partition 4094: C was Master, Replica B gets promoted to new Master. Typically, B will have full data. A becomes the new Replica. Partition data will be migrated from from B to A. Scaling Down: Removing/Losing a Node
  10. 10 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • We have a 5-node cluster: 4096/5= ~819 master partitions per node. • Adding a node, we have a 6-node cluster: 4096/6 = ~683 master partitions per node. • For a given node capacity (RAM, DISK), as cluster size grows, each node is responsible for less partitions, less data, less activity. • When a node is taken out (e.g. rolling upgrade), the remaining nodes should be able to still store 2 copies of the data after cluster re-balances automatically. ➢ Adjust cluster size with automatic data re-distribution and rebalancing. Cluster Capacity When Scaling
  11. Scale Up Demo Time!
  12. 12 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. Aerospike Server Version 4.3.0.2+ introduced ALL FLASH storage option. • Allows user to store the PRIMARY INDEX (PI) on device (NVMe SSD) instead of in-memory. Edge Systems • For large number of very small size records with relaxed latency needs. • RAM vs. SSD storage space ratio approaches 1:1 causing server sprawl. • Significant cost savings by using ALL FLASH storage. • No need to modify data model with a reverse lookup implementation to improve RAM:SSD ratio. System of Records • Cost savings with very small objects and very large data stores. (> 100 TB) ALL FLASH Configuration
  13. 13 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. About ALL FLASH Configuration
  14. 14 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. Scenario: 10 Billion Objects, 64 bytes per object • When using Hybrid Memory, resources needed cluster-wide: • Memory: 10B * (Replication Factor=2) * (PI=64 bytes) = ~1.2 TB • Disk: 10B * (RF) * 64 bytes = ~1.2 TB • We need as much on memory as we do disks - not a lot of data, but things becomes expensive! • Example hardware needed: 6 nodes of r5d.8xl (1.4TB of RAM), at ~76K USD a year. • When using All Flash: • Memory needed: 13GB • Disk needed for Index: 4TB • Index Actual Utilization: 10B * (Replication Factor=2) * 64 bytes = ~1.2 TB • Disk Utilization: 10B * (RF) * 64 = ~1.2 TB • DRAM needs reduced; Hardware needed: 3 nodes of i3en.3xl, at ~24K USD a year. • Costs saved! All Flash Cost Savings Examples
  15. 15 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • Aerospike Cloud: Empowering customers to build, manage and automate their own Aerospike database-as-a-service (DBaaS). • Aerospike Cloud for Accelerated Cloud Deployments: use standard tools across multiple cloud environments to accelerate the development, management and automation of your own Aerospike database-as-a-service (DBaaS). • Standard Based Approach: Aerospike Cloud Foundations is based on Cloud Native Computing Foundation (CNCF) standards. New: Aerospike Cloud
  16. 16 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • CNCF is a set of technologies that make loosely coupled cloud-based deployments resilient, manageable, and observable. • A basis for automating the management of cloud deployments • A standard set of tools for alerting and monitoring systems • Managed under the Linux Foundation • Provides a governance model fit for enterprises and vendors of enterprise software • For Aerospike CNCF provides a complete model • Kubernetes, evolving support for Helm Charts, and Prometheus What Is CNCF?
  17. 17 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. What We Are Delivering Kubernetes operator Custom Aerospike-specific extensions to the Kubernetes API that encapsulate operations domain knowledge, such as scale-up, scale-down, cluster configuration management, upgrades. Helm Charts The ability to deploy Aerospike clusters in a Kubernetes environment using the Helm package manager, a CNCF incubating project. Prometheus Integration with the CNCF graduated monitoring and alerting solution by way of a custom exporter for Aerospike Enterprise Edition and Alertmanager configs. Grafana Integration with CNCF member Grafana Labs' open source visualization platform through custom dashboards for the Aerospike EE Prometheus exporter.
  18. 18 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • Announced on March 2020. • Google Cloud First: Aerospike Cloud supports the Google Kubernetes Engine (GKE) on Google Cloud Platform (GCP). Full integration to other cloud platforms will follow soon. • Individual parts are available for other cloud/on-prem platforms as well. Aerospike Cloud Availability
  19. Time for Q&A!
  20. Thank You! zelkayam@aerospike.com
Advertisement