Predictable Big Data Performance in Real-time


Published on

How do you ensure predictable, high performance and why does this matter? Find out in this presentation.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Predictable Big Data Performance in Real-time

  1. 1. Predictable BiGdata Performance inReal-timeSrini V. SRINIVASAN17thBIG data London MeetupAPRIL 22, 2013
  2. 2. Response time: Hours, WeeksTB to PBRead IntensiveTRANSACTIONS (OLTP)Response time: SecondsGigabytes of dataBalanced Reads/WritesANALYTICS (OLAP)STRUCTUREDDATAResponse time: SecondsTerabytes of dataRead Intensive© 2013 Aerospike. All rights reserved. Confidential Pg. 2BIG DATA ANALYTICSReal-time TransactionsResponse time: < 10 ms1-20 TBBalanced Reads/Writes24x7x365 AvailabilityUNSTRUCTURED DATAREAL-TIME BIG DATADatabase Landscape
  3. 3. Requirements for Internet Enterprises1. Know who the Interaction iswith Monitor 200+ Million US Consumers,5+ Billion mobile devices and sensors1. Determine intent based oncurrent context Page views, search terms, game state,last purchase, friends list, ads served,location1. Respond now, use big data formore accurate decisions Display the most relevant Ad Recommend the best product Deliver the richest gaming experience Eliminate fraud…1. Service can NEVER go down!© 2013 Aerospike. All rights reserved. Confidential Pg. 3
  4. 4. Challenges1. Handle extremely high rates of persistentread/write transactions2. Avoid hot spots to maintain tight latency SLAs3. Provide immediate consistency with replication4. Allow long running tasks with transactions5. Scale linearly as data sizes increase1. Add capacity with no service interruption© 2013 Aerospike. All rights reserved. Pg. 4
  5. 5. Native Flash  Performance➤ Low Latency at High Throughput© 2012 Aerospike. All rights reserved. Confidential Pg. 5
  6. 6. © 2013 Aerospike. All rights reserved. Confidential Pg. 6“Only Aerospike was able to function in synchronous mode with a replicationfactor of two.. it is a significant advantage that Aerospike is able to functionreliably on a smaller amount of hardware while still maintaining true consistency.”
  7. 7. Shared-Nothing Architecture© 2013 Aerospike. All rights reserved. Pg. 7OHIO Data Center➤ Every node in a cluster is identical,handles both transactions and longrunning tasks➤ Data is replicated synchronously withimmediate consistency within thecluster➤ Data is replicated asynchronouslyacross data centers
  8. 8. Distributed Hash TableHow Data Is Distributed (Replication Factor 2)➤ Every key is hashed into a20 byte (fixed length) stringusing the RIPEMD160 hash function➤ This hash + additional data(fixed 64 bytes)are stored in RAM in the index➤ 4 bytes of this hash are used tocompute the partition id➤ There are 4096 partitions➤ Partition id maps to node idbased on cluster membership© 2013 Aerospike. All rights reserved. Pg. 8cookie-abcdefg-12345678cookie-abcdefg-12345678182023kh15hh3kahdjsh182023kh15hh3kahdjshPartitionIDMasternodeReplicanode… 1 41820 2 31821 3 24096 4 1
  9. 9. Organizing the cluster➤ Automatic multicast gossip protocol for node discovery➤ Paxos consensus algorithm determines nodes in cluster➤ Ordered list of nodes determines data location➤ Data partitions balanced for minimal data motion➤ Vote initiated and terminated in 100 milliseconds© 2013 Aerospike. All rights reserved. Pg. 9
  10. 10. How it Works1. Write sent to row master2. Latch against simultaneous writes3. Apply write to master memoryand replica memorysynchronously4. Queue operations to disk5. Signal completed transaction(optional storage commit wait)6. Master applies conflict resolutionpolicy (rollback/ rollforward)© 2013 Aerospike. All rights reserved. Pg. 10master replica1. Cluster discovers new node viagossip protocol2. Paxos vote determines new dataorganization3. Partition migrations scheduled4. When a partition migration starts,write journal starts on destination5. Partition moves atomically6. Journal is applied and source datadeletedtransactionscontinueWriting with Immediate Consistency Adding a Node
  11. 11. Intelligent ClientShields Applications from the Complexity of the Cluster➤ Implements Aerospike API➤ Optimistic row locking➤ Optimized binary protocol➤ Cluster tracking Learns about clusterchanges, partition map Gossip protocol➤ Transaction semantics Global transaction ID Retransmit and timeout© 2013 Aerospike. All rights reserved. Pg. 11
  12. 12. Cross Data Center Replication (XDR)➤ Asynchronous replication for long linkdelays and outages➤ Namespace is configured to replicate to adestination cluster – master / slave,including star and ring➤ Replication process Transaction journal on partition master andreplica XDR process writes batches to destination Transmission state shared with sourcereplica Retransmission in case of network fault When data arrives back at originatingcluster, transaction ID matching preventssubsequent application and forwarding➤ In master / master replication, conflictresolution via multiple versions, ortimestamp© 2013 Aerospike. All rights reserved. Confidential Pg. 12
  13. 13. Multi-core Optimization Right Architecture Shared nothing In-memory (or multiple SSDs) Tight code loop Lock free isolation OS, Programming Language, Libraries Modern Linux kernel C language Use epoll Tweaks Pin threads to processor cores IRQ affinity settings for NIC CPU Socket Isolation via pairing of CPU to NICRuss’s 10 Ingredient Recipe forMaking 1 Million TPS on $5K Hardware© 2013 Aerospike. All rights reserved. Pg. 13
  14. 14. Flash-optimized Storage Layer➤ Direct device access Direct attach performance Data written in flash optimallarge block patterns All indexes in RAM for low wear Constant backgrounddefragmentation Log structured file system, “copyon write” Clean restart through sharedmemory➤ Random distribution using hashdoes not require RAID hardware© 2013 Aerospike. All rights reserved. Pg. 14…SSD performance varies widely•Aerospike has a certifiedhardware list•Free SSD certification tool,CIO, is also available
  15. 15. Native Flash  17x better TCO“…data-in-DRAM implementations like SAP HANA..should be bypassed…..current leading data-in-flash database for transactional analytic appsis Aerospike.” - David Floyer, CTO, Wikibon© 2012 Aerospike. All rights reserved. Confidential | Pg. 15$$$
  16. 16. Case studies
  17. 17. Proven in Production➤ AppNexus - #2 RTB after Google 27 Billion auctions per day 600+ QPS Aerospike servers in 6 clusters in 3data centers➤ Chango – #2 Search after Google Sees more Searches thanYahoo! + bing Data on 300 Million users➤ TradeDesk – first Ad Exchange Facebook Exchange partner FBX serves 25% of Ads on theInternet 1200% growth in 2012“Aerospike has operatedwithout interruptionsand easily scaled to meetour performance demands.”– Mike Nolet, CTO, AppNexus© 2013 Aerospike. All rights reserved. Confidential Pg. 17
  18. 18. Proven in Production➤ eXelate – Data on 500 Million users Online data plus Nielsen, Mastercard,Autobytel, Bizo data.. Data on 400 million users 20 Billion Transactions per month 4x2 TB data per cluster 4 clusters across 4 data centers “Scale.Real-time performance.Real-time replication at 4 datacenters.Aerospike delivered.”- Elad Efraim, eXelate CTO➤ BlueKai – Serves half the Fortune 30 #1 Data Exchange 2 Trillion Transactions per month© 2013 Aerospike. All rights reserved. Confidential
  19. 19. Fast? Scale & Never Fail?➤ Cluster-aware Client Layer➤ Per Node Optimizations Thread-core-pinning Real-time prioritization➤ Extremely efficientprimary index scheme Index in DRAM 64 byte index entry size Kernel quality C code;no degradation due toJava garbage collection➤ Flash-optimized DataLayer➤ Shared-nothingDistribution Layer Intelligent datamigration and re-balancing Smart data expirationand eviction Rolling upgrades andbackground backups➤ Cross DatacenterReplication (XDR)What makes Aerospike…➤ © 2013 Aerospike. All rights reserved. Pg. 19
  20. 20. Mission➤ Build the Modern Real-time Data Platform1. Scaling the Internet of Everything2. Pushing the limits of modern hardware3. No data loss and No downtime© 2013 Aerospike. All rights reserved. Confidential Pg. 20Publish &Subscribe• ASQL & NoSQL• Powerful Aggregations(MapReduce++)• ASQL & NoSQL• Powerful Aggregations(MapReduce++)• Secondary Index QueriesTransactions• User Defined Functions (UDF)SecurityEncryptionCompressionAEROSPIKE REAL-TIME DATA DATAPLATFORM• Distribution - Shared Nothing, ACID, Scale-out, Multiple datacenters• Data Types – Int, Str, Blob, List, Map, Large Stack, Large Set, Large List• Storage– DRAM, SSD, HDD
  21. 21. How to get Aerospike?FreeCommunity Edition Enterprise Edition➤ For developers lookingfor speed and stabilityand transparently scaleas they grow All features for 2 nodes, 100GB 1 cluster 1 datacenter Community support➤ For mission critical appsneeding to scale right fromthe start Unlimited number ofnodes, clusters, datacenters Cross data centerreplication Premium 24x7 support Priced by TBs of uniquedata (not replicas)➤ © 2013 Aerospike. All rights reserved. Pg. 21
  22. 22. Questions© 2013 Aerospike. All rights reserved. Confidential Pg. 22