SlideShare a Scribd company logo
1 of 38
Download to read offline
HyperDex
A Closer Look
–  DECK36 is a young spin-off from
ICANS
–  Small team of 7 engineers
–  Longstanding expertise in
designing, implementing and
operating complex web systems
–  Developing own data intelligencefocused tools and web services
–  Offering our expert knowledge in: 
–  Automation & Operations
–  Architecture & Engineering
–  Analytics & Data Logistics
Dr. Stefan Schadwinkel
Co-Founder / Analytics Engineer
stefan.schadwinkel@deck36.de
BACKGROUND
*log: Storm-based Analytics RT
BACKGROUND
*log
Our *log provides stream-based real-time analytics. We need a serious DB.
We need to focus on servicing each request, scale easily & fast, throughput must be
consistent, we need secondary indices, and the possibility to compute aggregations.

MongoDB, Cassandra, Riak, MariaDB

HyperDex: A Distributed, Searchable Key-Value Store.
Robert Escriva, Bernard Wong and Emin Gün Sirer.
In Proceedings of the SIGCOMM Conference, Helsinki, Finland, August 2012.
http://hyperdex.org/papers/hyperdex.pdf
WHY HYPERDEX?
Next Generation K/V
WHY HYPERDEX?
Features.
CAP - Common Buzz: Consistent, Available, Partition-tolerant – Pick any two.
From http://hyperdex.org/FAQ/: HyperDex is designed to withstand a threshold of
failures desired by the application. The level of fault-tolerance is tunable by the system
administrator. HyperDex guarantees consistency, availability in the presence of less
than f faults, and partition tolerance for partitions that affect less than f nodes, where f
is a user-tunable parameter.

-  Fully linearizable. Every ‘get’ always returns the latest ‘put’.
-  Tolerates up to f failures.
-  Query secondary attributes almost as fast as the primary key.
-  Rich data types: Strings, Floats, Ints, Lists, Maps, Sets
-  Atomic, multi-key transactions. (Commercial)
HOW?
Hyperspace Hashing
HYPERSPACE HASHING
Mapping Data into Euclidean Space
Each object is mapped into space. Space is mapped onto servers.
One hyperspace relates to one table. HyperDex can manage multiple independent
hyperspaces.
HYPERSPACE HASHING
So far, so good. Aww, wait!
The curse of dimensionality.
The volume of the resulting hyperspace grows
exponentially in the number of dimensions/
attributes. 

For instance, a table with 9 dimensions requires 29
regions. That’s a minimum of 512 servers.
HYPERSPACE HASHING
Logarithms to the rescue!
Subspaces.
HyperDex splits the hyperspace into multiple lower dimensional subspaces. Thus, the
volume of the space only grows linearly. Not only does this reduce the number of
machines required to store the data, search becomes more efficient, because less
machines need to be contacted. A key subspace is added to distinguish key lookup
from single attribute searches. Each subspace stores a full copy of the object.
HOW?
Value-dependent Chaining
VALUE-DEPENDENT CHAINING
Consistency and Replication.
We have copies of each object in each subspace.
Value-dependent chaining keeps all copies consistent and provides strong consistency
(linearizability) and fault tolerance in the presence of concurrent updates.
VALUE-DEPENDENT CHAINING
Consistency.
HyperDex propagates each update deterministically to all relevant spaces. 
Update u1: PUT (insert key)
-  h1, h2, h3

Chains are executed from the
end. 

Head = Point leader. The same
for each key.

The point leader knows all
updates. Dependencies are
embedded in the chain.
VALUE-DEPENDENT CHAINING
Replication.
HyperDex inserts replicas for each region into the chain. 
Consider Update u1:
-  h1, h2, h3
-  h1, h1‘, h2, h2‘, h3, h3‘
-  h1, h1‘, h2‘, h2‘‘, h3, h3‘
Replicas are always updated
first.

Failures do not compromise
strong consistency. 

Clients are only acknowledged
after full replication is achieved.
THE PARTS OF THE MACHINE
HyperDex - Nuts and Bolts
THE PARTS OF THE MACHINE
The Slave Node.
Everything is C++.
The slave nodes are not particularly interesting.
THE PARTS OF THE MACHINE
The Coordinator & the Configuration.
A logically centralized coordinator maintains global state.
-  Own replicated state machine for the coordinator called “replicant”. This is what
Zookeeper does for Hadoop et al. 
-  Global state is maintained as Configuration. 
-  The coordinator has no state of the stored objects, only mappings and servers. 
-  Instance: IP, Port, Instance ID.
-  The coordinator creates new configurations based on changes and failures and
distributes it to the client.
THE PARTS OF THE MACHINE
The Client.
The client is part of the whole system, not just a customer.
-  Client receives new configurations from the coordinator.
-  Switching to a new configuration is atomic.
-  Client only contacts relevant nodes. This is significant for performance.
-  Clients must be “intelligent”. No REST.
-  A load-balancing proxy layer could help. But isn’t there.
-  Full support for C++, Python.
-  Partial support for Java (uses the C++ driver through JNI), Node.JS, Ruby
-  Using layers skips features. Java driver doesn’t support “count”.
THE REAL WORLD™
HyperDex Tutorial
THE REAL WORLD ™
Install.
Pre-build packages.
Supports CentOS, Debian, Fedora, Ubuntu. 
But not all versions. 
And not everything. Read: “No package for the Java driver”.

Build from source.
Good luck. 

Be super conscious of package versions.
More on that in a minute.
THE REAL WORLD ™
Start the Daemons.
Coordinator.
# hyperdex coordinator -f -l 127.0.0.1 -p 1982

Data Nodes.
# hyperdex daemon -f --listen=127.0.0.1 --listen-port=2022 

--coordinator=127.0.0.1 --coordinator-port=1982 

--data=./data0/

# hyperdex daemon -f --listen=127.0.0.1 --listen-port=2032 

--coordinator=127.0.0.1 --coordinator-port=1982 

--data=./data1/
THE REAL WORLD ™
Client Demo
The Python client is the HyperDex shell. Create Hyperspace.
# python
THE REAL WORLD ™
Client Demo
Create a client. Basic PUT/GET. Uses Key subspace.
# python
THE REAL WORLD ™
Client Demo
Search. Uses further subspaces.
THE REAL WORLD ™
Client Demo
Updates and Range Query/Search.
THE REAL WORLD™
Performance
THE REAL WORLD ™
Bashing the Prophetess & the Giant.
Performance Benchmarks use the YCSB against Cassandra and MongoDB.
Dedicated cluster of 14 Nodes in the VICCI cloud.
Take it with a grain of salt. 
I’m missing Riak.
THE REAL WORLD ™
Throughput.
THE REAL WORLD ™
Latency.
THE REAL WORLD ™
Latency.
THE REAL WORLD ™
Scaling.
THE REAL WORLD™
Experiences & Findings
THE REAL WORLD ™
Experiences. Findings.
Minor versions are incompatible.
-  hyperdex-1.0.rc4 vs. hyperdex-1.0.rc5
-  import hyperclient vs. hyperdex.admin, hyperdex.client
-  (hyperdisk) vs. leveldb vs. hyperleveldb

-  There goes my dream of using the PHP driver on github.
-  Migration? No idea.
-  Compile? Use VM to go.
THE REAL WORLD ™
Experiences. Findings. 
It’s just a K/V store.
-  No methods to do distributed computations. Python map/reduce is on the agenda.
No Dynamo Ring. But a chain to rule them all.
-  Fault-tolerance with f dedicated nodes is fine, but what about multiple datacenters?
It’s a quite young project with few committers.
Important internals change between minor versions. 
Not much sleep for them. How about your DevOps?
REMEMBER?
*log: Storm-based Analytics RT
WHAT ABOUT?
*log
We chose Riak.
-  Excellent Java driver. 
-  We don’t need transactions.
-  During development, our schema will change often. 
-  Operational ease, easy to scale, excellent feedback.
-  Map/reduce in Erlang and JS. Can use the result of a secondary index query.
-  Solr Integration with Riak Search. Not at the moment, but we deal with content.
We like HyperDex.
-  Really interesting concepts and advancements, but atm not the perfect fit.
-  Implemented a storage backend abstraction layer. Easy to switch to HyperDex once
its more mature.
Thank You.

More Related Content

What's hot

Spark crash course workshop at Hadoop Summit
Spark crash course workshop at Hadoop SummitSpark crash course workshop at Hadoop Summit
Spark crash course workshop at Hadoop SummitDataWorks Summit
 
The Convergence of HPC and Deep Learning
The Convergence of HPC and Deep LearningThe Convergence of HPC and Deep Learning
The Convergence of HPC and Deep Learninginside-BigData.com
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed_Hat_Storage
 
Pedal to the Metal: Accelerating Spark with Silicon Innovation
Pedal to the Metal: Accelerating Spark with Silicon InnovationPedal to the Metal: Accelerating Spark with Silicon Innovation
Pedal to the Metal: Accelerating Spark with Silicon InnovationJen Aman
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesKamesh Pemmaraju
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersDataWorks Summit/Hadoop Summit
 
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNEGenerating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNEDataWorks Summit/Hadoop Summit
 
Bringing complex event processing to Spark streaming
Bringing complex event processing to Spark streamingBringing complex event processing to Spark streaming
Bringing complex event processing to Spark streamingDataWorks Summit
 
Architecting Ceph Solutions
Architecting Ceph SolutionsArchitecting Ceph Solutions
Architecting Ceph SolutionsRed_Hat_Storage
 
Reactive streams
Reactive streamsReactive streams
Reactive streamscodepitbull
 
Flying Circus Ceph Case Study (CEPH Usergroup Berlin)
Flying Circus Ceph Case Study (CEPH Usergroup Berlin)Flying Circus Ceph Case Study (CEPH Usergroup Berlin)
Flying Circus Ceph Case Study (CEPH Usergroup Berlin)Christian Theune
 
20140202 fosdem-nosql-devroom-hadoop-yarn
20140202 fosdem-nosql-devroom-hadoop-yarn20140202 fosdem-nosql-devroom-hadoop-yarn
20140202 fosdem-nosql-devroom-hadoop-yarnDatalayer
 
Apache Spark Operations
Apache Spark OperationsApache Spark Operations
Apache Spark OperationsCloudera, Inc.
 
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...buildacloud
 
Serverless Data Platform
Serverless Data PlatformServerless Data Platform
Serverless Data PlatformShu-Jeng Hsieh
 
Solving Real Problems with Apache Spark: Archiving, E-Discovery, and Supervis...
Solving Real Problems with Apache Spark: Archiving, E-Discovery, and Supervis...Solving Real Problems with Apache Spark: Archiving, E-Discovery, and Supervis...
Solving Real Problems with Apache Spark: Archiving, E-Discovery, and Supervis...Spark Summit
 
OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?Tim Bell
 
Which Hypervisor is Best?
Which Hypervisor is Best?Which Hypervisor is Best?
Which Hypervisor is Best?Kyle Bader
 
FiloDB - Breakthrough OLAP Performance with Cassandra and Spark
FiloDB - Breakthrough OLAP Performance with Cassandra and SparkFiloDB - Breakthrough OLAP Performance with Cassandra and Spark
FiloDB - Breakthrough OLAP Performance with Cassandra and SparkEvan Chan
 

What's hot (20)

Spark crash course workshop at Hadoop Summit
Spark crash course workshop at Hadoop SummitSpark crash course workshop at Hadoop Summit
Spark crash course workshop at Hadoop Summit
 
The Convergence of HPC and Deep Learning
The Convergence of HPC and Deep LearningThe Convergence of HPC and Deep Learning
The Convergence of HPC and Deep Learning
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
 
Pedal to the Metal: Accelerating Spark with Silicon Innovation
Pedal to the Metal: Accelerating Spark with Silicon InnovationPedal to the Metal: Accelerating Spark with Silicon Innovation
Pedal to the Metal: Accelerating Spark with Silicon Innovation
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference Architectures
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
 
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNEGenerating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
 
Bringing complex event processing to Spark streaming
Bringing complex event processing to Spark streamingBringing complex event processing to Spark streaming
Bringing complex event processing to Spark streaming
 
Architecting Ceph Solutions
Architecting Ceph SolutionsArchitecting Ceph Solutions
Architecting Ceph Solutions
 
Reactive streams
Reactive streamsReactive streams
Reactive streams
 
Flying Circus Ceph Case Study (CEPH Usergroup Berlin)
Flying Circus Ceph Case Study (CEPH Usergroup Berlin)Flying Circus Ceph Case Study (CEPH Usergroup Berlin)
Flying Circus Ceph Case Study (CEPH Usergroup Berlin)
 
20140202 fosdem-nosql-devroom-hadoop-yarn
20140202 fosdem-nosql-devroom-hadoop-yarn20140202 fosdem-nosql-devroom-hadoop-yarn
20140202 fosdem-nosql-devroom-hadoop-yarn
 
Apache Spark Operations
Apache Spark OperationsApache Spark Operations
Apache Spark Operations
 
Enterprise Grade Streaming under 2ms on Hadoop
Enterprise Grade Streaming under 2ms on HadoopEnterprise Grade Streaming under 2ms on Hadoop
Enterprise Grade Streaming under 2ms on Hadoop
 
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
 
Serverless Data Platform
Serverless Data PlatformServerless Data Platform
Serverless Data Platform
 
Solving Real Problems with Apache Spark: Archiving, E-Discovery, and Supervis...
Solving Real Problems with Apache Spark: Archiving, E-Discovery, and Supervis...Solving Real Problems with Apache Spark: Archiving, E-Discovery, and Supervis...
Solving Real Problems with Apache Spark: Archiving, E-Discovery, and Supervis...
 
OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?
 
Which Hypervisor is Best?
Which Hypervisor is Best?Which Hypervisor is Best?
Which Hypervisor is Best?
 
FiloDB - Breakthrough OLAP Performance with Cassandra and Spark
FiloDB - Breakthrough OLAP Performance with Cassandra and SparkFiloDB - Breakthrough OLAP Performance with Cassandra and Spark
FiloDB - Breakthrough OLAP Performance with Cassandra and Spark
 

Viewers also liked

Riak Search - Erlang Factory London 2010
Riak Search - Erlang Factory London 2010Riak Search - Erlang Factory London 2010
Riak Search - Erlang Factory London 2010Rusty Klophaus
 
LXC, Docker, and the future of software delivery | LinuxCon 2013
LXC, Docker, and the future of software delivery | LinuxCon 2013LXC, Docker, and the future of software delivery | LinuxCon 2013
LXC, Docker, and the future of software delivery | LinuxCon 2013dotCloud
 
Chloe and the Realtime Web
Chloe and the Realtime WebChloe and the Realtime Web
Chloe and the Realtime WebTrotter Cashion
 
Blazes: coordination analysis for distributed programs
Blazes: coordination analysis for distributed programsBlazes: coordination analysis for distributed programs
Blazes: coordination analysis for distributed programspalvaro
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseRobert Lujo
 
(Functional) reactive programming (@pavlobaron)
(Functional) reactive programming (@pavlobaron)(Functional) reactive programming (@pavlobaron)
(Functional) reactive programming (@pavlobaron)Pavlo Baron
 
Complex Legacy System Archiving/Data Retention with MongoDB and Xquery
Complex Legacy System Archiving/Data Retention with MongoDB and XqueryComplex Legacy System Archiving/Data Retention with MongoDB and Xquery
Complex Legacy System Archiving/Data Retention with MongoDB and XqueryDATAVERSITY
 
Spring Cleaning for Your Smartphone
Spring Cleaning for Your SmartphoneSpring Cleaning for Your Smartphone
Spring Cleaning for Your SmartphoneLookout
 
Web-Oriented Architecture (WOA)
Web-Oriented Architecture (WOA)Web-Oriented Architecture (WOA)
Web-Oriented Architecture (WOA)thetechnicalweb
 
Scalable XQuery Processing with Zorba on top of MongoDB
Scalable XQuery Processing with Zorba on top of MongoDBScalable XQuery Processing with Zorba on top of MongoDB
Scalable XQuery Processing with Zorba on top of MongoDBWilliam Candillon
 
Interoperability With RabbitMq
Interoperability With RabbitMqInteroperability With RabbitMq
Interoperability With RabbitMqAlvaro Videla
 
In Pursuit of the Holy Grail: Building Isomorphic JavaScript Apps
In Pursuit of the Holy Grail: Building Isomorphic JavaScript AppsIn Pursuit of the Holy Grail: Building Isomorphic JavaScript Apps
In Pursuit of the Holy Grail: Building Isomorphic JavaScript AppsSpike Brehm
 
Erlang plus BDB: Disrupting the Conventional Web Wisdom
Erlang plus BDB: Disrupting the Conventional Web WisdomErlang plus BDB: Disrupting the Conventional Web Wisdom
Erlang plus BDB: Disrupting the Conventional Web Wisdomguest3933de
 
Shrinking the Haystack" using Solr and OpenNLP
Shrinking the Haystack" using Solr and OpenNLPShrinking the Haystack" using Solr and OpenNLP
Shrinking the Haystack" using Solr and OpenNLPlucenerevolution
 
Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...
Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...
Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...C4Media
 
AST - the only true tool for building JavaScript
AST - the only true tool for building JavaScriptAST - the only true tool for building JavaScript
AST - the only true tool for building JavaScriptIngvar Stepanyan
 
Erlang as a cloud citizen, a fractal approach to throughput
Erlang as a cloud citizen, a fractal approach to throughputErlang as a cloud citizen, a fractal approach to throughput
Erlang as a cloud citizen, a fractal approach to throughputPaolo Negri
 
Pregel: A System for Large-Scale Graph Processing
Pregel: A System for Large-Scale Graph ProcessingPregel: A System for Large-Scale Graph Processing
Pregel: A System for Large-Scale Graph ProcessingChris Bunch
 

Viewers also liked (20)

Riak Search - Erlang Factory London 2010
Riak Search - Erlang Factory London 2010Riak Search - Erlang Factory London 2010
Riak Search - Erlang Factory London 2010
 
LXC, Docker, and the future of software delivery | LinuxCon 2013
LXC, Docker, and the future of software delivery | LinuxCon 2013LXC, Docker, and the future of software delivery | LinuxCon 2013
LXC, Docker, and the future of software delivery | LinuxCon 2013
 
Chloe and the Realtime Web
Chloe and the Realtime WebChloe and the Realtime Web
Chloe and the Realtime Web
 
Blazes: coordination analysis for distributed programs
Blazes: coordination analysis for distributed programsBlazes: coordination analysis for distributed programs
Blazes: coordination analysis for distributed programs
 
Brunch With Coffee
Brunch With CoffeeBrunch With Coffee
Brunch With Coffee
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document database
 
(Functional) reactive programming (@pavlobaron)
(Functional) reactive programming (@pavlobaron)(Functional) reactive programming (@pavlobaron)
(Functional) reactive programming (@pavlobaron)
 
Complex Legacy System Archiving/Data Retention with MongoDB and Xquery
Complex Legacy System Archiving/Data Retention with MongoDB and XqueryComplex Legacy System Archiving/Data Retention with MongoDB and Xquery
Complex Legacy System Archiving/Data Retention with MongoDB and Xquery
 
NkSIP: The Erlang SIP application server
NkSIP: The Erlang SIP application serverNkSIP: The Erlang SIP application server
NkSIP: The Erlang SIP application server
 
Spring Cleaning for Your Smartphone
Spring Cleaning for Your SmartphoneSpring Cleaning for Your Smartphone
Spring Cleaning for Your Smartphone
 
Web-Oriented Architecture (WOA)
Web-Oriented Architecture (WOA)Web-Oriented Architecture (WOA)
Web-Oriented Architecture (WOA)
 
Scalable XQuery Processing with Zorba on top of MongoDB
Scalable XQuery Processing with Zorba on top of MongoDBScalable XQuery Processing with Zorba on top of MongoDB
Scalable XQuery Processing with Zorba on top of MongoDB
 
Interoperability With RabbitMq
Interoperability With RabbitMqInteroperability With RabbitMq
Interoperability With RabbitMq
 
In Pursuit of the Holy Grail: Building Isomorphic JavaScript Apps
In Pursuit of the Holy Grail: Building Isomorphic JavaScript AppsIn Pursuit of the Holy Grail: Building Isomorphic JavaScript Apps
In Pursuit of the Holy Grail: Building Isomorphic JavaScript Apps
 
Erlang plus BDB: Disrupting the Conventional Web Wisdom
Erlang plus BDB: Disrupting the Conventional Web WisdomErlang plus BDB: Disrupting the Conventional Web Wisdom
Erlang plus BDB: Disrupting the Conventional Web Wisdom
 
Shrinking the Haystack" using Solr and OpenNLP
Shrinking the Haystack" using Solr and OpenNLPShrinking the Haystack" using Solr and OpenNLP
Shrinking the Haystack" using Solr and OpenNLP
 
Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...
Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...
Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...
 
AST - the only true tool for building JavaScript
AST - the only true tool for building JavaScriptAST - the only true tool for building JavaScript
AST - the only true tool for building JavaScript
 
Erlang as a cloud citizen, a fractal approach to throughput
Erlang as a cloud citizen, a fractal approach to throughputErlang as a cloud citizen, a fractal approach to throughput
Erlang as a cloud citizen, a fractal approach to throughput
 
Pregel: A System for Large-Scale Graph Processing
Pregel: A System for Large-Scale Graph ProcessingPregel: A System for Large-Scale Graph Processing
Pregel: A System for Large-Scale Graph Processing
 

Similar to HyperDex: A Closer Look at a Distributed, Searchable Key-Value Store

Big data: current technology scope.
Big data: current technology scope.Big data: current technology scope.
Big data: current technology scope.Roman Nikitchenko
 
Fast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonBenjamin Bengfort
 
High Performance Processing of Streaming Data
High Performance Processing of Streaming DataHigh Performance Processing of Streaming Data
High Performance Processing of Streaming DataGeoffrey Fox
 
Manuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4octManuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4octParadigma Digital
 
Gluecon Preso: Hybrid Container Infrastructure
Gluecon Preso: Hybrid Container InfrastructureGluecon Preso: Hybrid Container Infrastructure
Gluecon Preso: Hybrid Container Infrastructurerhirschfeld
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...confluent
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09Chris Purrington
 
Delivering big content at NBC News with RavenDB
Delivering big content at NBC News with RavenDBDelivering big content at NBC News with RavenDB
Delivering big content at NBC News with RavenDBJohn Bennett
 
Demystifying the Distributed Database Landscape (DevOps) (1).pdf
Demystifying the Distributed Database Landscape (DevOps) (1).pdfDemystifying the Distributed Database Landscape (DevOps) (1).pdf
Demystifying the Distributed Database Landscape (DevOps) (1).pdfScyllaDB
 
How to Make Hadoop Easy, Dependable and Fast
How to Make Hadoop Easy, Dependable and FastHow to Make Hadoop Easy, Dependable and Fast
How to Make Hadoop Easy, Dependable and FastMapR Technologies
 
SnappyData Toronto Meetup Nov 2017
SnappyData Toronto Meetup Nov 2017SnappyData Toronto Meetup Nov 2017
SnappyData Toronto Meetup Nov 2017SnappyData
 
Tez Data Processing over Yarn
Tez Data Processing over YarnTez Data Processing over Yarn
Tez Data Processing over YarnInMobi Technology
 
Containerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesContainerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesDataWorks Summit
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analyticskgshukla
 
An Introduction To Space Based Architecture
An Introduction To Space Based ArchitectureAn Introduction To Space Based Architecture
An Introduction To Space Based ArchitectureAmin Abbaspour
 
Pluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and DockerPluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and DockerBob Killen
 
prodops.io k8s presentation
prodops.io k8s presentationprodops.io k8s presentation
prodops.io k8s presentationProdops.io
 

Similar to HyperDex: A Closer Look at a Distributed, Searchable Key-Value Store (20)

Big data: current technology scope.
Big data: current technology scope.Big data: current technology scope.
Big data: current technology scope.
 
Customer Case : Citrix et Nutanix
Customer Case : Citrix et NutanixCustomer Case : Citrix et Nutanix
Customer Case : Citrix et Nutanix
 
Fast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and Python
 
High Performance Processing of Streaming Data
High Performance Processing of Streaming DataHigh Performance Processing of Streaming Data
High Performance Processing of Streaming Data
 
Manuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4octManuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4oct
 
Gluecon Preso: Hybrid Container Infrastructure
Gluecon Preso: Hybrid Container InfrastructureGluecon Preso: Hybrid Container Infrastructure
Gluecon Preso: Hybrid Container Infrastructure
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09
 
Delivering big content at NBC News with RavenDB
Delivering big content at NBC News with RavenDBDelivering big content at NBC News with RavenDB
Delivering big content at NBC News with RavenDB
 
Re thinkdb
Re thinkdbRe thinkdb
Re thinkdb
 
Demystifying the Distributed Database Landscape (DevOps) (1).pdf
Demystifying the Distributed Database Landscape (DevOps) (1).pdfDemystifying the Distributed Database Landscape (DevOps) (1).pdf
Demystifying the Distributed Database Landscape (DevOps) (1).pdf
 
How to Make Hadoop Easy, Dependable and Fast
How to Make Hadoop Easy, Dependable and FastHow to Make Hadoop Easy, Dependable and Fast
How to Make Hadoop Easy, Dependable and Fast
 
SnappyData Toronto Meetup Nov 2017
SnappyData Toronto Meetup Nov 2017SnappyData Toronto Meetup Nov 2017
SnappyData Toronto Meetup Nov 2017
 
WebWorkersCamp 2010
WebWorkersCamp 2010WebWorkersCamp 2010
WebWorkersCamp 2010
 
Tez Data Processing over Yarn
Tez Data Processing over YarnTez Data Processing over Yarn
Tez Data Processing over Yarn
 
Containerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesContainerized Hadoop beyond Kubernetes
Containerized Hadoop beyond Kubernetes
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analytics
 
An Introduction To Space Based Architecture
An Introduction To Space Based ArchitectureAn Introduction To Space Based Architecture
An Introduction To Space Based Architecture
 
Pluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and DockerPluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and Docker
 
prodops.io k8s presentation
prodops.io k8s presentationprodops.io k8s presentation
prodops.io k8s presentation
 

More from DECK36

Our Puppet Story (GUUG FFG 2015)
Our Puppet Story (GUUG FFG 2015)Our Puppet Story (GUUG FFG 2015)
Our Puppet Story (GUUG FFG 2015)DECK36
 
PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.DECK36
 
Effizientere WordPress-Plugin-Entwicklung mit Softwaretests
Effizientere WordPress-Plugin-Entwicklung mit SoftwaretestsEffizientere WordPress-Plugin-Entwicklung mit Softwaretests
Effizientere WordPress-Plugin-Entwicklung mit SoftwaretestsDECK36
 
Real-time Data De-duplication using Locality-sensitive Hashing powered by Sto...
Real-time Data De-duplication using Locality-sensitive Hashing powered by Sto...Real-time Data De-duplication using Locality-sensitive Hashing powered by Sto...
Real-time Data De-duplication using Locality-sensitive Hashing powered by Sto...DECK36
 
Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)DECK36
 
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)DECK36
 
Log everything! @DC13
Log everything! @DC13Log everything! @DC13
Log everything! @DC13DECK36
 

More from DECK36 (7)

Our Puppet Story (GUUG FFG 2015)
Our Puppet Story (GUUG FFG 2015)Our Puppet Story (GUUG FFG 2015)
Our Puppet Story (GUUG FFG 2015)
 
PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.
 
Effizientere WordPress-Plugin-Entwicklung mit Softwaretests
Effizientere WordPress-Plugin-Entwicklung mit SoftwaretestsEffizientere WordPress-Plugin-Entwicklung mit Softwaretests
Effizientere WordPress-Plugin-Entwicklung mit Softwaretests
 
Real-time Data De-duplication using Locality-sensitive Hashing powered by Sto...
Real-time Data De-duplication using Locality-sensitive Hashing powered by Sto...Real-time Data De-duplication using Locality-sensitive Hashing powered by Sto...
Real-time Data De-duplication using Locality-sensitive Hashing powered by Sto...
 
Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)
 
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
 
Log everything! @DC13
Log everything! @DC13Log everything! @DC13
Log everything! @DC13
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 

HyperDex: A Closer Look at a Distributed, Searchable Key-Value Store

  • 1.
  • 3. –  DECK36 is a young spin-off from ICANS –  Small team of 7 engineers –  Longstanding expertise in designing, implementing and operating complex web systems –  Developing own data intelligencefocused tools and web services –  Offering our expert knowledge in: –  Automation & Operations –  Architecture & Engineering –  Analytics & Data Logistics Dr. Stefan Schadwinkel Co-Founder / Analytics Engineer stefan.schadwinkel@deck36.de
  • 5. BACKGROUND *log Our *log provides stream-based real-time analytics. We need a serious DB. We need to focus on servicing each request, scale easily & fast, throughput must be consistent, we need secondary indices, and the possibility to compute aggregations. MongoDB, Cassandra, Riak, MariaDB HyperDex: A Distributed, Searchable Key-Value Store. Robert Escriva, Bernard Wong and Emin Gün Sirer. In Proceedings of the SIGCOMM Conference, Helsinki, Finland, August 2012. http://hyperdex.org/papers/hyperdex.pdf
  • 7. WHY HYPERDEX? Features. CAP - Common Buzz: Consistent, Available, Partition-tolerant – Pick any two. From http://hyperdex.org/FAQ/: HyperDex is designed to withstand a threshold of failures desired by the application. The level of fault-tolerance is tunable by the system administrator. HyperDex guarantees consistency, availability in the presence of less than f faults, and partition tolerance for partitions that affect less than f nodes, where f is a user-tunable parameter. -  Fully linearizable. Every ‘get’ always returns the latest ‘put’. -  Tolerates up to f failures. -  Query secondary attributes almost as fast as the primary key. -  Rich data types: Strings, Floats, Ints, Lists, Maps, Sets -  Atomic, multi-key transactions. (Commercial)
  • 9. HYPERSPACE HASHING Mapping Data into Euclidean Space Each object is mapped into space. Space is mapped onto servers. One hyperspace relates to one table. HyperDex can manage multiple independent hyperspaces.
  • 10. HYPERSPACE HASHING So far, so good. Aww, wait! The curse of dimensionality. The volume of the resulting hyperspace grows exponentially in the number of dimensions/ attributes. For instance, a table with 9 dimensions requires 29 regions. That’s a minimum of 512 servers.
  • 11. HYPERSPACE HASHING Logarithms to the rescue! Subspaces. HyperDex splits the hyperspace into multiple lower dimensional subspaces. Thus, the volume of the space only grows linearly. Not only does this reduce the number of machines required to store the data, search becomes more efficient, because less machines need to be contacted. A key subspace is added to distinguish key lookup from single attribute searches. Each subspace stores a full copy of the object.
  • 13. VALUE-DEPENDENT CHAINING Consistency and Replication. We have copies of each object in each subspace. Value-dependent chaining keeps all copies consistent and provides strong consistency (linearizability) and fault tolerance in the presence of concurrent updates.
  • 14. VALUE-DEPENDENT CHAINING Consistency. HyperDex propagates each update deterministically to all relevant spaces. Update u1: PUT (insert key) -  h1, h2, h3 Chains are executed from the end. Head = Point leader. The same for each key. The point leader knows all updates. Dependencies are embedded in the chain.
  • 15. VALUE-DEPENDENT CHAINING Replication. HyperDex inserts replicas for each region into the chain. Consider Update u1: -  h1, h2, h3 -  h1, h1‘, h2, h2‘, h3, h3‘ -  h1, h1‘, h2‘, h2‘‘, h3, h3‘ Replicas are always updated first. Failures do not compromise strong consistency. Clients are only acknowledged after full replication is achieved.
  • 16. THE PARTS OF THE MACHINE HyperDex - Nuts and Bolts
  • 17. THE PARTS OF THE MACHINE The Slave Node. Everything is C++. The slave nodes are not particularly interesting.
  • 18. THE PARTS OF THE MACHINE The Coordinator & the Configuration. A logically centralized coordinator maintains global state. -  Own replicated state machine for the coordinator called “replicant”. This is what Zookeeper does for Hadoop et al. -  Global state is maintained as Configuration. -  The coordinator has no state of the stored objects, only mappings and servers. -  Instance: IP, Port, Instance ID. -  The coordinator creates new configurations based on changes and failures and distributes it to the client.
  • 19. THE PARTS OF THE MACHINE The Client. The client is part of the whole system, not just a customer. -  Client receives new configurations from the coordinator. -  Switching to a new configuration is atomic. -  Client only contacts relevant nodes. This is significant for performance. -  Clients must be “intelligent”. No REST. -  A load-balancing proxy layer could help. But isn’t there. -  Full support for C++, Python. -  Partial support for Java (uses the C++ driver through JNI), Node.JS, Ruby -  Using layers skips features. Java driver doesn’t support “count”.
  • 21. THE REAL WORLD ™ Install. Pre-build packages. Supports CentOS, Debian, Fedora, Ubuntu. But not all versions. And not everything. Read: “No package for the Java driver”. Build from source. Good luck. Be super conscious of package versions. More on that in a minute.
  • 22. THE REAL WORLD ™ Start the Daemons. Coordinator. # hyperdex coordinator -f -l 127.0.0.1 -p 1982 Data Nodes. # hyperdex daemon -f --listen=127.0.0.1 --listen-port=2022 --coordinator=127.0.0.1 --coordinator-port=1982 --data=./data0/ # hyperdex daemon -f --listen=127.0.0.1 --listen-port=2032 --coordinator=127.0.0.1 --coordinator-port=1982 --data=./data1/
  • 23. THE REAL WORLD ™ Client Demo The Python client is the HyperDex shell. Create Hyperspace. # python
  • 24. THE REAL WORLD ™ Client Demo Create a client. Basic PUT/GET. Uses Key subspace. # python
  • 25. THE REAL WORLD ™ Client Demo Search. Uses further subspaces.
  • 26. THE REAL WORLD ™ Client Demo Updates and Range Query/Search.
  • 28. THE REAL WORLD ™ Bashing the Prophetess & the Giant. Performance Benchmarks use the YCSB against Cassandra and MongoDB. Dedicated cluster of 14 Nodes in the VICCI cloud. Take it with a grain of salt. I’m missing Riak.
  • 29. THE REAL WORLD ™ Throughput.
  • 30. THE REAL WORLD ™ Latency.
  • 31. THE REAL WORLD ™ Latency.
  • 32. THE REAL WORLD ™ Scaling.
  • 34. THE REAL WORLD ™ Experiences. Findings. Minor versions are incompatible. -  hyperdex-1.0.rc4 vs. hyperdex-1.0.rc5 -  import hyperclient vs. hyperdex.admin, hyperdex.client -  (hyperdisk) vs. leveldb vs. hyperleveldb -  There goes my dream of using the PHP driver on github. -  Migration? No idea. -  Compile? Use VM to go.
  • 35. THE REAL WORLD ™ Experiences. Findings. It’s just a K/V store. -  No methods to do distributed computations. Python map/reduce is on the agenda. No Dynamo Ring. But a chain to rule them all. -  Fault-tolerance with f dedicated nodes is fine, but what about multiple datacenters? It’s a quite young project with few committers. Important internals change between minor versions. Not much sleep for them. How about your DevOps?
  • 37. WHAT ABOUT? *log We chose Riak. -  Excellent Java driver. -  We don’t need transactions. -  During development, our schema will change often. -  Operational ease, easy to scale, excellent feedback. -  Map/reduce in Erlang and JS. Can use the result of a secondary index query. -  Solr Integration with Riak Search. Not at the moment, but we deal with content. We like HyperDex. -  Really interesting concepts and advancements, but atm not the perfect fit. -  Implemented a storage backend abstraction layer. Easy to switch to HyperDex once its more mature.