Be the first to like this
The talk is about the process of adding support for Cassandra in Kiji, our open-source platform for building big-data applications. I start off by describing the Kiji project, how it enables folks to build big-data applications, and (hopefully) get everyone excited about it. Then I talk about the Kiji data model, its origins in HBase (we initially built Kiji on top of HBase), how we updated it to also support Cassandra, what some if the issues were, etc. I get into some detail about our use of the Java driver and its async API, how we translate operations in Kiji into CQL statements, and some enhancements we've made to the Hadoop InputFormat and OutputFormat. I think this talk will be interesting to folks in general, and in particular will be useful for anyone who has an HBase background and is now working with Cassandra.
The Kiji Project is a modular, open-source framework that enables developers to efficiently build real-time Big Data applications. Kiji is built upon popular open-source technologies such as Cassandra, HBase, Hadoop, and Scalding, and contains components that implement functionality critical for Big Data applications, including the following:
Support for evolvable schemas of complex data types
Batch training of machine learning models with Hadoop
Real-time scoring with trained models
Integration with Hive and R
A REST endpoint
Recently, we have updated Kiji to use Cassandra as a backing data store (previously, Kiji worked only with HBase). In this talk, we describe the process of integrating Cassandra and Kiji. Topics we cover include the following:
The Kiji architecture and data model
Implementing the Kiji data model in Cassandra using the Java driver and CQL3
Integrating Cassandra with Hadoop 2.x
Building a flexible middleware platform that supports Cassandra and HBase (including projects that use both simultaneously)
Exposing unique features of Cassandra (e.g., variable consistency) to Kiji users
Clipping is a handy way to collect important slides you want to go back to later.