Everything Cassandra does is designed for a real-time workload of high volume inserts and frequent small queries. Cassandra has Hadoop and Hive integration, but performing long running ad-hoc queries with these tools is difficult without impacting real-time performance or requires duplicate clusters. This talk will explain how I'm integrating Cassandra with Shark, a drop-in Hive replacement developed by Berkeley's AmpLab. It's designed to give fine grained control over all resource usage so you can safely run arbitrary ad-hoc queries on your existing cluster with controlled and predictable impact.
Recortar diapositivas es una manera útil de recopilar información importante para consultarla más tarde.