* The Big Data phenomenon is the best thing that
could have happened to the database
* Despite other definitions related to ‘3 Vs’ --Big Data means BIG Data
* Which means we need scalable database systems
* Still two main components of Big Data
* Performing data analysis at scale
* Performing requests on data at scale
* Database community has won the battle
* Some thought that MapReduce might replace
traditional database technology as the primary
means to perform analysis at scale
* Just about every MapReduce vendor has abandoned
* Hadapt, Impala, Tez, and several others are in a
race to see who can add the most traditional
database execution technology to Hadoop fastest
* Everyone is going in the direction of cost-based
optimizers, traditional database operators, and
push-based query execution
* The database community is losing the battle
* NoSQL systems still have very little traditional database
technology inside (despite adding SQL interfaces)
* No race to add DB technology --- why?
* Don’t blame CAP --- CAP is only relevant when there’s a
We never figured out how to do ACID and active replication at
Many new proposals make simplifying assumptions in order to
* It’s been 30 years ---- why can’t we build a distributed
database that can handle distributed transactions over
actively replicated data at scale?
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.