Your SlideShare is downloading. ×
Beckman abadi-5min-pres
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Beckman abadi-5min-pres

372
views

Published on

Slides forthe Beckman Database Research Self-Assessment Meeting. Provides thoughts about Big Data and the database systems research community.

Slides forthe Beckman Database Research Self-Assessment Meeting. Provides thoughts about Big Data and the database systems research community.


0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
372
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. * @daniel_abadi Yale University
  • 2. * The Big Data phenomenon is the best thing that could have happened to the database community * Despite other definitions related to ‘3 Vs’ --Big Data means BIG Data * Which means we need scalable database systems * Still two main components of Big Data * Performing data analysis at scale * Performing requests on data at scale *
  • 3. * Database community has won the battle * Some thought that MapReduce might replace traditional database technology as the primary means to perform analysis at scale * Just about every MapReduce vendor has abandoned this goal * Hadapt, Impala, Tez, and several others are in a race to see who can add the most traditional database execution technology to Hadoop fastest * Everyone is going in the direction of cost-based optimizers, traditional database operators, and push-based query execution *
  • 4. * The database community is losing the battle * NoSQL systems still have very little traditional database technology inside (despite adding SQL interfaces) * No race to add DB technology --- why? * Don’t blame CAP --- CAP is only relevant when there’s a * network partition We never figured out how to do ACID and active replication at scale * Many new proposals make simplifying assumptions in order to handle scale * It’s been 30 years ---- why can’t we build a distributed database that can handle distributed transactions over actively replicated data at scale? *