Next generation databases july2010


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Apologies, I’m a database type.....Quest is best known for toad, but we also have enterprise monitoring across all levels of the stackIn Melbourne, SQL Navigator + the spotlights. It’s not a complete co-incidence about the star trek theme.
  • Insanely popular – literally millions of users
  • This is your databaseThis is your database on crack
  • That’s a predictable linear growth curve. Gets much worse for unpredictable or cyclic demand So I think it’s real, and it excites me because it represents the realization of a more industrialized model for providing computing resources. In the early days of electricity everybody had thier own power sources and every company needed engineers as a result. Nowdays, few companies need that...
  • NoSQL tends to be strongly coupled with the application. Everybody else is out of luck
  • Data warehouses doubling every three years.
  • Next generation databases july2010

    1. 1. This is Not Your Father’s Database: Everything You Need to Know Now About Cloud Computing and Emerging Database Technology <br />Guy Harrison<br />Director Research and Development, Melbourne<br /><br /><br />
    2. 2. Introductions<br />
    3. 3.
    4. 4.
    5. 5. Mainframes<br />After the gold rush<br />Minicomputers<br />Client Server<br />Internet/Y2K Boom<br />
    6. 6. Current Day Trends<br />Big Data<br />Cloud computing<br />Solid State Disk<br />
    7. 7. Big Data<br />The Industrial Revolution of data* <br />User generated data:<br />Twitter, Facebook, Amazon <br />Machine generated data:<br />RFID, POS, cell phones, GPS<br />Traditional RDBMS neither economic or capable<br />*<br />
    8. 8. Big data 1: Google <br />
    9. 9. Map Reduce <br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Start<br />Reduce<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />Map<br />
    10. 10. Hadoop: Open source Map-reduce <br />Yahoo! Hadoop cluster:<br />4000 nodes<br />16PB disk<br />64 TB of RAM<br />32,000 Cores<br />Very Low $/TB<br />
    11. 11. Hive<br />SQL<br />Java<br />Results<br />
    12. 12. Big Data 2: Web 2.0<br />
    13. 13. Twitter Growth<br />
    14. 14. The fail whale<br />
    15. 15. Web Servers<br />Memcached Servers<br />Database<br />Servers <br />Read Only Slaves <br />Shard (G-O)<br />Shard (P-Z)<br />Shard (A-F)<br />
    16. 16. Clouds and Elastic provisioning<br />Capacity / Demand<br />Demand<br />Hardware upgrade<br />Under provisioned<br />Capacity<br />Over provisioned<br />Time<br />
    17. 17. CAP Theorem<br />Availability<br />RD<br />B<br />M<br />S<br />Consistency<br />NO<br />GO<br />NoSQL<br />Partition<br />Tolerance<br />
    18. 18. In search of the elastic database <br />Big Web sites AND Cloud applications need servers that scale up (and down) on demand<br />Elastic provisioning works fine for web servers, application servers, etc.<br />However RDBMS does not scale easily:<br />SQL Azure limited to one database <50GB on a single host<br />Oracle’s RAC not supported in cloud environments<br />MySQL sharding “obnoxious”<br />Many are willing to sacrifice relational database features for scalability and operational simplicity<br />
    19. 19. The NoSQL movement<br />
    20. 20. NoSQL (A.K.A.) Cloud databases<br />Generally DO NOT support<br />SQL<br />Transactions<br />Immediate consistency <br />Usually DO support:<br />Elasticity (scale out AND in)<br />Eventual consistency<br />Inherent redundancy and fault tolerance <br />
    21. 21. NoSQL Data Models<br />
    22. 22. MemcacheDB<br />Azure Table Services<br />Key Value Stores<br />Redis<br />Tokyo Cabinet<br />SimpleDB<br />Riak<br />Amazon Dynamo<br />Voldemort<br />Google BigTable<br />Cassandra<br />Hbase<br />Hypertable<br />CouchDB<br />Document DB<br />JSON/XML DB<br />MongoDB<br />Neo4J<br />Graph Databases<br />FlockDB<br />
    23. 23. Not so easy to get the data out....<br />
    24. 24. Amazon AWS Cloud<br />On-Premise<br /> (AKA private Cloud)<br />MySQL<br />Data Hub<br />SQL<br />HBase<br />SimpleDB<br />SQL<br />Data Hub<br />Microsoft Azure Cloud<br />SQL Azure<br />Table Services<br />SQL Server<br />Oracle<br />
    25. 25.
    26. 26. Big Data 3: Data Warehousing <br />
    27. 27. Data Warehouse players<br />
    28. 28. DATAllegro architecture<br />
    29. 29. Column Databases (Vertica, Sybase)<br />Data is stored together in columns<br />Very fast answers to analytic aggregate queries<br />Better compression<br />Not write optimized<br />
    30. 30. Disk drives and Moore’s law<br />Transistor density doubles every 18 months<br />Exponential growth is observed in most electronic components:<br />CPU clock speeds<br />RAM<br />Hard Disk Drive storage density <br />But not in mechanical components<br />Service time (Seek latency) – limited by actuator arm speed and disk circumference <br />Throughput (rotational latency) – limited by speed of rotation, circumference and data density<br />
    31. 31. Big Data vs. Fast Data<br />Disk trends 2001-2009<br />
    32. 32. SSD to the rescue?<br />
    33. 33. Power consumption<br />
    34. 34. Economics of SSD<br />
    35. 35. Fast reads but slow writes<br />
    36. 36. Hierarchical storage management <br />$/GB<br />$/IOP<br />
    37. 37. In Memory Databases: VoltDB & H-Store<br />In Memory Distributed (“Sharded”) Database<br />No transactional IO<br />ACID transactions (k-safety)<br />Single Threaded (no latches or locks)<br />Java Stored Procedure transactions<br />Hierarchical data model <br /><ul><li>Double Shared Nothing (disk OR CPU)
    38. 38. Spool out to DW for ad-hoc analysis
    39. 39. Very high TPS for suitable applications</li></li></ul><li>Oracle EXADATA<br />RAC clusters provide MPP<br />Dedicated storage servers<br />High Speed infiniband channels <br />Smart storage reduces data transfer requirements <br />Hybrid Flash & spinning disk storage system<br />Flash caching in the database systems<br />
    40. 40. The Next Generation?<br />