Database VirtualizationThe Next Wave of Big DataMike Hogan, CEO
2Agenda• Big Data: A Moving Target• Common Understanding of Virtualization• Database Virtualization Challenge• Alternative...
3Big Data: A Moving Target• Definition: Too much data tohandle in a traditional database• Big Data tools leverage scale-ou...
© Copyright 2013 ScaleDB. The information contained herein is subject to change without notice.What is Database Virtualiza...
5The Dedicated ServerA ServerServer UtilizationHeadroom (to avoid failure)Usage Spike(Average 10%)
6The Virtualized App ServerShared among many customersPlenty of room for usage peaksVirtualization enables Cloud Providers...
7Database Virtualization Challenges• No coordination between databases (data & locking)Bank Balance = $10MWithdraw $10MWir...
8Alternative 1: NoSQLElasticity enables you to burstacross servers, so you can runthem at high utilization
9Alternative 1: NoSQLMoves functionality to the application tier…more work for youYour ApplicationCons:1. Non-relational (...
10Alternative 2: SQL ShardingMastersSlavesEACH server must handle the peak for ITS dataCons:1. Not elastic = no bursting a...
11Introducing Database VirtualizationHighly-available data tiershared across multipledatabase clustersDatabase Tier(CPU)St...
12Introducing Database VirtualizationProcessed at the storagetier, only results are sentback to the databaseDatabase Tier(...
13Database Virtualization Enables DBaaSProcessing sharedacross database nodesHighly-available data tiershared across multi...
14Cloud Computing’s Enabling TechnologiesServer• Server Virtualization• VMWare, CitrixStorage• Storage Virtualization• EMC...
© Copyright 2013 ScaleDB. The information contained herein is subject to change without notice.How About Performance?
16Performance: ScaleDB vs. InnoDBPerformance tests running on DL380 servers, large data set0500100015002000250055012381884...
17Performance: ScaleDB vs. InnoDBPerformance tests running on HP Cloud (Read:Write Ratio = 1:1)MySQL+InnoDBScaleDB1-NodeSc...
18Performance: ScaleDB vs. InnoDBPerformance tests running on HP Cloud (Read-Only)MySQL+InnoDBScaleDB1-NodeScaleDB2-NodesB...
19Performance: ScaleDB vs. InnoDBSysbench benchmark running on HP Cloud (Read-Only)MySQL+InnoDBScaleDB1-NodeScaleDB2-Nodes...
20Performance: ScaleDB vs. InnoDBSysbench benchmark running on HP Cloud (10% Write )MySQL+InnoDBScaleDB1-NodeScaleDB2-Node...
21Summary• Database Scale-out & Parallelization Address Big Data• Scaling-out SQL Database Problem: Distributed Locking• A...
© Copyright 2013 ScaleDB. The information contained herein is subject to change without notice.Thank You
Upcoming SlideShare
Loading in …5
×

Database Virtualization: The Next Wave of Big Data

587 views

Published on

Servers, Storage and Networking have all been virtualized, the next big wave is the database. SQL databases are the one thing in the cloud that require single dedicated instances. Database virtualization changes all of this, enabling full elasticity without sacrificing functionality.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
587
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
20
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Average server utilization runs at about 10%, that then enables your IT or your cloud provider to use/sell the unused capabilities.
  • Companies no longer have to
  • Companies no longer have to
  • Easy to build, you simply lock the other nodes, while one is writing….but then your performance is terrible. How hard is it to build this distributed lock manager? It took Oracle 10 years to get it right with RAC. 10 Years….That’s 70 cloud years…who has time for that?
  • Mitigating Factors: “It depends”Distribution of data/loadUse of slaves to handle read load
  • ScaleDB virtualizes the database, turning it into a database tier and a storage tier. The storage tier provides a pool of cache that is shared among various clusters, enabling it to share I/O peaks across multiple nodes. The database tier then enables very high utilization because they elastically expand to handle peaks. The only Con to this architecture is that it takes the developer a long time to build…but we’ve done that!
  • ScaleDB virtualizes the database, turning it into a database tier and a storage tier. The storage tier provides a pool of cache that is shared among various clusters, enabling it to share I/O peaks across multiple nodes. The database tier then enables very high utilization because they elastically expand to handle peaks.
  • ScaleDB virtualizes the database, turning it into a database tier and a storage tier. The storage tier provides a pool of cache that is shared among various clusters, enabling it to share I/O peaks across multiple nodes. The database tier then enables very high utilization because they elastically expand to handle peaks.
  • Database Virtualization: The Next Wave of Big Data

    1. 1. Database VirtualizationThe Next Wave of Big DataMike Hogan, CEO
    2. 2. 2Agenda• Big Data: A Moving Target• Common Understanding of Virtualization• Database Virtualization Challenge• Alternative 1: NoSQL• Alternative 2: Sharding• Introducing Database Virtualization• Narrowing the Gap Between Databases and Big Data
    3. 3. 3Big Data: A Moving Target• Definition: Too much data tohandle in a traditional database• Big Data tools leverage scale-out architectures e.g. Hadoop• Technology advances make BigData a moving target• Databases adopting scale-out, virtual databasearchitecturesDataVolumeTimeBIG Data
    4. 4. © Copyright 2013 ScaleDB. The information contained herein is subject to change without notice.What is Database Virtualization?
    5. 5. 5The Dedicated ServerA ServerServer UtilizationHeadroom (to avoid failure)Usage Spike(Average 10%)
    6. 6. 6The Virtualized App ServerShared among many customersPlenty of room for usage peaksVirtualization enables Cloud Providers to sell 3-4 TIMES moreservers than they actually own. This is how they make money.
    7. 7. 7Database Virtualization Challenges• No coordination between databases (data & locking)Bank Balance = $10MWithdraw $10MWire $8MWire $8MBank Balance = -$16MBankYou• Requires a distributed locking solution• Distributed locking is fairly easy to build…• …but building it to perform well is extremely hard• It took Oracle RAC 10 years …70 “cloud years”
    8. 8. 8Alternative 1: NoSQLElasticity enables you to burstacross servers, so you can runthem at high utilization
    9. 9. 9Alternative 1: NoSQLMoves functionality to the application tier…more work for youYour ApplicationCons:1. Non-relational (build this into your app)2. Reduces consistency: different users/different answers3. Removes transactions (build this into your app)4. Less functionality e.g. joins (build these into your app)The DBMS SQLNoSQLApp AppYou buy this partYou build & maintain this partPros:1. Scalability2. Elastic = high utilization
    10. 10. 10Alternative 2: SQL ShardingMastersSlavesEACH server must handle the peak for ITS dataCons:1. Not elastic = no bursting across servers2. Rigid partitioning model3. Requires slaves for fail-over (vs. high-availability)4. You have to build & maintain routing codePros:1. Relational2. Consistent data (ACID)3. Transactional4. Full functionalityNo elasticity means no burstingacross servers, requiring lowutilization.Not highly-available, relies onfail-over
    11. 11. 11Introducing Database VirtualizationHighly-available data tiershared across multipledatabase clustersDatabase Tier(CPU)Storage Tier(I/O)Virtualizes & Shares Storage Tier across Elastic Database ClustersShared among many customersPlenty of room for usage peaksPros:1. Relational2. Consistent data (ACID)3. Transactional4. Full functionality5. Elastic6. No slaves
    12. 12. 12Introducing Database VirtualizationProcessed at the storagetier, only results are sentback to the databaseDatabase Tier(CPU)Storage Tier(I/O)Distributed Parallel Process Across Storage ServersQuery:What were my sales last month?• Distributed Parallel Processing: Similar to Map-Reduce & Oracle Exadata• This Narrows the Gap between Databases and Big Data
    13. 13. 13Database Virtualization Enables DBaaSProcessing sharedacross database nodesHighly-available data tiershared across multipledatabase clustersDatabase Tier(CPU)Storage Tier(I/O)Virtualizes & Shares Storage Tier across Elastic Database Clusters
    14. 14. 14Cloud Computing’s Enabling TechnologiesServer• Server Virtualization• VMWare, CitrixStorage• Storage Virtualization• EMC, Netapp, IBM, Dell, HPNetwork• Network Virtualization• Cisco, VMWare, OracleDBMS• Database Virtualization• ScaleDB
    15. 15. © Copyright 2013 ScaleDB. The information contained herein is subject to change without notice.How About Performance?
    16. 16. 16Performance: ScaleDB vs. InnoDBPerformance tests running on DL380 servers, large data set05001000150020002500550123818842236MariaDB+InnoDBScaleDB1-NodeScaleDB2-NodesScaleDB3-NodesBenchmark Details: YCSB Workload A, 1:1 Read/Write Ratio, Database Size: 200M Rows, MariaDB V5.3.5OperationsperSecond
    17. 17. 17Performance: ScaleDB vs. InnoDBPerformance tests running on HP Cloud (Read:Write Ratio = 1:1)MySQL+InnoDBScaleDB1-NodeScaleDB2-NodesBenchmark Details: YCSB Workload A, 1:1 Read/Write Ratio, Database Size: 40M Rows, MySQL V5.1.42OperationsperSecond050010001500200025003000350040004500500054435424668
    18. 18. 18Performance: ScaleDB vs. InnoDBPerformance tests running on HP Cloud (Read-Only)MySQL+InnoDBScaleDB1-NodeScaleDB2-NodesBenchmark Details: YCSB Workload A, 1:0 Read/Write Ratio, Database Size: 40M Rows, MySQL V5.1.42020004000600080001000012000930611711920OperationsperSecond
    19. 19. 19Performance: ScaleDB vs. InnoDBSysbench benchmark running on HP Cloud (Read-Only)MySQL+InnoDBScaleDB1-NodeScaleDB2-NodesBenchmark Details: Sysbench, Read-Only, Database Size: 500M Rows, MySQL V5.1.42TransactionsperSecond0501001502002507134250
    20. 20. 20Performance: ScaleDB vs. InnoDBSysbench benchmark running on HP Cloud (10% Write )MySQL+InnoDBScaleDB1-NodeScaleDB2-NodesBenchmark Details: Sysbench, 10% Write, Database Size: 500M Rows, MySQL V5.1.42TransactionsperSecond0102030405060708035079
    21. 21. 21Summary• Database Scale-out & Parallelization Address Big Data• Scaling-out SQL Database Problem: Distributed Locking• Alternative 1: NoSQL• Alternative 2: Sharding• Both Shift Functionality to the Application Tier• Introducing Database Virtualization…with Performance!• Closing the Gap Between Databases and Big Data
    22. 22. © Copyright 2013 ScaleDB. The information contained herein is subject to change without notice.Thank You

    ×