TPC-H inMongoDBAung Thu Rha Hein(g5536871)
Agenda•   Introduction to MongoDB•   TPC-H Data Setup•   Schema•   Advantages and Disadvantages of New Schema•   Queries  ...
Introduction to MongoDB• Open source, document-oriented and schema-free• Store data in BSON format• Easy to understand• Fl...
TPC-H Data Setup• Import data into MongoDB   o Use MongoVue to import from MySQL   o Time consuming and difficult• To achi...
Schema  • Final Schema of TPC-H in MongoDBlineitemOrder   CustomerNation Region   Partsupp Part supplier N R
Advantages and Disadvantages      of New Schema• Advantages  o Easier to understand than SQL schema  o One document: one r...
Queries• Select 6 queries to run on MongoDB with Map-  Reduce & Aggregation Framework• Compare the result with MySQLPROBLE...
Q1: Pricing Summary    Record Query
Q8:National Market ShareQuery
Q15:Top Supplier Query
Q20:Potential part Promotion           Query
Q21:Supplier who kept orderwaiting
Q22:Global Sales Opportunity
Benchmark result• All benchmarks run on Intel Core i7-3610QM 2.30GHz 6MB  cache,4GB DDR3,750GB 7200 RPM,Win64 system• Quer...
Benchmark result(cont.)• Query 20   MongoDB             1.1 sec   MySQL                                               174....
Discussion & Conclusion• MongoDB left behind in all queries   o   Design problem   o   Aggregation framework problem   o  ...
Demonstration
Upcoming SlideShare
Loading in...5
×

TPC-H in MongoDB

1,206

Published on

Run TPC-H queries in MongoDB and benchmark against MySQL RDBMS

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,206
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
18
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

TPC-H in MongoDB

  1. 1. TPC-H inMongoDBAung Thu Rha Hein(g5536871)
  2. 2. Agenda• Introduction to MongoDB• TPC-H Data Setup• Schema• Advantages and Disadvantages of New Schema• Queries o Pricing Summary Record o National Market Share Query o Total Supplier Query o Potential Part Promotion Query o Suppliers who kept orders waiting query o Global Sales Opportunity Query• Benchmark result• Discussion• Demonstration
  3. 3. Introduction to MongoDB• Open source, document-oriented and schema-free• Store data in BSON format• Easy to understand• Flexible, Scalable & lightweight• Ease of use• No ‘join’ operation• SQL to MongoDB Sample Query• Select * from users where status = “A” ORDER BY USER_ID DESC• db.users.find( { status: "A" } ).sort( { user_id: -1 } )
  4. 4. TPC-H Data Setup• Import data into MongoDB o Use MongoVue to import from MySQL o Time consuming and difficult• To achieve flexibility: o Embedded all tables into single collection o Replace all foreign keys with objects from lineitem table o Choose lineitem table because of • No primary keys
  5. 5. Schema • Final Schema of TPC-H in MongoDBlineitemOrder CustomerNation Region Partsupp Part supplier N R
  6. 6. Advantages and Disadvantages of New Schema• Advantages o Easier to understand than SQL schema o One document: one record o No need to join tables• Disadvantages o Higher memory usage o Update operation becomes more demanding o Converting to BSON takes time o Require lot of computational power o Only around 300,000(5%) count of lineitem able to convert
  7. 7. Queries• Select 6 queries to run on MongoDB with Map- Reduce & Aggregation Framework• Compare the result with MySQLPROBLEMS• Outputs are not the same because of failure during converting data• Aggregation framework is still in development
  8. 8. Q1: Pricing Summary Record Query
  9. 9. Q8:National Market ShareQuery
  10. 10. Q15:Top Supplier Query
  11. 11. Q20:Potential part Promotion Query
  12. 12. Q21:Supplier who kept orderwaiting
  13. 13. Q22:Global Sales Opportunity
  14. 14. Benchmark result• All benchmarks run on Intel Core i7-3610QM 2.30GHz 6MB cache,4GB DDR3,750GB 7200 RPM,Win64 system• Query1 MongoDB 6.1 sec MySQL 0.2 sec• Query 8 MongoDB 1.6 sec MySQL 0.1 sec• Query15 MongoDB 0.7 sec MySQL 0.4 sec
  15. 15. Benchmark result(cont.)• Query 20 MongoDB 1.1 sec MySQL 174.4 sec• Query 21 MongoDB 6.2 sec MySQL 5.5 sec• Query 22 MongoDB 7.6 sec MySQL 0.8 sec
  16. 16. Discussion & Conclusion• MongoDB left behind in all queries o Design problem o Aggregation framework problem o No standard Query Language o Server side query processing is not the nature of NoSQL o Complex SQL cannot convert easily• Only suitable for Applications: o Business card database o Web Blog o Applications without complex transactions
  17. 17. Demonstration
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×