Scalable XQuery Processing with Zorba on top of MongoDB

7,377 views
7,094 views

Published on

Since a couple of years, the NoSQL movement has developed a variety of open-source document stores. Most of them focus on high availability, horizontal scalability, and are designed to run on commodity hardware. These products have gained great traction in the industry to store large amounts of flexible data (mostly JSON). In the meantime, XQuery has evolved to a standardized, full-fledged programming language for XML with native support for complex queries, indexes, updates, full-text search, and scripting. Moreover, JSON has recently been added as a first-level datatype into the language. As of today, it is without doubt the most robust and productive technology to process flexible data.
The aim of this talk is to showcase the benefits that can be achieved by integrating the Zorba XQuery Processor with MongoDB. We will introduce the 28msec platform that seamlessly stores, indexes, and manages flexible data entirely in XQuery. The data itself is stored in MongoDB. The platform leverages MongoDB’s indexes, sharding, and consistency guarantees to scale-out horizontally. The talk will conclude by showing a benchmark of the platform and discuss perspectives of the outlined approach.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
7,377
On SlideShare
0
From Embeds
0
Number of Embeds
90
Actions
Shares
0
Downloads
59
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Scalable XQuery Processing with Zorba on top of MongoDB

  1. 1. Scalable XQuery Processingmsec Zorba Meets MongoDB William Candillon {candillon@28msec.com}
  2. 2. Two Drivers
  3. 3. Flexible Data
  4. 4. Scalability
  5. 5. MongoDB BaseX CouchBase eXist-dbStandardized Query Language X ✔ Flexible DataModern Query Processing X ✔Typing X ✔High Availability ✔ X ScalabilitySharding ✔ XAvailable as a Service ✔ X
  6. 6. What can XML contribute to JSON Datastores?
  7. 7. A Standardized, Rock Solid Query Language
  8. 8. JSONiq - The SQL of NoSQL 28
  9. 9. JSONiq 28• Open Specification: jsoniq.org• Extension of the mature XQuery for JSON - Joins, Group-by, Filters, Search...• Leverage the complete XQuery Family - Scripting, Updates, Full-Text• Standardized Query Language - Run the same code accross multiple JSON stores
  10. 10. JSONiq - MongoDB Connector 28 http://28.io/mongodb
  11. 11. What can JSON datastore contribute to XML?
  12. 12. A Distributed and Scalable Store
  13. 13. The Goal 28 • memcached Scalability & Performance • key/value • MongoDB • RDBMS • XML DB Depth of functionality
  14. 14. The Goal 28 28msec - XQuery on top of MongoDB • memcached Scalability & Performance • key/value • MongoDB • 28msec RDBMS • XML DB Depth of functionnality
  15. 15. Meet Zorba 28• Open Source XQuery Processor - Apache 2 License - Contributors: Oracle, 28msec, FLWOR Foundation• The Complete Family - XQuery 3.0, Updates, Full-Text, Scripting, JSONiq - XQuery Data Definition Facility• Pluggable Store API - Run Zorba on your own persistency layer
  16. 16. Zorba Architecture 28
  17. 17. Meet MongoDB 28• Open Source JSON Document Store - License AGPL 3.0• Focus on scalability - Replication accross multiple availability zones - Sharding - Atomic updates on documents• Available as a service - MongoHQ, MongoLab
  18. 18. MongoDB Deployment Example 28 Shard1 Shard2 Shard3 MongoD Replica set C1 MongoD C2 MongoD C3 MongoD MongoS MongoS Config Servers App Server App Server
  19. 19. The Goal 28 Zorba Runtime XDM Collections IndexesMongoDB MongoS BSON Collections Indexes
  20. 20. The Goal 28• Seamless XQuery Integration into MongoDB Zorba Runtime XDM Collections IndexesMongoDB MongoS BSON Collections Indexes
  21. 21. Application Example 28
  22. 22. Application Example 28• Fetching sports news from XMLTeam.com• Stored and indexed on MongoDB• 1 million documents and counting• Entirely built in XQuery from backend to frontend• 1k loc, 1 developer, 1 week work
  23. 23. Collection Declarations 28declare collection sports:docs as document-node();
  24. 24. Collection Declarations 28 declare collection ... 1. Compile Query Compiler Runtime 2. createCollection(QName) Store APIZorba 3. Create Collection MongoDB
  25. 25. Index Declarations 28declare %an:value-range index sports:by-datetime on nodes db:collection(xs:QName(sports:docs)) by ./sports-content/sports-metadata/@date-time;
  26. 26. Index Declarations 28 declare index ... 1. Compile Query Compiler Runtime createIndex( 2. qname, ordpath, keys ) Store APIZorba 3. Create Index MongoDB
  27. 27. Insert Nodes 28let $uri := http://xmlteam.com/...let $doc := http:get($uri)return db:insert-nodes($sports:docs, $doc)
  28. 28. Insert Nodes 28 db:insert-nodes(...) 1. Process Query Compiler Runtime 2. insertNode(qname, xdm) Store APIZorba 3. Insert BSON MongoDB
  29. 29. MongoDB Store Layer 28• Direct XQuery to MongoDB mapping - Collections - Indexes• Converts XDM to BSON• Inherits MongoDB consistency model
  30. 30. Request Processing on 28msec 28 HTTP Client 1 R 9 Availability Zone 1 ELB R 2 8 Sausalito 7 Zorba R Processor Request Handler Store 4 3 5 6 MongoDB Compiled Code Data
  31. 31. Scaling Out 28 Avg Response Time in ms 1000 750 500 2 App Servers 4 App Servers 250 0 10 40 50 70 80 100 120 150 Number of concurrent requests
  32. 32. XQuery on Top of MongoDB 28• Seamless Integration of XQuery with MongoDB - XDM to BSON - Collections and indexes mapping - Atomicity per document• 28msec - XQuery Platform on top of MongoDB - Deploy your XQuery apps in 1-click - Scale up & down automatically
  33. 33. Take Away 28• Two Drivers - Flexible Data - Scalability• Two Champions - XQuery for Flexible Data - JSON Stores for Scalability• Two Contributions - JSONiq: The SQL of NoSQL - XQuery Platform on top of MongoDB
  34. 34. Thank You!msec Questions?

×