WSO2's API Vision: Unifying Control, Empowering Developers
Big data and polyglot solutions
1. SQL Server
Big Data and Polyglot Solutions
Polyglot solution in Big Data:
‘Polyglot’ means a software application that uses multiple resources to perform
desired operation and achieve results.
• It refers to creating hybrid data store that combines functionalities
of RDBMS and NoSQL data tools to achieve flexible and efficient
data model.
Web / Mobile or client application
Dynamo
DB
Cassandr
a
Calls for relational and
transactional data
Calls for unstructured
and scalable data
NoSQL Server Cluster
2. Implementing Polyglot solution
• Analyzing the requirement for business
continuity is the key.
• Selecting the apt database management
solution according to the need.
• Some of the approaches to implement
polyglot solution in an organization –
Multiple lanes, Polyglot mapper, Nested
Database, Omnipotent database.
3. Example – eCommerce Application
MongoDB
Redis
DynamoDB
Cassandra
MySQL
HBaseNeo4j
Product Catalog
Shopping Cart
Social Profile Social graph Email and feed messages
Payment process
Audit and activity log
4. Polyglot solution -Multiple lanes:
Dispatches work flow into multiple separate
persistence lanes.
Advantages:
• Easy to Implement.
• Works for all databases.
Disadvantages:
• Data needs to be domain specific pre-
partitioned.
• Cross lane queries are hard to implement.
MySQL Mongo DB
5. Polyglot solution – Polyglot Mapper:
Particular component of the application
handles different multiple databases in
parallel by mapping.
Advantages:
• Single Object model.
• Cross Lane Queries possible.
Disadvantages:
• Just few mapper products that supports
NoSQL + RDBMS.
MySQL Mongo DB
6. Polyglot solution – Nested Database:
Primary database backend maps to
secondary database.
Advantages:
• Polyglot persistence is invisible.
• Cross database query is possible.
Disadvantages:
• Not many databases supports this model.
MySQL
Mongo DB
7. Polyglot solution – Omnipotent
database:
Databases uses backend multiple relational
and non-relational database storage engines
in parallel.
Advantages:
• Polyglot persistence is invisible.
• Backup & Restore is possible.
Disadvantages:
• Just few products support this model.
MySQL Mongo DB
FE
BE
8. Advantages of Polyglot solution
• Leverage the strengths of multiple data
stores.
• Synchronization engine integrates
multiple data storage for a single
individual application.
• Uses highly optimized data storage
formats.
• Provides great query performance.
• More Scalable data.
• Most of the NoSQL solutions are open
source.
9. Disadvantages of Polyglot solution
• Complexity of architecture and setting up
the system increases significantly.
• Impedance mismatch between databases
due to different data models, languages.
• Different backup, restoring and
maintenance approaches provides
greater challenges for implemented
systems.
10. Recommendations
• Select the data store based on requirements and
environment.
• Minimize the number of stores, it would
significantly bring down the maintenance and
implementation cost.