Rule of thumb to translate your SQL schema to a suitable MongoDB Schema. Learn about the practical aspects of transforming a SQL Schema to MongoDB.
Learn about the difference between the Explicit and Implicit Schema and Data Islands.
6. Implicit Schema
❖ The SQL Schema as expressed by the following
operations and their associated metadata
❖ Insert operation
❖ Update operations
❖ Select Operations
❖ Join relationships
13. Important Notes
❖ Foreign Key Relationship in most cases are not
representative of application level queries
❖ Cannot discover the degree of mutability looking at the
SQL in isolation
❖ Cannot know how the average sizes of n in the 1:n
relationships
15. Implicit Schema
❖ The Implicit Schema represents the SQL operations
executed against the relational schema (Application
Schema)
❖ Can vary hugely from the foreign key relationships
❖ Expresses read/vs write ratios for tables
❖ Can be used to deduct entity mutability
❖ Can be used to estimate n in the 1:n relationships
16. Example - SELECT
❖ SELECT * FROM orders, orderdetails, products WHERE
…. [1000]
❖ SELECT * FROM offices, employees WHERE … [100]
❖ SELECT * FROM productlines, products WHERE … [2000]
❖ SELECT * FROM products WHERE … [4000]
❖ SELECT * FROM employees, customers WHERE … [200]
❖ SELECT * FROM customers, orders WHERE … [200]
17. What We Can Learn
❖ The frequency of the SQL operations
❖ The Application Schema relationships studying the join
relationships.
❖ If the logs include the number of rows returned we can
make estimates for the size of n in the 1:n relationships
❖ We can also calculate the rate of growth of the n over
time
21. ❖ Single Item Mutability Rate (SIMR)
❖ How much an entity mutates in a given time period
❖ A low mutation rate
❖ Entity reaches a stable state and is a good candidate for rolling up
into a single document
❖ Duplication of data is ok as the document is a snapshot in time
❖ A high mutation rate
❖ Entity does not reach a stable state and keeps mutating and might
not be a good candidate for rollup
Single Item Mutability Rate
22. ❖ Order life span example
❖ An order gets created at T=0
❖ 10 order details are created at T+1
❖ Order is filled and order record updated T+10
❖ Order is shipped and order record updated T+15
❖ Past T+15 there are no more mutations
Order Life Span Example
23. Order Life Span Example
T
T = 0
Order
Created
T = 1
Added
10
Order Details
T = 10
Order
Fulfilled
T = 15
Order
Shipped
25. ❖ Customer[1:n] -> Payment relationship
❖ A payment created at T=5, T=50
❖ Customer[1:n] -> Orders
❖ An order created at T=0, T=15, T=20, T=45
❖ Unbound Relationships
Customer Life Span Example
26. Customer Life Span Example
T
T = 0
Order
Created
T = 1 T = 5
Payment
Created
T = 15
Order
Created
Order
Created
T = 20
Order
Created
28. ❖ The recursive relationship for Employees makes it
unsuitable for rolling up
❖ The same recursive relationship also affects the offices-
>employees relationship
❖ The ProductLines -> Products relationship are big and
possibly unbound
And The Rest ?
30. 1. SQL Schema + Foreign Key Relationships
❖ Only have the Explicit Relationships and Table
definitions
2. SQL Operations Logs (mysql general log)
❖ Contains only SQL operations (no result set size)
3. Full SQL Operations Logs (mysql slow log)
❖ Contains SQL operations (result set size, latencies)
Levels Of Information
31. 1. Use Selects with Joins to draw the new relationship
2. Establish the average n join relationship
3. Establish the mutation rate of over time
❖ Does the relationship go static ?
❖ Are the relationships unbound ? (growing n)
Analysis Steps
32. 1. Roll up relationships
1. If entity relationship reaches a static state
2. If the rate of growth of n is slow enough for the relationship to be
static (analyst discretion)
2. Don’t rollup relationships
1. If the rate of mutability is high
2. If the average size of n is huge
3. If the mutation rate of the entity is large
4. If an entity has a recursive relationship
Algorithms
38. 1. We are working on building tooling to help
1. Analyze your relational schema
2. Propose schema recommendations
3. Load and transform your data
2. Push the whole subject of schema transformation
forward doing something never done before
Tooling
39. 1. Are Operation Latencies important for recommending
a Schema ?
2. Can one quantify a schema recommendation (is
recommendation A better than B and if, then why ?)
3. Can Machine Learning produce better
recommendations ?
4. … etc etc
Tons of Open Questions
40. Are you ready to build a new team, to build a brand new product, and to create
a whole new category of products for the most popular NoSQL database?
MongoDB, the leader in NoSQL databases is building a new team in Dublin.
This team will develop products that help our customer adopt our technology
by analyzing their legacy relational systems. We need someone who is going to
participate in the research, partner with our staff engineers who are
prototyping solutions, write production ready code, and build a team.
This person will report to the Director of Integrations at MongoDB.
Come Work With Us
http://grnh.se/ge1rfp1