MongoDB World 2019: Building an Efficient and Performant Data Model: Real World Challenges Faced and How We Solved Them

Building an efficient and a performant data
model
Real world challenges faced and how we solved them

Navigating your next with Infosys
200,000+
Employees
globally
$10.9
Billion in
revenues
1,204
Clients in over
45 countries
168,000+
Employees
trained in
Design Thinking
World’s largest
Corporate
University

Open Source COE
Legacy / Mainframe
Modernization
Public Cloud
(Applications)
DevOps
Innovation, Cost
Efficiency
ROI on current
technology investment
alignment to modern
architectures
Scale and Savings on the
infrastructure with Cloud
Native architecture
Infra as Code
Build the technology
foundation of the
Digital Platform
Reduce dependency on
the existing legacy
estate
Building a Cloud Native
digital platform
Digital tools adopting
DevOps & Agile
principles
Modernization Practice to drive Transformation

Infosys Open Source – At a Glance
Technology Platforms
Solutions Themes
Mainframe
offload
Monolith
to
Microservices
RDBMS
to
ODBMS
Application
Modernization on
NoSQL
EVENTS & STREAMING
API MANAGEMENT RDBMS
SEARCH & INSIGHTS
IN-MEMORY
PaaS CONTAINERSUXINTEGRATION
&
BPM
NOSQL
IaaS
Advisory
Plan, accelerate open
source adoption and
manage associated risk
Architecture
Consulting
Implementation Operations & Support
Make the right
technological choices and
establish the springboard
for success
Deliver measurable
benefits faster through
agile & lean methods
Embed technology into
mainstream with
continuous improvement
Big Data
Analytics
Service Offerings

Considerations for Relational Data Models
5
Integrity
Structure of Data &
Entities
Concurrency ControlConsistency
Schema Validations

Need of Databases for non-transactional use cases and applications
6
Use Case Preferred Type of Database
§ Caching Data
§ User Session and Preferences
§ Shopping Cart Data
Key Value
§ IOT Sensor Data
§ Logs
§ Huge Data set
Columnar
§ Social and other networks
§ Real time Routing
§ Fraud Detection
Graph
§ Web App
§ Product Catalog
§ Operational Data stores
§ Performant Reference Data Store
Document
That support Variety, Velocity and Volume of Data in the Digital world

Features of No SQL Databases Alternatives
7
Denormalized data -
Higher speed of retrieval
Schema-free and
unstructured data formats
Flexibility to
accommodate changes
and various data types
Denormalized data
Higher speed of retrieval
Higher performance
Horizontal scaling on
commodity servers
Low cost
Low Complexity
Features of Modern Databases
1 0 1 1 0 1 0 0 1 1 0 1 0 1
Built in Replication, High
Availability and Automated
Failover
No Add-on's
Low Complexity
Consistent Multi platform
experience
Avoids platform lock-in
Aligns to Next Gen Architecture
Open Source
Low License & Storage Cost
Lower Cost

NoSQL Data Modelling Patterns followed at our
customers

9
Client Context: A Multinational e-Commerce Company that provides order management, payment processing, order
routing, fulfillment, and analytics services to its clients. The company was out to re-architect its order management
system to a micro services based architecture
Identified 3 business
areas for Modernization
- Order Capture,
Supply Chain, Billing
Deep Dive on
understanding
Business Processes
and data flows
Identified
Dependencies that
impact the data entities
and attributes
Impacted ERD’s
mapped to 200 Tables
Pattern #1 –
Reference by key or
Embedding the
document.
Guidance - Typically never
embed more than few 100
documents
Inventory
{
“_id” : ObjectId(“…”),
“item_zone” : ” “,
“item_id” : “ “,
“item_sku” : “ “,
“item_qty” : 500,
“item_desc” : “XXXX”,
…
}
Inventory Audit
{
“item_id” : “ “,
“inventory_chng”: [
{“dts”: DTS#1},
{“dts”: DTS#2},
…]
}
Inventory Audit Detail
{
“item_id” : “ “,
“dts” : DTS#1,
from_qty : X,
to_qty : Y,
ord_id : “XXX”
}
Reference by key was recommended
1

10
Pattern #2 –
Reference by key or
Embedded Reference.
Guidance – Denormalize with a
date for sync of data duplication.
Order
{
“Order_ID” : ” “,
“Customer_ID” : “ “,
“ordered_item_info” : [
{ “item_id”: ObjectId(“..”), “item_qty”:1, “item_prc” : “XX”},
{ “item_id”: ObjectId(“..”), “item_qty”: 3, “item_prc” : “XX”}
…],
“Customer_Ship_Addr”: {“fl”: “123 XXX”, “st”:”TX”,”zc”:”75013 }
}
Customer
{
“customer_id” : ” “,
“customer_nm” : “ “,
“customer_address”: [
{“fl”: “123 XXX”, “st”: “TX”, “zc”:”75013”},
{“fl”: “233 XXX”, “st”: “TX”, “zc”:”75014”},
…]
}
Pattern #3 –
Partial Embedding
Guidance - Typically meant for
performant data access of
frequently needed data
Inventory
{
“item_id” : “ “,
…
}
Inventory Audit
{
“item_id” : “ “,
{“dts”: DTS#1},
{“dts”: DTS#2},
…]
}
{
“item_id” : “ “,
“dts” : DTS#1,
from_qty : X,
to_qty : Y,
ord_id : “XXX”
}
Embedding was recommended
Based on Application Access

11
11
Pattern #4 –
Supplementary
Collection for Tuning
and Optimization.
differential access pattern,
sharding and performant access
Inventory
{
“item_id” : “ “,
…
}
Inventory Audit
{
“item_id” : “ “,
{“dts”: DTS#1},
{“dts”: DTS#2},
…]
}
{
“item_id” : “ “,
“dts” : DTS#1,
from_qty : X,
to_qty : Y,
ord_id : “XXX”
}
Pattern #5 –
Avoid Deep Nesting
Guidance – Not recommended
beyond 3-4 level of nesting
Inventory
{
“item_id” : “ “,
…
}
Inventory Audit
{
“item_id” : “ “,
{“dts”: DTS#1},
{“dts”: DTS#2},
…]
}
{
“item_id” : “ “,
“dts” : DTS#1,
from_qty : X,
to_qty : Y,
ord_id : “XXX”
}
Supplementary Collection of Order Status was recommended
Nesting was limited to 3 levels only

12
Client Context: A French multinational corporation specializing in energy management and automation . The
company was in the process of implementing an IOT use case for recording sensor readings from multiple devices
Sds Dsds Sds sds
Pattern #1 –
Bucket Pattern –
Optimized Index size and
Optimized Read
operations
Guidance - Optimized as per
the application access and
aggregation needs
Old Model
{ “_id” : ObjectId(“…”),
"s" : BinData ( xx),
"t" : ISODate ( xxx ),
"v" : 15,
"a" :
{
"Name" : "Quality",
"SemanticRef" : "com.ref",
"Value" : "Good"
}
}
New Model
"v" : [ "0" : { # Increment from bucket start
"a" : { # Only exists if there are attributes
"Quality": { "v" : “Good",
"s" : "XXX" # SemanticRef },
}},
“1" : {Increment from beginning of bucket
},
Bucket the observations
2

13
13
Pattern #4 –
Supplementary
and Optimization.
Inventory
{
“item_id” : “ “,
…
}
Inventory Audit
{
“item_id” : “ “,
{“dts”: DTS#1},
{“dts”: DTS#2},
…]
}
{
“item_id” : “ “,
“dts” : DTS#1,
from_qty : X,
to_qty : Y,
ord_id : “XXX”
}
Pattern #5 –
Avoid Deep Nesting
Inventory
{
“item_id” : “ “,
…
}
Inventory Audit
{
“item_id” : “ “,
{“dts”: DTS#1},
{“dts”: DTS#2},
…]
}
{
“item_id” : “ “,
“dts” : DTS#1,
from_qty : X,
to_qty : Y,
ord_id : “XXX”
}

14
Client Context: A US financial services company provides nationwide domestic debit acceptance at retail POS,
ATM, and Online outlets for most of the U.S. credit and debit cards.
Sds Dsds Sds sds
Pattern #1 –
Old Model
"v" : 15,
"a" :
{
"Name" : "Quality",
"SemanticRef" : "com.ref",
"Value" : "Good"
}
}
Inventory Audit
"v" : [ "0" : { # Increment from bucket start
"a" : { # Only exists if there are attributes
"Quality": { "v" : “Good",
"s" : "XXX" # SemanticRef },
}},
“1" : {Increment from beginning of bucket
},
Reference by key was recommended
3

15
15
Pattern #4 –
Supplementary
and Optimization.
Inventory
{
“item_id” : “ “,
…
}
Inventory Audit
{
“item_id” : “ “,
{“dts”: DTS#1},
{“dts”: DTS#2},
…]
}
{
“item_id” : “ “,
“dts” : DTS#1,
from_qty : X,
to_qty : Y,
ord_id : “XXX”
}
Pattern #5 –
Avoid Deep Nesting
Inventory
{
“item_id” : “ “,
…
}
Inventory Audit
{
“item_id” : “ “,
{“dts”: DTS#1},
{“dts”: DTS#2},
…]
}
{
“item_id” : “ “,
“dts” : DTS#1,
from_qty : X,
to_qty : Y,
ord_id : “XXX”
}

Important Modelling Considerations
16

How to provide Predictability and get a head start to
NoSQL Model when migrating from a RDBMS?

IDMC - Infosys Data Model Converter
18
EXTRACTION ANALYSIS PERSISTENCE PROCESSING DEPLOYMENT
Source
RDBMS
Table Design
Query Pattern
Data Pattern
Entity and
Relationship
Read and Write
Query
Data Volatility
and Cardinality
Rules
Drools
SQL Lite
Target
Data-Model
Generation
Deployment
Scripts
Target
NoSQL
Process Rules

© 2018 Infosys Limited, Bengaluru, India. All Rights Reserved. Infosys believes the information in this document is accurate as of its publication date; such
information is subject to change without notice. Infosys acknowledges the proprietary rights of other companies to the trademarks, product names and
such other intellectual property rights mentioned in this document. Except as expressly permitted, neither this documentation nor any part of it may be
reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, printing, photocopying, recording or
otherwise, without the prior permission of Infosys Limited and/ or any named intellectual property rights holders under this document.
THANK YOU

MongoDB World 2019: Building an Efficient and Performant Data Model: Real World Challenges Faced and How We Solved Them

Recommended

Recommended

More Related Content

Similar to MongoDB World 2019: Building an Efficient and Performant Data Model: Real World Challenges Faced and How We Solved Them

Similar to MongoDB World 2019: Building an Efficient and Performant Data Model: Real World Challenges Faced and How We Solved Them (20)

More from MongoDB

More from MongoDB (20)

Recently uploaded

Recently uploaded (20)

MongoDB World 2019: Building an Efficient and Performant Data Model: Real World Challenges Faced and How We Solved Them