There is a tremendous innovation that has happened in the database space in the recent past. In the beginning, we had traditional file systems and hierarchical databases. We then graduated to Relational Database Management systems followed by database appliances and in-memory databases. Then we saw the data explosion. Hadoop was born to alleviate the challenges that followed this data explosion. But it also failed to meet the new age demands of the business. Batch oriented processing was not good enough for the speed that business is looking for. The need of the hour is a near real time solution that can not only process transactional but also analytical workloads at the same time. The new age NoSQL databases bring them all together. However, we need to unlearn a lot while designing applications on NoSQL. We need to throw our relational cap while building such solutions. In this session, we will talk about the best practices of designing performant data models, tools and accelerators that lend predictability especially while migrating from traditional RDBMS to modern NoSQL models. This will be substantiated with our experience in developing such data models for our clients across the verticals.
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
MongoDB World 2019: Building an Efficient and Performant Data Model: Real World Challenges Faced and How We Solved Them
1. Building an efficient and a performant data
model
Real world challenges faced and how we solved them
2. Navigating your next with Infosys
200,000+
Employees
globally
$10.9
Billion in
revenues
1,204
Clients in over
45 countries
168,000+
Employees
trained in
Design Thinking
World’s largest
Corporate
University
3. Open Source COE
Legacy / Mainframe
Modernization
Public Cloud
(Applications)
DevOps
Innovation, Cost
Efficiency
ROI on current
technology investment
alignment to modern
architectures
Scale and Savings on the
infrastructure with Cloud
Native architecture
Infra as Code
Build the technology
foundation of the
Digital Platform
Reduce dependency on
the existing legacy
estate
Building a Cloud Native
digital platform
Digital tools adopting
DevOps & Agile
principles
Modernization Practice to drive Transformation
4. Infosys Open Source – At a Glance
Technology Platforms
Solutions Themes
Mainframe
offload
Monolith
to
Microservices
RDBMS
to
ODBMS
Application
Modernization on
NoSQL
EVENTS & STREAMING
API MANAGEMENT RDBMS
SEARCH & INSIGHTS
IN-MEMORY
PaaS CONTAINERSUXINTEGRATION
&
BPM
NOSQL
IaaS
Advisory
Plan, accelerate open
source adoption and
manage associated risk
Architecture
Consulting
Implementation Operations & Support
Make the right
technological choices and
establish the springboard
for success
Deliver measurable
benefits faster through
agile & lean methods
Embed technology into
mainstream with
continuous improvement
Big Data
Analytics
Service Offerings
5. Considerations for Relational Data Models
5
Integrity
Structure of Data &
Entities
Concurrency ControlConsistency
Schema Validations
6. Need of Databases for non-transactional use cases and applications
6
Use Case Preferred Type of Database
§ Caching Data
§ User Session and Preferences
§ Shopping Cart Data
Key Value
§ IOT Sensor Data
§ Logs
§ Huge Data set
Columnar
§ Social and other networks
§ Real time Routing
§ Fraud Detection
Graph
§ Web App
§ Product Catalog
§ Operational Data stores
§ Performant Reference Data Store
Document
That support Variety, Velocity and Volume of Data in the Digital world
7. Features of No SQL Databases Alternatives
7
Denormalized data -
Higher speed of retrieval
Schema-free and
unstructured data formats
Flexibility to
accommodate changes
and various data types
Denormalized data
Higher speed of retrieval
Higher performance
Horizontal scaling on
commodity servers
Low cost
Low Complexity
Features of Modern Databases
1 0 1 1 0 1 0 0 1 1 0 1 0 1
Built in Replication, High
Availability and Automated
Failover
No Add-on's
Low Complexity
Consistent Multi platform
experience
Avoids platform lock-in
Aligns to Next Gen Architecture
Open Source
Low License & Storage Cost
Lower Cost
9. 9
Client Context: A Multinational e-Commerce Company that provides order management, payment processing, order
routing, fulfillment, and analytics services to its clients. The company was out to re-architect its order management
system to a micro services based architecture
Identified 3 business
areas for Modernization
- Order Capture,
Supply Chain, Billing
Deep Dive on
understanding
Business Processes
and data flows
Identified
Dependencies that
impact the data entities
and attributes
Impacted ERD’s
mapped to 200 Tables
Pattern #1 –
Reference by key or
Embedding the
document.
Guidance - Typically never
embed more than few 100
documents
Inventory
{
“_id” : ObjectId(“…”),
“item_zone” : ” “,
“item_id” : “ “,
“item_sku” : “ “,
“item_qty” : 500,
“item_desc” : “XXXX”,
…
}
Inventory Audit
{
“_id” : ObjectId(“…”),
“item_zone” : ” “,
“item_id” : “ “,
“item_sku” : “ “,
“item_qty” : 500,
“inventory_chng”: [
{“dts”: DTS#1},
{“dts”: DTS#2},
…]
}
Inventory Audit Detail
{
“_id” : ObjectId(“…”),
“item_zone” : ” “,
“item_id” : “ “,
“item_sku” : “ “,
“item_qty” : 500,
“dts” : DTS#1,
from_qty : X,
to_qty : Y,
ord_id : “XXX”
}
Reference by key was recommended
1
12. 12
Client Context: A French multinational corporation specializing in energy management and automation . The
company was in the process of implementing an IOT use case for recording sensor readings from multiple devices
Sds Dsds Sds sds
Pattern #1 –
Bucket Pattern –
Optimized Index size and
Optimized Read
operations
Guidance - Optimized as per
the application access and
aggregation needs
Old Model
{ “_id” : ObjectId(“…”),
"s" : BinData ( xx),
"t" : ISODate ( xxx ),
"v" : 15,
"a" :
{
"Name" : "Quality",
"SemanticRef" : "com.ref",
"Value" : "Good"
}
}
New Model
{ “_id” : ObjectId(“…”),
"s" : BinData ( xx),
"t" : ISODate ( xxx ),
"v" : [ "0" : { # Increment from bucket start
"a" : { # Only exists if there are attributes
"Quality": { "v" : “Good",
"s" : "XXX" # SemanticRef },
}},
“1" : {Increment from beginning of bucket
},
Bucket the observations
2
17. How to provide Predictability and get a head start to
NoSQL Model when migrating from a RDBMS?
18. IDMC - Infosys Data Model Converter
18
EXTRACTION ANALYSIS PERSISTENCE PROCESSING DEPLOYMENT
Source
RDBMS
Table Design
Query Pattern
Data Pattern
Entity and
Relationship
Read and Write
Query
Data Volatility
and Cardinality
Rules
Drools
SQL Lite
Target
Data-Model
Generation
Deployment
Scripts
Target
NoSQL
Process Rules