Are you tired of tedious and long data-to-insights journey, siloed data and unleveraged Data? Would you like existing demographic data help you drive business outcome? Would you like NOT to create any data lake and direct insights on data with pre-fabricated data structure without any efforts?
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
MongoDB World 2019: re:Innovate from Siloed to Deep Insights on Your Data
1. Jignesh Desai - Partner Solutions Architect @ aws
re:Innovate from Siloed to Deep Insights on Your Data
jigs_1979 | jigned@amazon.com
2. What to expect
Hear
about the aws
approach to services,
products and solutions
1
Understand
our partners and
analytics strategy;
our portfolio of
various services &
how they work
together
2
Plan
how you would use
the solution by
appreciating how
others use them
3
3. Our strategy & our beliefs
1. There is going to be an explosion in data.
2. Cloud will enable a different architecture.
3. One size does not fit all—innovate with purpose-
built.
2010
5. Analytics
Our portfolio
Broad and deep portfolio, purpose-built for builders
Redshift
Data warehousing
EMR
Hadoop + Spark
Athena
Interactive analytics
Kinesis Data Analytics Real
time
Elasticsearch Service
Operational Analytics
QuickSight SageMaker
S3/Glacier
Glue
ETL & Data Catalog
Lake Formation
Data Lakes
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams
Data Movement
Business Intelligence & Machine Learning
Data Lake
6. Analytics
Our portfolio
Broad and deep portfolio, purpose-built for builders
QuickSight SageMaker
S3/Glacier
Glue
ETL & Data Catalog
Lake Formation
Data Lakes
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams
Data Movement
Business Intelligence & Machine Learning
Data Lake
Redshift
Data warehousing
EMR
Hadoop + Spark
Kinesis Data Analytics Real
time
Elasticsearch Service
Operational Analytics
Athena
Interactive analytics
RDS
MySQL, PostgreSQL, MariaDB,
Oracle, SQL Server
Aurora
MySQL, PostgreSQL
DynamoDB
Key value, Document
ElastiCache
Redis, Memcached
Neptune
Graph
Timestream
Time Series
QLDB
Ledger Database
RDS on VMware
Databases
7. Our portfolio
Broad and deep portfolio, purpose-built for builders
Redshift
Data warehousing
EMR
Hadoop + Spark
Athena
Interactive analytics
Kinesis Data Analytics
Real time
Elasticsearch Service
Operational Analytics
RDS
MySQL, PostgreSQL, MariaDB,
Oracle, SQL Server
Aurora
MySQL, PostgreSQL
QuickSight SageMaker
DynamoDB
Key value, Document
ElastiCache
Redis, Memcached
Neptune
Graph
Timestream
Time Series
QLDB
Ledger Database
S3/Glacier
Glue
ETL & Data Catalog
Lake Formation
Data Lakes
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams
Data Movement
Analytics Databases
Business Intelligence & Machine Learning
Data Lake
Managed
Blockchain
Blockchain
Templates
Blockchain
RDS on VMware
8. Three type of projects
Quickly build new
apps in the cloud
Gain new
insights
“Lift and shift” existing
apps to the cloud
9. Traditionally, analytics looked like this
Relational data
GBs-TBs scale [not designed for PB/EBs]
Expensive: Large initial capex + $10K-$50K/TB/year
90% of data was thrown away because of cost
OLTP ERP CRM LOB
Data Warehouse
Business Intelligence
10. Our beliefs
1. All data has value. No data should be thrown
away.
2. All employees should have access to all data
(subject to company access rules).
2010
11. Snowball
Snowmobile Kinesis
Data Firehose
Kinesis
Data Streams
S3
Redshift
EMR
Athena Kinesis
Elasticsearch Service
Data lakes on AWS
Kinesis
Video Streams
AI Services
QuickSight
Exabyte scale
Store and analyze relational and non-relational data
Purpose-built analytics tools
Cost effective
• Store at 2.3 cents per GB-month in Amazon S3
• Query with Amazon Athena at ½ cent per GB scanned
• DW with Amazon Redshift for $1,000/TB/year
Give access to everyone
• Amazon QuickSight: $0.30 for 30 minutes of use
12. CHALLENGE
Need to create constant feedback loop
for designers.
Gain up-to-the-minute understanding
of gamer satisfaction to guarantee
gamers are engaged, resulting in the
most popular game played in the
world.
Fortnite | 125+ million players
14. MongoDB Atlas takes the innovation to the cloud
Database as a service for MongoDB
• Automated: Create and deploy production ready cluster in minutes. Modify your
cluster with zero downtime
• Scalable and high performance: Full elastic scalability with the performance you need
for your most demanding workloads
• Highly available: Each deployment is geographically distributed, fault-tolerant, and
self-healing by default
• Visibility: Optimized dashboards highlight key historical metrics. View metric in real-
time, customize alerts, or dig into the details with ease
• Continuous Backup: Continuous backup with point-in-time restores and queryable
snapshots to ensure no data loss
• Secured: Authentication, network isolation, encryption, and role-based access
controls keep your data protected
15. Enterprises Data Strategy
15
15
Analytics will play a central role across all business processes to drive outcomes
2
Financial
Services
Retail, CPG &
Logistics
Energy,
Communication
Services
Healthcare &
Lifesciences
Manufacturing
Marketing & Sales Analytics
• Improve Campaign ROI
• Enhance sales volume
Servicing Analytics
• Reduce AHT
• Reduce Employee Attrition
Supply Chain Analytics
• Reduce Safety stock
• Improve Working Capital
Customer Analytics
• Improve NPS/CLTV
• Reduce Churn
Risk and Compliance
• Reduce Fraud
• Real time transaction monitoring
Personalized Experience with
targeted recommendations
Develop the relationship by
understanding Social and
digital behavior & life events
Synthesize external data with
enterprise data to generate
Revenue opportunities
Eg Geo-spatial data(Weather,
News, Commodity, Political)
Optimize servicing cost
(Channel/Self-service)
Improve Supply Chain efficiencies
Smart Factories (IOT based
Automation)
Fraud monitoring
Improving Speed of
compliance
Customer
Intimacy
New Revenue
Models
Risk &
Compliance
Operational
Excellence
Monetize data to drive business outcomes
CHALLENGES : Tedious and cumbersome process across IT, Data Science and Business Users, Long Time to Market, Huge TCO
Product and Process based data fragmentation New Age Data sourcesData Science Lifecycle Management
§ Use case based approach
§ REPETITION OF DATA & ANALYTICS LIFE
CYCLE for every use case
§ Limited standardization & reuse
§ PRODUCT & PROCESS DATA MART across sales, marketing &
operations limit the ability to create a unified view of data in an
accelerated manner
§ PRE-AGGREGATED DATA limits the ability to dynamically create new
behavioral attributes
§ CONNECT THE UNCONNECTED DATA -
Existing enterprise data structures
unable to wrangle with new age data
sources ; digital, social & external
(Geo-spatial data)
16. Cloud ready Data
Security
Backup and Recovery
Monitoring
Networking Migration
High availability, load balancing, failover
Set-up storage
Access Controls
Data
NEW DEV or ON-
PREMISES
CLOUD
Provision Compute
17. 17
360 View : Genome
Can be seamlessly
augmented based on
client’s business needs for
§ Product 360
§ Account 360
§ …
Custome
r 360
§ Prebuilt for Customer ,
Household
§ To be customized as per
clients needs
Gene Blocks
Customer
(1…n)
Account
(1…n)
Transaction
Collection
Aggregation
based on
“RFMQ”
AutomatedDerivationonBehavioralDimensions
Recency
Frequency
Monetary
Quantitative
Genome Transformation Engine (GTE)
Genome Management Console (GMC)
Genome Query Engine (GQE)
Raw Layer
Customer
(1…n)
Account
(1…n)
Transaction
Collection
Customer Type
Account Type
Transaction Type
Channel
Dimension (n)
Curated Layer
Dimension (n)
+ Exception Handling
+ Audit Processing
Validation
Deduplication
Transformation
Standardization&Processing(RuleBased)
Master
Transaction
Dimension
Legend CDC/SCD2 (Rule Based)
• Capture Change records
• Performs SCD2
Indicative. Not
Exhaustive
ETL
Subject Area
Table
Attribute
Changes
Dimensions
Customer
(1…n)
Account
(1…n)
Transaction
Collection
Customer Type
Account Type
Transaction Type
Channel
ETL
Solution
18. AWS + Infosys – a Strategic Partnership
• Relationship driven by CEOs of both companies
• 4500+ Infosys employees trained in AWS; 550 Certified resources
• AWS Dedicated architect and GTM sales support
• Joint AWS and Infosys investments in solution development,
Sales incentives and customer delivery
• Exclusive access to joint collaboration and funding to accelerate
cloud adoption program for clients
• Infosys is part of a small set of delivery partners
certified to help customers with every stage of their cloud
migration.
Infosys is one of only 8 Migration Acceleration Program partners•
Big Data Competency
• Infosys is AWS’s Premier Consulting Partner
19. 19
Curated
Data
Derived
Attributes -
Features
M
odel
Outcom
es
Acquisition
&
Ingestion
Transformation
Distribution
Create
UI
Manage UI
Access
UI
CONFIGURABLE DATA ASSETS
AUTOMATED PROCESS
INTERACTIVE UI
Identity &
Personality
Actions Taken
For Entity
Actions
Performed By
Entity
GENOME
User Interface
Genome Management Console
Genome Marketplace
Data Pipeline / Code
Genome Transformation Engine
Genome Query Engine
Genome Metadata Repository
Data Models : Logical &
Physical
Gene Blocks
Genome
Infosys Information Grid
Analytics & Visualization
Library
Analytical Models
Dashboards
GTE (UI)
EnterpriseCommunity
Operations
Compliance
Servicing
Sales
Marketing
Leadership
Data Scientists
& Analysts
Business Users
IT Data
Engineers
Data Monetization Challenges
§ Process based data fragmentation across
products and processes
§ New age data sources and external data
§ Varied data formats
§ Repetition of data life cycle for insight and
analytical use cases
How Infosys Genome Solution can help?
20. Customer & Marketing Analytics Operations Analytics Risk & Compliance
Customer
segmentation
Market basket
analytics
Sentiment
analytics
Call intent
prediction
Supply shortage
prediction
Provide fraud
Monitoring
Probability
to default
20
Get Away from Silos
21. Genome solution’s architecture and data process flow..
Provisioning
Data
Governance
Provisioning, Mgmt.
Monitoring
Monitoring Workflow Retention
Metadata Lineage SecurityEncryptionA
B
Curated LayerSource Systems RAW Layer
Conformed Zone
Consumption/
VisualizationData Intelligence Grid
Landing
Zone
Ingestion Framework
• AS-IS data
from source
systems
• Master,
Reference data
Gene Blocks
Aggregated
Data (GENOME)
• Data enrichment
• Standardized and curated
data
• Flatten data in Gene Blocks.
• Aggregated view of each advisor,
product, customer etc. in Genome
Real Time Gene
Blocks
Speed Zone
Internal Data
External Data
1
2
3
4
5
6
Extracts
Data Exploration
Dashboards
Canned Reports
Self Service
Analytics
Analytical Tools
ReportingSemantic
Layer
22. Genome Architecture on AWS & Mongo DB Atlas
22
Batch
Real-Time
Real-Time
Sources
Internal
Sources
External
Sources
Spark Streaming
Speed Zone
Ingestion Zone Curated
Zone
Distribution Zone
Visualize
Kafka
Real-Time Serving
Real-Time
Gene Blocks
Sqoop
Gene
Blocks
GQE
Genome
GMP
IIG – Infosys Information Grid
GQE – Genome Query Engine
GMP – Genome Market Place
Genome
Management
Console
• Provides UI to define structures for Gene
Blocks and genome as well as Dimension
data
• Generates the metadata that is leveraged
by Genome Transformation Engine to
define data pipeline & build Gene Blocks.
• Genome query engine uses same
metadata to for automated generation of
Genome Features (derived attributes in
Genome)
Real-Time/Near Real Time Flow
Batch Transactional Flow
AWS
Direct
Connect
AWS EMR
AWS EC2
AWS EC2
AWS EMR
Amazon KinesisInfosysInformationGrid
IIG
AWS Glue
23. Any type of data, structured, unstructured Scale to any demands with no downtime
§ Performance
§ Data
§ Cluster (geographies)
Zero lock-in
§ Run on choice of cloud: AWS, GCP and Azure
§ Frictionless migration from or to on-premises deployments
General purpose to wide variety of use-cases, including blockchain, real time analytics,
single view and more!
One click upgrade to latest and greatest version of MongoDB as soon as it is available
Why MongoDB ATLAS for Genome?
24. Highly Available by Default
● A minimum of three data nodes per replica
set/shard are automatically deployed across
zones for high availability
● If your primary node does go down for any
reason, the self healing recovery process in
MongoDB Atlas will typically occur in under 2
seconds
● MongoDB Atlas automatically applies patches
and enables 1 touch upgrades with no
downtime
25. Live Migration Migrate existing deployments running anywhere into
MongoDB Atlas with minimal impact to your
application. Live migration works by:
● Performing a sync between your source
database and a target database hosted in
MongoDB Atlas
● Syncing live data between your source database
and the target database by tailing the oplog
● Notifying you when its time to cut over to the
MongoDB Atlas cluster
26. MongoDB Atlas on AWS
Cost of Migration +
Post Migration OpEx
Current CapEx
and OpEX<
Ø Production ready cluster in minutes
Ø Built on modern best practices and aligns with DevOps practices and tools (CI/CD)
Ø Automates and simplifies operational tasks
Ø Flexible schema and aggregation framework to enable rapid development
Ø Tools to explore data and schema, and optimize performance
27. Query-able Backups
MongoDB Atlas gives you the ability to query your backup snapshots
and restore data at the document level in minutes.
Queryable backups significantly reduces the operational overhead
associated with:
• Identifying whether data of interest has been altered
• Pinpointing the best point in time to restore a database by
comparing data across multiple snapshots
28. Continuous Backup / Point-in-time Restore
● MongoDB Atlas continuously backs up your data, ensuring your
backups are typically just a few seconds behind the operational system
● Point-in-time backup of replica sets and consistent, cluster-wide
snapshots of sharded clusters. With MongoDB Atlas, you can easily
and safely restore to precisely the moment you need
● Compression and block-level deduplication technology keeps your
backup processes as efficient as possible
● Backups are securely stored in North America, Ireland, Germany,
United Kingdom, or Australia*. For more location flexibility of your
backup data, you can utilize MongoDB’s mongodump / mongorestore
tools
*Additional regions for backup coming soon
29. Track Everything
● Monitoring and alerts provide full metrics on the state of your
cluster’s database and server usage
● Automatic notifications when your database operations or
server usage reach defined thresholds that affect your cluster's
performance
● Combining our automated alerting with the flexible scale-up-
and-out options in MongoDB Atlas, we can keep your
database-supported applications always performing as well as
they should
30. Security is job zero
All MongoDB Atlas nodes are single-tenant and deployed into their own VPC for
security isolation.
VPC Peering is available between AWS VPCs in the same AWS region.
In-flight security:
● TLS/SSL for in-flight data encryption
● Authentication and authorization access controls with SCRAM-SHA1
● IP whitelists
At-rest security:
● Encrypted storage volumes
● AES-256 (CBC mode) hardware encryption with Seagate Self-Encrypting
Drives
31. Solutions across Industries
• Retail: Retail Analytics Solution (customer)
• Telco: Telco Analytics Solution (customer)
• Financial Services:
o Corporate Banking Analytics Solution (corporate)
o Credit Card Analytics Solution (Customer, Merchant)
o Wealth Management Analytics Solution (Customer, Security, Advisor)
o Consumer banking Analytics Solution
• Insurance, Healthcare & Life Science
o Healthcare Analytics Solution ( Member, Payer, Provider, Claim , Corporate)
o Insurance Analytics Solution ( Customer, Household)
o Life Science Analytics Solution ( HCP)
• Automobile Analytics Solution ( Customer)
• Horizontal Solutions
o Supply chain Analytics Solution (Supplier, Product)
o HR Analytics Solution ( Employee)
o Asset Analytics Solution (Oil Well)
o Assets Analytics Solution ( Chiller plant – Equipment component)
31
32. Business benefits are customers are seeing
• One process, platform and reference model across enterprises. Customer, Account,
Revenue, Usage, payment, Interaction, Clickstream to create a 360 view of customer for a
New Zealand based Telco client . Created Customer and Product genome table to enable
micro segmentation, call reduction and real time visualization
1
• Time saving in insight generation insights for the campaign management team such as
classification of ecomm customers, creating of propensity models for a brand etc for a
European based retailer ;
80%
• Predicting shipment delay 7 days in advance lead to a revenue benefits by way of
released working capital currently tied up in safety stocks10%
• Unified Data and Analytical platform (Genome) for a US based Insurance firm across
personal and commercial lines for all the Insurance products
32