SlideShare a Scribd company logo
Optimizing Data Management for
MongoDB
October 11, 2017
My Background
Why Bother With Backup and Test Data Mgmt?
The average cost of a data loss
incident is $900,000
90% of enterprises delay applications
because of a lack of test data
Source: EMC, Imanis Data
Data Replicas Don’t Prevent ALL Data Loss
Human errors: dropping a
collection
Application corruption: incorrect
updates to a collection
Replicas go out of synch
Primary
Secondary Secondary
Current MongoDB Backup Options
mongodump
Filesystem/storage
snapshots
Ops Manager
MongoDB Backup/Recovery Options:
mongodump
Very resource
intensive
PROBLEM
Not feasible
for granular
recovery
PROBLEMPROBLEM
Not
incremental-
forever
PROBLEM
MongoDB Backup/Recovery Options:
filesystem/storage snapshots
Requires
quiescing of
MongoDB
instance for
consistent
snapshots
PROBLEM
Requires
periodic full
backups to
ensure faster
recovery
PROBLEM
Cannot do
point in time
recovery or
back up
specific
collections
PROBLEM
MongoDB Backup/Recovery Options: Ops
Manager
Requires
agents on
production –
scaling
increases
overhead
PROBLEM PROBLEM
Storage
optimization
is dependent
on external
media. Not
data-aware
PROBLEM
Extremely
difficult to
restore to a
different
topology
Test Data Mgmt Eliminates Costly Application
Delays
Application
teams wait for
production data
A company delayed 3-4 releases a year
Cost the company $450K per year
of companies delay
application releases90%
BUSINESS IMPACT
CUSTOMER EXAMPLE
PRIMARY CHALLENGE
Challenges With Typical Test Data
Management
Change
Request - 1
week
Provision
Production
Data - 1
week
Create Test
DB and
Mask Data
- 1 week
Create
Samples of
Production
Data – 2
days
Push
Production
Data To
Test –
Hours
Repeat
Process –
3-4 weeks
The Evolution of Data Management
THE NEXT
25 YEARS
THE
TRADITIONAL
WORLD
Data ManagementData Platforms
Imanis Data in Production
Test
Cluster
Research
Cluster
Imanis Data GUI
Hadoop/Spark
Cluster
Cassandra
Cluster
Vertica
Cluster
Couchbase
Cluster
Imanis Data
Smart Storage
Cluster
MongoDB
Cluster
The Imanis Data Architecture
• Deep de-duplication and compression with app-aware architecture
• Incremental-forever backup architecture
• High availability via erasure coding in distributed cluster architecture
Smart Storage Optimizer
The Imanis Data Architecture
Native querying and analytics via
active compute layer
Unbounded scale with a
Hadoop-native architecture
Smart Storage Optimizer
Active Compute Services Distributed File System
The Imanis Data Architecture
• Google-like catalog shortens
data recovery time
• Automatic schema
generation for mirroring and
backups
• Granular recovery at an
object level
• Recovery to multiple
topologies
• Native integration with
LDAP and Kerberos for
authentication
• Role-based access control
defines specific privileges
• Stateless, consistent,
irreversible, and one-way
masks for PII data
Smart Storage Optimizer
Active Compute Services Distributed File System
Metadata Catalog Data Orchestration ServicesSecurity Services
Smart Storage Optimizer
The Imanis Data Architecture
GUI CLI API
Active Compute Services Distributed File System
• ‘Single pane of glass’ for multiple use cases and data platforms
• Agentless architecture minimizes management overhead
• GUI, CLI, REST-based Talena API options
Metadata Catalog Data Orchestration ServicesSecurity Services
Machine Intelligence: ThreatSense
• Proactively identify
anomalous data loss and
ransomware to reduce
downtime
• Collects nearly 50
attributes to set baseline
• Enables user input to
optimize machine
learning
Q&A

More Related Content

What's hot

Snowflake + Syncsort: Get Value from Your Mainframe Data
Snowflake + Syncsort: Get Value from Your Mainframe DataSnowflake + Syncsort: Get Value from Your Mainframe Data
Snowflake + Syncsort: Get Value from Your Mainframe Data
Precisely
 
Clinical Suspecting at Scale Using PySpark
Clinical Suspecting at Scale Using PySparkClinical Suspecting at Scale Using PySpark
Clinical Suspecting at Scale Using PySpark
Databricks
 
Kyligence Cloud 4 - An Overview
Kyligence Cloud 4 - An OverviewKyligence Cloud 4 - An Overview
Kyligence Cloud 4 - An Overview
SamanthaBerlant
 
Driving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive AnalyticsDriving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive Analytics
SingleStore
 
Build Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks StreamingBuild Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks Streaming
Databricks
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
Sunil Govindan
 
Veritas + MongoDB
Veritas + MongoDBVeritas + MongoDB
Veritas + MongoDB
MongoDB
 
IMC Summit 2016 Breakout - Ken Gibson - The In-Place Working Storage Tier
IMC Summit 2016 Breakout - Ken Gibson - The In-Place Working Storage TierIMC Summit 2016 Breakout - Ken Gibson - The In-Place Working Storage Tier
IMC Summit 2016 Breakout - Ken Gibson - The In-Place Working Storage Tier
In-Memory Computing Summit
 
Operationalizing Machine Learning at Scale at Starbucks
Operationalizing Machine Learning at Scale at StarbucksOperationalizing Machine Learning at Scale at Starbucks
Operationalizing Machine Learning at Scale at Starbucks
Databricks
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
Databricks
 
Real time architecture big data
Real time architecture big dataReal time architecture big data
Real time architecture big data
Sanjeev Solanki
 
Netherlands OSUG | Sep 30
Netherlands OSUG | Sep 30Netherlands OSUG | Sep 30
Netherlands OSUG | Sep 30
CatarinaPereira64715
 
Newsweaver - Big Data Storage
Newsweaver - Big Data StorageNewsweaver - Big Data Storage
Newsweaver - Big Data Storage
Sean Griffin
 
IMC Summit 2016 Innovation - Dennis Duckworth - Lambda-B-Gone: The In-memory ...
IMC Summit 2016 Innovation - Dennis Duckworth - Lambda-B-Gone: The In-memory ...IMC Summit 2016 Innovation - Dennis Duckworth - Lambda-B-Gone: The In-memory ...
IMC Summit 2016 Innovation - Dennis Duckworth - Lambda-B-Gone: The In-memory ...
In-Memory Computing Summit
 
Google Cloud Platform Intro to Data and Storage Services
Google Cloud Platform Intro to Data and Storage ServicesGoogle Cloud Platform Intro to Data and Storage Services
Google Cloud Platform Intro to Data and Storage Services
Joseph Holbrook, Chief Learning Officer (CLO)
 
Leveraging Apache Spark to Develop AI-Enabled Products and Services at Bosch
Leveraging Apache Spark to Develop AI-Enabled Products and Services at BoschLeveraging Apache Spark to Develop AI-Enabled Products and Services at Bosch
Leveraging Apache Spark to Develop AI-Enabled Products and Services at Bosch
Databricks
 
Dan Stone "Scalabale Application Frameworks"
Dan Stone "Scalabale Application Frameworks"Dan Stone "Scalabale Application Frameworks"
Dan Stone "Scalabale Application Frameworks"
Chris Purrington
 
Big Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 TelcoBig Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 Telco
BlueData, Inc.
 
Journey to the Cloud: Database Modernization Best Practices
Journey to the Cloud: Database Modernization Best PracticesJourney to the Cloud: Database Modernization Best Practices
Journey to the Cloud: Database Modernization Best Practices
Datavail
 
IMC Summit 2016 Innovation - Steve Wilkes - Tap Into Your Enterprise – Why Da...
IMC Summit 2016 Innovation - Steve Wilkes - Tap Into Your Enterprise – Why Da...IMC Summit 2016 Innovation - Steve Wilkes - Tap Into Your Enterprise – Why Da...
IMC Summit 2016 Innovation - Steve Wilkes - Tap Into Your Enterprise – Why Da...
In-Memory Computing Summit
 

What's hot (20)

Snowflake + Syncsort: Get Value from Your Mainframe Data
Snowflake + Syncsort: Get Value from Your Mainframe DataSnowflake + Syncsort: Get Value from Your Mainframe Data
Snowflake + Syncsort: Get Value from Your Mainframe Data
 
Clinical Suspecting at Scale Using PySpark
Clinical Suspecting at Scale Using PySparkClinical Suspecting at Scale Using PySpark
Clinical Suspecting at Scale Using PySpark
 
Kyligence Cloud 4 - An Overview
Kyligence Cloud 4 - An OverviewKyligence Cloud 4 - An Overview
Kyligence Cloud 4 - An Overview
 
Driving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive AnalyticsDriving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive Analytics
 
Build Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks StreamingBuild Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks Streaming
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Veritas + MongoDB
Veritas + MongoDBVeritas + MongoDB
Veritas + MongoDB
 
IMC Summit 2016 Breakout - Ken Gibson - The In-Place Working Storage Tier
IMC Summit 2016 Breakout - Ken Gibson - The In-Place Working Storage TierIMC Summit 2016 Breakout - Ken Gibson - The In-Place Working Storage Tier
IMC Summit 2016 Breakout - Ken Gibson - The In-Place Working Storage Tier
 
Operationalizing Machine Learning at Scale at Starbucks
Operationalizing Machine Learning at Scale at StarbucksOperationalizing Machine Learning at Scale at Starbucks
Operationalizing Machine Learning at Scale at Starbucks
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
Real time architecture big data
Real time architecture big dataReal time architecture big data
Real time architecture big data
 
Netherlands OSUG | Sep 30
Netherlands OSUG | Sep 30Netherlands OSUG | Sep 30
Netherlands OSUG | Sep 30
 
Newsweaver - Big Data Storage
Newsweaver - Big Data StorageNewsweaver - Big Data Storage
Newsweaver - Big Data Storage
 
IMC Summit 2016 Innovation - Dennis Duckworth - Lambda-B-Gone: The In-memory ...
IMC Summit 2016 Innovation - Dennis Duckworth - Lambda-B-Gone: The In-memory ...IMC Summit 2016 Innovation - Dennis Duckworth - Lambda-B-Gone: The In-memory ...
IMC Summit 2016 Innovation - Dennis Duckworth - Lambda-B-Gone: The In-memory ...
 
Google Cloud Platform Intro to Data and Storage Services
Google Cloud Platform Intro to Data and Storage ServicesGoogle Cloud Platform Intro to Data and Storage Services
Google Cloud Platform Intro to Data and Storage Services
 
Leveraging Apache Spark to Develop AI-Enabled Products and Services at Bosch
Leveraging Apache Spark to Develop AI-Enabled Products and Services at BoschLeveraging Apache Spark to Develop AI-Enabled Products and Services at Bosch
Leveraging Apache Spark to Develop AI-Enabled Products and Services at Bosch
 
Dan Stone "Scalabale Application Frameworks"
Dan Stone "Scalabale Application Frameworks"Dan Stone "Scalabale Application Frameworks"
Dan Stone "Scalabale Application Frameworks"
 
Big Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 TelcoBig Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 Telco
 
Journey to the Cloud: Database Modernization Best Practices
Journey to the Cloud: Database Modernization Best PracticesJourney to the Cloud: Database Modernization Best Practices
Journey to the Cloud: Database Modernization Best Practices
 
IMC Summit 2016 Innovation - Steve Wilkes - Tap Into Your Enterprise – Why Da...
IMC Summit 2016 Innovation - Steve Wilkes - Tap Into Your Enterprise – Why Da...IMC Summit 2016 Innovation - Steve Wilkes - Tap Into Your Enterprise – Why Da...
IMC Summit 2016 Innovation - Steve Wilkes - Tap Into Your Enterprise – Why Da...
 

Similar to Optimizing Data Management for MongoDB

Debunking Common Myths of Hadoop Backup & Test Data Management
Debunking Common Myths of Hadoop Backup & Test Data ManagementDebunking Common Myths of Hadoop Backup & Test Data Management
Debunking Common Myths of Hadoop Backup & Test Data Management
Imanis Data
 
Optimize Your Vertica Data Management Infrastructure
Optimize Your Vertica Data Management InfrastructureOptimize Your Vertica Data Management Infrastructure
Optimize Your Vertica Data Management Infrastructure
Imanis Data
 
SunilBabu_Assignment#2
SunilBabu_Assignment#2SunilBabu_Assignment#2
SunilBabu_Assignment#2
Sunil Babu
 
Learn how Maxwell Health Protects its MongoDB Workloads on AWS
 Learn how Maxwell Health Protects its MongoDB Workloads on AWS Learn how Maxwell Health Protects its MongoDB Workloads on AWS
Learn how Maxwell Health Protects its MongoDB Workloads on AWS
Amazon Web Services
 
Delivering Modern Data Protection for VMware Environments
Delivering Modern Data Protection for VMware EnvironmentsDelivering Modern Data Protection for VMware Environments
Delivering Modern Data Protection for VMware Environments
Paula Koziol
 
Addressing VMware Data Backup and Availability Challenges with IBM Spectrum P...
Addressing VMware Data Backup and Availability Challenges with IBM Spectrum P...Addressing VMware Data Backup and Availability Challenges with IBM Spectrum P...
Addressing VMware Data Backup and Availability Challenges with IBM Spectrum P...
Paula Koziol
 
Rapid_Recovery-T75-v2204j.pdf
Rapid_Recovery-T75-v2204j.pdfRapid_Recovery-T75-v2204j.pdf
Rapid_Recovery-T75-v2204j.pdf
Tony Pearson
 
How To Build A Stable And Robust Base For a “Cloud”
How To Build A Stable And Robust Base For a “Cloud”How To Build A Stable And Robust Base For a “Cloud”
How To Build A Stable And Robust Base For a “Cloud”
Hardway Hou
 
Debunking Common Myths of Cassandra Backup
Debunking Common Myths of Cassandra BackupDebunking Common Myths of Cassandra Backup
Debunking Common Myths of Cassandra Backup
Imanis Data
 
SureSkills - Introducing Simpana 10 Features
SureSkills - Introducing Simpana 10 Features SureSkills - Introducing Simpana 10 Features
SureSkills - Introducing Simpana 10 Features
Google
 
Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute final
Avere Systems
 
Best practices: Backup and Recovery for Windows Workloads
Best practices: Backup and Recovery for Windows WorkloadsBest practices: Backup and Recovery for Windows Workloads
Best practices: Backup and Recovery for Windows Workloads
Amazon Web Services
 
IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015
Doug O'Flaherty
 
Backup & Restore Seamlessly with Industry-Leading Integration
Backup & Restore Seamlessly with Industry-Leading IntegrationBackup & Restore Seamlessly with Industry-Leading Integration
Backup & Restore Seamlessly with Industry-Leading Integration
Amazon Web Services
 
Using AWS for Backup and Restore (backup in the cloud, backup to the cloud, a...
Using AWS for Backup and Restore (backup in the cloud, backup to the cloud, a...Using AWS for Backup and Restore (backup in the cloud, backup to the cloud, a...
Using AWS for Backup and Restore (backup in the cloud, backup to the cloud, a...
Amazon Web Services
 
Protecting your Microsoft Workloads with High Availability and Reliability
Protecting your Microsoft Workloads with High Availability and ReliabilityProtecting your Microsoft Workloads with High Availability and Reliability
Protecting your Microsoft Workloads with High Availability and Reliability
Amazon Web Services
 
Webinar Presentation: Stories of Accidental Data Loss
Webinar Presentation: Stories of Accidental Data LossWebinar Presentation: Stories of Accidental Data Loss
Webinar Presentation: Stories of Accidental Data Loss
Imanis Data
 
Focus on your app with Amazon RDS
Focus on your app with Amazon RDSFocus on your app with Amazon RDS
Focus on your app with Amazon RDS
Amazon Web Services
 
Oracle database 12c introduction- Satyendra Pasalapudi
Oracle database 12c introduction- Satyendra PasalapudiOracle database 12c introduction- Satyendra Pasalapudi
Oracle database 12c introduction- Satyendra Pasalapudi
pasalapudi123
 
Spectrum Scale final
Spectrum Scale finalSpectrum Scale final
Spectrum Scale final
Joe Krotz
 

Similar to Optimizing Data Management for MongoDB (20)

Debunking Common Myths of Hadoop Backup & Test Data Management
Debunking Common Myths of Hadoop Backup & Test Data ManagementDebunking Common Myths of Hadoop Backup & Test Data Management
Debunking Common Myths of Hadoop Backup & Test Data Management
 
Optimize Your Vertica Data Management Infrastructure
Optimize Your Vertica Data Management InfrastructureOptimize Your Vertica Data Management Infrastructure
Optimize Your Vertica Data Management Infrastructure
 
SunilBabu_Assignment#2
SunilBabu_Assignment#2SunilBabu_Assignment#2
SunilBabu_Assignment#2
 
Learn how Maxwell Health Protects its MongoDB Workloads on AWS
 Learn how Maxwell Health Protects its MongoDB Workloads on AWS Learn how Maxwell Health Protects its MongoDB Workloads on AWS
Learn how Maxwell Health Protects its MongoDB Workloads on AWS
 
Delivering Modern Data Protection for VMware Environments
Delivering Modern Data Protection for VMware EnvironmentsDelivering Modern Data Protection for VMware Environments
Delivering Modern Data Protection for VMware Environments
 
Addressing VMware Data Backup and Availability Challenges with IBM Spectrum P...
Addressing VMware Data Backup and Availability Challenges with IBM Spectrum P...Addressing VMware Data Backup and Availability Challenges with IBM Spectrum P...
Addressing VMware Data Backup and Availability Challenges with IBM Spectrum P...
 
Rapid_Recovery-T75-v2204j.pdf
Rapid_Recovery-T75-v2204j.pdfRapid_Recovery-T75-v2204j.pdf
Rapid_Recovery-T75-v2204j.pdf
 
How To Build A Stable And Robust Base For a “Cloud”
How To Build A Stable And Robust Base For a “Cloud”How To Build A Stable And Robust Base For a “Cloud”
How To Build A Stable And Robust Base For a “Cloud”
 
Debunking Common Myths of Cassandra Backup
Debunking Common Myths of Cassandra BackupDebunking Common Myths of Cassandra Backup
Debunking Common Myths of Cassandra Backup
 
SureSkills - Introducing Simpana 10 Features
SureSkills - Introducing Simpana 10 Features SureSkills - Introducing Simpana 10 Features
SureSkills - Introducing Simpana 10 Features
 
Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute final
 
Best practices: Backup and Recovery for Windows Workloads
Best practices: Backup and Recovery for Windows WorkloadsBest practices: Backup and Recovery for Windows Workloads
Best practices: Backup and Recovery for Windows Workloads
 
IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015
 
Backup & Restore Seamlessly with Industry-Leading Integration
Backup & Restore Seamlessly with Industry-Leading IntegrationBackup & Restore Seamlessly with Industry-Leading Integration
Backup & Restore Seamlessly with Industry-Leading Integration
 
Using AWS for Backup and Restore (backup in the cloud, backup to the cloud, a...
Using AWS for Backup and Restore (backup in the cloud, backup to the cloud, a...Using AWS for Backup and Restore (backup in the cloud, backup to the cloud, a...
Using AWS for Backup and Restore (backup in the cloud, backup to the cloud, a...
 
Protecting your Microsoft Workloads with High Availability and Reliability
Protecting your Microsoft Workloads with High Availability and ReliabilityProtecting your Microsoft Workloads with High Availability and Reliability
Protecting your Microsoft Workloads with High Availability and Reliability
 
Webinar Presentation: Stories of Accidental Data Loss
Webinar Presentation: Stories of Accidental Data LossWebinar Presentation: Stories of Accidental Data Loss
Webinar Presentation: Stories of Accidental Data Loss
 
Focus on your app with Amazon RDS
Focus on your app with Amazon RDSFocus on your app with Amazon RDS
Focus on your app with Amazon RDS
 
Oracle database 12c introduction- Satyendra Pasalapudi
Oracle database 12c introduction- Satyendra PasalapudiOracle database 12c introduction- Satyendra Pasalapudi
Oracle database 12c introduction- Satyendra Pasalapudi
 
Spectrum Scale final
Spectrum Scale finalSpectrum Scale final
Spectrum Scale final
 

Recently uploaded

20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 

Recently uploaded (20)

20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 

Optimizing Data Management for MongoDB

  • 1. Optimizing Data Management for MongoDB October 11, 2017
  • 3. Why Bother With Backup and Test Data Mgmt? The average cost of a data loss incident is $900,000 90% of enterprises delay applications because of a lack of test data Source: EMC, Imanis Data
  • 4. Data Replicas Don’t Prevent ALL Data Loss Human errors: dropping a collection Application corruption: incorrect updates to a collection Replicas go out of synch Primary Secondary Secondary
  • 5. Current MongoDB Backup Options mongodump Filesystem/storage snapshots Ops Manager
  • 6. MongoDB Backup/Recovery Options: mongodump Very resource intensive PROBLEM Not feasible for granular recovery PROBLEMPROBLEM Not incremental- forever PROBLEM
  • 7. MongoDB Backup/Recovery Options: filesystem/storage snapshots Requires quiescing of MongoDB instance for consistent snapshots PROBLEM Requires periodic full backups to ensure faster recovery PROBLEM Cannot do point in time recovery or back up specific collections PROBLEM
  • 8. MongoDB Backup/Recovery Options: Ops Manager Requires agents on production – scaling increases overhead PROBLEM PROBLEM Storage optimization is dependent on external media. Not data-aware PROBLEM Extremely difficult to restore to a different topology
  • 9. Test Data Mgmt Eliminates Costly Application Delays Application teams wait for production data A company delayed 3-4 releases a year Cost the company $450K per year of companies delay application releases90% BUSINESS IMPACT CUSTOMER EXAMPLE PRIMARY CHALLENGE
  • 10. Challenges With Typical Test Data Management Change Request - 1 week Provision Production Data - 1 week Create Test DB and Mask Data - 1 week Create Samples of Production Data – 2 days Push Production Data To Test – Hours Repeat Process – 3-4 weeks
  • 11. The Evolution of Data Management THE NEXT 25 YEARS THE TRADITIONAL WORLD Data ManagementData Platforms
  • 12. Imanis Data in Production Test Cluster Research Cluster Imanis Data GUI Hadoop/Spark Cluster Cassandra Cluster Vertica Cluster Couchbase Cluster Imanis Data Smart Storage Cluster MongoDB Cluster
  • 13. The Imanis Data Architecture • Deep de-duplication and compression with app-aware architecture • Incremental-forever backup architecture • High availability via erasure coding in distributed cluster architecture Smart Storage Optimizer
  • 14. The Imanis Data Architecture Native querying and analytics via active compute layer Unbounded scale with a Hadoop-native architecture Smart Storage Optimizer Active Compute Services Distributed File System
  • 15. The Imanis Data Architecture • Google-like catalog shortens data recovery time • Automatic schema generation for mirroring and backups • Granular recovery at an object level • Recovery to multiple topologies • Native integration with LDAP and Kerberos for authentication • Role-based access control defines specific privileges • Stateless, consistent, irreversible, and one-way masks for PII data Smart Storage Optimizer Active Compute Services Distributed File System Metadata Catalog Data Orchestration ServicesSecurity Services
  • 16. Smart Storage Optimizer The Imanis Data Architecture GUI CLI API Active Compute Services Distributed File System • ‘Single pane of glass’ for multiple use cases and data platforms • Agentless architecture minimizes management overhead • GUI, CLI, REST-based Talena API options Metadata Catalog Data Orchestration ServicesSecurity Services
  • 17. Machine Intelligence: ThreatSense • Proactively identify anomalous data loss and ransomware to reduce downtime • Collects nearly 50 attributes to set baseline • Enables user input to optimize machine learning
  • 18. Q&A

Editor's Notes

  1. Ask the question of Hari about scalability in OpsManager and comparison with Imanis Data
  2. ----- Meeting Notes (10/10/17 10:35) ----- Deployment diagram before the detailed architecture diagram