SlideShare a Scribd company logo
1 of 28
Webinar:
automating the data testing
for
Bill Hayduk
CEO, President
RTTS
Jeff Bocarsly, PhD
VP & Chief Architect
RTTS
Presenters
Ensuring MongoDB Data Quality
built by
QuerySurge™
built by
QuerySurge™
About
FACTS
Founded:
1996
Locations:
New York (HQ), Atlanta,
Philadelphia, Phoenix
Strategic Partners:
IBM, Microsoft, HP,
Oracle, Teradata,
HortonWorks, Cloudera,
Amazon
Software:
QuerySurge
RTTS is the leading provider of software & data quality
for critical business systems
about
built by
QuerySurge™
What is MongoDB?
Name: MongoDB (from "humongous")
1NoSQL means now “not only SQL”
210gen changed its name to MongoDB, Inc.
Source: Wikipedia
built by
QuerySurge™
• classified as a NoSQL1 database
• does not implement the table-based relational db structure
• cross-platform document-oriented database
• makes the integration of data in certain types of apps easier & faster
• free and open source
• originally built by 10gen2 and released in 2009
“MongoDB is in 5th place as the most popular type of database management system,
and 1st place for NoSQL database management systems.”
April 2014
built by
QuerySurge™
• Online real-time processing
• Data set is smaller
• Measured in milliseconds
• Offline big data processing
• Offline analytics
• Measured in minutes & hours
MongoDB versus Hadoop
Source: classpattern.com
When use MongoDB? / When use Hadoop?
MongoDB Use Cases
built by
QuerySurge™
Source: MongoDB, Inc.
Data Warehouse Batch Aggregation
ETL from MongoDB
ETL to MongoDB
Use Cases: Data Warehouse
Relational DB & Data
Warehousing
Source Data
@
BI, Analytics &
Reporting
built by
QuerySurge™
Ingestion
Data Quality Issues
built by
QuerySurge™
Data Quality Best Practices boost revenue by 66%.
The average organization loses $8.2 million annually through poor Data Quality.
46% of companies cite Data Quality as a barrier for adopting Business
Intelligence products.
80% of organizations… will underestimate the costs related to the data acquisition
tasks by an average of 50 percent.
News Headlines
built by
QuerySurge™
Validating Data: 3 Big Issues
- need to verify more data and to do it faster
- need to automate the testing effort
- need to be able to test across different platforms
Need a testing tool!
built by
QuerySurge™
What is QuerySurge™?
a collaborative
data testing tool that
finds bad data & provides
a holistic view of your
data’s health
built by
QuerySurge™
• Reduce your costs & risks
• Improve your data quality
• Accelerate your testing cycles
• Share information with your team
with QuerySurge™ you can:
built by
QuerySurge™
• Provides huge ROI (i.e. 1,300%)*
*based on client’s calculation of Return on Investment
Finding Bad Data
SQL
SQL
SQL
SQL
SQL
SQL
 QS pulls data from data source(s)
 QS pulls data from data target(s)
 QS compares data in seconds
 QS generates reports, audit trails
How?
reports
built by
QuerySurge™
the QuerySurge advantage
built by
QuerySurge™
Automate the entire testing cycle
 Automate kickoff, tests, comparison, auto-emailed results
Create Tests easily with no SQL programming
 ensures minimal time & effort to create tests / obtain results
Test across different platforms
 data warehouse, Hadoop, NoSQL, database, flat file, XML
Collaborate with team
 Data Health dashboard, shared tests & auto-emailed reports
Verify more data & do it quickly
 verifies up to 100% of all data up to 1,000 x faster
Integrate for Continuous Delivery
 Integrates with most Build, ETL & QA management software
Collaboration
Testers
- functional testing
- regression testing
- result analysis
Developers / DBAs
- unit testing
- result analysis
Data Analysts
- review, analyze data
- verify mapping failures
Operations teams
- monitoring
- result analysis
Managers
- oversight
- result analysis
Share information on the
built by
QuerySurge™
QuerySurge™ Architecture
Web-based…
Installs on...
Linux
Connects to…
…or any other JDBC compliant data source
built by
QuerySurge™
QuerySurge
Controller
QuerySurge
Server
QuerySurge
Agents
Flat Files
built by
QuerySurge™
QuerySurge™ Modules
Design Library
SchedulingDeep-Dive Reporting
Run Dashboard
Query Wizards
Data Health Dashboard
built by
From a recent poll1 of:
• Big Data Experts
• Data Warehouse Architects
• Solution Architects
• ETL Architects
Recent Survey: Data Experts
Consensus Answer:
80% of data columns have no transformation at all
Our Question: What % of columns in your Data Warehouse
have no transformations at all?
1Poll conducted by RTTS on targeted LinkedIn groups
Why is this important?
Fast and Easy.
No programming needed.
built by
QuerySurge™
QuerySurge™ Modules
Compare by Table, Column & Row
• Perform 80% of all data tests - no SQL
coding needed
• Opens up testing to novices &
non-technical team members
• Speeds up testing for skilled SQL coders
• provides a huge Return-On-Investment
built by
QuerySurge™
QuerySurge™ Modules:
(we picked Column-Level Comparison)
Design Library
• Create Query Pairs (source & target SQLs)
• Great for team members skilled with SQL
QuerySurge™ Modules
Scheduling
 Build groups of Query Pairs
 Schedule Test Runs
built by
QuerySurge™
Deep-Dive Reporting
 Examine and automatically
email test results
Run Dashboard
 View real-time execution
 Analyze real-time results
QuerySurge™ Modules
built by
QuerySurge™
built by
QuerySurge™
• view data reliability & pass rate
• add, move, filter, zoom-in on any data
widget & underlying data
• verify build success or failure
QuerySurge Test Management Connectors
built by
QuerySurge™
 Drive QuerySurge execution from your Test Management Solution
 Outcome results (Pass/Fail/etc.) are returned from QuerySurge to your Test Management Solution
 Results are linked in your Test Management Solution so that you can click directly into detailed QuerySurge
results
• HP ALM (Quality Center)
• Microsoft Team Foundation Server
• IBM Rational Quality Manager
Integration with leading
Test Management Solutions
Use Case: Big Data and
Relational DB & Data
Warehousing
Source Data
@
BI, Analytics &
Reporting
Ingestion
built by
QuerySurge™
™
Value-Add
QuerySurge provides value by either:
in testing data coverage from < 1% to
upwards of 100%
in testing time by as much as 1,000 x
combination of in test coverage while in
testing time
26
built by
QuerySurge™
Return on Investment
QuerySurge provides an increase in better data due to shorter / more
thorough testing cycle - saving $$$.
27
built by
QuerySurge™
Pharmaceutical Organization
Saves $288,000 in Clinical Trials
Data Migration Testing Project
1Since 2010, the pharmaceutical industry has been assessed
over $13 billion in fines.
Source: wikipedia
http://en.wikipedia.org/wiki/List_of_largest_pharmaceutical_settlements
This savings does not include savings
from avoiding fines from regulatory
bodies or lawsuits.1
Total Savings
Contact us if your team would like:
(1) a Trial in the Cloud of QuerySurge, including self-learning
tutorial that works with sample data for 3 days or
(2) a downloaded Trial of QuerySurge, including self-learning
tutorial with sample data or your data for 15 days or
(3) a Proof of Concept of QuerySurge, including a kickoff &
setup meeting and weekly meetings with our team of experts
for 30 days
http://www.querysurge.com/compare-trial-optionsfor more information, Go here
QuerySurge
built by
QuerySurge™
TRIAL
IN THE CLOUD

More Related Content

What's hot

Data preprocessing
Data preprocessingData preprocessing
Data preprocessingankur bhalla
 
Exploratory data analysis with Python
Exploratory data analysis with PythonExploratory data analysis with Python
Exploratory data analysis with PythonDavis David
 
Python For Data Analysis | Python Pandas Tutorial | Learn Python | Python Tra...
Python For Data Analysis | Python Pandas Tutorial | Learn Python | Python Tra...Python For Data Analysis | Python Pandas Tutorial | Learn Python | Python Tra...
Python For Data Analysis | Python Pandas Tutorial | Learn Python | Python Tra...Edureka!
 
Constraints In Sql
Constraints In SqlConstraints In Sql
Constraints In SqlAnurag
 
Types Of Keys in DBMS
Types Of Keys in DBMSTypes Of Keys in DBMS
Types Of Keys in DBMSPadamNepal1
 
Exception Handling
Exception HandlingException Handling
Exception HandlingReddhi Basu
 
Data Preprocessing || Data Mining
Data Preprocessing || Data MiningData Preprocessing || Data Mining
Data Preprocessing || Data MiningIffat Firozy
 
Feature enginnering and selection
Feature enginnering and selectionFeature enginnering and selection
Feature enginnering and selectionDavis David
 
Key and its different types
Key and its different typesKey and its different types
Key and its different typesUmair Shakir
 
How To Write A Test Case In Software Testing | Edureka
How To Write A Test Case In Software Testing | EdurekaHow To Write A Test Case In Software Testing | Edureka
How To Write A Test Case In Software Testing | EdurekaEdureka!
 
Machine Learning - Splitting Datasets
Machine Learning - Splitting DatasetsMachine Learning - Splitting Datasets
Machine Learning - Splitting DatasetsAndrew Ferlitsch
 
PL/SQL Introduction and Concepts
PL/SQL Introduction and Concepts PL/SQL Introduction and Concepts
PL/SQL Introduction and Concepts Bharat Kalia
 

What's hot (20)

Resampling methods
Resampling methodsResampling methods
Resampling methods
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Exploratory data analysis with Python
Exploratory data analysis with PythonExploratory data analysis with Python
Exploratory data analysis with Python
 
Decision tree
Decision treeDecision tree
Decision tree
 
Python For Data Analysis | Python Pandas Tutorial | Learn Python | Python Tra...
Python For Data Analysis | Python Pandas Tutorial | Learn Python | Python Tra...Python For Data Analysis | Python Pandas Tutorial | Learn Python | Python Tra...
Python For Data Analysis | Python Pandas Tutorial | Learn Python | Python Tra...
 
DATABASE CONSTRAINTS
DATABASE CONSTRAINTSDATABASE CONSTRAINTS
DATABASE CONSTRAINTS
 
Constraints In Sql
Constraints In SqlConstraints In Sql
Constraints In Sql
 
Types Of Keys in DBMS
Types Of Keys in DBMSTypes Of Keys in DBMS
Types Of Keys in DBMS
 
Collections in-csharp
Collections in-csharpCollections in-csharp
Collections in-csharp
 
Static Testing
Static TestingStatic Testing
Static Testing
 
Black Box Testing
Black Box TestingBlack Box Testing
Black Box Testing
 
Exception Handling
Exception HandlingException Handling
Exception Handling
 
Data clustring
Data clustring Data clustring
Data clustring
 
Data Preprocessing || Data Mining
Data Preprocessing || Data MiningData Preprocessing || Data Mining
Data Preprocessing || Data Mining
 
Feature enginnering and selection
Feature enginnering and selectionFeature enginnering and selection
Feature enginnering and selection
 
Jdbc ppt
Jdbc pptJdbc ppt
Jdbc ppt
 
Key and its different types
Key and its different typesKey and its different types
Key and its different types
 
How To Write A Test Case In Software Testing | Edureka
How To Write A Test Case In Software Testing | EdurekaHow To Write A Test Case In Software Testing | Edureka
How To Write A Test Case In Software Testing | Edureka
 
Machine Learning - Splitting Datasets
Machine Learning - Splitting DatasetsMachine Learning - Splitting Datasets
Machine Learning - Splitting Datasets
 
PL/SQL Introduction and Concepts
PL/SQL Introduction and Concepts PL/SQL Introduction and Concepts
PL/SQL Introduction and Concepts
 

Similar to Big Data Testing: Ensuring MongoDB Data Quality

Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...RTTS
 
Query Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programmingQuery Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programmingRTTS
 
How to Automate your Enterprise Application / ERP Testing
How to Automate your  Enterprise Application / ERP TestingHow to Automate your  Enterprise Application / ERP Testing
How to Automate your Enterprise Application / ERP TestingRTTS
 
QuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solutionQuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solutionRTTS
 
Data Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical IndustryData Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical IndustryRTTS
 
Leveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE VerticaLeveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE VerticaRTTS
 
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...RTTS
 
Improve the Health of Your Data
Improve the Health of Your DataImprove the Health of Your Data
Improve the Health of Your DataRTTS
 
Automated Testing of Microsoft Power BI Reports
Automated Testing of Microsoft Power BI ReportsAutomated Testing of Microsoft Power BI Reports
Automated Testing of Microsoft Power BI ReportsRTTS
 
DataOps , cbuswaw April '23
DataOps , cbuswaw April '23DataOps , cbuswaw April '23
DataOps , cbuswaw April '23Jason Packer
 
State of the Market - Data Quality in 2023
State of the Market - Data Quality in 2023State of the Market - Data Quality in 2023
State of the Market - Data Quality in 2023RTTS
 
QuerySurge AI webinar
QuerySurge AI webinarQuerySurge AI webinar
QuerySurge AI webinarRTTS
 
An introduction to QuerySurge webinar
An introduction to QuerySurge webinarAn introduction to QuerySurge webinar
An introduction to QuerySurge webinarRTTS
 
Deliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL TestingDeliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL TestingCognizant
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarRTTS
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurgeRTTS
 
Creating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing AssignmentCreating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing AssignmentRTTS
 
#DOAW16 - DevOps@work Roma 2016 - Testing your databases
#DOAW16 - DevOps@work Roma 2016 - Testing your databases#DOAW16 - DevOps@work Roma 2016 - Testing your databases
#DOAW16 - DevOps@work Roma 2016 - Testing your databasesAlessandro Alpi
 
MICROSOFT SQL Server
MICROSOFT SQL ServerMICROSOFT SQL Server
MICROSOFT SQL Serverwebhostingguy
 

Similar to Big Data Testing: Ensuring MongoDB Data Quality (20)

Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
 
Query Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programmingQuery Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programming
 
How to Automate your Enterprise Application / ERP Testing
How to Automate your  Enterprise Application / ERP TestingHow to Automate your  Enterprise Application / ERP Testing
How to Automate your Enterprise Application / ERP Testing
 
QuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solutionQuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solution
 
Data Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical IndustryData Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical Industry
 
Leveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE VerticaLeveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE Vertica
 
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
 
Improve the Health of Your Data
Improve the Health of Your DataImprove the Health of Your Data
Improve the Health of Your Data
 
Automated Testing of Microsoft Power BI Reports
Automated Testing of Microsoft Power BI ReportsAutomated Testing of Microsoft Power BI Reports
Automated Testing of Microsoft Power BI Reports
 
DataOps , cbuswaw April '23
DataOps , cbuswaw April '23DataOps , cbuswaw April '23
DataOps , cbuswaw April '23
 
State of the Market - Data Quality in 2023
State of the Market - Data Quality in 2023State of the Market - Data Quality in 2023
State of the Market - Data Quality in 2023
 
QuerySurge AI webinar
QuerySurge AI webinarQuerySurge AI webinar
QuerySurge AI webinar
 
An introduction to QuerySurge webinar
An introduction to QuerySurge webinarAn introduction to QuerySurge webinar
An introduction to QuerySurge webinar
 
Deliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL TestingDeliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL Testing
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurge
 
Test Automation for Data Warehouses
Test Automation for Data Warehouses Test Automation for Data Warehouses
Test Automation for Data Warehouses
 
Creating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing AssignmentCreating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing Assignment
 
#DOAW16 - DevOps@work Roma 2016 - Testing your databases
#DOAW16 - DevOps@work Roma 2016 - Testing your databases#DOAW16 - DevOps@work Roma 2016 - Testing your databases
#DOAW16 - DevOps@work Roma 2016 - Testing your databases
 
MICROSOFT SQL Server
MICROSOFT SQL ServerMICROSOFT SQL Server
MICROSOFT SQL Server
 

More from RTTS

TestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data TestingTestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data TestingRTTS
 
RTTS Postman and API Testing Webinar Slides.pdf
RTTS Postman and API Testing Webinar  Slides.pdfRTTS Postman and API Testing Webinar  Slides.pdf
RTTS Postman and API Testing Webinar Slides.pdfRTTS
 
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
 Webinar - QuerySurge and Azure DevOps in the Azure Cloud Webinar - QuerySurge and Azure DevOps in the Azure Cloud
Webinar - QuerySurge and Azure DevOps in the Azure CloudRTTS
 
Creating a Data validation and Testing Strategy
Creating a Data validation and Testing StrategyCreating a Data validation and Testing Strategy
Creating a Data validation and Testing StrategyRTTS
 
Implementing Azure DevOps with your Testing Project
Implementing Azure DevOps with your Testing ProjectImplementing Azure DevOps with your Testing Project
Implementing Azure DevOps with your Testing ProjectRTTS
 
Completing the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessCompleting the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessRTTS
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World DistilledRTTS
 
QuerySurge for DevOps
QuerySurge for DevOpsQuerySurge for DevOps
QuerySurge for DevOpsRTTS
 
Whitepaper: Volume Testing Thick Clients and Databases
Whitepaper:  Volume Testing Thick Clients and DatabasesWhitepaper:  Volume Testing Thick Clients and Databases
Whitepaper: Volume Testing Thick Clients and DatabasesRTTS
 
Case study: Open Source Automation Framework using Selenium WebDriver
Case study: Open Source Automation Framework using Selenium WebDriverCase study: Open Source Automation Framework using Selenium WebDriver
Case study: Open Source Automation Framework using Selenium WebDriverRTTS
 
Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
Enterprise Business Intelligence & Data Warehousing: The Data Quality ConundrumEnterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
Enterprise Business Intelligence & Data Warehousing: The Data Quality ConundrumRTTS
 
RTTS - the Software Quality Experts
RTTS - the Software Quality ExpertsRTTS - the Software Quality Experts
RTTS - the Software Quality ExpertsRTTS
 
What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?RTTS
 

More from RTTS (13)

TestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data TestingTestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data Testing
 
RTTS Postman and API Testing Webinar Slides.pdf
RTTS Postman and API Testing Webinar  Slides.pdfRTTS Postman and API Testing Webinar  Slides.pdf
RTTS Postman and API Testing Webinar Slides.pdf
 
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
 Webinar - QuerySurge and Azure DevOps in the Azure Cloud Webinar - QuerySurge and Azure DevOps in the Azure Cloud
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
 
Creating a Data validation and Testing Strategy
Creating a Data validation and Testing StrategyCreating a Data validation and Testing Strategy
Creating a Data validation and Testing Strategy
 
Implementing Azure DevOps with your Testing Project
Implementing Azure DevOps with your Testing ProjectImplementing Azure DevOps with your Testing Project
Implementing Azure DevOps with your Testing Project
 
Completing the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessCompleting the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = Success
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World Distilled
 
QuerySurge for DevOps
QuerySurge for DevOpsQuerySurge for DevOps
QuerySurge for DevOps
 
Whitepaper: Volume Testing Thick Clients and Databases
Whitepaper:  Volume Testing Thick Clients and DatabasesWhitepaper:  Volume Testing Thick Clients and Databases
Whitepaper: Volume Testing Thick Clients and Databases
 
Case study: Open Source Automation Framework using Selenium WebDriver
Case study: Open Source Automation Framework using Selenium WebDriverCase study: Open Source Automation Framework using Selenium WebDriver
Case study: Open Source Automation Framework using Selenium WebDriver
 
Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
Enterprise Business Intelligence & Data Warehousing: The Data Quality ConundrumEnterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
 
RTTS - the Software Quality Experts
RTTS - the Software Quality ExpertsRTTS - the Software Quality Experts
RTTS - the Software Quality Experts
 
What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?
 

Recently uploaded

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfLivetecs LLC
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 

Recently uploaded (20)

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdf
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 

Big Data Testing: Ensuring MongoDB Data Quality

  • 1. Webinar: automating the data testing for Bill Hayduk CEO, President RTTS Jeff Bocarsly, PhD VP & Chief Architect RTTS Presenters Ensuring MongoDB Data Quality built by QuerySurge™
  • 2. built by QuerySurge™ About FACTS Founded: 1996 Locations: New York (HQ), Atlanta, Philadelphia, Phoenix Strategic Partners: IBM, Microsoft, HP, Oracle, Teradata, HortonWorks, Cloudera, Amazon Software: QuerySurge RTTS is the leading provider of software & data quality for critical business systems
  • 4. What is MongoDB? Name: MongoDB (from "humongous") 1NoSQL means now “not only SQL” 210gen changed its name to MongoDB, Inc. Source: Wikipedia built by QuerySurge™ • classified as a NoSQL1 database • does not implement the table-based relational db structure • cross-platform document-oriented database • makes the integration of data in certain types of apps easier & faster • free and open source • originally built by 10gen2 and released in 2009 “MongoDB is in 5th place as the most popular type of database management system, and 1st place for NoSQL database management systems.” April 2014
  • 5. built by QuerySurge™ • Online real-time processing • Data set is smaller • Measured in milliseconds • Offline big data processing • Offline analytics • Measured in minutes & hours MongoDB versus Hadoop Source: classpattern.com When use MongoDB? / When use Hadoop?
  • 6. MongoDB Use Cases built by QuerySurge™ Source: MongoDB, Inc. Data Warehouse Batch Aggregation ETL from MongoDB ETL to MongoDB
  • 7. Use Cases: Data Warehouse Relational DB & Data Warehousing Source Data @ BI, Analytics & Reporting built by QuerySurge™ Ingestion
  • 8. Data Quality Issues built by QuerySurge™ Data Quality Best Practices boost revenue by 66%. The average organization loses $8.2 million annually through poor Data Quality. 46% of companies cite Data Quality as a barrier for adopting Business Intelligence products. 80% of organizations… will underestimate the costs related to the data acquisition tasks by an average of 50 percent.
  • 10. Validating Data: 3 Big Issues - need to verify more data and to do it faster - need to automate the testing effort - need to be able to test across different platforms Need a testing tool! built by QuerySurge™
  • 11. What is QuerySurge™? a collaborative data testing tool that finds bad data & provides a holistic view of your data’s health built by QuerySurge™
  • 12. • Reduce your costs & risks • Improve your data quality • Accelerate your testing cycles • Share information with your team with QuerySurge™ you can: built by QuerySurge™ • Provides huge ROI (i.e. 1,300%)* *based on client’s calculation of Return on Investment
  • 13. Finding Bad Data SQL SQL SQL SQL SQL SQL  QS pulls data from data source(s)  QS pulls data from data target(s)  QS compares data in seconds  QS generates reports, audit trails How? reports built by QuerySurge™
  • 14. the QuerySurge advantage built by QuerySurge™ Automate the entire testing cycle  Automate kickoff, tests, comparison, auto-emailed results Create Tests easily with no SQL programming  ensures minimal time & effort to create tests / obtain results Test across different platforms  data warehouse, Hadoop, NoSQL, database, flat file, XML Collaborate with team  Data Health dashboard, shared tests & auto-emailed reports Verify more data & do it quickly  verifies up to 100% of all data up to 1,000 x faster Integrate for Continuous Delivery  Integrates with most Build, ETL & QA management software
  • 15. Collaboration Testers - functional testing - regression testing - result analysis Developers / DBAs - unit testing - result analysis Data Analysts - review, analyze data - verify mapping failures Operations teams - monitoring - result analysis Managers - oversight - result analysis Share information on the built by QuerySurge™
  • 16. QuerySurge™ Architecture Web-based… Installs on... Linux Connects to… …or any other JDBC compliant data source built by QuerySurge™ QuerySurge Controller QuerySurge Server QuerySurge Agents Flat Files
  • 17. built by QuerySurge™ QuerySurge™ Modules Design Library SchedulingDeep-Dive Reporting Run Dashboard Query Wizards Data Health Dashboard
  • 18. built by From a recent poll1 of: • Big Data Experts • Data Warehouse Architects • Solution Architects • ETL Architects Recent Survey: Data Experts Consensus Answer: 80% of data columns have no transformation at all Our Question: What % of columns in your Data Warehouse have no transformations at all? 1Poll conducted by RTTS on targeted LinkedIn groups Why is this important?
  • 19. Fast and Easy. No programming needed. built by QuerySurge™ QuerySurge™ Modules Compare by Table, Column & Row • Perform 80% of all data tests - no SQL coding needed • Opens up testing to novices & non-technical team members • Speeds up testing for skilled SQL coders • provides a huge Return-On-Investment
  • 20. built by QuerySurge™ QuerySurge™ Modules: (we picked Column-Level Comparison)
  • 21. Design Library • Create Query Pairs (source & target SQLs) • Great for team members skilled with SQL QuerySurge™ Modules Scheduling  Build groups of Query Pairs  Schedule Test Runs built by QuerySurge™
  • 22. Deep-Dive Reporting  Examine and automatically email test results Run Dashboard  View real-time execution  Analyze real-time results QuerySurge™ Modules built by QuerySurge™
  • 23. built by QuerySurge™ • view data reliability & pass rate • add, move, filter, zoom-in on any data widget & underlying data • verify build success or failure
  • 24. QuerySurge Test Management Connectors built by QuerySurge™  Drive QuerySurge execution from your Test Management Solution  Outcome results (Pass/Fail/etc.) are returned from QuerySurge to your Test Management Solution  Results are linked in your Test Management Solution so that you can click directly into detailed QuerySurge results • HP ALM (Quality Center) • Microsoft Team Foundation Server • IBM Rational Quality Manager Integration with leading Test Management Solutions
  • 25. Use Case: Big Data and Relational DB & Data Warehousing Source Data @ BI, Analytics & Reporting Ingestion built by QuerySurge™ ™
  • 26. Value-Add QuerySurge provides value by either: in testing data coverage from < 1% to upwards of 100% in testing time by as much as 1,000 x combination of in test coverage while in testing time 26 built by QuerySurge™
  • 27. Return on Investment QuerySurge provides an increase in better data due to shorter / more thorough testing cycle - saving $$$. 27 built by QuerySurge™ Pharmaceutical Organization Saves $288,000 in Clinical Trials Data Migration Testing Project 1Since 2010, the pharmaceutical industry has been assessed over $13 billion in fines. Source: wikipedia http://en.wikipedia.org/wiki/List_of_largest_pharmaceutical_settlements This savings does not include savings from avoiding fines from regulatory bodies or lawsuits.1 Total Savings
  • 28. Contact us if your team would like: (1) a Trial in the Cloud of QuerySurge, including self-learning tutorial that works with sample data for 3 days or (2) a downloaded Trial of QuerySurge, including self-learning tutorial with sample data or your data for 15 days or (3) a Proof of Concept of QuerySurge, including a kickoff & setup meeting and weekly meetings with our team of experts for 30 days http://www.querysurge.com/compare-trial-optionsfor more information, Go here QuerySurge built by QuerySurge™ TRIAL IN THE CLOUD

Editor's Notes

  1. There are a multitude of bad news that is hitting the press regarding bad data. And these mistakes are extremely costly, as the news from the Australian retail grocery industry losing $1 billion Australian ($940 million US) highlights.
  2. QuerySurge provides insight onto the health of your data throughout your organization through BI dashboards and reporting at your fingertips. It is a collaborative tools that allows for distributed use of the tool throughout your organization and provides for a sharable, holistic view of your data’s health and your organization’s level of maturity of your data management.
  3. QuerySurge helps your team coordinate your data quality initiatives while speeding up your development and testing cycles and finding your bad data. Why risk having your team identify trends and develop strategic initiatives when the underlying data is incorrect? QuerySurge reduces this risk.
  4. QuerySurge finds bad data by natively connecting to: any data source, whether it is any type of database, flat file or xml and can connect to any data target, whether it is a db, file, xml, data warehouse or hadoop implementation. QuerySurge pulls data from the source and the target and compares them very quickly (typically in a few minutes) and then produces reports that show every data difference, even if there are millions of rows and hundreds of columns in the test. These reports can be automatically emailed to your team. You can pick from a multitude of reports or export the results so that you can build your own reports.
  5. QuerySurge can utilized by active practitioners such as testers & developers to create and launch tests, or by managers, analysts and operations to view data test results and the overall health of the data. QuerySurge facilitates this by providing 2 types of licenses: (1) full user & (2) participant user. (1) Full User – This type of user has unlimited access to create QueryPairs, Suites, and Scenarios. This user can also schedule and run tests, see results, run and export reports, and export data. Perfect for anyone creating and/or running data tests while performing analysis of results. (2) Participant User – This user cannot create or run tests, but has access to all other information - including viewing all query pairs, results, and reports, receiving email notifications, and exporting test results and reports. Perfect for managers, analysts, architects, DBAs, developers, and operations users who need to know the health of their data.
  6. Your distributed team from around the world can use any of these web browsers: Internet Explorer, Chrome, Firefox and Safari. Installs on operating systems: Windows & Linux. QS connects to any JDBC-compliant data source. Even if it is not listed here.