More Related Content
Similar to Fundamentals of BI Report Testing - Module 1 (20)
More from MichaelCalabrese20 (8)
Fundamentals of BI Report Testing - Module 1
- 1. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
1 a software division of
QuerySurge™
Module 1
Introduction to
BI Testing
- 2. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
2 a software division of
QuerySurge™
Course Objectives
In this course we will aim to:
• Understand the role and importance of BI Testing in the context of
Business Intelligence.
• Gain insights into the fundamentals of Business Intelligence and
the BI Report testing processes.
• Identify the challenges and considerations involved in BI Testing.
• Learn about the different testing approaches and tools utilized to
validate BI Reports and ensure their accuracy.
- 3. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
3 a software division of
QuerySurge™
Module Outline
− What is Business Intelligence (BI)?
− BI testing and the decision-making processes
− What is BI Testing
− Challenges in BI Testing
− Fundamentals of BI Testing
− Data Sources used in BI
− ETL Process & ETL Pipelines
− Transactional vs Analytical Databases
− Primary & Foreign Keys
- 4. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
4 a software division of
QuerySurge™
Business Intelligence – What is it?
• Software applications used in collecting, displaying and
analyzing business data.
• BI provides simple access to data which can be used in day-to-
day operations and integrates data into logical business areas.
• BI provides historical, current and predictive views of business
operations.
• BI is made up of several related activities, including data mining,
online analytical processing, querying and reporting.
- 5. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
5 a software division of
QuerySurge™
Business Intelligence – Decision Making
• Business users, Managers and Executives are making strategic
decisions based on information from their BI and analytics
initiatives to try to provide their firms with a competitive
advantage.
• But what if the data is incorrect? Then they are making big bets,
impacting the company’s direction and future, on analyses that
has underlying data that is incorrect.
“Poor data quality now costs organizations
an average of $14 million annually”
- Gartner
“46% of companies cite data quality as a
barrier for adopting Business Intelligence
products”
- InformationWeek
- 6. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
6 a software division of
QuerySurge™
What is Business Intelligence Testing?
• Business Intelligence (BI) Testing refers to the process of ensuring
the accuracy, reliability, and functionality of BI reports.
• It involves validating data, calculations, visualizations, and overall
report functionality.
• BI Testing plays a critical role in supporting informed decision-
making processes within organizations.
• A comprehensive BI report testing effort will span multiple testing
disciplines and aim to validate all aspects of the reporting platform.
- 7. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
7 a software division of
QuerySurge™
Challenges in BI Testing
• Varying data sources: BI reports often pull data from different sources, requiring careful
validation and integration.
• Complex calculations: Reports may involve complex calculations, aggregations, and derived
metrics, which need to be verified.
• Evolving requirements: As business needs change, BI reports may require frequent updates
and testing to accommodate new requirements.
- 8. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
8 a software division of
QuerySurge™
Business Intelligence Platforms
• A Business Intelligence (BI) platform is a software solution that facilitates the collection, integration,
analysis, and visualization of business data to support the decision-making processes within an
organization.
• It provides a comprehensive suite of tools and functionalities that allow businesses to turn raw data
into actionable insights.
• BI platforms are used to create reports, dashboards, and interactive visualizations that help users
understand trends, patterns, and performance metrics.
- 9. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
9 a software division of
QuerySurge™
Business Intelligence Platforms
• Some of the top BI platform vendors include:
o Tableau
o Microsoft Power BI
o MicroStrategy
o SAP Business Objects
o Oracle Business Intelligence
o Qlik Sense & Qlik View
o Looker by Google
o Domo
- 10. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
10 a software division of
QuerySurge™
ETL processes move data across several different legs
Data Warehouse
ETL
Data Mart
ETL
Source Data Big Data lake BI & Analytics
BI platform extracts
data for reports
Business Intelligence – Underlying Data
The data that populates a BI report can originate from multiple sources. Typically, an ETL process is
involved, which orchestrates the movement and transformation of this data into a purpose-built data
mart, finely tuned to serve its reporting objectives.
- 11. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
11 a software division of
QuerySurge™
Business Intelligence – Underlying Data
• Understanding the various components that contribute to a BI report will help guide the testing
strategy needed to comprehensively validate and ensure the accuracy of these reports.
• The following slides will discuss various components in the ETL architecture that can populate data
within BI reports:
Data Source
Data Model
Data Warehouse
Data Lake
Data Mart
Transactional vs. Analytical
Databases
Dimensional Modeling
Primary & Foreign Keys
- 12. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
12 a software division of
QuerySurge™
What is a Data Source?
• A Data Source is a pool of data available for extraction into a Data Warehouse.
• The concept of the Data Source is technologically neutral – it is not associated with any specific
technology.
• The most common Data Sources are databases, data warehouses, data marts and files.
- 13. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
13 a software division of
QuerySurge™
What is a Database?
• A database is a collection of information structured to provide efficient retrieval.
• Databases are produced to drive large quantities of data by recording, storing, retrieving,
and managing that information.
- 14. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
14 a software division of
QuerySurge™
What is a Data Model?
• A Data Model is a type of model that defines the logical structure of a database and
determines the way data is stored, organized, and manipulated.
• The most used database model is the relational model, which will be utilized for this
course.
- 15. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
15 a software division of
QuerySurge™
What is a Data Warehouse?
• A collection of data or information intended to support business decision
making.
• Data Warehouses contain a wide variety of data that present a coherent
picture of business conditions.
• A Data Warehouse is a huge repository of electronically organized data mainly
meant for the purpose of reporting and analysis.
• Most Data Warehouses are sent data from multiple sources (Databases and
Files).
• A place where historical data is stored for archival, analysis and security
purposes.
Legacy DB
CRM/ERP
DB
Finance DB
Data Warehouse
- 16. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
16 a software division of
QuerySurge™
What is a Data Lake?
• A vast collection of data stored in its raw/unfiltered format.
• Typically used as a single store of data which can include
structured and unstructured data.
• Can be established “on premises” but is usually managed in the
“cloud” due to the massive size in data being stored.
- 17. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
17 a software division of
QuerySurge™
Transactional database vs. Analytical database?
• A Transactional or operational database is used for transaction processing such as in order entry
systems, customer service applications, and inventory control programs.
• An analytical database stores historical data on business metrics like inventory levels and sales
performance numbers.
o The information is updated on a regular basis (called an incremental) to incorporate recent
transaction data from an organization’s operational systems.
o Incremental updates from source transactional databases generally occur daily, weekly or
monthly.
- 18. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
18 a software division of
QuerySurge™
Data Warehouse
Data Mart
Data Mart
BI Tool
BI Tool
Inventory
‘We have
212 Widgets
in the east
warehouse’
Customer Service
‘The paint
came off my
widget’
Advertising
‘Running a
new radio ad
today’
Transactional Analytical
- 19. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
19 a software division of
QuerySurge™
What is Dimensional Modeling?
• ‘Dimensional modeling’ refers to a commonly used set of concepts to define how Data
Warehouse databases should be structured.
• Key dimensional concepts include the notions of Fact tables and Dimension tables
• Fact Table - A fact table consists of the measurements, metrics, events or facts of a business
process.
Example: Customer X ordered a widget Y on Monday at 7:35PM.
• Dimension Table - A dimension table is a complete library of reference information about a
measurable event.
Example: Common dimensions are the entire set of products sold by a company, or the
entire set of customers of the company, and all their relevant attributes.
- 20. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
20 a software division of
QuerySurge™
• The Item_ID, Date_ID and Customer_ID fact keys from the Order Fact Table are joined
by keys to the Item, Data and Customer Dimension tables.
Order Fact Table
Item_ID Date_ID Customer_ID Units
76 4 2 4
411 8 3 2
5 8 1 1
5 7 1 11
Customer Dimension Table
Customer_ID First_Name Last_Name City
1 Chris Thompson Wayne
2 Fred Olson Atlanta
3 Billy Blau New York
Item Dimension Table
Item_ID Product SKU Price
5 1 L Bottle 1834G 10.95
76 10 L Bottle 5843J 20.89
411 50 L Bottle 7323S 35.50
Date Dimension Table
Date_ID Date Month Fiscal Yr
4 01-04-2020 January F2012
7 01-07-2020 January F2012
8 01-08-2020 January F2012
- 21. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
21 a software division of
QuerySurge™
What are Primary and Foreign Keys?
• In a database, the primary key is a key that uniquely defines each record or row
• The primary key must consist of characteristics that cannot be duplicated by any other row
• A foreign key is a field in one table that uniquely identifies a row of another table
• A foreign key is a column that is used to establish and enforce a link between the data in two
tables.
- 22. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
22 a software division of
QuerySurge™
What is ETL?
• In computing, the term Extract, Transform and Load (ETL) refers to a data
handling process that involves:
Extract data from outside sources
Transform data to fit operational or reporting needs
Load data into the endpoint target (usually a database, more specifically a
Data Warehouse or Data Mart)
• Why ETL? Businesses need to load the Data Warehouse regularly
(incrementally/daily/weekly) so that it can serve its purpose of supporting the
business intelligence reports.
- 23. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
23 a software division of
QuerySurge™
Legacy DB
CRM/ERP
DB
Finance DB
Source Data ETL Process Target DWH
Extract
Transform
Load
- 24. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
24 a software division of
QuerySurge™
ETL Pipelines
• An ETL pipeline is a series of processes, or a data workflow used to collect, process, and transfer data
from various sources to a destination, typically a data warehouse, database, data lake or data mart.
• ETL pipelines are a fundamental component of data integration and are widely used in business
intelligence, data analytics, and data warehousing.
• An ETL Pipeline orchestrator is a tool or system that manages and coordinates the ETL processes in
data integration workflows.
• Popular ETL Pipeline orchestrators include tools like Apache Nifi, Apache Airflow, Talend, Informatica,
and Microsoft Azure Data Factory. These tools simplify the management of complex ETL workflows and
enhance the efficiency of data integration processes.
- 25. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
25 a software division of
QuerySurge™
What is a Data Mart?
• A Data Mart is the access layer of the Data Warehouse architecture that serves for efficient
reporting.
• The Data Mart is a subset of the Data Warehouse that is usually oriented to a specific business
line or team.
• A Data Mart usually aggregates standard information from the Data Warehouse for reporting
efficiency.
• Data Marts can be a slice of the Data Warehouse –
i.e., Data Mart might pertain to a
single department.
Data Warehouse Data Mart
- 26. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
26 a software division of
QuerySurge™
Data Mart
A database that has the same characteristics as a data warehouse but is usually smaller and
focused on the data for one division or one workgroup within an enterprise.
It will typically hold aggregated data and some granular data. It is a subset of the DW which
makes it more efficient for Business Intelligence reporting. BI tools sit on top of the data marts.
Legacy DB
CRM/ERP DB
Finance DB
Source Data ETL Process Target DW ETL Process Data Mart
Business Intelligence (BI) & Data Marts
a software division of
QuerySurge™
- 27. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
27 a software division of
QuerySurge™
Legacy DB
CRM/ERP
DB
Finance DB
Source Data
ETL Process
Target DW
ETL Process
Data Mart
a software division of
QuerySurge™
Sample BI Reports generated from Data Mart
- 28. © 2019 Real-Time Technology Solutions, Inc.
22 West 38th Street FL 11, New York, NY 10018
www.rttsweb.com | (212) 240-9050
28 a software division of
QuerySurge™
Summary
• Business Intelligence (BI) involves software applications used in collecting, displaying, and analyzing
business data.
• BI Reports are used in making strategic decisions which could lead to negative impacts if the data is
incorrect.
• BI Testing refers to the process of ensuring the accuracy, reliability, and functionality of BI reports and
involves validating data, calculations, visualizations, and overall report functionality.
• The data that populates a BI report can originate from multiple sources in which an ETL process is
typically involved.
• The various components in an ETL process include:
o Data Source
o Data Model
o Data Warehouse
o Data Lake
o Data Mart
o Transactional Databases
o Analytical Databases
o Dimensional Modeling
o Primary & Foreign Keys
o ETL Pipeline Orchestrators