SlideShare a Scribd company logo
1 of 33
Download to read offline
1 | P a g e
UNIT 1
BUSINESS INTELLIGENCE
Introduction
BI(Business Intelligence) is a set of processes, architectures, and technologies that
convert raw data into meaningful information that drives profitable business
actions. It is a suite of software and services to transform data into actionable
intelligence and knowledge.
BI has a direct impact on organization’s strategic, tactical and operational business
decisions. BI supports fact-based decision making using historical data rather than
assumptions and gut feeling.
BI tools perform data analysis and create reports, summaries, dashboards, maps,
graphs, and charts to provide users with detailed intelligence about the nature of
the business.
Importance of BI
 Measurement: creating KPI (Key Performance Indicators) based on historic
data
 Identify and set benchmarks for varied processes.
 With BI systems organizations can identify market trends and spot business
problems that need to be addressed.
 BI helps on data visualization that enhances the data quality and thereby the
quality of decision making.
 BI systems can be used not just by enterprises but SME (Small and Medium
Enterprises)
Implementation of BI
Here are the steps:
Step 1) Raw Data from corporate databases is extracted. The data could be spread
across multiple systems heterogeneous systems.
Step 2) The data is cleaned and transformed into the data warehouse. The table can
be linked, and data cubes are formed.
2 | P a g e
Step 3) Using BI system the user can ask quires, request ad-hoc reports or conduct
any other analysis.
Examples of Business Intelligence System
used in Practice
Example 1:
In an Online Transaction Processing (OLTP) system information that could be fed
into product database could be
 add a product line
 change a product price
Correspondingly, in a Business Intelligence system query that would beexecuted for
the product subject area could be did the addition of new product line or change in
product price increase revenues
In an advertising database of OLTP system query that could be executed
 Changed in advertisement options
 Increase radio budget
3 | P a g e
Correspondigly, in BI system query that could be executed would be how many new
clients added due to change in radio budget
In OLTP system dealing with customer demographic data bases data that could be
fed would be
 increase customer credit limit
 change in customer salary level
Correspondingly in the OLAP system query that could be executed would be can
customer profile changes support support higher product price
Example 2:
A hotel owner uses BI analytical applications to gather statistical information
regarding average occupancy and room rate. It helps to find aggregate revenue
generated per room.
It also collects statistics on market share and data from customer surveys from
each hotel to decides its competitive position in various markets.
By analyzing these trends year by year, month by month and day by day helps
management to offer discounts on room rentals.
Example 3:
A bank gives branch managers access to BI applications. It helps branch manager
to determine who are the most profitable customers and which customers they
should work on.
The use of BI tools frees information technology staff from the task of generating
analytical reports for the departments. It also gives department personnel access to
a richer data source.
Types of BI users
Following given are the four key players who are used Business Intelligence System:
1. The Professional Data Analyst:
The data analyst is a statistician who always needs to drill deep down into data. BI
system helps them to get fresh insights to develop unique business strategies.
2. The IT users:
4 | P a g e
The IT user also plays a dominant role in maintaining the BI infrastructure.
3. The head of the company:
CEO or CXO can increase the profit of their business by improving operational
efficiency in their business.
4. The Business Users”
Business intelligence users can be found from across the organization. There are
mainly two types of business users
1. Casual business intelligence user
2. The power user.
The difference between both of them is that a power user has the capability of
working with complex data sets, while the casual user need will make him use
dashboards to evaluate predefined sets of data.
Advantages of Business Intelligence
Here are some of the advantages of using Business Intelligence System:
1. Boost productivity
With a BI program, It is possible for businesses to create reports with a single click
thus saves lots of time and resources. It also allows employees to be more
productive on their tasks.
2. To improve visibility
BI also helps to improve the visibility of these processes and make it possible to
identify any areas which need attention.
3. Fix Accountability
BI system assigns accountability in the organization as there must be someone who
should own accountability and ownership for the organization’s performance
against its set goals.
4. It gives a bird’s eye view:
BI system also helps organizations as decision makers get an overall bird’s eye view
through typical BI features like dashboards and scorecards.
5 | P a g e
5. It streamlines business processes:
BI takes out all complexity associated with business processes. It also automates
analytics by offering predictive analysis, computer modeling, benchmarking and
other methodologies.
6. It allows for easy analytics.
BI software has democratized its usage, allowing even nontechnical or non-analysts
users to collect and process data quickly. This also allows putting the power of
analytics from the hand’s many people.
BI System Disadvantages
1. Cost:
Business intelligence can prove costly for small as well as for medium-sized
enterprises. The use of such type of system may be expensive for routine business
transactions.
2. Complexity:
Another drawback of BI is its complexity in implementation of datawarehouse. It
can be so complex that it can make business techniques rigid to deal with.
3. Limited use
Like all improved technologies, BI was first established keeping in consideration the
buying competence of rich firms. Therefore, BI system is yet not affordable for many
small and medium size companies.
4. Time Consuming Implementation
It takes almost one and half year for data warehousing system to be completely
implemented. Therefore, it is a time-consuming process.
Trends in Business Intelligence
The following are some business intelligence and analytics trends that you should
be aware of.
Artificial Intelligence: Gartner’ report indicates that AI and machine learning now
take on complex tasks done by human intelligence. This capability is being
leveraged to come up with real-time data analysis and dashboard reporting.
6 | P a g e
Collaborative BI: BI software combined with collaboration tools, including social
media, and other latest technologies enhance the working and sharing by teams for
collaborative decision making.
Embedded BI: Embedded BI allows the integration of BI software or some of its
features into another business application for enhancing and extending it’s
reporting functionality.
Cloud Analytics: BI applications will be soon offered in the cloud, and more
businesses will be shifting to this technology. As per their predictions within a
couple of years, the spending on cloud-based analytics will grow 4.5 times faster.
Gap between business and IT professionals
The key to bridging the gap between business and IT professionals is
communication. Both groups will need to learn to communicate with each other in
order to understand each other’s needs and goals. By doing so, they can work
together more effectively to create solutions that meet the needs of both businesses
and users.
The Benefits of Bridging the Gap Between Business and IT
As businesses rely increasingly on technology, it’s more important than ever for the
business and IT departments to work closely together. By bridging the gap between
these two groups, businesses can reap a number of benefits, including:
1. Increased Efficiency
When business and IT professionals work together closely, they can share
information and ideas more easily, which can lead to increased efficiency and
productivity.
2. Improved Communication
Close partnerships between business and IT can help improve communication
between the two groups, which can help prevent misunderstandings and
miscommunications.
3. Better Decision-making
When business and IT professionals share information and ideas openly, they can
make better decisions about how to use technology to meet business goals.
4. Greater Innovation
7 | P a g e
Partnerships between business and IT can lead to greater innovation as the two
groups explore new ways to use technology to benefit the business.
5. Enhanced customer service
Close collaboration between business and IT can help improve customer service by
ensuring that technology is being used in the most effective way possible to meet
customer needs.
How to Approach a Discussion About the Gap Between
Business and IT
The gap between business and IT professionals can be difficult to overcome, but it is
important to try. One way to approach this discussion is by first understanding the
different perspectives of each group. Business professionals focus on the bottom
line and outcomes, while IT professionals focus on the details and implementation.
It is important to find common ground between these two perspectives in order to
have a productive discussion.
Some ways to overcome the gap between business and IT professionals include:
-Encourage open communication
Make sure that both groups feel comfortable communicating with each other. This
can be done by setting up regular meetings or check-ins, as well as creating
channels for communication outside of formal meetings (e.g., Slack, KalamTime
groups, email, etc.).
-Educate each other
Take some time to educate each other on the different priorities and goals of each
group. This will help everyone understand where the other is coming from and what
they are trying to accomplish.
-Work together on projects
Collaborating on projects is a great way to bridge the gap between business and IT
professionals. By working together, both groups can learn about the strengths and
weaknesses of the other and how they can best complement each other.
8 | P a g e
Business intelligence architecture components and diagram
A BI architecture can be deployed in an on
either case, it contains a set of core components that collectively support the
different stages of the BI process from data collection, integration, data storage and
analysis to data visualization, information delivery and the use of BI data in
business decision-making.
This shows the various technologies used to run BI and analytics applications.
The core components of a BI architecture include the following:
Source systems. These are all of the systems that capture and hold the
transactional and operational data identified as essential for the enterprise BI
program. For example, this can include
relationship management, flat files, application programming interfaces, finance,
manufacturing and supply chain
sources, such as market data and customer databases from outside information
providers. As a result, both internal and external data sources are often
incorporated into a BI architecture.
UNIT 2
Business intelligence architecture components and diagram
A BI architecture can be deployed in an on-premises data center or in the
either case, it contains a set of core components that collectively support the
different stages of the BI process from data collection, integration, data storage and
analysis to data visualization, information delivery and the use of BI data in
This shows the various technologies used to run BI and analytics applications.
The core components of a BI architecture include the following:
These are all of the systems that capture and hold the
nd operational data identified as essential for the enterprise BI
program. For example, this can include enterprise resource planning, customer
relationship management, flat files, application programming interfaces, finance,
supply chain management systems as well as secondary
sources, such as market data and customer databases from outside information
providers. As a result, both internal and external data sources are often
incorporated into a BI architecture.
Business intelligence architecture components and diagram
premises data center or in the cloud. In
either case, it contains a set of core components that collectively support the
different stages of the BI process from data collection, integration, data storage and
analysis to data visualization, information delivery and the use of BI data in
This shows the various technologies used to run BI and analytics applications.
These are all of the systems that capture and hold the
nd operational data identified as essential for the enterprise BI
enterprise resource planning, customer
relationship management, flat files, application programming interfaces, finance,
systems as well as secondary
sources, such as market data and customer databases from outside information
providers. As a result, both internal and external data sources are often
9 | P a g e
Important criteria in the data source selection process include data relevancy, data
currency, data quality and the level of detail in the available data sets. In addition, a
combination of structured, semi-structured and unstructured data types might be
required to meet the data analysis and decision-making needs of executives and
other end users.
Data integration and cleansing tools. To effectively analyze the collected data for a
BI program, an organization must integrate and consolidate different data sets to
create unified views of them. The most widely used data integration technology for
BI applications is extract, transform and load (ETL) software, which pulls data from
source systems in batch processes. A variant of ETL is extract, load and transform a
technology in which data is extracted and loaded as-is and transformed later for
specific BI uses. Other methods include real-time data integration, such as change
data capture and streaming integration to support real-time analytics applications,
and data virtualization, which combines data from different source systems
virtually.
A BI architecture typically also includes data profiling and data cleansing tools that
are used to identify and fix data quality issues. They help BI and data management
teams provide clean, consistent data that's suitable for BI uses.
Analytics data stores. This encompasses the various repositories where BI data is
stored and managed. The primary repository is a data warehouse, which usually
stores structured data in a relational, columnar or multidimensional database and
makes it available for querying and analysis. An enterprise data warehouse can also
be tied to smaller data marts set up for individual departments and business units
with data that's specific to their BI needs.
In addition, BI architectures often include an operational data store (ODS) that's an
interim repository for data before it goes into a data warehouse. An ODS can also be
used to run analytical queries against recent transaction data. Depending on the
size of a BI environment, a data warehouse, data mart and an ODS can be deployed
on a single database server or separate business intelligence systems.
A data lake running on a Hadoop cluster or other big data platform can also be
incorporated into a BI architecture as a repository for raw data of various types. The
data can be analyzed in the data lake itself or filtered and loaded into a data
warehouse for analysis. A well-planned architecture should specify which of the
different data stores is best suited for particular BI uses.
BI and data visualization tools. The tools used to analyze data and present
information to business users include a suite of technologies that can be built into a
BI architecture -- for example, ad hoc query, data mining and online analytical
processing software. In addition, the growing adoption of self-service BI tools
10 | P a g e
enables business analysts and managers to run queries themselves instead of
relying on the members of the BI team to do that for them.
BI software also includes data visualization tools that can be used to create
graphical representations of data in the form of charts, graphs and other types of
visualizations designed to illustrate trends, patterns and outlier elements in data
sets.
Dashboards, portals and reports. These information delivery tools give users
visibility into the results of BI and analytics applications with built-in data
visualizations and, often, self-service capabilities to do additional data analysis. For
example, BI dashboards and online portals can be designed to provide real-time
data access with configurable views and give users the ability to drill down into
data. Reports tend to present data in a more static format.
Other components that increasingly are part of a business architecture include data
preparation software used to structure and organize data for analysis and a
metadata repository, a business glossary and a data catalog, which can help users
find relevant data and understand its lineage and meaning.
BI architecture tools
BI architecture tools facilitate the centralization of data collection as well as data
analysis and visualization. These tools play an integral role in empowering
businesses to make informed decisions and extract insights from extensive data
sets.
Some examples of BI tools on the market include the following:
1. Datapine. Datapine lets users access, view, analyze and share their company
data on a single analytics platform. Users can perform data analysis, create
interactive business dashboards and obtain new business insights through a
simple drag-and-drop interface.
2. Domo. The Domo cloud-based platform unifies data, systems and people for
seamless business operations. It provides enterprise tools for data aggregation,
analytics, dashboards and reporting for organizations looking to maximize data
value.
3. Dundas BI. This enterprise-level BI tool lets users create and customize
interactive dashboards and reports. The software can either act as a central data
hub or integrate into existing websites for customized BI capabilities.
4. GoodData. As part of the GoodData platform, this tool offers an enterprise-level
option for data analytics and business intelligence. It helps users analyze data
coming from multiple sources and create reports.
11 | P a g e
5. Infor Birst. Infor Birst is a cloud-based platform that uses a networked
approach and modern enterprise-class architecture with a focus on multi-
tenancy. Birst ensures that a company's data remains connected by centralizing
both decentralized and centralized data.
6. Microsoft Power BI. Users can run analytics either in the cloud or in a reporting
server. The tool comes with built-in artificial intelligence features and offers end-
to-end encryption features.
7. Oracle Business Intelligence. This integrated set of tools lets users gather,
store, analyze and report data for smart decision-making. In addition, it includes
a scalable BI server, dashboards, a content library, web-based reporting and
analytics tools.
8. SAS Business Intelligence. This collection of tools lets corporate users conduct
self-service analytics. Its two components -- Enterprise Business Intelligence and
Business Visualization -- provide interactive visualizations and analytics to aid
with data analysis and decision-making.
9. Tableau. In addition to data visualization features, this tool offers live visual
analytics and supports most databases and numerous data sources.
10. Zoho Analytics. This self-service BI and data analytics software lets users
analyze data, generate data visualizations and uncover insights quickly and
easily. This tool is accessible to both small and large-sized organizations.
Online Analytical Processing (OLAP) – Definition, Architecture
and Functionality
 OLAP Council (1997) define Online Analytical Processing (OLAP) as a group
of decision support system that facilitate fast, consistent and interactive
access of information that has been reformulate, transformed and
summarized from relational dataset mainly from data warehouse into Multi-
Dimensional Databases
 OLAP have the ability to analyze large amount of data for the extraction of
valuable information. Analytical development can be of business, education or
medical sectors. OLAP enable discovering pattern and relationship contain in
business activity by query tons of data from multiple database source systems
at one time.
 Processing database information using OLAP required an OLAP server to
organize and transformed and builds Multi Dimensional Database (MDDB).
MDDB are then separated by cubes for client OLAP tools to perform data
analysis which aim to discover new pattern relationship between the cubes.
 Data warehouse stores and manages data while OLAP transforms data
warehouse datasets into strategic information. OLAP function ranges from
12 | P a g e
basic navigation and browsing (often known as “slice and dice”), to
calculations and also serious analysis such as time series and complex
modelling.
 As decision-makers implement more advanced OLAP capabilities, they move
from basic data access to creation of information and to discovering of new
knowledge.
CHALLENGES IN OLAP
 Data quality and consistency: Data sources may have different formats,
standards, definitions, and levels of granularity, which can cause
discrepancies and errors in the cube. For example, different databases may
use different currencies, date formats, or units of measurement.
 Cube size and performance: As the amount and complexity of data
increases, so does the size of the cube, which can affect the storage space,
processing time, and query speed.
 Cube design and maintenance: The cube design involves choosing the
dimensions, measures, hierarchies, and calculations that best suit the
analytical needs and goals of the users. The cube design also affects the
usability, flexibility, and scalability of the cube. However, designing a cube
that meets all the requirements and expectations can be difficult and costly,
especially if the data sources or business rules change frequently.
 Cube security and access: The cube may contain sensitive or confidential
information that needs to be protected from unauthorized or inappropriate
use. The cube may also have different types of users with different roles and
permissions, such as administrators, analysts, or managers. To ensure the
security and access of the cube, some measures such as encryption,
authentication, authorization, and auditing can be used. However, these
measures also have implications for the complexity, performance, and
usability of the cube.
 Cube compatibility and integration: The cube may need to interact with
various data sources, such as relational databases, data warehouses, or web
services. The cube may also need to support various analytical tools, such as
reporting, dashboarding, or visualization software. To ensure the
compatibility and integration of the cube, some standards, protocols, and
interfaces can be used, such as XMLA, MDX, or OLE DB for OLAP. However,
these standards and protocols also have limitations, such as complexity,
performance, or functionality.
 Cube adoption and usage: The cube may have a steep learning curve, as it
requires some technical skills and knowledge to understand and manipulate
the cube data and functionality. The cube may also have a low user
13 | P a g e
satisfaction, as it may not meet the user expectations or preferences in terms
of usability, flexibility, or relevance.
Online Analytical Processing (OLAP) Architecture
In comparison to data warehouse which usually based on relational technology,
OLAP uses a multidimensional view to aggregate data to provide rapid access to
strategic information for analysis. There are three type of Online Analytical
Processing (OLAP) architecture based on the method in which they store multi-
dimensional data and perform analysis operations on that dataset. The categories
are multidimensional OLAP (MOLAP), relational OLAP (ROLAP) and hybrid OLAP
(HOLAP).
1. In MOLAP, datasets are stored and summarized in a multidimensional cube. The
MOLAP architecture can perform faster than ROLAP and HOLAP (C). MOLAP
cubes designed and build for rapid data retrieval to enhance efficient slicing and
dicing operations. MOLAP can perform complex calculations which have been
pre-generated after cube creation. MOLAP processing is restricted to initial cube
that was created and are not bound to any additional replication of cube.
2. In ROLAP, data and aggregations are stored in relational database tables to
provide the OLAP slicing and dicing functionalities. ROLAP are the slowest among
the OLAP flavors. ROLAP relies on data manipulating directly in the relational
database to give the manifestation of conventional OLAP’s slicing and dicing
functionality. Basically, each slicing and dicing action is equivalent to adding a
“WHERE” clause in the SQL statement. ROLAP can manage large amounts of
data and ROLAP do not have any limitations for data size. ROLAP can influence
the intrinsic functionality in a relational database. ROLAP are slow in
performance because each ROLAP activity are essentially a SQL query or multiple
SQL queries in the relational database. The query time and number of SQL
statements executed measures by its complexity of the SQL statements and can
be a bottleneck if the underlying dataset size is large. ROLAP essentially depends
on SQL statements generation to query the relational database and do not cater
all needs which make ROLAP technology conventionally limited by what SQL
functionality can offer.
3. HOLAP combine the technologies of MOLAP and ROLAP. Data are stored in
ROLAP relational database tables and the aggregations are stored in MOLAP
cube. HOLAP can drill down from multidimensional cube into the underlying
relational database data. To acquire summary type of information, HOLAP
leverages cube technology for faster performance. Whereas to retrieve detail type
of information, HOLAP can drill down from the cube into the underlying relational
data.
14 | P a g e
In Online Analytical Processing (OLAP) architectures (MOLAP, ROLAP and HOLAP),
the datasets are stored in a multidimensional format as it involves the creation of
multidimensional blocks called data cubes. The cube in OLAP architecture may
have three axes (dimensions), or more. Each axis (dimension) represents a logical
category of data. One axis may for example represent the geographic location of
the data, while others may indicate a state of time or a specific school. Each of
the categories can be broken down into successive levels and it is possible to drill
up or down between the levels.
Online Analytical Processing (OLAP) partitions are normally stored in an OLAP
server, with the relational database frequently stored on a separate server from
OLAP server. OLAP server must query across the network whenever it needs to
access the relational tables to resolve a query. The impact of querying across the
network depends on the performance characteristics of the network itself. Even
when the relational database is placed on the same server as OLAP server, inter-
process calls and the associated context switching are required to retrieve
relational data. With a OLAP partition, calls to the relational database, whether
local or over the network, do not occur during querying.
Online Analytical Processing (OLAP)
Functionality
Online Analytical Processing (OLAP) functionality offers dynamic multidimensional
analysis supporting end users with analytical activities includes calculations and
modelling applied across dimensions, trend analysis over time periods, slicing
subsets for on-screen viewing, drilling to deeper levels of records, OLAP is
implemented in a multi-user client/server environment and provide reliably fast
response to queries, in spite of database size and complexity. OLAP facilitate the
end user integrate enterprise information through relative, customized viewing,
analysis of historical and present data in various “what-if” data model scenario.
This is achieved through use of an OLAP Server.
OLAP functionality is provided by an OLAP server. OLAP server design and data
structure are optimized for fast information retrieval in any course and flexible
calculation and transformation of unprocessed data. The OLAP server may either
actually carry out the processed multidimensional information to distribute
consistent and fast response times to end users, or it may fill its data structures in
real time from relational databases, or offer a choice of both.
Essentially, OLAP create information in cube form which allows more composite
analysis compares to relational database. OLAP analysis techniques employ ‘slice
and dice’ and ‘drilling’ methods to segregate data into loads of information
15 | P a g e
depending on given parameters. Slice is identifying a single value for one or more
variable which is non-subset of multidimensional array. Whereas dice function is
application of slice function on more than two dimensions of multidimensional
cubes. Drilling function allows end user to traverse between condensed data to most
precise data unit.
Multidimensional Database Schema
The base of every data warehouse system is a relational database build using a
dimensional model. Dimensional model consists of fact and dimension tables which
are described as star schema or snowflake schema. A schema is a collection of
database objects, tables, views and indexes.
In designing data models for data warehouse, the most commonly used schema
types are star schema and snowflake schema. In the star schema design, fact table
sits in the middle and is connected to other surrounding dimension tables like a
star. A star schema can be simple or complex. A simple star consists of one fact
table; a complex star can have more than one fact table.
Most data warehouses use a star schema to represent the multidimensional data
model. The database consists of a single fact table and a single table for each
dimension. Each tuple in the fact table consists of a pointer or foreign key to each of
the dimensions that provide its multidimensional coordinates, and stores the
numeric measures for those coordinates. A tuple consist of a unit of data extracted
from cube in a range of member from one or more dimension tables. Each
dimension table consists of columns that correspond to attributes of the dimension.
Star schemas do not explicitly provide support for attribute hierarchies which are
not suitable for architecture such as MOLAP which require lots of hierarchies of
dimension tables for efficient drilling of datasets.
Snowflake schemas provide a refinement of star schemas where the dimensional
hierarchy is explicitly represented by normalizing the dimension tables. The main
advantage of the snowflake schema is the improvement in query performance due to
minimized disk storage requirements and joining smaller lookup tables. The main
disadvantage of the snowflake schema is the additional maintenance efforts needed
due to the increase number of lookup tables.
In addition to the fact and dimension tables, data warehouses store selected
summary tables containing pre-aggregated data. In the simplest cases, the pre-
aggregated data corresponds to aggregating the fact table on one or more selected
dimensions. Such pre-aggregated summary data can be represented in the database
16 | P a g e
in at least two ways. Whether to use star or a snowflake mainly depends on
business needs.
OLAP Evaluation
As OLAP technology taking prominent place in data warehouse industry, there
should be a suitable assessment tool to evaluate it. E.F. Codd not only invented
OLAP but also provided a set of procedures which are known as the ‘Twelve Rules’
for OLAP product ability assessment:
1. Multidimensional conceptual view. OLAP operates with CUBEs of data that
represent multidimensional construct of data. Event though the name implies three
dimensional data, the number of possible dimensions is practically unlimited.
2. Transparency. OLAP systems should be part of an open system that supports
heterogeneous data sources.
3. Accessibility. The OLAP should present the user with a single logical schema of the
data.
4. Consistent reporting performance. Performance should not degrade as the
number of dimensions in the model increases.
5. Client/server architecture. Should be based on open, modular systems.
6. Generic dimensionality. Not limited to 3-D and not biased toward any particular
dimension. A function applied to one dimension should also be able to be applied to
another.
7. Dynamic sparse-matrix handling. Related both to the idea of nulls in relational
databases and to the notion of compressing large files, a sparse matrix is one in
which not every cell contains data. OLAP systems should accommodate varying
storage and data-handling options.
8. Multiuser support. OLAP systems should support more than one user at the time.
9. Unrestricted cross-dimensional operations. Similar to rule of generic
dimensionality; all dimensions are created equal, and operations across data
dimensions should not restrict relationships between cells.
10. Intuitive data manipulation. Ideally, users shouldn’t have to use menus or
perform complex multiple-step operations when an intuitive drag-and-drop action
will do.
11. Flexible reporting. Save a tree. Users should be able to print just what they
need, and any changes to the underlying financial model should be automatically
reflected in reports.
12. Unlimited dimensional and aggregation levels. The OLAP cube can be built
with unlimited dimensions, and aggregation of the contained data also does not
have practical limits.
Codd twelve rules of OLAP provide us an essential tool to verify the OLAP functions
and OLAP models used are able to produce desired result. A good OLAP system
17 | P a g e
should also support a complete database management tools as a utility for
integrated centralized tool to permit database management to perform distribution
of databases within the enterprise. OLAP ability to perform drilling mechanism
within the MDDB allows the functionality of drill down right to the source or root of
the detail record level. This implies that OLAP tool permit a smooth changeover from
the MDDB to the detail record level of the source relational database. OLAP systems
also must support incremental database refreshes. This is an important feature as
to prevent stability issues on operations and usability problems when the size of the
database increases.
OLTP and OLAP
The design of OLAP for multidimensional cube is entirely different compare to OLTP
(Online Transactional Processing) for database. OLTP is implemented into relational
database to support daily processing in an organization. OLTP system main
function is to capture data into computers. OLTP allow effective data manipulation
and storage of data for daily operational resulting in huge quantity of transactional
data. Organizations build multiple OLTP systems to handle huge quantities of daily
operations transactional data can in short period of time.
OLAP is designed for data access and analysis to support managerial user strategic
decision making process. OLAP technology focuses on aggregating datasets into
multidimensional view without hindering the system performance. OLTP systems is
defined as a “Customer oriented” and OLAP is a “market oriented”. Major differences
between OLTP and OLAP systems are shown below.
Differences OLTP OLAP
Characteristics Can handle large numbers of
small online transactions.
Handles large volumes of data.
Query Simple queries, such as
Insert, Delete, and Update
information.
Complex queries which require
aggregations.
Database
Design
Normal, with many tables. Usually with fewer tables and
can include star or snowflake
schemas.
Method Uses traditional DBMS. Uses data warehouses.
Sources The OLTP itself and respective
transactions correspond to the
sources of data.
The various OLTP databases
become the data sources for
OLAP.
Data Quality Huge effort to ensure the data
is ACID-compliant.
The data may not be as
organized, but what really
matters is the capacity to
navigate through the
18 | P a g e
dimensions of the data.
Functionality Online database which
modifies a system by
controlling and running
essential business tasks in
real time.
Online database query
management system that allows
users to discover hidden
insights, plan, support
decisions, and solve problems.
Speed Typically very fast processing. Depends on the amount of data.
Creating indexes can enhance
query speed.
Backup and
Recovery
Regular backups are vital to
ensure the business keeps
running since data loss can
lead to monetary loss and
legal issues.
Requires backup from time to
time, and lost data can be
reloaded from OLTP database
when needed.
It is complicated to merge OLAP and OLTP into one centralized database system.
The dimensional data design model used in OLAP is much more effective for
querying than the relational database query used in OLTP system. OLAP may use
one central database as data source and OLTP used different data source from
different database sites. The dimensional design of OLAP is not suitable for OLTP
system, mainly due to redundancy and the loss of referential integrity of the data.
Organization chooses to have two separate information systems, one OLTP and one
OLAP system.
We can conclude that the purpose of OLTP systems is to get data into computers,
whereas the purpose of OLAP is to get data or information out of computers.
OLAP Operations in the Multidimensional Data Model
In the multidimensional model, the records are organized into various dimensions,
and each dimension includes multiple levels of abstraction described by concept
hierarchies. This organization support users with the flexibility to view data from
various perspectives. A number of OLAP data cube operation exist to demonstrate
these different views, allowing interactive queries and search of the record at hand.
Hence, OLAP supports a user-friendly environment for interactive data analysis.
Consider the OLAP operations which are to be performed on multidimensional data.
The figure shows data cubes for sales of a shop. The cube contains the dimensions,
location, and time and item, where the location is aggregated with regard to city
values, time is aggregated with respect to quarters, and an item is aggregated with
respect to item types.
Roll-Up
19 | P a g e
Roll-up is like zooming-out on the data cubes. Figure shows the result of roll-up
operations performed on the dimension location. The hierarchy for the location is
defined as the Order Street, city, province, or state, country. The roll-up operation
aggregates the data by ascending the location hierarchy from the level of the city to
the level of the country.
When a roll-up is performed by dimensions reduction, one or more dimensions are
removed from the cube.
Example
Consider the following cubes illustrating temperature of certain days recorded
weekly:
Temperature 64 65 68 69 70 71 72 75 80 81 83 85
Week1 1 0 1 0 1 0 0 0 0 0 1 0
Week2 0 0 0 1 0 0 1 2 0 1 0 0
Consider that we want to set up levels (hot (80-85), mild (70-75), cool (64-69)) in
temperature from the above cubes.
To do this, we have to group column and add up the value according to the concept
hierarchies. This operation is known as a roll-up.
By doing this, we contain the following cube:
Temperature cool mild hot
Week1 2 1 1
Week2 2 1 1
The roll-up operation groups the information by levels of temperature.
The following diagram illustrates how roll-up works.
20 | P a g e
Drill-Down
Drill-down is like zooming-in
record to more detailed data. Drill
down a concept hierarchy for a dimension or adding additional dimensions.
Figure shows a drill-down operation performed on the dimension time by stepping
down a concept hierarchy which is defined as day, month, quarter, and year. Drill
down appears by descending
more detailed level of the month.
Because a drill-down adds more details to the given data, it can also be performed
by adding a new dimension to a cube. For example, a drill
cubes of the figure can occur by introducing an additional dimension, such as a
customer group.
Example
Drill-down adds more details to the given data
in on the data cube. It navigates from less detailed
record to more detailed data. Drill-down can be performed by either
concept hierarchy for a dimension or adding additional dimensions.
down operation performed on the dimension time by stepping
down a concept hierarchy which is defined as day, month, quarter, and year. Drill
down appears by descending the time hierarchy from the level of the quarter to a
more detailed level of the month.
down adds more details to the given data, it can also be performed
by adding a new dimension to a cube. For example, a drill-down on the central
f the figure can occur by introducing an additional dimension, such as a
down adds more details to the given data
on the data cube. It navigates from less detailed
down can be performed by either stepping
concept hierarchy for a dimension or adding additional dimensions.
down operation performed on the dimension time by stepping
down a concept hierarchy which is defined as day, month, quarter, and year. Drill-
the time hierarchy from the level of the quarter to a
down adds more details to the given data, it can also be performed
down on the central
f the figure can occur by introducing an additional dimension, such as a
21 | P a g e
Temperature cool mild hot
Day 1 0 0 0
Day 2 0 0 0
Day 3 0 0 1
Day 4 0 1 0
Day 5 1 0 0
Day 6 0 0 0
Day 7 1 0 0
Day 8 0 0 0
Day 9 1 0 0
Day 10 0 1 0
Day 11 0 1 0
Day 12 0 1 0
Day 13 0 0 1
Day 14 0 0 0
The following diagram illustrates how Drill-down works.
22 | P a g e
Slice
A slice is a subset of the cubes corresponding to a single value for one or more
members of the dimension. For example, a slice operation is executed when the
customer wants a selection on one dimension of a three
in a two-dimensional site. So, the Slice operations perform a selection on one
dimension of the given cube, thus resulting in a subcube.
For example, if we make the selection, temperature=cool we will obtain the following
cube:
Temperature
Day 1
Day 2
Day 3
Day 4
Day 5
Day 6
Day 7
is a subset of the cubes corresponding to a single value for one or more
of the dimension. For example, a slice operation is executed when the
customer wants a selection on one dimension of a three-dimensional cube resulting
dimensional site. So, the Slice operations perform a selection on one
be, thus resulting in a subcube.
For example, if we make the selection, temperature=cool we will obtain the following
cool
0
0
0
0
1
1
1
is a subset of the cubes corresponding to a single value for one or more
of the dimension. For example, a slice operation is executed when the
dimensional cube resulting
dimensional site. So, the Slice operations perform a selection on one
For example, if we make the selection, temperature=cool we will obtain the following
23 | P a g e
Day 8
Day 9
Day 11
Day 12
Day 13
Day 14
The following diagram illustrates how Slice works.
Here Slice is functioning for the dimensions "time" using the criterion time = "Q1".
It will form a new sub-cubes by selecting one or more dimensions.
Dice
The dice operation describes a subcube
dimension.
1
1
0
0
0
0
The following diagram illustrates how Slice works.
Here Slice is functioning for the dimensions "time" using the criterion time = "Q1".
cubes by selecting one or more dimensions.
The dice operation describes a subcube by operating a selection on two or more
Here Slice is functioning for the dimensions "time" using the criterion time = "Q1".
by operating a selection on two or more
24 | P a g e
For example, Implement the selection (time = day 3 OR time = day 4) AND
(temperature = cool OR temperature = hot) to the original cubes we get the following
subcube (still two-dimensional)
Temperature
Day 3
Day 4
Consider the following diagram, which shows the dice operations.
The dice operation on the cubes based on the following selection criteria involves
three dimensions.
, Implement the selection (time = day 3 OR time = day 4) AND
(temperature = cool OR temperature = hot) to the original cubes we get the following
dimensional)
cool hot
0 1
0 0
Consider the following diagram, which shows the dice operations.
The dice operation on the cubes based on the following selection criteria involves
, Implement the selection (time = day 3 OR time = day 4) AND
(temperature = cool OR temperature = hot) to the original cubes we get the following
hot
1
0
The dice operation on the cubes based on the following selection criteria involves
25 | P a g e
o (location = "Toronto" or "Vancouver")
o (time = "Q1" or "Q2")
o (item =" Mobile" or "Modem")
Pivot
The pivot operation is also called a rotation. Pivot is a visualization operations which
rotates the data axes in view to provide an alternative presentation of the data. It
may contain swapping the rows and columns or
into the column dimensions.
Consider the following diagram, which shows the pivot operation.
(location = "Toronto" or "Vancouver")
(item =" Mobile" or "Modem")
The pivot operation is also called a rotation. Pivot is a visualization operations which
rotates the data axes in view to provide an alternative presentation of the data. It
may contain swapping the rows and columns or moving one of the row
Consider the following diagram, which shows the pivot operation.
The pivot operation is also called a rotation. Pivot is a visualization operations which
rotates the data axes in view to provide an alternative presentation of the data. It
moving one of the row-dimensions
26 | P a g e
Other OLAP Operations
Executes queries containing more than one fact table. The drill
make use of relational SQL facilitates to drill through the bottom level of a data
cubes down to its back-end relational tables.
Other OLAP operations may contain ranking the top
lists, as well as calculate moving average, growth rates, and interests,
of returns, depreciation, currency conversions, and statistical tasks.
OLAP offers analytical modeling capabilities, containing a calculation engine for
determining ratios, variance, etc. and for computing measures across various
dimensions. It can generate summarization, aggregation, and hierarchies at each
granularity level and at every dimensions intersection. OLAP also provide functional
models for forecasting, trend analysis, and statistical analysis. In this context, the
OLAP engine is a powerful data analysis tool.
Other OLAP Operations
Executes queries containing more than one fact table. The drill-through operations
l SQL facilitates to drill through the bottom level of a data
end relational tables.
Other OLAP operations may contain ranking the top-N or bottom-
lists, as well as calculate moving average, growth rates, and interests,
of returns, depreciation, currency conversions, and statistical tasks.
OLAP offers analytical modeling capabilities, containing a calculation engine for
determining ratios, variance, etc. and for computing measures across various
. It can generate summarization, aggregation, and hierarchies at each
granularity level and at every dimensions intersection. OLAP also provide functional
models for forecasting, trend analysis, and statistical analysis. In this context, the
a powerful data analysis tool.
through operations
l SQL facilitates to drill through the bottom level of a data
-N elements in
lists, as well as calculate moving average, growth rates, and interests, internal rates
OLAP offers analytical modeling capabilities, containing a calculation engine for
determining ratios, variance, etc. and for computing measures across various
. It can generate summarization, aggregation, and hierarchies at each
granularity level and at every dimensions intersection. OLAP also provide functional
models for forecasting, trend analysis, and statistical analysis. In this context, the
27 | P a g e
UNIT III
Data Warehousing
A Data Warehouse (DW) is a repository of huge amount of organized data. This data
is consolidated from one or more different data sources. DW is a relational database
that is mainly designed for analytical reporting and on-time decision making in
organizations.
The data for this purpose is isolated and optimized from the source transaction
data, which will not have any impact on the main business. If an organization
introduces any business change, then DW is used to examine the effects of that
change, and hence DW is also used to monitor the non-decision making process.
The data warehouse is mostly a read-only system as operational data is very much
separated from DW. This provides an environment to retrieve the highest amount of
data with good query writing.
Thus DW will act as the backend engine for Business Intelligence tools which shows
the reports, dashboards for the business users. DW is extensively used in banking,
financial, retail sectors, etc.
Enlisted below are some of the reasons for which Data Warehouse is crucial.
1. Data warehouse gathers all the operational data from several heterogeneous
sources of “different formats” and through the process of extract, transform
and load (ETL) it loads the data into DW in a “standardized dimensional
format” across an organization.
2. Data warehouse maintains both “current data and historical data” for
analytical reporting and fact-based decision making.
3. It helps organizations to take “smarter and quick decisions” on reducing costs
and to increase the revenue, by comparing quarter and annual reports to
improve their performance.
Types Of Data Warehouse Applications
28 | P a g e
Business Intelligence (BI) is a branch of data warehousing designed for decision
making. Once the data in the DW is loaded, BI plays a major role by analyzing the
data and presenting it to the business users.
Practically, the term “data warehouse applications” implies, in how many different
types the data can be processed and utilized.
We have three types of DW Applications as mentioned below.
1. Information processing
2. Analytical processing
3. Data mining which serves the purpose of BI
#1) Information Processing
This is a kind of application where the data warehouse allows direct one-one
contact with the data stored in it.
As the data can be processed by writing direct queries on the data (or) with a basic
statistical analysis on the data and the end results will be reported to the business
users in the form of reports, tables, charts or graphs.
DW supports the following tools for Information Processing:
(i) Query Tools: The business (or) the analyst runs the queries using query tools to
explore the data and generate the output in the form of reports or graphics as per
the business requirement.
(ii) Reporting Tools: If the business wants to see the results in any defined format
and on a scheduled basis i.e. daily, weekly or monthly then reporting tools will be
used. These kinds of reports can be saved and reviewed at any time.
29 | P a g e
(iii) Statistics Tools: If the business wants to do an analysis on a broad view of
data then statistics tools will be used to generate such results. Businesses can
make conclusions and predictions by understanding these strategic results.
#2) Analytical Processing
This is a kind of application where a data warehouse allows the analytical
processing of data stored in it. The data can be analyzed by the following operations
as Slice-and-Dice, Drill Down, Roll Up and Pivoting.
(i) Slice-and-Dice: Data warehouse allows slice-and-dice operations to analyze the
data accessed from many levels with a combination of different perspectives. The
slice-and-dice operation internally uses the drill-down mechanism. Slicing works on
dimensional data.
As a part of the business requirement, if we focus on a single area then slicing
analyzes the dimensions of that particular area as per the requirements and gives
the results. Dicing works on analytic operations. Dicing zooms for a specific set of
attributes over all the dimensions to provide diverse perspectives. The dimensions
are considered from one or more consecutive slices.
(ii) Drill Down: If the business wants to go to a more detailed level of any summary
number, then drill down is an operation for navigating down that summary to minor
detailed levels. This gives a great idea of what is happening and where the business
has to be focused more closely.
Drill down tracks from the hierarchy level until the minor detail level for the root
cause analysis. This can be easily understood with an example as sales drill down
can happen from Country-level -> Region level -> State-level -> District level ->
Store level.
30 | P a g e
(iii) Roll up: Roll up works opposite to the drill-down operation. If the business
wants any summarized data, then roll up comes into the picture. It aggregates the
detail level data by moving up in the dimensional hierarchy.
Roll-ups are used to analyze the development and performance of a system.
This can be understood with an Example as in a sales roll up where the totals can
be rolled up from City level -> State-level -> Region level -> Country level.
(iv) Pivot: Pivoting analyzes dimension data by rotating the data on the cubes. For
Example, the row dimension can be swapped into the column dimension and vice
versa.
#3) Data Mining
This is a kind of application where the data warehouse allows knowledge discovery
of the data and results will be represented with visualization tools. In the above two
types of applications, the information can be driven by the users.
As the data goes vast in various businesses, it is difficult to query and drill down
the data warehouse to get all possible insights into data. Then data mining comes
into the picture to accomplish the discovery of knowledge.
This drives into the data with all the past associations, results etc and predicts the
future. Hence this is data-driven and not user-driven. The data can be discovered
by finding hidden patterns, associations, classifications, and predictions.
Data mining goes in-depth with the data to predict the future. Based on the
predictions, it also suggests the actions to take.
31 | P a g e
Given below are the various activities of Data Mining:
Patterns: Data mining discovers patterns that occur in the database. Users can
provide the business inputs on which some knowledge of the patterns is expected
for decision making.
Associations/Relationships: Data mining discovers relationships between the objects
with the frequency of their association rules. This relationship may be between two
or more objects (or) it may discover the rules within the properties of the same
object.
Classification: Data mining organizes data in a set of pre-defined classes. So if any
object is picked up from the data, classification associates the respective class label
to that object.
Prediction: Data mining compares a set of existing values to find the best possible
future values/trends in business.
Hence, based on all the above results, Data mining also proposes a set of actions to
be taken.
Kimball’s approach versus Inman’s approach to Data
warehousing
The Kimball method
Ralph Kimball (born in 1944) is one of the original pioneers of data warehousing
design.
At the heart of his method stands the concept that a database must be designed to
be understood and operate fast. His approach is a Business-like approach, which
first takes into consideration the specific business requirements of an organization
and builds the data warehouse on top of them. For example, an organization can
have multiple data sources on its operational system (OLTP), but only the necessary
ones will be transferred into the data warehouse, after being cleaned by an ETL
process. There, it would be stored in a relational data model made from facts and
32 | P a g e
dimension tables.
The following illustration shows the data flow process of the Kimball method:
Kimball approach
The Inman Method
Bill Inman (born in 1945) is recognized by many as the “Father of data
warehousing”. his approach is considered a Technical Approach and is the
opposite approach of Kimball.
From Inman’s point of view, it’s essential to first transfer all the data from the
operating system (OLTP) into the data warehouse to serve as a single source of truth
for one’s organization, storing it in highly normalized data model. Only then, the
relevant data will be transferred into one or more data marts, according to the
organization’s requirements.
The following illustration shows the data flow process of the Inman method:
The Inman approach
33 | P a g e
The next table summarizes the main differences between the two methods:
Method Kimball Inman
Set-up Relatively fast because only
partial parts of the data are being
transferred to the data warehouse
Relatively slow because
all the company data has
to be transferred to the
data warehouse
Performance Relatively fast due to the data
being split into fact and
dimension tables in a de-
normalized way
Relatively slow due to the
highly normalized
structure of the data
model
Data
modeling complexity
Star or snowflake schemas which
are considered user-friendly
models to understand
The data model can
become over-complex
over time as more tables
are joined together.
Costs Relatively low because only
partial parts of the organization’s
data are being transferred
Relatively high due to the
transfer and storage of
the organization’s entire
data
Reporting ability Because not all the data is
transferred, it can sometimes lead
to difficulties for the
organization’s different reporting
needs
Any reporting need of the
organization is being
covered.
The conclusion
Both the Inman and Kimball methods can be applied for different scenarios, and
each method has its own advantages and disadvantages. To this day, the most
common DWH structure uses the Kimball method due to the fact that it’s a
Business-minded approach rather than a Technical approach. It is essentially much
cheaper and delivers faster results with better ROI.

More Related Content

Similar to Business Intelligence Unit 1.pdf

7 Benefits Of Business Intelligence In Finance.docx
7 Benefits Of Business Intelligence In Finance.docx7 Benefits Of Business Intelligence In Finance.docx
7 Benefits Of Business Intelligence In Finance.docxSameerShaik43
 
Self-Service Data Exploration_ No Coding, Just Reporting.pdf
Self-Service Data Exploration_ No Coding, Just Reporting.pdfSelf-Service Data Exploration_ No Coding, Just Reporting.pdf
Self-Service Data Exploration_ No Coding, Just Reporting.pdfGrow
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business IntelligenceAmulya Lohani
 
Group 5
Group 5Group 5
Group 5Xylar
 
Finding The Best Business Intelligence Tool For Large Datasets.pdf
Finding The Best Business Intelligence Tool For Large Datasets.pdfFinding The Best Business Intelligence Tool For Large Datasets.pdf
Finding The Best Business Intelligence Tool For Large Datasets.pdfGrow
 
Core Components of BI.pdf
Core Components of BI.pdfCore Components of BI.pdf
Core Components of BI.pdfXlogia Tech
 
Power bi implementation for finance services firms
Power bi implementation for finance services firmsPower bi implementation for finance services firms
Power bi implementation for finance services firmsaddendanalytics
 
Smarter BI for SMBs
Smarter BI for SMBsSmarter BI for SMBs
Smarter BI for SMBsHari Menon
 
Business Intelligence and Business Analytics
Business Intelligence and Business AnalyticsBusiness Intelligence and Business Analytics
Business Intelligence and Business Analyticssnehal_152
 
PREDICTIVE BUSINESS INTELLIGENCE: CONSUMER GOODS SALES FORECASTING USING ARTI...
PREDICTIVE BUSINESS INTELLIGENCE: CONSUMER GOODS SALES FORECASTING USING ARTI...PREDICTIVE BUSINESS INTELLIGENCE: CONSUMER GOODS SALES FORECASTING USING ARTI...
PREDICTIVE BUSINESS INTELLIGENCE: CONSUMER GOODS SALES FORECASTING USING ARTI...IAEME Publication
 
Application business intelligence in railways
Application business intelligence in railwaysApplication business intelligence in railways
Application business intelligence in railwaysVoice Malaysia
 
Business intelligence an introduction
Business intelligence an introductionBusiness intelligence an introduction
Business intelligence an introductionIsaac Victor
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligenceSuparnaR1
 

Similar to Business Intelligence Unit 1.pdf (20)

PART 1.docx
PART 1.docxPART 1.docx
PART 1.docx
 
7 Benefits Of Business Intelligence In Finance.docx
7 Benefits Of Business Intelligence In Finance.docx7 Benefits Of Business Intelligence In Finance.docx
7 Benefits Of Business Intelligence In Finance.docx
 
BA MODULE1.pdf
BA MODULE1.pdfBA MODULE1.pdf
BA MODULE1.pdf
 
Business Analytics
Business AnalyticsBusiness Analytics
Business Analytics
 
Self-Service Data Exploration_ No Coding, Just Reporting.pdf
Self-Service Data Exploration_ No Coding, Just Reporting.pdfSelf-Service Data Exploration_ No Coding, Just Reporting.pdf
Self-Service Data Exploration_ No Coding, Just Reporting.pdf
 
Power BI TEI by Forrester
Power BI TEI by ForresterPower BI TEI by Forrester
Power BI TEI by Forrester
 
Business Analytics
Business AnalyticsBusiness Analytics
Business Analytics
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
Group 5
Group 5Group 5
Group 5
 
Bi in financial industry
Bi in financial industryBi in financial industry
Bi in financial industry
 
Bi in financial industry
Bi in financial industryBi in financial industry
Bi in financial industry
 
Finding The Best Business Intelligence Tool For Large Datasets.pdf
Finding The Best Business Intelligence Tool For Large Datasets.pdfFinding The Best Business Intelligence Tool For Large Datasets.pdf
Finding The Best Business Intelligence Tool For Large Datasets.pdf
 
Core Components of BI.pdf
Core Components of BI.pdfCore Components of BI.pdf
Core Components of BI.pdf
 
Power bi implementation for finance services firms
Power bi implementation for finance services firmsPower bi implementation for finance services firms
Power bi implementation for finance services firms
 
Smarter BI for SMBs
Smarter BI for SMBsSmarter BI for SMBs
Smarter BI for SMBs
 
Business Intelligence and Business Analytics
Business Intelligence and Business AnalyticsBusiness Intelligence and Business Analytics
Business Intelligence and Business Analytics
 
PREDICTIVE BUSINESS INTELLIGENCE: CONSUMER GOODS SALES FORECASTING USING ARTI...
PREDICTIVE BUSINESS INTELLIGENCE: CONSUMER GOODS SALES FORECASTING USING ARTI...PREDICTIVE BUSINESS INTELLIGENCE: CONSUMER GOODS SALES FORECASTING USING ARTI...
PREDICTIVE BUSINESS INTELLIGENCE: CONSUMER GOODS SALES FORECASTING USING ARTI...
 
Application business intelligence in railways
Application business intelligence in railwaysApplication business intelligence in railways
Application business intelligence in railways
 
Business intelligence an introduction
Business intelligence an introductionBusiness intelligence an introduction
Business intelligence an introduction
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
 

Recently uploaded

PITHAMPUR 💋 Call Girl 9827461493 Call Girls in Escort service book now
PITHAMPUR 💋 Call Girl 9827461493 Call Girls in  Escort service book nowPITHAMPUR 💋 Call Girl 9827461493 Call Girls in  Escort service book now
PITHAMPUR 💋 Call Girl 9827461493 Call Girls in Escort service book nowkapoorjyoti4444
 
GUWAHATI 💋 Call Girl 9827461493 Call Girls in Escort service book now
GUWAHATI 💋 Call Girl 9827461493 Call Girls in  Escort service book nowGUWAHATI 💋 Call Girl 9827461493 Call Girls in  Escort service book now
GUWAHATI 💋 Call Girl 9827461493 Call Girls in Escort service book nowkapoorjyoti4444
 
PHX May 2024 Corporate Presentation Final
PHX May 2024 Corporate Presentation FinalPHX May 2024 Corporate Presentation Final
PHX May 2024 Corporate Presentation FinalPanhandleOilandGas
 
Berhampur CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Berhampur CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGBerhampur CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Berhampur CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGpr788182
 
Chandrapur Call Girl Just Call 8084732287 Top Class Call Girl Service Available
Chandrapur Call Girl Just Call 8084732287 Top Class Call Girl Service AvailableChandrapur Call Girl Just Call 8084732287 Top Class Call Girl Service Available
Chandrapur Call Girl Just Call 8084732287 Top Class Call Girl Service Availablepr788182
 
Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024Marel
 
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGParadip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGpr788182
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptxnandhinijagan9867
 
Ooty Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Avail...
Ooty Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Avail...Ooty Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Avail...
Ooty Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Avail...Puja Sharma
 
Berhampur 70918*19311 CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Berhampur 70918*19311 CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGBerhampur 70918*19311 CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Berhampur 70918*19311 CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGpr788182
 
Escorts in Nungambakkam Phone 8250092165 Enjoy 24/7 Escort Service Enjoy Your...
Escorts in Nungambakkam Phone 8250092165 Enjoy 24/7 Escort Service Enjoy Your...Escorts in Nungambakkam Phone 8250092165 Enjoy 24/7 Escort Service Enjoy Your...
Escorts in Nungambakkam Phone 8250092165 Enjoy 24/7 Escort Service Enjoy Your...meghakumariji156
 
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai KuwaitThe Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwaitdaisycvs
 
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al MizharAl Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizharallensay1
 
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAIGetting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAITim Wilson
 
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...Falcon Invoice Discounting
 
Lundin Gold - Q1 2024 Conference Call Presentation (Revised)
Lundin Gold - Q1 2024 Conference Call Presentation (Revised)Lundin Gold - Q1 2024 Conference Call Presentation (Revised)
Lundin Gold - Q1 2024 Conference Call Presentation (Revised)Adnet Communications
 
Lucknow Housewife Escorts by Sexy Bhabhi Service 8250092165
Lucknow Housewife Escorts  by Sexy Bhabhi Service 8250092165Lucknow Housewife Escorts  by Sexy Bhabhi Service 8250092165
Lucknow Housewife Escorts by Sexy Bhabhi Service 8250092165meghakumariji156
 
PARK STREET 💋 Call Girl 9827461493 Call Girls in Escort service book now
PARK STREET 💋 Call Girl 9827461493 Call Girls in  Escort service book nowPARK STREET 💋 Call Girl 9827461493 Call Girls in  Escort service book now
PARK STREET 💋 Call Girl 9827461493 Call Girls in Escort service book nowkapoorjyoti4444
 

Recently uploaded (20)

PITHAMPUR 💋 Call Girl 9827461493 Call Girls in Escort service book now
PITHAMPUR 💋 Call Girl 9827461493 Call Girls in  Escort service book nowPITHAMPUR 💋 Call Girl 9827461493 Call Girls in  Escort service book now
PITHAMPUR 💋 Call Girl 9827461493 Call Girls in Escort service book now
 
GUWAHATI 💋 Call Girl 9827461493 Call Girls in Escort service book now
GUWAHATI 💋 Call Girl 9827461493 Call Girls in  Escort service book nowGUWAHATI 💋 Call Girl 9827461493 Call Girls in  Escort service book now
GUWAHATI 💋 Call Girl 9827461493 Call Girls in Escort service book now
 
PHX May 2024 Corporate Presentation Final
PHX May 2024 Corporate Presentation FinalPHX May 2024 Corporate Presentation Final
PHX May 2024 Corporate Presentation Final
 
Berhampur CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Berhampur CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGBerhampur CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Berhampur CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
 
Chandrapur Call Girl Just Call 8084732287 Top Class Call Girl Service Available
Chandrapur Call Girl Just Call 8084732287 Top Class Call Girl Service AvailableChandrapur Call Girl Just Call 8084732287 Top Class Call Girl Service Available
Chandrapur Call Girl Just Call 8084732287 Top Class Call Girl Service Available
 
Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024
 
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGParadip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
 
WheelTug Short Pitch Deck 2024 | Byond Insights
WheelTug Short Pitch Deck 2024 | Byond InsightsWheelTug Short Pitch Deck 2024 | Byond Insights
WheelTug Short Pitch Deck 2024 | Byond Insights
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptx
 
Ooty Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Avail...
Ooty Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Avail...Ooty Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Avail...
Ooty Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Avail...
 
Home Furnishings Ecommerce Platform Short Pitch 2024
Home Furnishings Ecommerce Platform Short Pitch 2024Home Furnishings Ecommerce Platform Short Pitch 2024
Home Furnishings Ecommerce Platform Short Pitch 2024
 
Berhampur 70918*19311 CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Berhampur 70918*19311 CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGBerhampur 70918*19311 CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Berhampur 70918*19311 CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
 
Escorts in Nungambakkam Phone 8250092165 Enjoy 24/7 Escort Service Enjoy Your...
Escorts in Nungambakkam Phone 8250092165 Enjoy 24/7 Escort Service Enjoy Your...Escorts in Nungambakkam Phone 8250092165 Enjoy 24/7 Escort Service Enjoy Your...
Escorts in Nungambakkam Phone 8250092165 Enjoy 24/7 Escort Service Enjoy Your...
 
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai KuwaitThe Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
 
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al MizharAl Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
 
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAIGetting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
 
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
 
Lundin Gold - Q1 2024 Conference Call Presentation (Revised)
Lundin Gold - Q1 2024 Conference Call Presentation (Revised)Lundin Gold - Q1 2024 Conference Call Presentation (Revised)
Lundin Gold - Q1 2024 Conference Call Presentation (Revised)
 
Lucknow Housewife Escorts by Sexy Bhabhi Service 8250092165
Lucknow Housewife Escorts  by Sexy Bhabhi Service 8250092165Lucknow Housewife Escorts  by Sexy Bhabhi Service 8250092165
Lucknow Housewife Escorts by Sexy Bhabhi Service 8250092165
 
PARK STREET 💋 Call Girl 9827461493 Call Girls in Escort service book now
PARK STREET 💋 Call Girl 9827461493 Call Girls in  Escort service book nowPARK STREET 💋 Call Girl 9827461493 Call Girls in  Escort service book now
PARK STREET 💋 Call Girl 9827461493 Call Girls in Escort service book now
 

Business Intelligence Unit 1.pdf

  • 1. 1 | P a g e UNIT 1 BUSINESS INTELLIGENCE Introduction BI(Business Intelligence) is a set of processes, architectures, and technologies that convert raw data into meaningful information that drives profitable business actions. It is a suite of software and services to transform data into actionable intelligence and knowledge. BI has a direct impact on organization’s strategic, tactical and operational business decisions. BI supports fact-based decision making using historical data rather than assumptions and gut feeling. BI tools perform data analysis and create reports, summaries, dashboards, maps, graphs, and charts to provide users with detailed intelligence about the nature of the business. Importance of BI  Measurement: creating KPI (Key Performance Indicators) based on historic data  Identify and set benchmarks for varied processes.  With BI systems organizations can identify market trends and spot business problems that need to be addressed.  BI helps on data visualization that enhances the data quality and thereby the quality of decision making.  BI systems can be used not just by enterprises but SME (Small and Medium Enterprises) Implementation of BI Here are the steps: Step 1) Raw Data from corporate databases is extracted. The data could be spread across multiple systems heterogeneous systems. Step 2) The data is cleaned and transformed into the data warehouse. The table can be linked, and data cubes are formed.
  • 2. 2 | P a g e Step 3) Using BI system the user can ask quires, request ad-hoc reports or conduct any other analysis. Examples of Business Intelligence System used in Practice Example 1: In an Online Transaction Processing (OLTP) system information that could be fed into product database could be  add a product line  change a product price Correspondingly, in a Business Intelligence system query that would beexecuted for the product subject area could be did the addition of new product line or change in product price increase revenues In an advertising database of OLTP system query that could be executed  Changed in advertisement options  Increase radio budget
  • 3. 3 | P a g e Correspondigly, in BI system query that could be executed would be how many new clients added due to change in radio budget In OLTP system dealing with customer demographic data bases data that could be fed would be  increase customer credit limit  change in customer salary level Correspondingly in the OLAP system query that could be executed would be can customer profile changes support support higher product price Example 2: A hotel owner uses BI analytical applications to gather statistical information regarding average occupancy and room rate. It helps to find aggregate revenue generated per room. It also collects statistics on market share and data from customer surveys from each hotel to decides its competitive position in various markets. By analyzing these trends year by year, month by month and day by day helps management to offer discounts on room rentals. Example 3: A bank gives branch managers access to BI applications. It helps branch manager to determine who are the most profitable customers and which customers they should work on. The use of BI tools frees information technology staff from the task of generating analytical reports for the departments. It also gives department personnel access to a richer data source. Types of BI users Following given are the four key players who are used Business Intelligence System: 1. The Professional Data Analyst: The data analyst is a statistician who always needs to drill deep down into data. BI system helps them to get fresh insights to develop unique business strategies. 2. The IT users:
  • 4. 4 | P a g e The IT user also plays a dominant role in maintaining the BI infrastructure. 3. The head of the company: CEO or CXO can increase the profit of their business by improving operational efficiency in their business. 4. The Business Users” Business intelligence users can be found from across the organization. There are mainly two types of business users 1. Casual business intelligence user 2. The power user. The difference between both of them is that a power user has the capability of working with complex data sets, while the casual user need will make him use dashboards to evaluate predefined sets of data. Advantages of Business Intelligence Here are some of the advantages of using Business Intelligence System: 1. Boost productivity With a BI program, It is possible for businesses to create reports with a single click thus saves lots of time and resources. It also allows employees to be more productive on their tasks. 2. To improve visibility BI also helps to improve the visibility of these processes and make it possible to identify any areas which need attention. 3. Fix Accountability BI system assigns accountability in the organization as there must be someone who should own accountability and ownership for the organization’s performance against its set goals. 4. It gives a bird’s eye view: BI system also helps organizations as decision makers get an overall bird’s eye view through typical BI features like dashboards and scorecards.
  • 5. 5 | P a g e 5. It streamlines business processes: BI takes out all complexity associated with business processes. It also automates analytics by offering predictive analysis, computer modeling, benchmarking and other methodologies. 6. It allows for easy analytics. BI software has democratized its usage, allowing even nontechnical or non-analysts users to collect and process data quickly. This also allows putting the power of analytics from the hand’s many people. BI System Disadvantages 1. Cost: Business intelligence can prove costly for small as well as for medium-sized enterprises. The use of such type of system may be expensive for routine business transactions. 2. Complexity: Another drawback of BI is its complexity in implementation of datawarehouse. It can be so complex that it can make business techniques rigid to deal with. 3. Limited use Like all improved technologies, BI was first established keeping in consideration the buying competence of rich firms. Therefore, BI system is yet not affordable for many small and medium size companies. 4. Time Consuming Implementation It takes almost one and half year for data warehousing system to be completely implemented. Therefore, it is a time-consuming process. Trends in Business Intelligence The following are some business intelligence and analytics trends that you should be aware of. Artificial Intelligence: Gartner’ report indicates that AI and machine learning now take on complex tasks done by human intelligence. This capability is being leveraged to come up with real-time data analysis and dashboard reporting.
  • 6. 6 | P a g e Collaborative BI: BI software combined with collaboration tools, including social media, and other latest technologies enhance the working and sharing by teams for collaborative decision making. Embedded BI: Embedded BI allows the integration of BI software or some of its features into another business application for enhancing and extending it’s reporting functionality. Cloud Analytics: BI applications will be soon offered in the cloud, and more businesses will be shifting to this technology. As per their predictions within a couple of years, the spending on cloud-based analytics will grow 4.5 times faster. Gap between business and IT professionals The key to bridging the gap between business and IT professionals is communication. Both groups will need to learn to communicate with each other in order to understand each other’s needs and goals. By doing so, they can work together more effectively to create solutions that meet the needs of both businesses and users. The Benefits of Bridging the Gap Between Business and IT As businesses rely increasingly on technology, it’s more important than ever for the business and IT departments to work closely together. By bridging the gap between these two groups, businesses can reap a number of benefits, including: 1. Increased Efficiency When business and IT professionals work together closely, they can share information and ideas more easily, which can lead to increased efficiency and productivity. 2. Improved Communication Close partnerships between business and IT can help improve communication between the two groups, which can help prevent misunderstandings and miscommunications. 3. Better Decision-making When business and IT professionals share information and ideas openly, they can make better decisions about how to use technology to meet business goals. 4. Greater Innovation
  • 7. 7 | P a g e Partnerships between business and IT can lead to greater innovation as the two groups explore new ways to use technology to benefit the business. 5. Enhanced customer service Close collaboration between business and IT can help improve customer service by ensuring that technology is being used in the most effective way possible to meet customer needs. How to Approach a Discussion About the Gap Between Business and IT The gap between business and IT professionals can be difficult to overcome, but it is important to try. One way to approach this discussion is by first understanding the different perspectives of each group. Business professionals focus on the bottom line and outcomes, while IT professionals focus on the details and implementation. It is important to find common ground between these two perspectives in order to have a productive discussion. Some ways to overcome the gap between business and IT professionals include: -Encourage open communication Make sure that both groups feel comfortable communicating with each other. This can be done by setting up regular meetings or check-ins, as well as creating channels for communication outside of formal meetings (e.g., Slack, KalamTime groups, email, etc.). -Educate each other Take some time to educate each other on the different priorities and goals of each group. This will help everyone understand where the other is coming from and what they are trying to accomplish. -Work together on projects Collaborating on projects is a great way to bridge the gap between business and IT professionals. By working together, both groups can learn about the strengths and weaknesses of the other and how they can best complement each other.
  • 8. 8 | P a g e Business intelligence architecture components and diagram A BI architecture can be deployed in an on either case, it contains a set of core components that collectively support the different stages of the BI process from data collection, integration, data storage and analysis to data visualization, information delivery and the use of BI data in business decision-making. This shows the various technologies used to run BI and analytics applications. The core components of a BI architecture include the following: Source systems. These are all of the systems that capture and hold the transactional and operational data identified as essential for the enterprise BI program. For example, this can include relationship management, flat files, application programming interfaces, finance, manufacturing and supply chain sources, such as market data and customer databases from outside information providers. As a result, both internal and external data sources are often incorporated into a BI architecture. UNIT 2 Business intelligence architecture components and diagram A BI architecture can be deployed in an on-premises data center or in the either case, it contains a set of core components that collectively support the different stages of the BI process from data collection, integration, data storage and analysis to data visualization, information delivery and the use of BI data in This shows the various technologies used to run BI and analytics applications. The core components of a BI architecture include the following: These are all of the systems that capture and hold the nd operational data identified as essential for the enterprise BI program. For example, this can include enterprise resource planning, customer relationship management, flat files, application programming interfaces, finance, supply chain management systems as well as secondary sources, such as market data and customer databases from outside information providers. As a result, both internal and external data sources are often incorporated into a BI architecture. Business intelligence architecture components and diagram premises data center or in the cloud. In either case, it contains a set of core components that collectively support the different stages of the BI process from data collection, integration, data storage and analysis to data visualization, information delivery and the use of BI data in This shows the various technologies used to run BI and analytics applications. These are all of the systems that capture and hold the nd operational data identified as essential for the enterprise BI enterprise resource planning, customer relationship management, flat files, application programming interfaces, finance, systems as well as secondary sources, such as market data and customer databases from outside information providers. As a result, both internal and external data sources are often
  • 9. 9 | P a g e Important criteria in the data source selection process include data relevancy, data currency, data quality and the level of detail in the available data sets. In addition, a combination of structured, semi-structured and unstructured data types might be required to meet the data analysis and decision-making needs of executives and other end users. Data integration and cleansing tools. To effectively analyze the collected data for a BI program, an organization must integrate and consolidate different data sets to create unified views of them. The most widely used data integration technology for BI applications is extract, transform and load (ETL) software, which pulls data from source systems in batch processes. A variant of ETL is extract, load and transform a technology in which data is extracted and loaded as-is and transformed later for specific BI uses. Other methods include real-time data integration, such as change data capture and streaming integration to support real-time analytics applications, and data virtualization, which combines data from different source systems virtually. A BI architecture typically also includes data profiling and data cleansing tools that are used to identify and fix data quality issues. They help BI and data management teams provide clean, consistent data that's suitable for BI uses. Analytics data stores. This encompasses the various repositories where BI data is stored and managed. The primary repository is a data warehouse, which usually stores structured data in a relational, columnar or multidimensional database and makes it available for querying and analysis. An enterprise data warehouse can also be tied to smaller data marts set up for individual departments and business units with data that's specific to their BI needs. In addition, BI architectures often include an operational data store (ODS) that's an interim repository for data before it goes into a data warehouse. An ODS can also be used to run analytical queries against recent transaction data. Depending on the size of a BI environment, a data warehouse, data mart and an ODS can be deployed on a single database server or separate business intelligence systems. A data lake running on a Hadoop cluster or other big data platform can also be incorporated into a BI architecture as a repository for raw data of various types. The data can be analyzed in the data lake itself or filtered and loaded into a data warehouse for analysis. A well-planned architecture should specify which of the different data stores is best suited for particular BI uses. BI and data visualization tools. The tools used to analyze data and present information to business users include a suite of technologies that can be built into a BI architecture -- for example, ad hoc query, data mining and online analytical processing software. In addition, the growing adoption of self-service BI tools
  • 10. 10 | P a g e enables business analysts and managers to run queries themselves instead of relying on the members of the BI team to do that for them. BI software also includes data visualization tools that can be used to create graphical representations of data in the form of charts, graphs and other types of visualizations designed to illustrate trends, patterns and outlier elements in data sets. Dashboards, portals and reports. These information delivery tools give users visibility into the results of BI and analytics applications with built-in data visualizations and, often, self-service capabilities to do additional data analysis. For example, BI dashboards and online portals can be designed to provide real-time data access with configurable views and give users the ability to drill down into data. Reports tend to present data in a more static format. Other components that increasingly are part of a business architecture include data preparation software used to structure and organize data for analysis and a metadata repository, a business glossary and a data catalog, which can help users find relevant data and understand its lineage and meaning. BI architecture tools BI architecture tools facilitate the centralization of data collection as well as data analysis and visualization. These tools play an integral role in empowering businesses to make informed decisions and extract insights from extensive data sets. Some examples of BI tools on the market include the following: 1. Datapine. Datapine lets users access, view, analyze and share their company data on a single analytics platform. Users can perform data analysis, create interactive business dashboards and obtain new business insights through a simple drag-and-drop interface. 2. Domo. The Domo cloud-based platform unifies data, systems and people for seamless business operations. It provides enterprise tools for data aggregation, analytics, dashboards and reporting for organizations looking to maximize data value. 3. Dundas BI. This enterprise-level BI tool lets users create and customize interactive dashboards and reports. The software can either act as a central data hub or integrate into existing websites for customized BI capabilities. 4. GoodData. As part of the GoodData platform, this tool offers an enterprise-level option for data analytics and business intelligence. It helps users analyze data coming from multiple sources and create reports.
  • 11. 11 | P a g e 5. Infor Birst. Infor Birst is a cloud-based platform that uses a networked approach and modern enterprise-class architecture with a focus on multi- tenancy. Birst ensures that a company's data remains connected by centralizing both decentralized and centralized data. 6. Microsoft Power BI. Users can run analytics either in the cloud or in a reporting server. The tool comes with built-in artificial intelligence features and offers end- to-end encryption features. 7. Oracle Business Intelligence. This integrated set of tools lets users gather, store, analyze and report data for smart decision-making. In addition, it includes a scalable BI server, dashboards, a content library, web-based reporting and analytics tools. 8. SAS Business Intelligence. This collection of tools lets corporate users conduct self-service analytics. Its two components -- Enterprise Business Intelligence and Business Visualization -- provide interactive visualizations and analytics to aid with data analysis and decision-making. 9. Tableau. In addition to data visualization features, this tool offers live visual analytics and supports most databases and numerous data sources. 10. Zoho Analytics. This self-service BI and data analytics software lets users analyze data, generate data visualizations and uncover insights quickly and easily. This tool is accessible to both small and large-sized organizations. Online Analytical Processing (OLAP) – Definition, Architecture and Functionality  OLAP Council (1997) define Online Analytical Processing (OLAP) as a group of decision support system that facilitate fast, consistent and interactive access of information that has been reformulate, transformed and summarized from relational dataset mainly from data warehouse into Multi- Dimensional Databases  OLAP have the ability to analyze large amount of data for the extraction of valuable information. Analytical development can be of business, education or medical sectors. OLAP enable discovering pattern and relationship contain in business activity by query tons of data from multiple database source systems at one time.  Processing database information using OLAP required an OLAP server to organize and transformed and builds Multi Dimensional Database (MDDB). MDDB are then separated by cubes for client OLAP tools to perform data analysis which aim to discover new pattern relationship between the cubes.  Data warehouse stores and manages data while OLAP transforms data warehouse datasets into strategic information. OLAP function ranges from
  • 12. 12 | P a g e basic navigation and browsing (often known as “slice and dice”), to calculations and also serious analysis such as time series and complex modelling.  As decision-makers implement more advanced OLAP capabilities, they move from basic data access to creation of information and to discovering of new knowledge. CHALLENGES IN OLAP  Data quality and consistency: Data sources may have different formats, standards, definitions, and levels of granularity, which can cause discrepancies and errors in the cube. For example, different databases may use different currencies, date formats, or units of measurement.  Cube size and performance: As the amount and complexity of data increases, so does the size of the cube, which can affect the storage space, processing time, and query speed.  Cube design and maintenance: The cube design involves choosing the dimensions, measures, hierarchies, and calculations that best suit the analytical needs and goals of the users. The cube design also affects the usability, flexibility, and scalability of the cube. However, designing a cube that meets all the requirements and expectations can be difficult and costly, especially if the data sources or business rules change frequently.  Cube security and access: The cube may contain sensitive or confidential information that needs to be protected from unauthorized or inappropriate use. The cube may also have different types of users with different roles and permissions, such as administrators, analysts, or managers. To ensure the security and access of the cube, some measures such as encryption, authentication, authorization, and auditing can be used. However, these measures also have implications for the complexity, performance, and usability of the cube.  Cube compatibility and integration: The cube may need to interact with various data sources, such as relational databases, data warehouses, or web services. The cube may also need to support various analytical tools, such as reporting, dashboarding, or visualization software. To ensure the compatibility and integration of the cube, some standards, protocols, and interfaces can be used, such as XMLA, MDX, or OLE DB for OLAP. However, these standards and protocols also have limitations, such as complexity, performance, or functionality.  Cube adoption and usage: The cube may have a steep learning curve, as it requires some technical skills and knowledge to understand and manipulate the cube data and functionality. The cube may also have a low user
  • 13. 13 | P a g e satisfaction, as it may not meet the user expectations or preferences in terms of usability, flexibility, or relevance. Online Analytical Processing (OLAP) Architecture In comparison to data warehouse which usually based on relational technology, OLAP uses a multidimensional view to aggregate data to provide rapid access to strategic information for analysis. There are three type of Online Analytical Processing (OLAP) architecture based on the method in which they store multi- dimensional data and perform analysis operations on that dataset. The categories are multidimensional OLAP (MOLAP), relational OLAP (ROLAP) and hybrid OLAP (HOLAP). 1. In MOLAP, datasets are stored and summarized in a multidimensional cube. The MOLAP architecture can perform faster than ROLAP and HOLAP (C). MOLAP cubes designed and build for rapid data retrieval to enhance efficient slicing and dicing operations. MOLAP can perform complex calculations which have been pre-generated after cube creation. MOLAP processing is restricted to initial cube that was created and are not bound to any additional replication of cube. 2. In ROLAP, data and aggregations are stored in relational database tables to provide the OLAP slicing and dicing functionalities. ROLAP are the slowest among the OLAP flavors. ROLAP relies on data manipulating directly in the relational database to give the manifestation of conventional OLAP’s slicing and dicing functionality. Basically, each slicing and dicing action is equivalent to adding a “WHERE” clause in the SQL statement. ROLAP can manage large amounts of data and ROLAP do not have any limitations for data size. ROLAP can influence the intrinsic functionality in a relational database. ROLAP are slow in performance because each ROLAP activity are essentially a SQL query or multiple SQL queries in the relational database. The query time and number of SQL statements executed measures by its complexity of the SQL statements and can be a bottleneck if the underlying dataset size is large. ROLAP essentially depends on SQL statements generation to query the relational database and do not cater all needs which make ROLAP technology conventionally limited by what SQL functionality can offer. 3. HOLAP combine the technologies of MOLAP and ROLAP. Data are stored in ROLAP relational database tables and the aggregations are stored in MOLAP cube. HOLAP can drill down from multidimensional cube into the underlying relational database data. To acquire summary type of information, HOLAP leverages cube technology for faster performance. Whereas to retrieve detail type of information, HOLAP can drill down from the cube into the underlying relational data.
  • 14. 14 | P a g e In Online Analytical Processing (OLAP) architectures (MOLAP, ROLAP and HOLAP), the datasets are stored in a multidimensional format as it involves the creation of multidimensional blocks called data cubes. The cube in OLAP architecture may have three axes (dimensions), or more. Each axis (dimension) represents a logical category of data. One axis may for example represent the geographic location of the data, while others may indicate a state of time or a specific school. Each of the categories can be broken down into successive levels and it is possible to drill up or down between the levels. Online Analytical Processing (OLAP) partitions are normally stored in an OLAP server, with the relational database frequently stored on a separate server from OLAP server. OLAP server must query across the network whenever it needs to access the relational tables to resolve a query. The impact of querying across the network depends on the performance characteristics of the network itself. Even when the relational database is placed on the same server as OLAP server, inter- process calls and the associated context switching are required to retrieve relational data. With a OLAP partition, calls to the relational database, whether local or over the network, do not occur during querying. Online Analytical Processing (OLAP) Functionality Online Analytical Processing (OLAP) functionality offers dynamic multidimensional analysis supporting end users with analytical activities includes calculations and modelling applied across dimensions, trend analysis over time periods, slicing subsets for on-screen viewing, drilling to deeper levels of records, OLAP is implemented in a multi-user client/server environment and provide reliably fast response to queries, in spite of database size and complexity. OLAP facilitate the end user integrate enterprise information through relative, customized viewing, analysis of historical and present data in various “what-if” data model scenario. This is achieved through use of an OLAP Server. OLAP functionality is provided by an OLAP server. OLAP server design and data structure are optimized for fast information retrieval in any course and flexible calculation and transformation of unprocessed data. The OLAP server may either actually carry out the processed multidimensional information to distribute consistent and fast response times to end users, or it may fill its data structures in real time from relational databases, or offer a choice of both. Essentially, OLAP create information in cube form which allows more composite analysis compares to relational database. OLAP analysis techniques employ ‘slice and dice’ and ‘drilling’ methods to segregate data into loads of information
  • 15. 15 | P a g e depending on given parameters. Slice is identifying a single value for one or more variable which is non-subset of multidimensional array. Whereas dice function is application of slice function on more than two dimensions of multidimensional cubes. Drilling function allows end user to traverse between condensed data to most precise data unit. Multidimensional Database Schema The base of every data warehouse system is a relational database build using a dimensional model. Dimensional model consists of fact and dimension tables which are described as star schema or snowflake schema. A schema is a collection of database objects, tables, views and indexes. In designing data models for data warehouse, the most commonly used schema types are star schema and snowflake schema. In the star schema design, fact table sits in the middle and is connected to other surrounding dimension tables like a star. A star schema can be simple or complex. A simple star consists of one fact table; a complex star can have more than one fact table. Most data warehouses use a star schema to represent the multidimensional data model. The database consists of a single fact table and a single table for each dimension. Each tuple in the fact table consists of a pointer or foreign key to each of the dimensions that provide its multidimensional coordinates, and stores the numeric measures for those coordinates. A tuple consist of a unit of data extracted from cube in a range of member from one or more dimension tables. Each dimension table consists of columns that correspond to attributes of the dimension. Star schemas do not explicitly provide support for attribute hierarchies which are not suitable for architecture such as MOLAP which require lots of hierarchies of dimension tables for efficient drilling of datasets. Snowflake schemas provide a refinement of star schemas where the dimensional hierarchy is explicitly represented by normalizing the dimension tables. The main advantage of the snowflake schema is the improvement in query performance due to minimized disk storage requirements and joining smaller lookup tables. The main disadvantage of the snowflake schema is the additional maintenance efforts needed due to the increase number of lookup tables. In addition to the fact and dimension tables, data warehouses store selected summary tables containing pre-aggregated data. In the simplest cases, the pre- aggregated data corresponds to aggregating the fact table on one or more selected dimensions. Such pre-aggregated summary data can be represented in the database
  • 16. 16 | P a g e in at least two ways. Whether to use star or a snowflake mainly depends on business needs. OLAP Evaluation As OLAP technology taking prominent place in data warehouse industry, there should be a suitable assessment tool to evaluate it. E.F. Codd not only invented OLAP but also provided a set of procedures which are known as the ‘Twelve Rules’ for OLAP product ability assessment: 1. Multidimensional conceptual view. OLAP operates with CUBEs of data that represent multidimensional construct of data. Event though the name implies three dimensional data, the number of possible dimensions is practically unlimited. 2. Transparency. OLAP systems should be part of an open system that supports heterogeneous data sources. 3. Accessibility. The OLAP should present the user with a single logical schema of the data. 4. Consistent reporting performance. Performance should not degrade as the number of dimensions in the model increases. 5. Client/server architecture. Should be based on open, modular systems. 6. Generic dimensionality. Not limited to 3-D and not biased toward any particular dimension. A function applied to one dimension should also be able to be applied to another. 7. Dynamic sparse-matrix handling. Related both to the idea of nulls in relational databases and to the notion of compressing large files, a sparse matrix is one in which not every cell contains data. OLAP systems should accommodate varying storage and data-handling options. 8. Multiuser support. OLAP systems should support more than one user at the time. 9. Unrestricted cross-dimensional operations. Similar to rule of generic dimensionality; all dimensions are created equal, and operations across data dimensions should not restrict relationships between cells. 10. Intuitive data manipulation. Ideally, users shouldn’t have to use menus or perform complex multiple-step operations when an intuitive drag-and-drop action will do. 11. Flexible reporting. Save a tree. Users should be able to print just what they need, and any changes to the underlying financial model should be automatically reflected in reports. 12. Unlimited dimensional and aggregation levels. The OLAP cube can be built with unlimited dimensions, and aggregation of the contained data also does not have practical limits. Codd twelve rules of OLAP provide us an essential tool to verify the OLAP functions and OLAP models used are able to produce desired result. A good OLAP system
  • 17. 17 | P a g e should also support a complete database management tools as a utility for integrated centralized tool to permit database management to perform distribution of databases within the enterprise. OLAP ability to perform drilling mechanism within the MDDB allows the functionality of drill down right to the source or root of the detail record level. This implies that OLAP tool permit a smooth changeover from the MDDB to the detail record level of the source relational database. OLAP systems also must support incremental database refreshes. This is an important feature as to prevent stability issues on operations and usability problems when the size of the database increases. OLTP and OLAP The design of OLAP for multidimensional cube is entirely different compare to OLTP (Online Transactional Processing) for database. OLTP is implemented into relational database to support daily processing in an organization. OLTP system main function is to capture data into computers. OLTP allow effective data manipulation and storage of data for daily operational resulting in huge quantity of transactional data. Organizations build multiple OLTP systems to handle huge quantities of daily operations transactional data can in short period of time. OLAP is designed for data access and analysis to support managerial user strategic decision making process. OLAP technology focuses on aggregating datasets into multidimensional view without hindering the system performance. OLTP systems is defined as a “Customer oriented” and OLAP is a “market oriented”. Major differences between OLTP and OLAP systems are shown below. Differences OLTP OLAP Characteristics Can handle large numbers of small online transactions. Handles large volumes of data. Query Simple queries, such as Insert, Delete, and Update information. Complex queries which require aggregations. Database Design Normal, with many tables. Usually with fewer tables and can include star or snowflake schemas. Method Uses traditional DBMS. Uses data warehouses. Sources The OLTP itself and respective transactions correspond to the sources of data. The various OLTP databases become the data sources for OLAP. Data Quality Huge effort to ensure the data is ACID-compliant. The data may not be as organized, but what really matters is the capacity to navigate through the
  • 18. 18 | P a g e dimensions of the data. Functionality Online database which modifies a system by controlling and running essential business tasks in real time. Online database query management system that allows users to discover hidden insights, plan, support decisions, and solve problems. Speed Typically very fast processing. Depends on the amount of data. Creating indexes can enhance query speed. Backup and Recovery Regular backups are vital to ensure the business keeps running since data loss can lead to monetary loss and legal issues. Requires backup from time to time, and lost data can be reloaded from OLTP database when needed. It is complicated to merge OLAP and OLTP into one centralized database system. The dimensional data design model used in OLAP is much more effective for querying than the relational database query used in OLTP system. OLAP may use one central database as data source and OLTP used different data source from different database sites. The dimensional design of OLAP is not suitable for OLTP system, mainly due to redundancy and the loss of referential integrity of the data. Organization chooses to have two separate information systems, one OLTP and one OLAP system. We can conclude that the purpose of OLTP systems is to get data into computers, whereas the purpose of OLAP is to get data or information out of computers. OLAP Operations in the Multidimensional Data Model In the multidimensional model, the records are organized into various dimensions, and each dimension includes multiple levels of abstraction described by concept hierarchies. This organization support users with the flexibility to view data from various perspectives. A number of OLAP data cube operation exist to demonstrate these different views, allowing interactive queries and search of the record at hand. Hence, OLAP supports a user-friendly environment for interactive data analysis. Consider the OLAP operations which are to be performed on multidimensional data. The figure shows data cubes for sales of a shop. The cube contains the dimensions, location, and time and item, where the location is aggregated with regard to city values, time is aggregated with respect to quarters, and an item is aggregated with respect to item types. Roll-Up
  • 19. 19 | P a g e Roll-up is like zooming-out on the data cubes. Figure shows the result of roll-up operations performed on the dimension location. The hierarchy for the location is defined as the Order Street, city, province, or state, country. The roll-up operation aggregates the data by ascending the location hierarchy from the level of the city to the level of the country. When a roll-up is performed by dimensions reduction, one or more dimensions are removed from the cube. Example Consider the following cubes illustrating temperature of certain days recorded weekly: Temperature 64 65 68 69 70 71 72 75 80 81 83 85 Week1 1 0 1 0 1 0 0 0 0 0 1 0 Week2 0 0 0 1 0 0 1 2 0 1 0 0 Consider that we want to set up levels (hot (80-85), mild (70-75), cool (64-69)) in temperature from the above cubes. To do this, we have to group column and add up the value according to the concept hierarchies. This operation is known as a roll-up. By doing this, we contain the following cube: Temperature cool mild hot Week1 2 1 1 Week2 2 1 1 The roll-up operation groups the information by levels of temperature. The following diagram illustrates how roll-up works.
  • 20. 20 | P a g e Drill-Down Drill-down is like zooming-in record to more detailed data. Drill down a concept hierarchy for a dimension or adding additional dimensions. Figure shows a drill-down operation performed on the dimension time by stepping down a concept hierarchy which is defined as day, month, quarter, and year. Drill down appears by descending more detailed level of the month. Because a drill-down adds more details to the given data, it can also be performed by adding a new dimension to a cube. For example, a drill cubes of the figure can occur by introducing an additional dimension, such as a customer group. Example Drill-down adds more details to the given data in on the data cube. It navigates from less detailed record to more detailed data. Drill-down can be performed by either concept hierarchy for a dimension or adding additional dimensions. down operation performed on the dimension time by stepping down a concept hierarchy which is defined as day, month, quarter, and year. Drill down appears by descending the time hierarchy from the level of the quarter to a more detailed level of the month. down adds more details to the given data, it can also be performed by adding a new dimension to a cube. For example, a drill-down on the central f the figure can occur by introducing an additional dimension, such as a down adds more details to the given data on the data cube. It navigates from less detailed down can be performed by either stepping concept hierarchy for a dimension or adding additional dimensions. down operation performed on the dimension time by stepping down a concept hierarchy which is defined as day, month, quarter, and year. Drill- the time hierarchy from the level of the quarter to a down adds more details to the given data, it can also be performed down on the central f the figure can occur by introducing an additional dimension, such as a
  • 21. 21 | P a g e Temperature cool mild hot Day 1 0 0 0 Day 2 0 0 0 Day 3 0 0 1 Day 4 0 1 0 Day 5 1 0 0 Day 6 0 0 0 Day 7 1 0 0 Day 8 0 0 0 Day 9 1 0 0 Day 10 0 1 0 Day 11 0 1 0 Day 12 0 1 0 Day 13 0 0 1 Day 14 0 0 0 The following diagram illustrates how Drill-down works.
  • 22. 22 | P a g e Slice A slice is a subset of the cubes corresponding to a single value for one or more members of the dimension. For example, a slice operation is executed when the customer wants a selection on one dimension of a three in a two-dimensional site. So, the Slice operations perform a selection on one dimension of the given cube, thus resulting in a subcube. For example, if we make the selection, temperature=cool we will obtain the following cube: Temperature Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7 is a subset of the cubes corresponding to a single value for one or more of the dimension. For example, a slice operation is executed when the customer wants a selection on one dimension of a three-dimensional cube resulting dimensional site. So, the Slice operations perform a selection on one be, thus resulting in a subcube. For example, if we make the selection, temperature=cool we will obtain the following cool 0 0 0 0 1 1 1 is a subset of the cubes corresponding to a single value for one or more of the dimension. For example, a slice operation is executed when the dimensional cube resulting dimensional site. So, the Slice operations perform a selection on one For example, if we make the selection, temperature=cool we will obtain the following
  • 23. 23 | P a g e Day 8 Day 9 Day 11 Day 12 Day 13 Day 14 The following diagram illustrates how Slice works. Here Slice is functioning for the dimensions "time" using the criterion time = "Q1". It will form a new sub-cubes by selecting one or more dimensions. Dice The dice operation describes a subcube dimension. 1 1 0 0 0 0 The following diagram illustrates how Slice works. Here Slice is functioning for the dimensions "time" using the criterion time = "Q1". cubes by selecting one or more dimensions. The dice operation describes a subcube by operating a selection on two or more Here Slice is functioning for the dimensions "time" using the criterion time = "Q1". by operating a selection on two or more
  • 24. 24 | P a g e For example, Implement the selection (time = day 3 OR time = day 4) AND (temperature = cool OR temperature = hot) to the original cubes we get the following subcube (still two-dimensional) Temperature Day 3 Day 4 Consider the following diagram, which shows the dice operations. The dice operation on the cubes based on the following selection criteria involves three dimensions. , Implement the selection (time = day 3 OR time = day 4) AND (temperature = cool OR temperature = hot) to the original cubes we get the following dimensional) cool hot 0 1 0 0 Consider the following diagram, which shows the dice operations. The dice operation on the cubes based on the following selection criteria involves , Implement the selection (time = day 3 OR time = day 4) AND (temperature = cool OR temperature = hot) to the original cubes we get the following hot 1 0 The dice operation on the cubes based on the following selection criteria involves
  • 25. 25 | P a g e o (location = "Toronto" or "Vancouver") o (time = "Q1" or "Q2") o (item =" Mobile" or "Modem") Pivot The pivot operation is also called a rotation. Pivot is a visualization operations which rotates the data axes in view to provide an alternative presentation of the data. It may contain swapping the rows and columns or into the column dimensions. Consider the following diagram, which shows the pivot operation. (location = "Toronto" or "Vancouver") (item =" Mobile" or "Modem") The pivot operation is also called a rotation. Pivot is a visualization operations which rotates the data axes in view to provide an alternative presentation of the data. It may contain swapping the rows and columns or moving one of the row Consider the following diagram, which shows the pivot operation. The pivot operation is also called a rotation. Pivot is a visualization operations which rotates the data axes in view to provide an alternative presentation of the data. It moving one of the row-dimensions
  • 26. 26 | P a g e Other OLAP Operations Executes queries containing more than one fact table. The drill make use of relational SQL facilitates to drill through the bottom level of a data cubes down to its back-end relational tables. Other OLAP operations may contain ranking the top lists, as well as calculate moving average, growth rates, and interests, of returns, depreciation, currency conversions, and statistical tasks. OLAP offers analytical modeling capabilities, containing a calculation engine for determining ratios, variance, etc. and for computing measures across various dimensions. It can generate summarization, aggregation, and hierarchies at each granularity level and at every dimensions intersection. OLAP also provide functional models for forecasting, trend analysis, and statistical analysis. In this context, the OLAP engine is a powerful data analysis tool. Other OLAP Operations Executes queries containing more than one fact table. The drill-through operations l SQL facilitates to drill through the bottom level of a data end relational tables. Other OLAP operations may contain ranking the top-N or bottom- lists, as well as calculate moving average, growth rates, and interests, of returns, depreciation, currency conversions, and statistical tasks. OLAP offers analytical modeling capabilities, containing a calculation engine for determining ratios, variance, etc. and for computing measures across various . It can generate summarization, aggregation, and hierarchies at each granularity level and at every dimensions intersection. OLAP also provide functional models for forecasting, trend analysis, and statistical analysis. In this context, the a powerful data analysis tool. through operations l SQL facilitates to drill through the bottom level of a data -N elements in lists, as well as calculate moving average, growth rates, and interests, internal rates OLAP offers analytical modeling capabilities, containing a calculation engine for determining ratios, variance, etc. and for computing measures across various . It can generate summarization, aggregation, and hierarchies at each granularity level and at every dimensions intersection. OLAP also provide functional models for forecasting, trend analysis, and statistical analysis. In this context, the
  • 27. 27 | P a g e UNIT III Data Warehousing A Data Warehouse (DW) is a repository of huge amount of organized data. This data is consolidated from one or more different data sources. DW is a relational database that is mainly designed for analytical reporting and on-time decision making in organizations. The data for this purpose is isolated and optimized from the source transaction data, which will not have any impact on the main business. If an organization introduces any business change, then DW is used to examine the effects of that change, and hence DW is also used to monitor the non-decision making process. The data warehouse is mostly a read-only system as operational data is very much separated from DW. This provides an environment to retrieve the highest amount of data with good query writing. Thus DW will act as the backend engine for Business Intelligence tools which shows the reports, dashboards for the business users. DW is extensively used in banking, financial, retail sectors, etc. Enlisted below are some of the reasons for which Data Warehouse is crucial. 1. Data warehouse gathers all the operational data from several heterogeneous sources of “different formats” and through the process of extract, transform and load (ETL) it loads the data into DW in a “standardized dimensional format” across an organization. 2. Data warehouse maintains both “current data and historical data” for analytical reporting and fact-based decision making. 3. It helps organizations to take “smarter and quick decisions” on reducing costs and to increase the revenue, by comparing quarter and annual reports to improve their performance. Types Of Data Warehouse Applications
  • 28. 28 | P a g e Business Intelligence (BI) is a branch of data warehousing designed for decision making. Once the data in the DW is loaded, BI plays a major role by analyzing the data and presenting it to the business users. Practically, the term “data warehouse applications” implies, in how many different types the data can be processed and utilized. We have three types of DW Applications as mentioned below. 1. Information processing 2. Analytical processing 3. Data mining which serves the purpose of BI #1) Information Processing This is a kind of application where the data warehouse allows direct one-one contact with the data stored in it. As the data can be processed by writing direct queries on the data (or) with a basic statistical analysis on the data and the end results will be reported to the business users in the form of reports, tables, charts or graphs. DW supports the following tools for Information Processing: (i) Query Tools: The business (or) the analyst runs the queries using query tools to explore the data and generate the output in the form of reports or graphics as per the business requirement. (ii) Reporting Tools: If the business wants to see the results in any defined format and on a scheduled basis i.e. daily, weekly or monthly then reporting tools will be used. These kinds of reports can be saved and reviewed at any time.
  • 29. 29 | P a g e (iii) Statistics Tools: If the business wants to do an analysis on a broad view of data then statistics tools will be used to generate such results. Businesses can make conclusions and predictions by understanding these strategic results. #2) Analytical Processing This is a kind of application where a data warehouse allows the analytical processing of data stored in it. The data can be analyzed by the following operations as Slice-and-Dice, Drill Down, Roll Up and Pivoting. (i) Slice-and-Dice: Data warehouse allows slice-and-dice operations to analyze the data accessed from many levels with a combination of different perspectives. The slice-and-dice operation internally uses the drill-down mechanism. Slicing works on dimensional data. As a part of the business requirement, if we focus on a single area then slicing analyzes the dimensions of that particular area as per the requirements and gives the results. Dicing works on analytic operations. Dicing zooms for a specific set of attributes over all the dimensions to provide diverse perspectives. The dimensions are considered from one or more consecutive slices. (ii) Drill Down: If the business wants to go to a more detailed level of any summary number, then drill down is an operation for navigating down that summary to minor detailed levels. This gives a great idea of what is happening and where the business has to be focused more closely. Drill down tracks from the hierarchy level until the minor detail level for the root cause analysis. This can be easily understood with an example as sales drill down can happen from Country-level -> Region level -> State-level -> District level -> Store level.
  • 30. 30 | P a g e (iii) Roll up: Roll up works opposite to the drill-down operation. If the business wants any summarized data, then roll up comes into the picture. It aggregates the detail level data by moving up in the dimensional hierarchy. Roll-ups are used to analyze the development and performance of a system. This can be understood with an Example as in a sales roll up where the totals can be rolled up from City level -> State-level -> Region level -> Country level. (iv) Pivot: Pivoting analyzes dimension data by rotating the data on the cubes. For Example, the row dimension can be swapped into the column dimension and vice versa. #3) Data Mining This is a kind of application where the data warehouse allows knowledge discovery of the data and results will be represented with visualization tools. In the above two types of applications, the information can be driven by the users. As the data goes vast in various businesses, it is difficult to query and drill down the data warehouse to get all possible insights into data. Then data mining comes into the picture to accomplish the discovery of knowledge. This drives into the data with all the past associations, results etc and predicts the future. Hence this is data-driven and not user-driven. The data can be discovered by finding hidden patterns, associations, classifications, and predictions. Data mining goes in-depth with the data to predict the future. Based on the predictions, it also suggests the actions to take.
  • 31. 31 | P a g e Given below are the various activities of Data Mining: Patterns: Data mining discovers patterns that occur in the database. Users can provide the business inputs on which some knowledge of the patterns is expected for decision making. Associations/Relationships: Data mining discovers relationships between the objects with the frequency of their association rules. This relationship may be between two or more objects (or) it may discover the rules within the properties of the same object. Classification: Data mining organizes data in a set of pre-defined classes. So if any object is picked up from the data, classification associates the respective class label to that object. Prediction: Data mining compares a set of existing values to find the best possible future values/trends in business. Hence, based on all the above results, Data mining also proposes a set of actions to be taken. Kimball’s approach versus Inman’s approach to Data warehousing The Kimball method Ralph Kimball (born in 1944) is one of the original pioneers of data warehousing design. At the heart of his method stands the concept that a database must be designed to be understood and operate fast. His approach is a Business-like approach, which first takes into consideration the specific business requirements of an organization and builds the data warehouse on top of them. For example, an organization can have multiple data sources on its operational system (OLTP), but only the necessary ones will be transferred into the data warehouse, after being cleaned by an ETL process. There, it would be stored in a relational data model made from facts and
  • 32. 32 | P a g e dimension tables. The following illustration shows the data flow process of the Kimball method: Kimball approach The Inman Method Bill Inman (born in 1945) is recognized by many as the “Father of data warehousing”. his approach is considered a Technical Approach and is the opposite approach of Kimball. From Inman’s point of view, it’s essential to first transfer all the data from the operating system (OLTP) into the data warehouse to serve as a single source of truth for one’s organization, storing it in highly normalized data model. Only then, the relevant data will be transferred into one or more data marts, according to the organization’s requirements. The following illustration shows the data flow process of the Inman method: The Inman approach
  • 33. 33 | P a g e The next table summarizes the main differences between the two methods: Method Kimball Inman Set-up Relatively fast because only partial parts of the data are being transferred to the data warehouse Relatively slow because all the company data has to be transferred to the data warehouse Performance Relatively fast due to the data being split into fact and dimension tables in a de- normalized way Relatively slow due to the highly normalized structure of the data model Data modeling complexity Star or snowflake schemas which are considered user-friendly models to understand The data model can become over-complex over time as more tables are joined together. Costs Relatively low because only partial parts of the organization’s data are being transferred Relatively high due to the transfer and storage of the organization’s entire data Reporting ability Because not all the data is transferred, it can sometimes lead to difficulties for the organization’s different reporting needs Any reporting need of the organization is being covered. The conclusion Both the Inman and Kimball methods can be applied for different scenarios, and each method has its own advantages and disadvantages. To this day, the most common DWH structure uses the Kimball method due to the fact that it’s a Business-minded approach rather than a Technical approach. It is essentially much cheaper and delivers faster results with better ROI.