Homework Help
https://www.homeworkping.com/
Research Paper help
https://www.homeworkping.com/
Online Tutoring
https://www.homeworkping.com/
Case Study on
OLAP
Name: VIJAY S. GACHANDE
Class: TEIT
Roll No.: 06
Subject: DBT
Batch No.: B1
OLAP (On-Line Analytical Processing):
Why OLAP?
Need of OLAP:
1. Multidimensional Data making easy to
-Select data
-Navigate data
-Integrate data
-Explore data
2. Analytical Query Language providing
-Filter the data relationship
-Aggregate the data relationship
-merge the data relationship
-explore complex data relationship
OLAP CUBE
An OLAP cube as shown in fig(1.1.1) is a data structure that allows fast analysis of
data. The arrangement of data into cubes overcomes a limitation of relational databases.
Relational databases are not well suited for near instantaneous analysis and display of large
amounts of data. Instead, they are better suited for creating records from a Series of
transactions known as OLTP or On-Line Transaction Processing. Although many report-
writing tools exist for relational databases, these are slow when the whole database must
be summarized.
Fig.. Olap Cube
The OLAP cube consists of numeric facts called measures which are categorized by
dimensions. The cube metadata is typically created from a star schema or snowflake
schema of tables in a relational database. Measures are derived from the records in the
fact table and dimensions are derived from the dimension tables.
CLASSIFICATIONS OF OLAP
In the OLAP world, there are mainly two different types: Multidimensional
OLAP (MOLAP) ,Relational OLAP (ROLAP) and HOLAP(HYBRID).
1. MOLAP:
This is the more traditional way of OLAP analysis. In MOLAP, data is stored in a
multidimensional cube. The storage is not in the relational database, but in proprietary
formats. Multidimensional OLAP is one of the oldest segments of the OLAP market. The
business problem MOLAP addresses is the need to compare, track, analyze and forecast
high level budgets based on allocation scenarios derived from actual numbers. The first
forays into data warehousing were led by the MOLAP vendors who created special
purpose databases that provided a cube-like structure for performing data analysis.
MOLAP tools restructure the source data so that it can be accessed, summarized,
filtered and retrieved almost instantaneously. As a general rule, MOLAP tools provide a
robust solution to data warehousing problems. Administration, distribution, meta data
Creation and deployment are all controlled from a central point. Deployment and
Distribution can be achieved over the Web and with client/server models.
Functions of MOLAP
-Produces a hypercube
-Pre-aggregated and pre-calculated
-Rapid response times
-Limited in the amount of data that can be managed
When MOLAP Tools Bring Value?
• Need to process information with consistent response time regardless of
level of summarization or calculations selected.
• Need to avoid many of the complexities of creating a relational database to
store data for analysis.
• Need fastest possible performance.
Who Should Have Access to MOLAP Tools?
• Users who are connected to a network and need to analyze larger, less defined
data.
• Users who want to access predefined reports, but need to have the ability to
perform additional analysis on information that may not be contained in the
report.
Advantages:
• Excellent performance: MOLAP cubes are built for fast data retrieval, and is
optimal for slicing and dicing operations.
• Can perform complex calculations: All calculations have been pre-generated
when the cube is created. Hence, complex calculations are not only doable,
but they return quickly.
Disadvantages:
• Limited in the amount of data it can handle: Because all calculations are
performed when the cube is built, it is not possible to include a large amount
of data in the cube itself. This is not to say that the data in the cube cannot be
derived from a large amount of data. Indeed, this is possible. But in this case,
only summary-level information will be included in the cube itself.
• Requires additional investment: Cube technology are often proprietary and do
not already exist in the organization. Therefore, to adopt MOLAP technology,
chances are additional investments in human and capital resources are
needed.
2. ROLAP
This methodology relies on manipulating the data stored in the relational database
to give the appearance of traditional OLAP's slicing and dicing functionality. In essence,
each action of slicing and dicing is equivalent to adding a "WHERE" clause in the SQL
statement.. data warehouse sizes, users have come to realize that they cannot store all of
the information that they need in MOLAP databases. The business problem that ROLAP
addresses is the need to analyze massive volumes of data without having to be a systems
or software expert. Relational OLAP databases seek to resolve this dilemma by providing a
multidimensional front end that creates queries to process information in a relational
format. These tools provide the ability to transform two-dimensional relational data into
multidimensional information.
Due to the complexity and size of ROLAP implementations, the tools provide a
robust set of functions for meta data creation, administration and deployment. The focus
of these tools is to provide administrators with the ability to optimize system performance
and generate maximum analytical throughput and performance for users. All of the
ROLAP vendors provide the ability to deploy their solutions via the Web or within a
multitier client/server environment.
Functions of ROLAP
-Data remains in a relational format
-Some degree of aggregation
-Slower response times
-Scales to large amounts of data
Advantages:
• Can handle large amounts of data: The data size limitation of ROLAP
technology is the limitation on data size of the underlying relational database.
In other words, ROLAP itself places no limitation on data amount.
• Can leverage functionalities inherent in the relational database: Often,
relational database already comes with a host of functionalities. ROLAP
technologies, since they siton top of the relational database, can therefore
leverage these functionalities.
Disadvantages:
• Performance can be slow: Because each ROLAP report is essentially a SQL
query (or multiple SQL queries) in the relational database, the query time can
be long if the underlying data size is large.
• Because ROLAP technology mainly relies on generating SQL statements to
Query the relational database, and SQL statements do not fit all needs (for
example, it is difficult to perform complex calculations using SQL), ROLAP
technologies are therefore traditionally limited by what SQL can do. ROLAP
vendors have mitigated this risk by building into the tool out-of-the-box
complex functions as well as the ability to allow users to define their own
functions
3. HOLAP:
Functions of HOLAP
-Can manage data both as
ROLAP and MOLAP
-Currently evolving
-MOLAP vendors are finding it easier to move into the HOLAP market space
The Basic Structure
Extract
Source Data
Extract
Storage:
flat files (fastest);
RDBMS;
other
Processing:
clean;
prune;
combine;
remove duplication
standardize
conform dimensions
store awaiting replication
export to data marts
No user query services
Data Staging Area
Data Mart #1
OLAP (ROLAP,
MOLAP,HOLAP)
dimensional access
subject oriented
user group driven
refresh frequency
conforms to the Bus
Data Mart #2
Data Mart #3
Populate,
replicate,
recover
DW Bus
DW Bus
Corporate View
The Basic Structure
Data Mart #1
OLAP (ROLAP,
MOLAP,HOLAP)
dimensional access
subject oriented
user group driven
refresh frequency
conforms to the Bus
Data Mart #2
Data Mart #3
DW Bus
DW Bus
Corporate Staging Area User Access
Ad Hoc Query Tools
Reporting Tools and Writers
Customized Applications
Models:
forecasting;
scoring;
allocating;
data mining;
scenario analysis;
etc.
Data Feed
Data Feed
Data Feed
Operations:
Unfortunately, there is no consensus on the set of multidimensional operations and
how to name them. However, in you find a comparison of algebraic proposals in the
academic literature, besides a set of operations subsuming all of them. A sequence of these
operations is known as an OLAP session. An OLAP session allows transforming a starting
query into a new query. Figure 3 draws the transitions generated by each one of these
operations (circles and triangles represent different attributes for Fact instances):
Selection Roll-up/Drill-down
ChangeBase Drill-across
Projection Set operations (Union)
Selection or Dice: By means of a logic predicate over the dimension attributes, this
operation allows users to choose the subset of points of interest out of the whole n-
dimensional space
-Slice and Dice:Look at a specific interest of the business
Roll-up: Also called ”Drill-up”, it groups cells in a Cube based on an aggregation hierarchy.
This operation modifies the granularity of data by means of a many-to-one relationship
which relates instances of two aggregation levels in the same Dimension, corresponding to
a part-whole relationship (figure 3.b from left to right). For example, you could roll-up
monthly sales into yearly sales moving from “Month” to “Year” aggregation level along the
temporal dimension.
-Roll Up:Move from detail to summary
Drill-down: This is the counterpart of Roll-up. Thus, it removes the effect of that operation
by going down through an aggregation hierarchy, and showing more detailed data
-Drill Down: Move from summary to detail
ChangeBase: This operation reallocates exactly the same instances of a Cube into a new n-
dimensional space with exactly the same number of points (figure 3.c). Actually, it allows
two different kinds of changes in the space: you can just rearrange the multidimensional
space by reordering the Dimensions interchanging rows and columns in the Cross-tab (this
is also known as Pivoting), or it could add/remove dimensions to/from the space.
-Pivot and Rotate
Looking at data from varying perspectives
-Drill Through
Move to a near transaction level of detail
Drill-across: This operation changes the subject of analysis of the Cube, by showing
measures regarding a new Fact. The n-dimensional space remains exactly the same, only
the data placed in it change so that new measures can be analyzed (figure). For example, if
your Cube contains data about sales, you could use this operation to analyze data regarding
production using the same Dimensions.
Projection: It selects a subset of measures from those available in the Cube (figure).
Set operations: These operations allow users to operate two Cubes defined over the same
n-dimensional space. Usually, Union (figure f), Difference and Intersection are considered.
This set of algebraic operations is minimal in the sense that none of the operations can be
expressed in terms of others, nor can any operation be dropped without affecting its
functionality (some tools consider that the set of measures of a Fact conform an artificial
analysis dimension, as well; if so, Projection should be removed from the set of operations
in order to be considered minimal, since it would be done by Selection over this artificial
Dimension). Thus, other operations can be derived by sequences of these. It is the case of
Slice (which reduces the dimensionality of the original Cube by fixing a point in a
Dimension) by means of Selection and ChangeBase operations. It is also common that OLAP
implementations use the term Slice & Dice to refer to the selection of fact instances, and
some also introduce Drill-through to refer to directly accessing the data sources in order to
Lower the aggregation level below that in the OLAP repository or data mart.
Data integration
1 Data resides in many distributed, heterogeneous OLTP (On-Line Transaction
Processing) sources Sales, inventory, customer, . NC branch, NY branch, CA branch
2 Need to support OLAP (On-Line Analytical Processing) over an integrated view of
the data
3 Possible approaches to integration
I) Eager: integrate in advance and store the integrated data at a central
repository called the data warehouse
II) Lazy: integrate on demand; process queries over distributed sources
OLTP versus OLAP:
OLTP OLAP
Usage Application Specific Decision support
Workload Predefined Unforeseeable
Access Read/Write Read-only
Query structure Simple Complex
Records per operation Tens/ Hundreds Thousands/Millions
Number of users Thousands/Millions Tens/Hundreds
Function Mostly updates Mostly reads
Transactions/queries Short, simple transactions Long, complex queries
Users types Clerical users Analysts, decision makers
Goal ACID, transaction throughput fast queries
Main Differences between OLTP and OLAP are:-
1. User and System Orientation
OLTP: customer-oriented, used for data analysis and querying by clerks, clients and IT
professionals.
OLAP: market-oriented, used for data analysis by knowledge workers( managers,
executives, analysis).
2. Data Contents
OLTP: manages current data, very detail-oriented.
OLAP: manages large amounts of historical data, provides facilities for summarization
and aggregation, stores information at different levels of granularity to support decision
making process.
3. Database Design
OLTP: adopts an entity relationship(ER) model and an application-oriented database
Design. OLAP: adopts star, snowflake or fact constellation model and a subject-oriented database
Design.
4. View
OLTP: focuses on the current data within an enterprise or department.
OLAP: spans multiple versions of a database schema due to the evolutionary process of
An organization; integrates information from many organizational locations and data.
APPLICATIONS OF OLAP
KEY APPLICATIONS
Managers are usually not trained to query databases by means of SQL. Moreover, if
the query is relatively complex (several joins and subqueries, grouping, and functions) and
the database schema is not small (with maybe hundreds of tables), using interactive SQL
could be a nightmare even for SQL experts. Thus, OLAP is used to ease the tasks of these
managers in extracting knowledge from the data warehouse by means of Drag&Drop,
instead of typing SQL queries by hand.
OLAP market is estimated around 6 billion US$ in 2006, which is mainly devoted to
decision making. However, this paradigm can also be used in any other field with non-
expert users, where schemas and queries are relatively complex. For example, its usage is
under investigation in bioinformatics [8], and the semantic web [9].
Declarative languages
There are some research proposals of declarative query languages for OLAP. [1]
proposes a graphical query language, while [3] proposes a calculus. From the industry point
of view, MDX (standing for Multidimensional Expressions [5]) is the de facto standard. It
was introduced in 1997, and in spite of the specification being owned by Microsoft it has
been widely adopted. Its syntax resembles that of SQL.
[ WITH <MeasureDefinition>+ ]
SELECT <DimensionSpecification>+
FROM <CubeName>
[WHERE <SlicerClause> ]
However, its semantics are completely different. Roughly speaking, an MDX query
gets the instances of a given Cube stated in the FROM clause and places them in the space
defined by the SELECT clause. Moreover, complex calculations can be defined in the WITH
clause, and the dimensions not used in the SELECT clause can be sliced in the WHERE clause
(if not explicitly sliced, it is assumed that dimensions that do not appear in the SELECT are
sliced at the higher aggregation level: All).
WITH MEMBER [Measures].[pending] AS ’[Measures].[Units Ordered]-[Measures].[Units
Shipped]’
SELECT
{[Time].[2006].children} ON COLUMNS,
{[Warehouse].[Warehouse Name].members} ON ROWS
FROM Inventory
WHERE ([Measures].[pending],[Trademark].[Acme]);
In the previous MDX query, an ad-hoc measure “pending” is firstly defined as the
difference between units ordered and shipped. Then, the children of the instance
representing year 2006 (i.e. the twelve months of that year) is placed on columns, and the
different members of the aggregation level “Warehouse Name” on rows. Now, this matrix
is filled with the data in “Inventory” cube, showing the previously defined measure
“pending” and slicing
“Acme” trademark.
FUTURE DIRECTIONS
OLAP is used to extract knowledge from the data warehouse. Another kind of tool
used with this purpose are data mining tools (see Data Mining definitional entry). Till now,
both research communities have been evolving separately. The former must be interactive,
while the latter presents computational complexity problems. However, it seems promising
to integrate both kinds of tools so that ones can benefit from the others. In fact, it was
already suggested in [4], and some tools like Microsoft Analysis Services already integrate
them in some way. Nevertheless, there is much work to do in this field, yet.
On the other hand, security is usually a flaw in data warehousing projects. [7]
contains a survey of OLAP security problems. In the past, OLAP tools used to have just a
few users and all of them had high responsibilities in the company, so this was not really a
concern in the sense of confidentiality. Nowadays, with the increase in potential
users of OLAP systems inside as well as outside the company, security has appeared as a
priority in these projects (see Security in DWs definitional entry). Moreover, personal data
(like those of customers) are usually analyzed in almost all companies. Thus, inference
control mechanisms need to be studied in data mining as well as OLAP tools.
Other research directions in OLAP can be the improvement of user interaction and
flexibility in the calculation of statistics (see Visual OLAP definitional entry), and the
integration of what-if analysis (see What-if Analysis definitional entry).
URL TO CODE
Some OLAP vendors:
•Microsoft Analysis Services:
http://www.microsoft.com/sql/technologies/analysis/default.mspx
•Hyperion Solutions:
http://www.hyperion.com
•Cognos PowerPlay:
http://www.cognos.com/products/business_intelligence/analysis/index.html
•Business Objects:
http://www.businessobjects.com/products/queryanalysis/olapaccess/businessobjects.asp
•MicroStrategy:
http://www.microstrategy.com/Solutions/5Styles/olap_analysis.asp
Some open source OLAP tools:
•Mondrian:
http://mondrian.pentaho.org
•Palo:
http://www.palo.net
Homework Help
https://www.homeworkping.com/
Math homework help
https://www.homeworkping.com/
Research Paper help
https://www.homeworkping.com/
Algebra Help
https://www.homeworkping.com/
Calculus Help
https://www.homeworkping.com/
Accounting help
https://www.homeworkping.com/
Paper Help
https://www.homeworkping.com/
Writing Help
https://www.homeworkping.com/
Online Tutor
https://www.homeworkping.com/
Online Tutoring
https://www.homeworkping.com/
Homework Help
https://www.homeworkping.com/
Math homework help
https://www.homeworkping.com/
Research Paper help
https://www.homeworkping.com/
Algebra Help
https://www.homeworkping.com/
Calculus Help
https://www.homeworkping.com/
Accounting help
https://www.homeworkping.com/
Paper Help
https://www.homeworkping.com/
Writing Help
https://www.homeworkping.com/
Online Tutor
https://www.homeworkping.com/
Online Tutoring
https://www.homeworkping.com/

86921864 olap-case-study-vj

  • 1.
    Homework Help https://www.homeworkping.com/ Research Paperhelp https://www.homeworkping.com/ Online Tutoring https://www.homeworkping.com/ Case Study on OLAP Name: VIJAY S. GACHANDE Class: TEIT Roll No.: 06
  • 2.
    Subject: DBT Batch No.:B1 OLAP (On-Line Analytical Processing): Why OLAP? Need of OLAP: 1. Multidimensional Data making easy to
  • 3.
    -Select data -Navigate data -Integratedata -Explore data 2. Analytical Query Language providing -Filter the data relationship -Aggregate the data relationship -merge the data relationship -explore complex data relationship OLAP CUBE An OLAP cube as shown in fig(1.1.1) is a data structure that allows fast analysis of data. The arrangement of data into cubes overcomes a limitation of relational databases. Relational databases are not well suited for near instantaneous analysis and display of large amounts of data. Instead, they are better suited for creating records from a Series of transactions known as OLTP or On-Line Transaction Processing. Although many report- writing tools exist for relational databases, these are slow when the whole database must be summarized. Fig.. Olap Cube The OLAP cube consists of numeric facts called measures which are categorized by dimensions. The cube metadata is typically created from a star schema or snowflake schema of tables in a relational database. Measures are derived from the records in the fact table and dimensions are derived from the dimension tables. CLASSIFICATIONS OF OLAP In the OLAP world, there are mainly two different types: Multidimensional OLAP (MOLAP) ,Relational OLAP (ROLAP) and HOLAP(HYBRID). 1. MOLAP: This is the more traditional way of OLAP analysis. In MOLAP, data is stored in a multidimensional cube. The storage is not in the relational database, but in proprietary formats. Multidimensional OLAP is one of the oldest segments of the OLAP market. The business problem MOLAP addresses is the need to compare, track, analyze and forecast
  • 4.
    high level budgetsbased on allocation scenarios derived from actual numbers. The first forays into data warehousing were led by the MOLAP vendors who created special purpose databases that provided a cube-like structure for performing data analysis. MOLAP tools restructure the source data so that it can be accessed, summarized, filtered and retrieved almost instantaneously. As a general rule, MOLAP tools provide a robust solution to data warehousing problems. Administration, distribution, meta data Creation and deployment are all controlled from a central point. Deployment and Distribution can be achieved over the Web and with client/server models. Functions of MOLAP -Produces a hypercube -Pre-aggregated and pre-calculated -Rapid response times -Limited in the amount of data that can be managed When MOLAP Tools Bring Value? • Need to process information with consistent response time regardless of level of summarization or calculations selected. • Need to avoid many of the complexities of creating a relational database to store data for analysis. • Need fastest possible performance. Who Should Have Access to MOLAP Tools? • Users who are connected to a network and need to analyze larger, less defined data. • Users who want to access predefined reports, but need to have the ability to perform additional analysis on information that may not be contained in the report. Advantages: • Excellent performance: MOLAP cubes are built for fast data retrieval, and is optimal for slicing and dicing operations. • Can perform complex calculations: All calculations have been pre-generated when the cube is created. Hence, complex calculations are not only doable, but they return quickly.
  • 5.
    Disadvantages: • Limited inthe amount of data it can handle: Because all calculations are performed when the cube is built, it is not possible to include a large amount of data in the cube itself. This is not to say that the data in the cube cannot be derived from a large amount of data. Indeed, this is possible. But in this case, only summary-level information will be included in the cube itself. • Requires additional investment: Cube technology are often proprietary and do not already exist in the organization. Therefore, to adopt MOLAP technology, chances are additional investments in human and capital resources are needed. 2. ROLAP This methodology relies on manipulating the data stored in the relational database to give the appearance of traditional OLAP's slicing and dicing functionality. In essence, each action of slicing and dicing is equivalent to adding a "WHERE" clause in the SQL statement.. data warehouse sizes, users have come to realize that they cannot store all of the information that they need in MOLAP databases. The business problem that ROLAP addresses is the need to analyze massive volumes of data without having to be a systems or software expert. Relational OLAP databases seek to resolve this dilemma by providing a multidimensional front end that creates queries to process information in a relational format. These tools provide the ability to transform two-dimensional relational data into multidimensional information. Due to the complexity and size of ROLAP implementations, the tools provide a robust set of functions for meta data creation, administration and deployment. The focus of these tools is to provide administrators with the ability to optimize system performance and generate maximum analytical throughput and performance for users. All of the ROLAP vendors provide the ability to deploy their solutions via the Web or within a multitier client/server environment. Functions of ROLAP -Data remains in a relational format -Some degree of aggregation -Slower response times -Scales to large amounts of data
  • 6.
    Advantages: • Can handlelarge amounts of data: The data size limitation of ROLAP technology is the limitation on data size of the underlying relational database. In other words, ROLAP itself places no limitation on data amount. • Can leverage functionalities inherent in the relational database: Often, relational database already comes with a host of functionalities. ROLAP technologies, since they siton top of the relational database, can therefore leverage these functionalities. Disadvantages: • Performance can be slow: Because each ROLAP report is essentially a SQL query (or multiple SQL queries) in the relational database, the query time can be long if the underlying data size is large. • Because ROLAP technology mainly relies on generating SQL statements to Query the relational database, and SQL statements do not fit all needs (for example, it is difficult to perform complex calculations using SQL), ROLAP technologies are therefore traditionally limited by what SQL can do. ROLAP vendors have mitigated this risk by building into the tool out-of-the-box complex functions as well as the ability to allow users to define their own functions 3. HOLAP:
  • 7.
    Functions of HOLAP -Canmanage data both as ROLAP and MOLAP -Currently evolving -MOLAP vendors are finding it easier to move into the HOLAP market space The Basic Structure Extract Source Data Extract Storage: flat files (fastest); RDBMS; other Processing: clean; prune; combine; remove duplication standardize conform dimensions store awaiting replication export to data marts No user query services Data Staging Area Data Mart #1 OLAP (ROLAP, MOLAP,HOLAP) dimensional access subject oriented user group driven refresh frequency conforms to the Bus Data Mart #2 Data Mart #3 Populate, replicate, recover DW Bus DW Bus Corporate View The Basic Structure Data Mart #1 OLAP (ROLAP, MOLAP,HOLAP) dimensional access subject oriented user group driven refresh frequency conforms to the Bus Data Mart #2 Data Mart #3 DW Bus DW Bus Corporate Staging Area User Access Ad Hoc Query Tools Reporting Tools and Writers Customized Applications Models: forecasting; scoring; allocating; data mining; scenario analysis; etc. Data Feed Data Feed Data Feed
  • 8.
    Operations: Unfortunately, there isno consensus on the set of multidimensional operations and how to name them. However, in you find a comparison of algebraic proposals in the academic literature, besides a set of operations subsuming all of them. A sequence of these operations is known as an OLAP session. An OLAP session allows transforming a starting query into a new query. Figure 3 draws the transitions generated by each one of these operations (circles and triangles represent different attributes for Fact instances): Selection Roll-up/Drill-down ChangeBase Drill-across Projection Set operations (Union) Selection or Dice: By means of a logic predicate over the dimension attributes, this operation allows users to choose the subset of points of interest out of the whole n- dimensional space -Slice and Dice:Look at a specific interest of the business
  • 9.
    Roll-up: Also called”Drill-up”, it groups cells in a Cube based on an aggregation hierarchy. This operation modifies the granularity of data by means of a many-to-one relationship which relates instances of two aggregation levels in the same Dimension, corresponding to a part-whole relationship (figure 3.b from left to right). For example, you could roll-up monthly sales into yearly sales moving from “Month” to “Year” aggregation level along the temporal dimension. -Roll Up:Move from detail to summary Drill-down: This is the counterpart of Roll-up. Thus, it removes the effect of that operation by going down through an aggregation hierarchy, and showing more detailed data -Drill Down: Move from summary to detail ChangeBase: This operation reallocates exactly the same instances of a Cube into a new n- dimensional space with exactly the same number of points (figure 3.c). Actually, it allows two different kinds of changes in the space: you can just rearrange the multidimensional space by reordering the Dimensions interchanging rows and columns in the Cross-tab (this is also known as Pivoting), or it could add/remove dimensions to/from the space. -Pivot and Rotate Looking at data from varying perspectives -Drill Through Move to a near transaction level of detail Drill-across: This operation changes the subject of analysis of the Cube, by showing measures regarding a new Fact. The n-dimensional space remains exactly the same, only the data placed in it change so that new measures can be analyzed (figure). For example, if your Cube contains data about sales, you could use this operation to analyze data regarding production using the same Dimensions. Projection: It selects a subset of measures from those available in the Cube (figure). Set operations: These operations allow users to operate two Cubes defined over the same n-dimensional space. Usually, Union (figure f), Difference and Intersection are considered. This set of algebraic operations is minimal in the sense that none of the operations can be expressed in terms of others, nor can any operation be dropped without affecting its functionality (some tools consider that the set of measures of a Fact conform an artificial analysis dimension, as well; if so, Projection should be removed from the set of operations in order to be considered minimal, since it would be done by Selection over this artificial Dimension). Thus, other operations can be derived by sequences of these. It is the case of Slice (which reduces the dimensionality of the original Cube by fixing a point in a
  • 10.
    Dimension) by meansof Selection and ChangeBase operations. It is also common that OLAP implementations use the term Slice & Dice to refer to the selection of fact instances, and some also introduce Drill-through to refer to directly accessing the data sources in order to Lower the aggregation level below that in the OLAP repository or data mart. Data integration 1 Data resides in many distributed, heterogeneous OLTP (On-Line Transaction Processing) sources Sales, inventory, customer, . NC branch, NY branch, CA branch 2 Need to support OLAP (On-Line Analytical Processing) over an integrated view of the data 3 Possible approaches to integration I) Eager: integrate in advance and store the integrated data at a central repository called the data warehouse II) Lazy: integrate on demand; process queries over distributed sources OLTP versus OLAP: OLTP OLAP Usage Application Specific Decision support Workload Predefined Unforeseeable Access Read/Write Read-only Query structure Simple Complex Records per operation Tens/ Hundreds Thousands/Millions Number of users Thousands/Millions Tens/Hundreds Function Mostly updates Mostly reads Transactions/queries Short, simple transactions Long, complex queries Users types Clerical users Analysts, decision makers Goal ACID, transaction throughput fast queries Main Differences between OLTP and OLAP are:- 1. User and System Orientation OLTP: customer-oriented, used for data analysis and querying by clerks, clients and IT professionals. OLAP: market-oriented, used for data analysis by knowledge workers( managers, executives, analysis).
  • 11.
    2. Data Contents OLTP:manages current data, very detail-oriented. OLAP: manages large amounts of historical data, provides facilities for summarization and aggregation, stores information at different levels of granularity to support decision making process. 3. Database Design OLTP: adopts an entity relationship(ER) model and an application-oriented database Design. OLAP: adopts star, snowflake or fact constellation model and a subject-oriented database Design. 4. View OLTP: focuses on the current data within an enterprise or department. OLAP: spans multiple versions of a database schema due to the evolutionary process of An organization; integrates information from many organizational locations and data. APPLICATIONS OF OLAP KEY APPLICATIONS Managers are usually not trained to query databases by means of SQL. Moreover, if the query is relatively complex (several joins and subqueries, grouping, and functions) and the database schema is not small (with maybe hundreds of tables), using interactive SQL could be a nightmare even for SQL experts. Thus, OLAP is used to ease the tasks of these
  • 12.
    managers in extractingknowledge from the data warehouse by means of Drag&Drop, instead of typing SQL queries by hand. OLAP market is estimated around 6 billion US$ in 2006, which is mainly devoted to decision making. However, this paradigm can also be used in any other field with non- expert users, where schemas and queries are relatively complex. For example, its usage is under investigation in bioinformatics [8], and the semantic web [9]. Declarative languages There are some research proposals of declarative query languages for OLAP. [1] proposes a graphical query language, while [3] proposes a calculus. From the industry point of view, MDX (standing for Multidimensional Expressions [5]) is the de facto standard. It was introduced in 1997, and in spite of the specification being owned by Microsoft it has been widely adopted. Its syntax resembles that of SQL. [ WITH <MeasureDefinition>+ ] SELECT <DimensionSpecification>+ FROM <CubeName> [WHERE <SlicerClause> ] However, its semantics are completely different. Roughly speaking, an MDX query gets the instances of a given Cube stated in the FROM clause and places them in the space defined by the SELECT clause. Moreover, complex calculations can be defined in the WITH clause, and the dimensions not used in the SELECT clause can be sliced in the WHERE clause (if not explicitly sliced, it is assumed that dimensions that do not appear in the SELECT are sliced at the higher aggregation level: All). WITH MEMBER [Measures].[pending] AS ’[Measures].[Units Ordered]-[Measures].[Units Shipped]’ SELECT {[Time].[2006].children} ON COLUMNS, {[Warehouse].[Warehouse Name].members} ON ROWS FROM Inventory WHERE ([Measures].[pending],[Trademark].[Acme]);
  • 13.
    In the previousMDX query, an ad-hoc measure “pending” is firstly defined as the difference between units ordered and shipped. Then, the children of the instance representing year 2006 (i.e. the twelve months of that year) is placed on columns, and the different members of the aggregation level “Warehouse Name” on rows. Now, this matrix is filled with the data in “Inventory” cube, showing the previously defined measure “pending” and slicing “Acme” trademark.
  • 14.
    FUTURE DIRECTIONS OLAP isused to extract knowledge from the data warehouse. Another kind of tool used with this purpose are data mining tools (see Data Mining definitional entry). Till now, both research communities have been evolving separately. The former must be interactive, while the latter presents computational complexity problems. However, it seems promising to integrate both kinds of tools so that ones can benefit from the others. In fact, it was already suggested in [4], and some tools like Microsoft Analysis Services already integrate them in some way. Nevertheless, there is much work to do in this field, yet. On the other hand, security is usually a flaw in data warehousing projects. [7] contains a survey of OLAP security problems. In the past, OLAP tools used to have just a few users and all of them had high responsibilities in the company, so this was not really a concern in the sense of confidentiality. Nowadays, with the increase in potential users of OLAP systems inside as well as outside the company, security has appeared as a priority in these projects (see Security in DWs definitional entry). Moreover, personal data (like those of customers) are usually analyzed in almost all companies. Thus, inference control mechanisms need to be studied in data mining as well as OLAP tools. Other research directions in OLAP can be the improvement of user interaction and flexibility in the calculation of statistics (see Visual OLAP definitional entry), and the integration of what-if analysis (see What-if Analysis definitional entry). URL TO CODE Some OLAP vendors: •Microsoft Analysis Services: http://www.microsoft.com/sql/technologies/analysis/default.mspx •Hyperion Solutions:
  • 15.
    http://www.hyperion.com •Cognos PowerPlay: http://www.cognos.com/products/business_intelligence/analysis/index.html •Business Objects: http://www.businessobjects.com/products/queryanalysis/olapaccess/businessobjects.asp •MicroStrategy: http://www.microstrategy.com/Solutions/5Styles/olap_analysis.asp Someopen source OLAP tools: •Mondrian: http://mondrian.pentaho.org •Palo: http://www.palo.net Homework Help https://www.homeworkping.com/ Math homework help https://www.homeworkping.com/ Research Paper help https://www.homeworkping.com/ Algebra Help https://www.homeworkping.com/ Calculus Help https://www.homeworkping.com/ Accounting help https://www.homeworkping.com/ Paper Help https://www.homeworkping.com/ Writing Help https://www.homeworkping.com/ Online Tutor https://www.homeworkping.com/ Online Tutoring https://www.homeworkping.com/ Homework Help https://www.homeworkping.com/ Math homework help https://www.homeworkping.com/ Research Paper help https://www.homeworkping.com/
  • 16.
    Algebra Help https://www.homeworkping.com/ Calculus Help https://www.homeworkping.com/ Accountinghelp https://www.homeworkping.com/ Paper Help https://www.homeworkping.com/ Writing Help https://www.homeworkping.com/ Online Tutor https://www.homeworkping.com/ Online Tutoring https://www.homeworkping.com/