Sql server 2008 r2 analysis services overview whitepaper
ArunAndGangadhar_OLaaS_v4
1. OLaaS: OLAP as a Service
Arun D. Patil and N. D. Gangadhar
Computer Science and Engineering, Faculty of Engineering and Technology
M. S. Ramaiah University of Applied Sciences, Bengaluru
Email: arundpatil007@gmail.com, gangadhar.cs.et@msruas.ac.in
Abstract—Online Analytics Processing (OLAP) is utilised to
develop multidimensional operations enabling queries and visu-
alisation for Business Intelligence (BI). Most of the OLAP systems
come with a tightly integrated user interface for querying and
visualisation of data without the core OLAP operations exposed
as an API. Advanced BI applications can be developed and
composed to create complex workflows if the OLAP operations
are available as an API. In addition, a Web Service based API
would enable applications to use Service Oriented Architecture
for Big Data Analytics and also easily be deployed on a Cloud.
This paper documents the design and prototyping of an OLAP
based Platform as a Service, termed OLAP as a Service (OLaaS).
OLaaS exposes the core OLAP operations of OLAP Cube
design, Slicing, Dicing, Rollup and Drilldown and data retrieval
as RESTful Web Services for application programming and
composition. Syntax and parsing logic for the Web Service call
parameters is developed and an engine for Multidimensional
Query (MDX) construction for the OLAP operations is developed.
The Web Services are designed to use an existing OLAP engine
for running the generated MDX queries. For flexibility and
processing huge data sizes, data movement is allowed to be
programmed using source and destination database sources set
as parameters to the developed OLAP operation services. The
designed OLaaS services are implemented in Java and integrated
with the open-source Mondrian OLAP engine Olap4j. Standard
OLAP Cube data is used to test and validate the OLaaS and its
OLAP services. The prototyped OLaaS services have successfully
passed the testing and validation. The performance of the services
is evaluated on the test data and is found that the overhead of the
REST request parsing and MDX query generation is comparable
with the standalone MDX query processing time.
Index Terms—OLAP, Web Service, REST, MDX, Cloud, Work
Flow, Big Data, BI
I. INTRODUCTION
IN order to keep the business competitive in the fast
changing environment, organizations must react to new
changes and opportunities quickly. Based on the right in-
formation, decision makers should be quick in making the
right decision. Such decision-making is usually known by
computer based Decision support system (DSS), which is also
known as Business Intelligence (BI) applications. BI aims to
transform a company’s raw data into meaningful information
to achieve the decision making quickly. Different information
systems are used to collect the large amount of historical and
multidimensional data from the operations of the enterprise.
The collected data is used to form the Data Warehouse of
the enterprise and will be analyzed by the BI applications.
The analysis from the BI applications is used to understand
business behavior and make strategic decisions.
The key component of BI systems for analyzing the Data
Warehouse data is On Line Analytical Processing (OLAP).
It enables the analysis of multi-dimensional business data
providing insights supporting business decision-making. Data
configured for OLAP usually use a multi-dimensional data
model. Data attributes such as product sales region, and
time are treated as separate dimensions allowing users to
perform complex analysis reports. In contrast to the tradi-
tional databases which are used to store and analyze business
transactions, Data Warehouses are historical and summary
information which are analyzed using online transaction Pro-
cessing (OLTP). OLAP provides the ability to model business
problems enabling organizations gain better insight into the
data.
A. Motivation
Two major shifts in the Enterprise Computing is the shifting
of the computation to the Cloud and use of Web Services
to develop Service Oriented Architectures (SOA) based en-
terprise applications. Cloud Software as a Service (SaaS)
computational model allows users to access software to be
hired from the Cloud based service provider over the Internet.
This allows for on demand scalability of the software usage. In
addition, the capital and maintenance burden is shifted away
from the user. Thus, with the use of SaaS, there would not be
any extra hardware cost, no cost for initial setup and pay for
what is used. On the other hand with Platform as a Service
(PaaS) Cloud Computational Model, applications can be pro-
grammed using the API exposed as a set of Web Services.
Web Services allow a uniform API for application/service
interaction. In addition, services can be composed to develop
complex workflows.
There are several standalone BI engines such as MS SQL,
Pentaho BI and Oracle DB. All these have OLAP built-in
and are tightly integrated into a Data Warehouse back end.
However, this tight coupling does not provide for independent
evolution of BI and the Data Warehouse components. An
independent middle tire between BI and Data Warehouse
allows for the BI and data warehouse to evolve independently.
If this middle tier is developed using Web Services, it enables
building complex workflows and compositions needed for BI
to be developed easily.
With this motivation, in this paper a framework is pro-
posed and prototyped to expose OLAP operations as Web
Services enabling complex BI applications to be developed
on the computational Cloud. Such an API would provide a
2. replacement of the OLAP services being offered as Software as
a Service (SaaS) Cloud Computational Model with a Platform
as a Service (PaaS) model.
B. Related Work
With the raise of the Cloud as dominant model for compu-
tation, many applications, including BI applications are being
migrated to Cloud infrastructures. With the advantages of
scalability and no maintenance costs on hardware infrastruc-
ture, Cloud is an alternative platform for BI systems [1], [2].
Developing strategies and methods of moving and hosting BI
applications to Cloud architectures has been the focus of re-
search [1]–[3]. Strimbei [4] discusses the development of and
web based OLAP application on the Cloud without an API for
BI application developer to use. Similarly, [1], while bringing
out the advantages of moving BI to Cloud platforms, does
not concentrate on an API for BI application development.
Cao et al. [2] is concerned with the data storage system for
supporting both OLAP and OLTP but do not consider exposing
the OLAP operations as an API. A significant amount of
research and development effort is currently spent, especially
by the industry teams, on creating BI applications using newer
models of data storage, processing and retrieval such as Map-
Reduce; e.g., [5]. Apache Kylin Project [6] is an effort in
the direction of creating a Web Services based API for OLAP.
However, the system is still under active development and none
of the core OLAP Cube operations are available yet.
As discussed above, while most of the OLAP systems have
been available on the Cloud under SaaS for end-user use, to
the best of our knowledge no significant effort has been put
create services for core OLAP operations under PaaS model.
C. Organisation of the Paper
The remaining part of this paper is organized as follows.
Section 2 discusses the background. Section 3 presents the
design of OLAP as a Service (OlaaS) and its prototyping.
Section 4 presents results and their discussion. Section 5
provides a conclusion to the paper.
II. BACKGROUND
One of the powerful technologies to analyze multidimen-
sional data is OLAP. With advances in technology and algo-
rithms, it has become easier to manage, maintain and store data
for use in BI applications. For decision-making, BI converts
raw data into meaningful data.
A. Data Warehouse
Data warehouse is defined as ”a subject oriented, inte-
grated, time variant, non-volatile collection of data in support
of decision-making process” [7]. A Data Warehouse (DW)
should help in managing the data effectively and will deliver
information to the decision makers effectively and efficiently.
It is created by integrating varied information sources in the
organisation.
B. BI and Data Warehouse
BI covers performance management, reporting, planning,
querying, online analytical processing, predictive analysis and
related areas. BI applications help to maintain, manage, store
and clarify data from Data Warehouses allowing organizations
can make use of this data to make business decisions. The
components of BI includes ETL engine, Data Warehouse,
OLAP and reporting.
C. OLAP
OLAP is one of the key components of a Business In-
telligence system. For individuals of business side, the key
element is ”Multidimensional”. It provides access to informa-
tion to non- IT experts, so they can create intelligent queries
without the mediation of IT experts. OLAP complements On-
Line Transactional Processing (OLTP), which uses operational
databases of transactions. OLTP relies largely on Relational
Database Management Systems (RDBMS) where as OLAP
operates on mutidimensional OLAP Cubes.
Multidimensionality is the main characteristics of OLAP.
The essentially two-dimensional limitation of Relational
Database can be overcome by OLAP cube data structure. The
structure of the data cube is closer to decision makers way of
thinking and make user interaction easier. The main aspects
of OLAP cubes are modelling of the data, cube formation and
dimension based analysis and visualisation.
D. OLAP Operations
Below are some of the well-known OLAP operations.
1) Slicing: In a given cube, slice performs a selection of
one dimension, which results in a sub cube. Slice is the process
of picking a rectangular subset of a cube by picking a solitary
quality for one of its dimension, making another cube with
one less dimension.
2) Dicing: Dice provides a new sub-cube by selecting two
or more dimensions by a given cube. The dice operation
delivers a sub-cube by permitting the user to pick particular
estimations of numerous dimensions.
3) Roll up: Aggregations on the cube are performed using
ROLLUP by moving up a hierarchy in a dimension or by
reducing a dimension. When Rollup is done, one or more
dimensions form the cube is deleted.
4) Drill down: The reverse of roll up is the roll down
operation and it is also called drill down. It traverses from
low level data to high level data. Drill down permits the client
to explore among levels of information going from the most
condensed (up) to the most definite (down).
5) OLAP Schema Definition: Schema definition syntax is
as follows.
<Schema>
<cube_name>
<table_name=" "/>
<dimension_name=" ">
<hierarchy has_all=" " all_member_name=" "
primary_key="">
<table_name=" "/>
<level_name=" " column=" " unique_members=" "/>
3. Fig. 1. Block Diagram of OLaaS System
</hierarchy>
</dimension_name>
</cube>
</schema>
Database schema can be constructed using tools such as
Schema Workbench tool.
III. OLAAS SYSTEM DESIGN AND PROTOTYPING
REST Web Services are designed to upload cube and per-
form Slice, Dice and Drill down operations on the OLAP cube.
The API is designed to take the required information from the
application, construct an MDX query and pass the same to an
MDX engine. For prototyping we have used Olap4j/Mondrian.
Based on the result, an SQL for data movement operation
is constructed and executed. The result will be stored in the
specified location requested by the application. The application
then can call another REST API to retrieve this computed
data for further processing. For prototyping we have chosen
MySQL for storage of both the Data Warehouse using ROLAP
and result database.
Web Services for the following operations have been iden-
tified for development:
1) To Upload Cube/Schema file
2) To Slice the data
3) To Dice the data
4) To Drill Down the data, and
5) To Retrieve the data
A. System Design
The flow of the query processing is as follows: REST
call parsing, MDX Query construction, Execution & Result
Parsing, SQL Query Construction and Data Movement. Fig-
ure 1 shows the basic block diagram of the OLaaS. Figure 2
shows Business Process Modelling Notation (BPMN) diagram
depicting the system design in more depth.
1) Request Processing: The request type is identified from
the received REST call and procesed accordingly. The request
processing is depicted in Figure 3. The application calls the
REST API. The system will validate the parameters passed
with the API and throws exception if any. Once the parameters
Fig. 2. BPMN Diagram of the OLaaS System
Fig. 3. Request Processing
are validated to true, the system then checks for the request
type (Request type: Upload, Slice, Dice, Drill, retrieve data).
Based on the request type, the specific method will be called
internally and perform the operation. Once the operation is
complete, the application developer can see the response with
the details of where the computed date is stored. The developer
also has provision to view the logs to check for any issues or
errors.
Based on the request type, the MDX query is constructed.
The constructed query is executed using an MDX engine.
The multidimensional format result of the MDX query is
parsed. Based on this, an SQL insert and create queries will
be constructed to load the data into another table based on
application request. The target database name to store the data
and retrieve later can be programmed via a parameter to the
REST call.
2) MDX Query Construction and Execution: The query
parameters from the REST URL are parsed and the Measures,
Cube name, Dimension, source and target database. names
are identified. Based on these details an MDX query is
constructed. Then the system connects to the source database,
executes the MDX and gets the result. The implementation of
the logic is explained in the Implementation section.
3) MDX Result Parsing and Data Loading: From the result
of the MDX query, the Axis details are identified. Based on the
Axis details, the rows and columns information is identified.
4. Fig. 4. MDX Query Processing
Finally, based on this information, an SQL query is constructed
and executed on the target database. This flow is illustrated in
Figure 4.
4) REST API: The parameters of the REST API calls are
defined as the following:
M=<measures>
D=<dimensions>
C=<cube name>
T=<destination table>
Dd=<destination database>
Sd=<source database>
H=<hostname>
U=<user name>
P=<password>
IV. SYSTEM IMPLEMENTATION
The designed OLaaS is prototyped by developing the
REST API in Java. The API is connected to open–source
Olap4j/Mondrian engine for MDX query execution. MySQL
is used to both store and process the OLAP Cube data and the
results of the request.
The following subsections detail the prototype implementa-
tion.
A. Upload Cube
The Web Service for Schema File upload is created and
REST API is exposed under the URL http://host:3000/api/
upload cube.
B. MDX Query Construction
Once the Cube is schema is designed and populated, an
MDX query on this schema can be implemented using SQL.
Olap4j is configured and the MDX query is executed. The
MDX query needs to be generated dynamically based on the
user request, based on what the user wants, on which part of
the data the computation is to be carried out.
The first step is to create an object that represents Cube in
database. Using Olap4j a connection object which is similar to
JDBC connection is obtained which cannot represent OLAP
databases. To get full functionality access of OLAP, this
connection needs to be unwrapped:
Connection rConn =
(OlapWrapper)DriverManager.getConnection(
"jdbc:xmla:" + "Server= server_name ",
...
...
)
OlapConn oConn = rConn.unwrap(OlapConn.class);
In the above snippet the connection is casted as OlapWrap-
per. This allows access to the OLAP features from the oConn
object. The next statement creates the actual query:
Query myQuery = new Query(<QueryName>,
<CubeName>);
The next step is to build an MDX query. For that, the
dimensions need to be put on their respective axis; as in the
following.
QueryDimension <DimensionID1> =
myQuery.getDimension(<DimensionName1>);
QueryDimension <DimensionID2> =
myQuery.getDimension(<DimensionName2>);
QueryDimension <DimensionID3> =
myQuery.getDimension(<DimensionName3>);
myQuery.getAxis(Axis.COLUMNS).
addDimension(DimensionID1);
myQuery.getAxis(Axis.ROWS).
addDimension(DimensionID2);
myQuery.getAxis(Axis.FILTER).
addDimension(DimensionID3);
In the above 3 dimensions of the cube are placed on the 3
different Axis. More than one dimension can be placed on a
single axis. Later the Query Model will automatically create
a cross join between dimensions.
Before obtaining the created MDX query it needs to be
validated:
myQuery.validate();
Finally, the actual MDX query string is obtained as an SQL
statement, ready for execution:
myQuery.getSelect().toString()
C. MDX Result Parsing and SQL Construction
The next step is to upload the sliced/diced/drilled data into
another table as requested by the user application. The result
data of the MDX query cannot be dumped into another table,
as any insert/create SQL queries cannot really be constructed
directly with this data. So the next step is to parse this data
and construct respective insert and create SQL queries. Once
the MDX query is executed we should identify the rows
and columns in the MDX result. This is achieved using the
function AXIS in MDX language. MDX query can return data
in more than two dimensions. To get the Column axis Axis(0)
is used, and to return Rows axis Axis(1), and so on for higher
dimensions.
Result result = connection.execute(query);
// query is the MDX query
slicers = result.getSlicerAxis().
getPositions();
List<Position> columns =
result.getAxes()[0].getPositions();
List<Position> rows = null;
if (result.getAxes().length == 2) {
rows = result.getAxes()[1].
5. Fig. 5. REST API Call
getPositions();
}
From the rows and columns information, SQL queries
are constructed to create table and insert data based on the
destination database information in the REST call.
D. REST API Implementation
Each of the OLAP operations identified as exposed as REST
API with the following base URLs:
Upload Cube: http://host:3000/api/upload cube
Slice: http://host:3000/api/slice
Dice: http://host:3000/api/dice
Rollup: http://host:3000/api/rollup
Rollup: http://host:3000/api/drilldown
For protyping the REST API, NodeJS is used. For instance,
Slice operation API is implemented as follows:
...
router.get("/slice", function(req, res) {
var child = exec(’java -jar olap_slice.jar ’
+ ’"’ + req.query.m + ’"’ + ’ ’
+ ’"’ + req.query.d + ’"’ + ’ ’
+ req.query.t + ’ ’ + req.query.c,
{maxBuffer: 1024 * 500},
function (error, stdout, stderr){
console.log(’Output -> ’ + stdout);
if (req.query.log == 1) {
...
} else {
res.json({result:
"Cube is stored in DATABASE=req.query.d,
TABLE=" + req.query.t});
}
if(error !== null){
res.json({oops_error: error});
console.log("Error -> "+error);
}
});
app.exports = child;
});
V. RESULTS AND DISCUSSION
In this section, the results and performance analysis of
the designed and prototyped OLaaS is documented and a
discussion on the same is presented.
A. Results
Figure 5 illustrates the API call with the example of a
Slice REST API call. Figure 6 illustrates an example of the
MDX constructed based on the API call parameters and the
Cube structure. The output of the MDX query will be in
multidimensional form. Figure 7 illustrates a snippet of the
MDX query result, which is captured from the backend logs.
Fig. 6. MDX Query Construction Result
Fig. 7. MDX Query Result
The MDX result is used by the OLaaS to construct and execute
an SQL query based on the REST API call type and its
paramenters. The result of execution of the generated SQL
query is verified from the examination of the target database
as in illustrated in Figure 8. The response form the API call
with logging enabled is shown in Figure 9.
B. Performance Analysis
To obtain an estimate of the performance of the prototyped
OLaaS API, the API is executed repeatedly and the response
time measured. Since the OLaaS API includes several other
operations in addition to the MDX query execution on the
OLAP Cube, a measure of these “overhead” is estimated indi-
6. Fig. 8. SQL Query Verification
Fig. 9. API Response with Logging Enabled
rectly as follows. The generated MDX query is run using Saiku
Analytics engine on the same Cube data and the MDX query
execution time measured. Table I tabulates the performnace
measurement results for the core OLAP operations along with
the Saiku Analytics measurement results.
From Table I, it can be seen that the responce times with
repeated calls to OLAP operations are decreasing indicating
that the caching subsystems of the underlying platforms (MDX
and database engines) is optimised to use the cached results.
In addition, with all the operations needed in addition to
execution of an MDX query, by its nature of design and
operation, the prototyped OLaaS REST API has given good
performance.
VI. CONCLUSION
This paper presents the design and prototyping of a frame-
work for exposing OLAP operations as a set of RESTful
Web Services. Web Services were developed for Upload Cube,
Slice, Dice, Drilldown and Retrieve Data OLAP operations.
The designed Web Services are implemented using Mondrian
OLAP4j. The developed REST API extends the OLAP ser-
vices being offered as SaaS on the Cloud to a Platform as
a Service (PaaS) model, termed OLAP as a Service (OLaaS)
Call No Slice Dice DrillDown
1 500 (300) 410 (275) 400 (320)
2 500 (280) 410 (275) 440 (320)
3 492 (105) 110 (100) 110 (111)
4 178 (105) 110 (100) 110 (111)
5 178 (105) 110 (100) 110 (111)
TABLE I
API RESPONCE TIME PERFORMANCE (MS). VALUES IN THE BRACKETS
INDICATE THE MDX QUERY PROCESSING TIMES USING SAIKU
ANALYTICS ENGINE
here. More importantly, it enables the BI application developer
to easily incorporate OLAP operations to develop complex
applications. The developed Web Services were tested, vali-
dated and analyzed for their performance. The performance
of the developed Web Services along with the included query
construction was found to be comparable to performance
measured using native Mondrian OLAP operations, which
does not include query construction. OLAP Web Services
developed proved openness, adaptability, interoperability and
modularization on Cloud computing platforms.
Based on the work we can conclude that, the fundamen-
tal OLAP operations, namely Cube Design, Upload Cube,
Slice, Dice, Drilldown operations can be exposed as RESTful
Web Services and integrated as OLAP as a Service (OLaaS)
platform. Complex BI applications can now be implemented
taking advantage of REST API, using the developed OLaaS.
The work carried out can be a base for research and
development in several directions. A few recommendations
for future work are as follows: The developed API needs to
be tested on Data Warehouse for real-life sizes (terabytes and
petabytes) and complex real-life BI operations. The developed
API can be optimized further for better response times. The
REST API developed is meant for users to develop their
own workflow using the developed services and Web Service
composition. It would be very useful to integrate a Web
Service Composition based workflow creation facility with
the developed API. Currently this is being worked upon.
The API can also be integrated with Map-Reduce framework
for creating and executing new calculations. The developed
API can be extended to support other Cube models than the
ROLAP considered here.
REFERENCES
[1] H. Al-Aqrabi, L. Liu, R. Hill, and N. Antonopoulos, “Taking the Business
Intelligence to the Clouds,” in High Performance Computing and Com-
munication & 2012 IEEE 9th International Conference on Embedded
Software and Systems (HPCC-ICESS), 2012 IEEE 14th International
Conference on. IEEE, 2012, pp. 953–958.
[2] Y. Cao, C. Chen, F. Guo, D. Jiang, Y. Lin, B. C. Ooi, H. T. Vo, S. Wu,
and Q. Xu, “Es2: A Cloud Data Storage System for Supporting both
OLTP and OLAP,” in 2011 IEEE 27th International Conference on Data
Engineering. IEEE, 2011, pp. 291–302.
[3] X. Zhou, “Parallel Real-Time OLAP on Cloud Platforms,” Ph.D. disser-
tation, Carleton University, Ottawa, 2013.
[4] C. Strˆımbei, “OLAP Services on Cloud Architecture,” Journal of Software
and Systems Development, vol. 2012, p. 1, 2012.
[5] A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, N. Zhang, S. Antony,
H. Liu, and R. Murthy, “Hive-A Petabyte Scale Data Warehouse using
Hadoop,” in 2010 IEEE 26th International Conference on Data Engi-
neering (ICDE 2010). IEEE, 2010, pp. 996–1005.
[6] “Apache Kylin Project,” http://kylin.apache.org/, Accessed: 2016-08-08.
[7] W. H. Inmon, Building the Data Warehouse. John Wiley & Sons, 2005.