Data Lakes are early in the Gartner hype cycle, but companies are getting value from their cloud-based data lake deployments. Break through the confusion between data lakes and data warehouses and seek out the most appropriate use cases for your big data lakes.
Big data architectures and the data lakeJames Serra
With so many new technologies it can get confusing on the best approach to building a big data architecture. The data lake is a great new concept, usually built in Hadoop, but what exactly is it and how does it fit in? In this presentation I'll discuss the four most common patterns in big data production implementations, the top-down vs bottoms-up approach to analytics, and how you can use a data lake and a RDBMS data warehouse together. We will go into detail on the characteristics of a data lake and its benefits, and how you still need to perform the same data governance tasks in a data lake as you do in a data warehouse. Come to this presentation to make sure your data lake does not turn into a data swamp!
Data Warehousing Trends, Best Practices, and Future OutlookJames Serra
Over the last decade, the 3Vs of data - Volume, Velocity & Variety has grown massively. The Big Data revolution has completely changed the way companies collect, analyze & store data. Advancements in cloud-based data warehousing technologies have empowered companies to fully leverage big data without heavy investments both in terms of time and resources. But, that doesn’t mean building and managing a cloud data warehouse isn’t accompanied by any challenges. From deciding on a service provider to the design architecture, deploying a data warehouse tailored to your business needs is a strenuous undertaking. Looking to deploy a data warehouse to scale your company’s data infrastructure or still on the fence? In this presentation you will gain insights into the current Data Warehousing trends, best practices, and future outlook. Learn how to build your data warehouse with the help of real-life use-cases and discussion on commonly faced challenges. In this session you will learn:
- Choosing the best solution - Data Lake vs. Data Warehouse vs. Data Mart
- Choosing the best Data Warehouse design methodologies: Data Vault vs. Kimball vs. Inmon
- Step by step approach to building an effective data warehouse architecture
- Common reasons for the failure of data warehouse implementations and how to avoid them
Embarking on building a modern data warehouse in the cloud can be an overwhelming experience due to the sheer number of products that can be used, especially when the use cases for many products overlap others. In this talk I will cover the use cases of many of the Microsoft products that you can use when building a modern data warehouse, broken down into four areas: ingest, store, prep, and model & serve. It’s a complicated story that I will try to simplify, giving blunt opinions of when to use what products and the pros/cons of each.
Data Architecture Best Practices for Advanced AnalyticsDATAVERSITY
Many organizations are immature when it comes to data and analytics use. The answer lies in delivering a greater level of insight from data, straight to the point of need.
There are so many Data Architecture best practices today, accumulated from years of practice. In this webinar, William will look at some Data Architecture best practices that he believes have emerged in the past two years and are not worked into many enterprise data programs yet. These are keepers and will be required to move towards, by one means or another, so it’s best to mindfully work them into the environment.
Data Lakes are early in the Gartner hype cycle, but companies are getting value from their cloud-based data lake deployments. Break through the confusion between data lakes and data warehouses and seek out the most appropriate use cases for your big data lakes.
Big data architectures and the data lakeJames Serra
With so many new technologies it can get confusing on the best approach to building a big data architecture. The data lake is a great new concept, usually built in Hadoop, but what exactly is it and how does it fit in? In this presentation I'll discuss the four most common patterns in big data production implementations, the top-down vs bottoms-up approach to analytics, and how you can use a data lake and a RDBMS data warehouse together. We will go into detail on the characteristics of a data lake and its benefits, and how you still need to perform the same data governance tasks in a data lake as you do in a data warehouse. Come to this presentation to make sure your data lake does not turn into a data swamp!
Data Warehousing Trends, Best Practices, and Future OutlookJames Serra
Over the last decade, the 3Vs of data - Volume, Velocity & Variety has grown massively. The Big Data revolution has completely changed the way companies collect, analyze & store data. Advancements in cloud-based data warehousing technologies have empowered companies to fully leverage big data without heavy investments both in terms of time and resources. But, that doesn’t mean building and managing a cloud data warehouse isn’t accompanied by any challenges. From deciding on a service provider to the design architecture, deploying a data warehouse tailored to your business needs is a strenuous undertaking. Looking to deploy a data warehouse to scale your company’s data infrastructure or still on the fence? In this presentation you will gain insights into the current Data Warehousing trends, best practices, and future outlook. Learn how to build your data warehouse with the help of real-life use-cases and discussion on commonly faced challenges. In this session you will learn:
- Choosing the best solution - Data Lake vs. Data Warehouse vs. Data Mart
- Choosing the best Data Warehouse design methodologies: Data Vault vs. Kimball vs. Inmon
- Step by step approach to building an effective data warehouse architecture
- Common reasons for the failure of data warehouse implementations and how to avoid them
Embarking on building a modern data warehouse in the cloud can be an overwhelming experience due to the sheer number of products that can be used, especially when the use cases for many products overlap others. In this talk I will cover the use cases of many of the Microsoft products that you can use when building a modern data warehouse, broken down into four areas: ingest, store, prep, and model & serve. It’s a complicated story that I will try to simplify, giving blunt opinions of when to use what products and the pros/cons of each.
Data Architecture Best Practices for Advanced AnalyticsDATAVERSITY
Many organizations are immature when it comes to data and analytics use. The answer lies in delivering a greater level of insight from data, straight to the point of need.
There are so many Data Architecture best practices today, accumulated from years of practice. In this webinar, William will look at some Data Architecture best practices that he believes have emerged in the past two years and are not worked into many enterprise data programs yet. These are keepers and will be required to move towards, by one means or another, so it’s best to mindfully work them into the environment.
SSAS, MDX , Cube understanding, Browsing and Tools information Vishal Pawar
Why we need SSAS Cube
What is SSAS Cube
Way to access Cube
What is Dimension and Attributes
QHP Dimension and Attributes
Process Flow and QHP Cube Browsing
MDX Basics
MDX Tools
Comparison of Queries Written in T-SQL and MDX with Construct
MDX –How to add where condition
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricCambridge Semantics
Watch this webinar to learn about the benefits of using semantic and graph database technology to create a Data Catalog of all of an enterprise's data, regardless of source or format, as part of a modern IT or data management stack and an important step toward building an Enterprise Data Fabric.
This presenation explains basics of ETL (Extract-Transform-Load) concept in relation to such data solutions as data warehousing, data migration, or data integration. CloverETL is presented closely as an example of enterprise ETL tool. It also covers typical phases of data integration projects.
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. I’ll include use cases so you can see what approach will work best for your big data needs.
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...DATAVERSITY
A solid data architecture is critical to the success of any data initiative. But what is meant by “data architecture”? Throughout the industry, there are many different “flavors” of data architecture, each with its own unique value and use cases for describing key aspects of the data landscape. Join this webinar to demystify the various architecture styles and understand how they can add value to your organization.
Data Modeling Best Practices - Business & Technical ApproachesDATAVERSITY
Data Modeling is hotter than ever, according to a number of recent surveys. Part of the appeal of data models lies in their ability to translate complex data concepts in an intuitive, visual way to both business and technical stakeholders. This webinar provides real-world best practices in using Data Modeling for both business and technical teams.
Modernizing to a Cloud Data ArchitectureDatabricks
Organizations with on-premises Hadoop infrastructure are bogged down by system complexity, unscalable infrastructure, and the increasing burden on DevOps to manage legacy architectures. Costs and resource utilization continue to go up while innovation has flatlined. In this session, you will learn why, now more than ever, enterprises are looking for cloud alternatives to Hadoop and are migrating off of the architecture in large numbers. You will also learn how elastic compute models’ benefits help one customer scale their analytics and AI workloads and best practices from their experience on a successful migration of their data and workloads to the cloud.
Building Lakehouses on Delta Lake with SQL Analytics PrimerDatabricks
You’ve heard the marketing buzz, maybe you have been to a workshop and worked with some Spark, Delta, SQL, Python, or R, but you still need some help putting all the pieces together? Join us as we review some common techniques to build a lakehouse using Delta Lake, use SQL Analytics to perform exploratory analysis, and build connectivity for BI applications.
Is the traditional data warehouse dead?James Serra
With new technologies such as Hive LLAP or Spark SQL, do I still need a data warehouse or can I just put everything in a data lake and report off of that? No! In the presentation I’ll discuss why you still need a relational data warehouse and how to use a data lake and a RDBMS data warehouse to get the best of both worlds. I will go into detail on the characteristics of a data lake and its benefits and why you still need data governance tasks in a data lake. I’ll also discuss using Hadoop as the data lake, data virtualization, and the need for OLAP in a big data solution. And I’ll put it all together by showing common big data architectures.
Building an Effective Data Warehouse ArchitectureJames Serra
Why use a data warehouse? What is the best methodology to use when creating a data warehouse? Should I use a normalized or dimensional approach? What is the difference between the Kimball and Inmon methodologies? Does the new Tabular model in SQL Server 2012 change things? What is the difference between a data warehouse and a data mart? Is there hardware that is optimized for a data warehouse? What if I have a ton of data? During this session James will help you to answer these questions.
Gartner: Master Data Management FunctionalityGartner
Gartner will further examine key trends shaping the future MDM market during the Gartner MDM Summit 2011, 2-3 February in London. More information at www.europe.gartner.com/mdm
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionJames Serra
It can be quite challenging keeping up with the frequent updates to the Microsoft products and understanding all their use cases and how all the products fit together. In this session we will differentiate the use cases for each of the Microsoft services, explaining and demonstrating what is good and what isn't, in order for you to position, design and deliver the proper adoption use cases for each with your customers. We will cover a wide range of products such as Databricks, SQL Data Warehouse, HDInsight, Azure Data Lake Analytics, Azure Data Lake Store, Blob storage, and AAS as well as high-level concepts such as when to use a data lake. We will also review the most common reference architectures (“patterns”) witnessed in customer adoption.
Microsoft SQL Server Analysis Services (SSAS) - A Practical Introduction Mark Ginnebaugh
Patrick Sheehan of Microsoft covers platform architecture, data warehousing methodology, and multi-dimensional cube development.
You will learn:
* How to develop and deploy data cubes using SQL Server Analysis Services (SSAS)
* Optimal data warehouse methodology for use with SSAS
* Tips/tricks for designing & building cubes over no warehouse/suboptimal source system (it happens)
* Cube processing types - How/why to use each
* Cube design practices + How to build and deploy cubes!
Microsoft SSAS: Should I Use Tabular or Multidimensional?Senturus
Learn the right version Microsoft SQL Server Analysis services to use to easily migrate the work to the other version. View the webinar video recording and download this deck: http://www.senturus.com/resources/microsoft-ssas/.
During this webinar, Senturus discussed how to choose between the tabular and multi-dimensional versions of SSAS for your analytic needs and the various features and benefits that each version provides.
Senturus, a business analytics consulting firm, has a resource library with hundreds of free recorded webinars, trainings, demos and unbiased product reviews. Take a look and share them with your colleagues and friends: http://www.senturus.com/resources/.
SSAS, MDX , Cube understanding, Browsing and Tools information Vishal Pawar
Why we need SSAS Cube
What is SSAS Cube
Way to access Cube
What is Dimension and Attributes
QHP Dimension and Attributes
Process Flow and QHP Cube Browsing
MDX Basics
MDX Tools
Comparison of Queries Written in T-SQL and MDX with Construct
MDX –How to add where condition
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricCambridge Semantics
Watch this webinar to learn about the benefits of using semantic and graph database technology to create a Data Catalog of all of an enterprise's data, regardless of source or format, as part of a modern IT or data management stack and an important step toward building an Enterprise Data Fabric.
This presenation explains basics of ETL (Extract-Transform-Load) concept in relation to such data solutions as data warehousing, data migration, or data integration. CloverETL is presented closely as an example of enterprise ETL tool. It also covers typical phases of data integration projects.
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. I’ll include use cases so you can see what approach will work best for your big data needs.
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...DATAVERSITY
A solid data architecture is critical to the success of any data initiative. But what is meant by “data architecture”? Throughout the industry, there are many different “flavors” of data architecture, each with its own unique value and use cases for describing key aspects of the data landscape. Join this webinar to demystify the various architecture styles and understand how they can add value to your organization.
Data Modeling Best Practices - Business & Technical ApproachesDATAVERSITY
Data Modeling is hotter than ever, according to a number of recent surveys. Part of the appeal of data models lies in their ability to translate complex data concepts in an intuitive, visual way to both business and technical stakeholders. This webinar provides real-world best practices in using Data Modeling for both business and technical teams.
Modernizing to a Cloud Data ArchitectureDatabricks
Organizations with on-premises Hadoop infrastructure are bogged down by system complexity, unscalable infrastructure, and the increasing burden on DevOps to manage legacy architectures. Costs and resource utilization continue to go up while innovation has flatlined. In this session, you will learn why, now more than ever, enterprises are looking for cloud alternatives to Hadoop and are migrating off of the architecture in large numbers. You will also learn how elastic compute models’ benefits help one customer scale their analytics and AI workloads and best practices from their experience on a successful migration of their data and workloads to the cloud.
Building Lakehouses on Delta Lake with SQL Analytics PrimerDatabricks
You’ve heard the marketing buzz, maybe you have been to a workshop and worked with some Spark, Delta, SQL, Python, or R, but you still need some help putting all the pieces together? Join us as we review some common techniques to build a lakehouse using Delta Lake, use SQL Analytics to perform exploratory analysis, and build connectivity for BI applications.
Is the traditional data warehouse dead?James Serra
With new technologies such as Hive LLAP or Spark SQL, do I still need a data warehouse or can I just put everything in a data lake and report off of that? No! In the presentation I’ll discuss why you still need a relational data warehouse and how to use a data lake and a RDBMS data warehouse to get the best of both worlds. I will go into detail on the characteristics of a data lake and its benefits and why you still need data governance tasks in a data lake. I’ll also discuss using Hadoop as the data lake, data virtualization, and the need for OLAP in a big data solution. And I’ll put it all together by showing common big data architectures.
Building an Effective Data Warehouse ArchitectureJames Serra
Why use a data warehouse? What is the best methodology to use when creating a data warehouse? Should I use a normalized or dimensional approach? What is the difference between the Kimball and Inmon methodologies? Does the new Tabular model in SQL Server 2012 change things? What is the difference between a data warehouse and a data mart? Is there hardware that is optimized for a data warehouse? What if I have a ton of data? During this session James will help you to answer these questions.
Gartner: Master Data Management FunctionalityGartner
Gartner will further examine key trends shaping the future MDM market during the Gartner MDM Summit 2011, 2-3 February in London. More information at www.europe.gartner.com/mdm
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionJames Serra
It can be quite challenging keeping up with the frequent updates to the Microsoft products and understanding all their use cases and how all the products fit together. In this session we will differentiate the use cases for each of the Microsoft services, explaining and demonstrating what is good and what isn't, in order for you to position, design and deliver the proper adoption use cases for each with your customers. We will cover a wide range of products such as Databricks, SQL Data Warehouse, HDInsight, Azure Data Lake Analytics, Azure Data Lake Store, Blob storage, and AAS as well as high-level concepts such as when to use a data lake. We will also review the most common reference architectures (“patterns”) witnessed in customer adoption.
Microsoft SQL Server Analysis Services (SSAS) - A Practical Introduction Mark Ginnebaugh
Patrick Sheehan of Microsoft covers platform architecture, data warehousing methodology, and multi-dimensional cube development.
You will learn:
* How to develop and deploy data cubes using SQL Server Analysis Services (SSAS)
* Optimal data warehouse methodology for use with SSAS
* Tips/tricks for designing & building cubes over no warehouse/suboptimal source system (it happens)
* Cube processing types - How/why to use each
* Cube design practices + How to build and deploy cubes!
Microsoft SSAS: Should I Use Tabular or Multidimensional?Senturus
Learn the right version Microsoft SQL Server Analysis services to use to easily migrate the work to the other version. View the webinar video recording and download this deck: http://www.senturus.com/resources/microsoft-ssas/.
During this webinar, Senturus discussed how to choose between the tabular and multi-dimensional versions of SSAS for your analytic needs and the various features and benefits that each version provides.
Senturus, a business analytics consulting firm, has a resource library with hundreds of free recorded webinars, trainings, demos and unbiased product reviews. Take a look and share them with your colleagues and friends: http://www.senturus.com/resources/.
Take your reports to the next dimension! In this session we will discuss how to combine the power of SSRS and SSAS to create cube driven reports. We will talk about using SSAS as a data source, writing MDX queries, using report parameters, passing parameters for drill down reports, performance tuning, and the pro’s and con’s of using a cube as your data source.
Jeff Prom is a Senior Consultant with Magenic Technologies. He holds a bachelor’s degree, three SQL Server certifications, and is an active PASS member. Jeff has been working in the IT industry for over 14 years and currently specializes in data and business intelligence.
Creating a Tabular Model Using SQL Server 2012 Analysis ServicesCode Mastery
At Code Mastery Boston Steve Hughes, Principal Consultant at Magenic, highlights: Basics of SQL Server 2012 Analysis Services, Multidimensional Model, VS PowerPivot, Creating a Tabular Model
Many People know FastTrack as a reference architecture for relational databases. The goal of this guideline is to provide a reference architecture for scalable and fast Analysis Services solutions.
Developing with SQL Server Analysis Services 201310Mark Tabladillo
SQL Server Analysis Services (SSAS) allows for integrating cubes, tabular model databases, and data mining with your developed applications and services. This talk provides a developer’s framework for understanding SSAS and its core ADOMD .NET and AMO classes. These development options will be demonstrated through application demos. This session is great for developers already working with SQL Server, who want to take their development skills to the next level.
“A broad category of applications and technologies for gathering, storing, analyzing, sharing and providing access to data to help enterprise users make better business decisions” -Gartner
Big Data Challenges and How to Overcome Them with Qubole - a Self-Service Platform for Big Data Analytics built on Amazon Web Services, Microsoft and Google Clouds. Storing, accessing, and analyzing large amounts of data from diverse sources and making it easily accessible to deliver actionable insights for users can be challenging for data driven organizations. The solution for customers is to optimize scaling and create a unified interface to simplify analysis. Qubole helps customers simplify their big data analytics with speed and scalability, while providing data analysts and scientists self-service access in Cloud. The platform is fully elastic and automatically scales or contracts clusters based on workload. We will try to overview main features, advantages and drawback of this platform.
Hear Ryan Millay, IBM Cloudant software development manager, discuss what you need to consider when moving from world of relational databases to a NoSQL document store.
You'll learn about the key differences between relational databases and JSON document stores like Cloudant, as well as how to dodge the pitfalls of migrating from a relational database to NoSQL.
2. Analysis Services Overview
Part of the Business Intelligence Development Studio
• Included with the SQL Server License
• Special version of Visual Studio
• Microsoft’s application for creating multidimensional OLAP databases
which are queried with the MDX language
• Microsoft’s powerful data mining platform. Includes sophisticated
algorithms that can operate on relational or OLAP data
• This presentation demonstrates the creation of an OLAP database, also
called a cube
2
3. Why Store Data in a Cube?
Analysis services, like the SQL Server relational database, is a platform for storing
data. But unlike the relational database, it does not store data in tables–it is
stored in other types of structures comprising a cube.
Why store data in cubes instead of tables? There are a number of reasons:
• better query performance
• fast, optimized aggregations calculations
• more efficient storage of data through its simplified read-oriented design
• richer calculation possibilities, supporting stored measures,
calculated measures, and key performance indicators (KPI)
• an easier to understand data model for the end user.
OLAP data is engineered to provide dimensional data views, hierarchical
browsing, attribute-based breakouts and filtering. And OLAP data does not
require or even use table joins!
3
4. Creating an Analysis Services (OLAP)
Database
Creating an analysis services database consists of creating the database
structure, then populating those structures with data from external data
sources. The structures comprising an OLAP cube include:
• Dimensions and their associated elements
– Hierarchies
– Attributes
• Measures and their associated elements
– Stored Measures
– Calculated Measures
– Measure Groups
– Key Performance Measures (KPI)
– Measure Profiles
4
5. Development Steps
1. Create Analysis Services Project
2. Create Data Source
3. Create Data Source View
4. Create cube object definitions
5. Deploy definitions to OLAP server and load
in data
6. Specify partitions and aggregations
5
6. Open BIDS and create a new Analysis
Services Project…
6
7. Create a data source and a data
source view
In the data source you identify the relational data
warehouse (DW) that hosts the source data and
provide the connection information.
In the data source view, you specify which the tables in
the DW you want to use to supply data to the cube.
If the DW has been designed using the classic star
schema or snowflake approaches, the SSAS cube
wizard can examine the DW’s structure and derive a
cube definition from it, making your job easier.
When the initial design from the wizard is complete
you can go back and make modifications, for example,
you can add or modify hierarchies and attribute
relationships.
Once you have the cube design you want, you load the
data in.
7
8. In this illustration, all tables from the star schema DW are used, only
the sysdiagrams table (which holds the database diagram data) remains
unselected.
8
9. Once you have defined the data source view, you can inspect the
schema. By right clicking on a table, you can browse the data and
define derived columns. But at this point, we still have no OLAP
database. The creation of the OLAP database begins with creating a
cube.
9
10. Creating the Cube Structure
1. With the data source view in place, you are now ready to
create the cube. Right click on the cubes folder and select
“New Cube.” This launches the cube wizard.
2. Make sure the wizard has
correctly identified which are
dimension and which are fact tables
and tell it which table contains the
data for the time dimension.
10
11. Setting the Time Dimension
Parameters
Time is a unique dimension with
inherent assumptions about how it
should work.
You identify the time dimension as
such so that the MDX functions
(such as PrevMember and
ParallelPeriod) specific to it will
work.
You also tell which of the source
data columns map to well known
time concepts such as years,
quarters, and months.
11
12. After defining the time
properties, the wizard displays
the measures that it defined.
Here you can select which ones
you want to keep. I am keeping
all of them in this illustration.
The “Fact Units Count”
measure counts the number of
records in the source table. This
information is used in
optimizing the aggregation
process.
In the Review New Dimensions
panel the dimensions that were
created along with their
hierarchies and attributes are
displayed.
12
13. A cube has now been defined. This panel lets you review the data model
(UDM) that has been created.
13
15. The solution browser on the right side of the screen now shows the
cube and dimensions that were defined. These objects can be modified
by clicking on them.
15
16. Click on the Product dimension to observe that no hierarchy was
defined by the wizard. I will create a hierarchy for it.
16
17. A new hierarchy is created by dragging the an aggregate attribute into the center panel. I
use Category Code and Dim Product to form the hierarchy; Category and Item provide
the labels the user will see (they are mapped to the name Property).
Next I define attributes between these hierarchy levels. This is so the aggregation process
adds the data from the level directly beneath it instead of always going to the leaf level,
which would be a less efficient process entailing significantly more calculations.
17
18. In the case of the time dimension,
there is an additional step. We
want the months to display in
chronological order, not
alphabetical order, so we assign a
value to the dimension’s
OrderByAttribute property. The
data source has a column called
CHRON_ORDER that contains
this ordering information.
18
19. The information entered to define the hierarchies and attributes is stored in
XML files. The database doesn’t actually know what you have done yet. You
must “process” the dimension. This brings that information into the OLAP
database, which it then uses to create its internal structures.
19
20. Once a dimension has been processed, you can inspect it in the
browser pane to verify the hierarchy has the expected structure. Note
that the months show in chronological order.
20
21. After all the elements have been defined and the cube has
been completely processed, you can inspect the data in the
SSAS data browser.
21
23. The cube has been defined and created. The leaf level data from the data warehouse
was loaded. We inspected the data in the multidimensional data browser.
We can tweak the physical design of the cube to improve scalability and query performance.
Three primary mechanisms for doing this are:
•Selecting ROLAP/HOLAP/MOLAP data storage options
•Partitioning
•Pre-aggregating
MOLAP is probably the most commonly used data storage option and is the default. It means
all the data will be stored in the multidimensional data cube. The illustration one the
succeeding slides will used MOLAP.
At the opposite end of the spectrum, is ROLAP where all data comes from relational data
tables. With ROLAP, the SSAS database is only providing metadata structures for presenting
information in the dimensional style. You can expect performance to be much slower. This
mode is used in situations where the source data is not static, changes frequently and you want
the reports to reflect those changes immediately.
HOLAP is a hybrid approach where all the aggregate data is in the SSAS database except the leaf
level data which resides in relational tables.
When the volume of data is very large, it can helpful to chop the data store into pieces, or
partitions. Storage mode is selectable per partition. This example uses a small amount of data
and only one partition will be employed.
23
24. Aggregates are summary level data that are computed from the leaf level data that was
loaded from the source. Often the aggregates are totals and subtotals, but other summary
statistics such as averages or maximum values can also be used.
Pre-calculating and storing the aggregate values normally improves query performance (at the
cost of the storage space and time required to compute them.) The default is to do no pre-
aggregates. You can see this from the partitions panel shown below.
The data display shown earlier from the SSAS data browser included many aggregate data
values. Those aggregates were all calculated on the fly.
You can pre-calculate all aggregates or only some of them. If you are going to pre-calculate only
some, there are different strategies that can be employed to determined which are chosen for
calculation. You’ll see this ahead.
24
25. Let’s go through the aggregations process. Click the “Design Aggregations”
hyperlink to bring up the wizard. In the first panel of the wizard, push the count
button to compute the statistics that are used to drive the aggregation optimization
process.
25
26. After a few seconds, the source
data has been analyzed. It
counts the number of records
per partition.
Once the statistics have been
computed, you can ask the system
to identify a set of aggregations to
perform. You can direct the
system to aggregate until A) a
certain amount of storage has been
used, B) a certain level of
performance gain has been
achieved, C) you click stop, or D)
don’t do anything
In this illustration, I am asking it
to aggregate until it reaches a
performance gain of 75%. The
system will run an optimization
algorithm to determine the best
ones to use.
26
27. The system generates a chart telling
what percent (of the total possible
number of) aggregations it has
identified and what level of
performance gain would be
achieved by computing them.
At the completion of that step, the
wizard has identified which
aggregations to compute. You may
elect to have it compute them now
or you can defer the calculations till
later. (They could take a while.)
27
28. Selecting “Deploy and Process
now” and pushing Finish, you
arrive at this screen.
Push the RUN button to
perform the calculations.
When it finishes, you get
a message heralding the
successful completion of
the deployment.
The information under
the Aggregations tab will
be updated.
28
30. Different Kinds of Reporting Data
Thus far, all the measures that have been constructed have been displays of
stored data or aggregates either stored or calculated on the fly. There are
other kinds information that can be made available to an end user.
• Calculated measures
– Percents
– Shares
– Differences
• Key Performance Indicators (KPIs)
Calculated measures are calculated on the fly using MDX expressions. KPI’s
are measures with associated goals and graphics. I will show an example of
both.
30
31. In this example, I create a calculated measure that gives difference between
the data value at a given time and its value the previous time period. The
calculation is defined from the Calculations tab. It is given a name and an
MDX expression. In this example I make use of the PrevMember function.
31
32. Displaying the Units measure and the Units Increase measure side-by-
side demonstrates that the calculated measure correctly computes the
difference between the current value and the one a month ago.
In the next series of slides I will use this calculated measure to construct
a KPI.
32
33. What is a Key Performance Indicator? (KPI)
Every KPI starts life off as a measure, presumably, a measure that is an indicator of company
performance. With each KPI, we assume that the company has established a target value –
goal – of what that indicator should be. For instance, sales revenue might be a performance
indicator. The goal might be to sell at least $100,000 in a given quarter.
The KPI will calculate the difference between the goal and the actual result. We assume the
company can assess those differences declaring them as either good, so-so, or bad. For
instance, the company may say, revenue > 100,000 is good, 90,000 to 100,000 is so-so, and
revenue less than 90,000 is bad.
This brings us to an essential distinguishing feature of the KPI: a graphical icon, known as an
indicator that is displayed to communicate the status of the KPI to the end user. That graphic
might be a happy face to show good, a neutral face to show so-so, and a frowning face to show
bad. Traffic lights with green, amber, and red are often used. The choice of graphics is up to
the client.
Setting up a KPI in Analysis Services entails computing a value of status. The difference
between the indicator and the goal is calculated, and the differences that are “good” are
mapped to the number 1, so-so to 0 and bad to -1. That number is the KPI’s status.
Optionally, you can define a trend for the KPI. The trend shows if, over time, the
performance measure has been moving upwards or downwards.
33
34. KPI Summary
• Begin with a measure indicating company performance
• Have goals associated with that performance measure
• Translate the difference between performance and goal into
its status with values of 1, 0, -1 (corresponding to good, so-so,
and bad)
• Display the status of the performance measure to the user as a
graphic
34
35. You define KPIs from the KPI tab of the Cube browser. In this simple
illustration, our calculated measure, “Units difference” is the performance
indicator, and the goal is a constant value of 180. MDX expressions can be
used to provide more complex goal statements.
35
36. Once you have defined the KPI, it may be inspected in the browser tab of the
KPI tab. Here you see the performance metric has a value of 179, just under
the target value. This is “so-so” and you see the neutral face showing.
36