What are some business facts that you need, or would like, to be able to report on?
Here is an example of how to identify Facts and Dimensions on an existing report The Facts are “math” words - Count of Cases, Sum of Aid Payments, Average of Pay per Case The Dimensions are “grouping” words - (By) Program, (By) Aid Type, (By) Calendar Month, (For) Fiscal Year
We start with data from operational sources We move this data into a staging area where business rules are applied Code values are translated to a common set; i.e., M vs. male Formats are changed to fit a standard; i.e., 5.1 vs. 5.1000 These rules make the data from different sources comparable (apples to apples) Once the data is made “standard” it is loaded into the warehouse’s fact and dimension tables We create the reporting cubes Users access the cubes to analyze and report on the data
The data warehouse is to help you answer business questions. To help you answer these questions there are Reporting Cubes.
Now I’m going to show some examples of how this comes together to help you in reporting and analyzing data. While going through these, think of what YOU would like see. In this report, the facts are Client counts, the dimensions are By Department, By Gender, and By Active Year.
Here we see another report. Again the fact is a count of Clients, the dimensions are By Race group, By Department, For Active Year
These cubes can provide for drilling down into greater level of detail. From the previous report we have “drilled” into the Social Services Division, “down” to the program level. Can you tell what the dimensions are here? By Race By Department By Program By Active Year
We say that these cubes are multi-dimensional. This report shows that we can combine dimensions to find even more interest information. Notice that the Fact is Unique Client, the Dimensions are By Race Group, By Gender, By Marital Status, By Department, and the Filter, or Selection, is For Active Year
Depending on the reporting tool, these reports can easily be converted in to visual graphs. Here we see the prior grid report in a graph format. This allows the user to quickly notice interesting information.
Now that the value of the data warehouse can be seen, how do we begin?
2. What is a Data WarehouseWhat is a Data Warehouse?
The conglomeration of an organization’s data
warehouse staging and presentation areas,
where operational data is specifically
structured for query and analysis
performance and ease-of-use.
Ralph Kimball,(2002) The Data Warehouse Toolkit.
3. Now in EnglishNow in English
A data warehouse is a database organized in a
way to allow for fast queries of information.
It contains the data from the different database
systems that is brought together for a single
4. So what’s the differenceSo what’s the difference?
• Centers around
• 2 dimension reports
– Age by System
• Individual data
• “Cut-n-paste” into other
• Centers around business
• Multi-dimensional reports
– Age by Race by Program
• Aggregated data
party reporting tools
can be used.
5. Measures Facts not ActivitiesMeasures Facts not Activities
Facts are business performance measurements
– Meals provided
– Dollars expended
– Hours worked
Facts are numerical and additive
– Sum of dollars spent
– Count of clients served
Facts are stored to represent a measurement at a
6. What is a Grain?What is a Grain?
A grain is the level of detail at which a business
measurement is stored
Different businesses have different fact needs
– A Social Services grain
• The number of food stamp dollars given to a case each month
– In-Home Support Services grain
• The number of hours of service a client received in a
provider’s pay period
• The number of dollars paid to a provider for a client during a
7. What is a DimensionWhat is a Dimension?
A dimension is a textual description that
relates to a fact, for example:
– Ethnicity (White, Black, Japanese)
– Language (English, Spanish, Tagalog)
– Gender (Male, Female)
– Date (05/31/2003, 04/15/2003)
– Location (California, Arizona, New Mexico)
8. Used in QueriesUsed in Queries
Dimensions are used to restrict and frame queries on
facts, for example:
“Give me a count of all Spanish speaking white males in
• The fact is the count (a number)
• The dimensions are:
– Spanish (language),
– white (race),
– male (gender),
– and California (location)
9. Identifying Facts and DimensionsIdentifying Facts and Dimensions
By Aid Type
By Month For (By) Year
Sum of Aid
10. What makes a Data WarehouseWhat makes a Data Warehouse?
11. Cubes Answer Business QuestionsCubes Answer Business Questions
How many Spanish speaking clients did H&HS
serve in each department for each of the past 3
Which cities currently have the highest concentration
of Asian clients? What has the trend been?
How many people who receive Medi-Cal received a
service in 2003 from health services, by service?
12. Reporting CubesReporting Cubes
13. Reporting CubesReporting Cubes
14. Drill Down CapableDrill Down Capable
16. Visual GraphsVisual Graphs
17. Where do we startWhere do we start?
• Choose the systems to include
• Identify the exact grain of the business
• Identify the dimensions available for use
with each fact table row
• Choose the numeric facts of what is being
18. Key to SuccessKey to Success
To ensure success end user involvement is
Data warehouse success is tied directly to
user acceptance. If the users haven’t
accepted the data warehouse …then your
efforts have been exercises in futility. (Kimball,