Big Data Analytics
Why data warehousing?
Data Information Decision
Data warehouse is a database used for reporting and data
analysis. It is a central repository of data which is created by
integrating data from one or more disparate sources.
Data warehouse is a pool of historical data that doesn’t
participate in the daily operations of the organization.
Instead, this data is purposefully used for business analytics.
Warehouse schemas - star
● Data in DW is arranged into hierarchical
groups called dimensions and into facts
● The simplest style of DW schema.
● Consists of one or more fact tables
referencing any number of dimension tables.
● Special case of the snowflake schema, and is more effective for
handling simpler queries.
Warehouse schemas - snowflake
● Multiple dimensions
● Star and snowflake schemas are most commonly found in
dimensional data warehouses and data marts where speed of data
retrieval is more important than the efficiency of data manipulations
● Don’t follow normal forms - speed tradeoff
● Online Analytical Processing
● An approach to answering multi-dimensional
● Represents star schema or snowflake schema in a relational data
● Each cell of the cube holds a number that represents some measure
of the business, such as sales, profits, expenses, budget and
● Measures are derived from the records in the fact table and
dimensions are derived from the dimension tables
● Operations: Slice and Dice, Drill-up and Drill-down, Roll-up