Assignment # 3Information System & Data Processing IISubmitted ByAbdul-rehman AslamRoll # (9998)Submitted ToMadam: Nargis FatimaNATIONAL UNIVERSITY OF MODERN LANGUAGES H-9ISLAMABAD
Question :Differenciate between the following1. star schema and snow flake2. snow flake and fact constellation schema3. Star schema and fact constellation1.Star schema and snow flakeStar Schema Snow Flake SchemaThe star schema is the simplestdata warehouse scheme.In star schema each of thedimensions is represented in asingle table .It should not have anyhierarchies between dims.It contains a fact table surroundedby dimension tables. If thedimensions are de-normalized, wesay it is a star schema design.In star schema only one joinestablishes the relationshipbetween the fact table and any oneof the dimension tables.A star schema optimizes theperformance by keeping queriessimple and providing fast responsetime. All the information about theeach level is stored in one row.It is called a star schema becausethe diagram resembles a star.Snowflake schema is a morecomplex data warehouse modelthan a star schema.In snow flake schema at least onehierarchy should exists betweendimension tables.It contains a fact table surroundedby dimension tables. If adimension is normalized, we say itis a snow flaked design.In snow flake schema since thereis relationship between thedimensions tables it has to domany joins to fetch the data.Snowflake schemas normalizedimensions to eliminatedredundancy. The result is morecomplex queries and reducedquery performance.It is called a snowflake schemabecause the diagram resembles asnowflake.
2.Snow flake and fact constellation schemaSnowflake schema:The snowflake schema is a variant of the star schema model, where some dimensiontables are normalized, thereby further splitting the data into additional tables. Theresulting schema graph forms a shape similar to a snowflake.· The major difference between the snowflake and star schema models is that thedimension tables of the snowflake model may be kept in normalized form to reduceredundancies.· Such a table is easy to maintain and saves storage space. However, this saving ofspace is negligible in comparison to the typical magnitude of the fact table.· Snowflake structure can reduce the effectiveness of browsing, since more joins will beneeded to execute a query.· The system performance may be adversely impacted. Hence, although the snowflakeschema reduces redundancy, it is not as popular as the star schema in data warehousedesign.ExampleHere, the sales fact table is identical to that of the star schema The main differencebetween the two schemas is in the definition of dimension tables.The single dimension table for item in the star schema is normalized in the snowflakeschema, resulting in new item and supplier tables.Fact constellation: Sophisticated applications may require multiple facttables to sharedimension tables. This kind of schema can be viewed as a collection of stars, andhence is called a galaxy schema or a fact constellation.Fact constellation. This schema specifies two fact tables, sales and shipping.The sales table definition is identical to that of the star schema (Figure 3.4).The shipping table has five dimensions, or keys: item key, time key, shipper key, fromlocation, and to location, and two measures: dollars cost and units shipped. A factconstellation schema allows dimension tables to be shared between fact tables. For
example, the dimensions tables for time, item, and location are shared between boththe sales and shipping fact tables.The fact constellation schema is commonly used, since it can model multiple,interrelated subjects. A data mart, on the other hand, is a department subset of the datawarehouse that focuses on selected subjects, and thus its scope is departmentwide.For data marts, the star or snowflake schema are commonly used, since both aregeared toward modeling single subjects, although the star schema is more popular andefficient.A dimension table will not have parent table in star schema, whereas snow flakeschemas have one or more parent tablesPerformance wise, star schema is good. But if memory utilization is a major concern,then snow flake schema is better than star schema.3.Star schema and fact constellationStar schema:The most common modeling paradigm is the star schema, in which the data warehousecontains (1) a large central table (fact table) containing the bulk of data with noredundancy, and (2) a set of smaller attunement tables (dimension tables), one foreach dimension.It is the basic structure for a dimensional model. It has one fact table and a set ofsmaller dimension tables arranged around the fact table. The fact data will notchange over time. The most useful fact tables are numeric and additive because datawarehouse applications almost never access a single record. They access hundreds,thousands, millions of records at a time and aggre-gate
Fact constellationSophisticated applications may require multiple fact tables to share dimension tables.This kind of schema can be viewed as a collection of stars. This kind of schema can beviewed as a collection of stars, and hence is called as a galaxy schema or a factconstellation.Example for defining Star, Snowflake and Fact Constellation SchemaJust as we use relational query languages like SQL, a data miming query language canbe used to query a data-mining task DMQL, whi9ch contains language primitives fordefining data warehouse and data marts. Data warehouse and data marts can bedefined using two language primitives, one for cube definition and another fordimension definition.The cube definition has the following syntax:Define cube <cube_name> [(dimensional list)]:<measure list>The dimension definition has the following syntax:Define dimension<dimension_name> as (<attribute or sub-dimension list>)