Dimensional modelingowb11gr2 paper


Published on

Published in: Technology
1 Comment
  • thank you very much
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Dimensional modelingowb11gr2 paper

  1. 1. Dimensional Modelling with OWB 11gR2 Maren Eschermann Trivadis AG Zürich
  2. 2. This document describes the features of the Oracle Warehouse Builder for dimensionalmodeling, their functionality but also their limitations.1. IntroductionOracle Warehouse Builder 11.2, but as well as earlier releases, offer many features and supportfor implementing data marts and their corresponding loading processes. Slowly changingdimensions, orphan management, time dimension with several hierarchies and tact tables arerealized quickly and in a standardized way while applying best practices. Dimension and cubeoperator allow implementing loading mappings very efficiently. Even materialized views(relational as well as cube based materialized views) can be generated by Oracle WarehouseBuilder with only a few mouse clicks. But what are the limitations of these features? Are therereasons for not using them and manually implementing the star schema structures and the loadingprocesses? This paper gives answers to these and similar questions and offers decision supportwhen, where and how to use the dimensional features most efficiently.2. Building a Dimensional Model with Warehouse Builder2.1 Dimensional Modeling BasicsThe dimensional model of a data mart consists of dimensions, hierarchies, facts and aggregationrules and can be described, for example, by an ADAPT model. It represents the interface betweenthe business and the IT department, a common understanding of that model is crucial for thesuccess of the project.When moving from the dimensional to the relational model very often the “dimensionalsemantics” are lost. To derive the hierarchy of a dimension from the corresponding relationaltable is hard or even impossible. To avoid this metadata should be enhanced to include thedimensional model as well. That is exactly what happens when using the Oracle WarehouseBuilder and its dimensional features.Oracle Warehouse Builder allows realizing the relational as well as the multidimensionalimplementation of a data mart. In the following we focus on relational data marts.2.2 Modeling a DimensionA dimension consists of levels, hierarchies, andattributes. Each level has a number of attributes,the relation between levels is defined byhierarchies. A dimension can have more thanone hierarchy. The time dimension might have aweek or fiscal year hierarchy besides thestandard hierarchy. With support of the OWBinfo@trivadis.com . www.trivadis.com . Info-Tel. 0800 87 482 347 . Datum 11.01.2012 . Seite 2 / 8
  3. 3. dimension wizard the specification of a dimension can be done in a few steps. Not only thedimension is created but also the dimension table and a sequence are created as OWB meta-objects. A dimension table has the following properties:  Table name = <DIMENSION NAME>_TAB  A primary key on the DIMENSION_KEY column  A number of columns for each level  An index on the business key of each level If dimension properties are modified, you have to perform an “automatic binding” to propagate these modifications to the underlying dimension table. If you have manually modified this table beforehand, all your modifications (comments, additional constraints, …) will be overwritten. Moreover the primary key of the dimen- sion table gets a new name, this results in an error if you already have fact tables referencing the dimension.If project specific guidelines exist, which are not compatible with automatic binding and theresulting dimension table properties, you can disable the auto binding feature and do the bindingbetween the dimension and its table manually. This way you have all freedom concerning thenaming, definition of constraints and other properties of the dimension table.A dimension object has properties which are only applicable if the dimension operator is used astarget operator in a mapping. This is true for configurations concerning historization (SlowlyChanging Dimension) and orphan management. In chapter 3.2 we will elaborate on that topic inmore detail.OWB 11.2 offers some advanced possibilities concerning the modeling of dimensions. In 11.2 itis possible to create a dimension without surrogate key, which allows having degeneratedimensions for example.2.3 Time DimensionEffectively each fact table references the time dimension at least once, very often the fact table ispartitioned by time. In earlier versions the time dimension table had a surrogate key like all otherdimension tables, which have been referenced by the fact tables. With this the partitioning bytime was difficult. OWB 11.2 realizes the time dimension table with a date column as primarykey. This way also the fact table has a date column and partitioning can be easily done.2.4 Modeling a CubeA cube consists of measures (facts), dimensionsand aggregation rules, which define how themeasures can be aggregated along thehierarchies of the dimensions.With the cube wizard the user can define acube and all its properties.info@trivadis.com . www.trivadis.com . Info-Tel. 0800 87 482 347 . Datum 11.01.2012 . Seite 3 / 8
  4. 4. The fact table, which is automatically generated, has the following properties:  Table name = < CUBE NAME>_TAB  One column per dimension with foreign key constraint and optionally a bitmap index  One column per measure  Optionally a composed unique key constraint consisting of the combination of dimension columnsThe fact table can be manually created and bound to a cube. Again, this is required if projectspecific requirements cannot be reconciled with the default properties of the automaticallycreated fact table. For example, the fact table will be very often partitioned or foreign keys aredisabled to load more efficiently.3. Implementing the Loading Processes3.1 Loading a DimensionImplementing the loading process for a di-mension is very easy and efficient when usingthe dimension operator. This operator realizesthe following functionality:  Populating the surrogate key  Lookup of the business key  Deduplicating of level elements  Realizing Slowly Changing Dimensions  Orphan Management The dimension operator supports two different loading types: LOAD and REMOVE. Generally the LOAD type is used, while the REMOVE type is only applied if SCD2 is implemented (for more details see chapter 3.2). If deduplication of level elements is necessary, you can apply the Enable Source Dedup property, which is the default configuration. When applying this property, the elements of all levels are deduplicated. If the source data isalready unique and deduplication is not necessary, you can disable this feature, especially if largesets of data are loaded and performance is an issue. Please note that in this case the dimensionoperator no longer guarantees the uniqueness of the business key.When mappings using the dimension operator are deployed an OWB$TEMP table is created inthe target schema for each dimension level. The creation of these tables cannot be switched off,they are necessary to provide some of the hierarchy management and loading functionality. Theyare not truncated after mapping execution, you would have to implement that manually (e.g. byusing a post-mapping operator) if this is necessary.info@trivadis.com . www.trivadis.com . Info-Tel. 0800 87 482 347 . Datum 11.01.2012 . Seite 4 / 8
  5. 5. With the exception of the time dimension Warehouse Builder realizes dimensions always as“solved dimensions”, i.e. the dimension tables do not only contain records for the lowest levelelements but also for the higher level elements (“control rows”). This allows that fact tablesreferencing elements of higher levels, i.e. you can implement a fact table that stores facts not on the product level but on the product category level. When deploying a dimen- sion, OWB creates a view for each dimension which filters all control rows. Applications that might have problems handling thecontrol rows can access the view instead of the dimension table.3.2 Slowly Changing Dimensions Very often it is required to keep historical data of a dimension, this is realized by implementing this dimension as Slowly Changing Dimension, SCD. Mostly SCD type 2 is used, which means that the complete history is preserved in the database. For relational dimension tables two additional attributes of type date are needed, which define the validity of a record:  EFFECTIVE_DATE defines the „valid from” date  EXPIRATION_DATE defines the „valid to” date The properties Type2 Gap and Type2 Gap Units of the dimension operator allow specifying how effective and expiration date are set.The user can specify which attributes trigger the creation of a new record. For all other attributesonly the current record is overwritten with the new value. This is in contrast to Kimball’s HybridSCD, see Kimball Design Tip #15: “Combining SCD Techniques”.Oracle Warehouse Builder allows the implemen-tation of Hierarchy Versioning. Whereas Kimballonly describes the historization of elements ofthe lowest level, OWB also provides thefunctionality of versioning elements of higherlevels.If you want to logically delete a dimensionrecord the expiration date is set to the currentdate. The dimension operator implements thisbehavior, if you are using the load type REMOVEand the Type2 Extract/Remove Current Only=Yes. Please note that the higher levels of thisdimension operator must not be connected, since they would be deleted physically.If you have multiple modifications of the same dimension record between two loads, you can useSupport Multiple History Loading. In this case more than one record for the same business key iscreated within a single load.info@trivadis.com . www.trivadis.com . Info-Tel. 0800 87 482 347 . Datum 11.01.2012 . Seite 5 / 8
  6. 6. Loading historical data out of order can be achieved by the Out of Order History Loadingproperty. This might become necessary if you want to load record becoming valid BEFORE theversion of the current record became valid. Assume the current record of a customer is valid sinceJanuary 1st 2012 and another record of the same customer valid since December 1st 2011 has tobe loaded afterwards. In this case the Out of Order History Loading property would be requiredto load that record. Please be aware, that the facts already loaded for December 2011 willreference the “wrong” record, which previously has been valid until December 31st 2011 andwhich is now only valid until November 30th 2011.Both properties, the Support Multiple History Loading as well as the Out of Order HistoryLoading property are switched off by default due to the possible performance overhead.3.3 Orphan Management When loading dimensions a level ele- ment might have an invalid parent element (Invalid parent key value) or nor parent specified at all (Null parent key value). Such records are called “orphans”. If you try to load such an orphan, youhave the choice between three different loading options:  No Maintenance: The orphan record is neither rejected nor stored in any error table nor corrected. No Maintenance is the default behavior and will result into having level elements in the dimension table without any parent specified (the parent level attributes are all set to NULL). When aggregating values these orphans are not considered which might lead to inconsistent reports. If you are using that option (for example due to licensing restrictions) it is strongly recommended to implement the orphan management outside the dimension operator.  Default Parent: The orphan record is loaded and the parent level attributes are set to default values, which can be specified by the Default Level Row settings.  Reject Orphan: The orphan record is not loaded into the dimension table but logged into an error table. This way you have the possibility to reload the record later, when the missing parent element exists.All three options are available for invalid or missing parent elements, you can have differentorphan strategies for both scenarios. In pre 11.2 OWB releases, orphans have just been rejectedbut without logging them into an error table.The dimension operator offers a broad range of options how to load the dimension table. Thisfunctionality, which otherwise has to be implemented by the development team with quite someeffort, is available in a standardized manner and with high quality. The user can understand howthe operator is implemented, it is a pluggable mapping which can be expanded (but notmodified).The complexity of the operator becomes obvious if you count the number of basic mappingoperators which are used to realize a dimension operator for a dimension with three levels: Itconsists of  28 basic operators, when neither SCD nor orphan management is specified  34 basic operators, when SCD2 is implemented  54 basic operators, when SCD2 as well as orphan management is implementedinfo@trivadis.com . www.trivadis.com . Info-Tel. 0800 87 482 347 . Datum 11.01.2012 . Seite 6 / 8
  7. 7. 3.4 Loading a Cube Loading processes for fact tables can be realized by using the cube operator. Its functionality encom- passes the lookup for the dimension business keys (depending on the SCD type) and orphan management. For the lookup of SCD2 dimensions the ACTIVE_DATE attribute of the cube operator is crucial. It represents the point in time that is used to determine which record in a Type 2 SCD is the active record. The default value of that attribute is SYSDATE.The cube operator has the following loading types:  INSERT LOAD: allows the modification of already loaded facts, which is realized with a MERGE statement.  LOAD: only new facts are loaded (inserted). For big data sets this is the fastest loading option.  REMOVE: allows removing already loaded facts.The cube operator offers orphan management functionality forloading facts with missing or invalid dimension business keys.The options correspond to those of the dimension operator. Ifyou choose the “Default Parent” option, remember to enablesource aggregation. Otherwise you might get an ORA-30926(“unable to get a stable set of rows in the source tables”)execution error because multiple fact rows are produced withthe same default dimension references.A loading policy for handling “early arriving facts” can be implemented with this orphanmanagement functionality. Early arriving facts are those which are loaded BEFORE some of thereferenced dimensions values are loaded.4. Materialized View CreationThe query performance of a data mart is crucial. Very often when aggregation at query time is tooslow, pre-aggregation byimplementing materializedviews will help. OracleWarehouse Builder allows toautomatically create and deploymaterialized views; the user cancreate relational materializedviews as well as cube-basedmaterialized views, which arestored in an analytical work-space. The user can specifywhich dimension and which levels are pre-computed.info@trivadis.com . www.trivadis.com . Info-Tel. 0800 87 482 347 . Datum 11.01.2012 . Seite 7 / 8
  8. 8. The relational materialized views do not exist as meta-objects in the OWB design repository, butonly in the target user of the database. Whenever you deploy a cube’s aggregation (this can bespecified in the configuration properties of the cube object), a new set of materialized views iscreated. The user has to delete the previously deployed set of materialized views manually.Partitioning of materialized views, the creation of MView Logs to allow Fast Refresh or theimplementation of partition change tracking have to be done manually.If the user selects the option ROLAP with MViews, not only the cube based materialized view butalso the analytic workspace and all necessary multidimensional elements in that workspace arecreated. When using this feature significant functionality is implemented automatically in thebackground. Nevertheless a good knowledge and understanding of the underlying technology isnecessary for debugging and maintaining such solutions.5. SummaryThe dimensional functionality of Oracle Warehouse Builder is many-faceted and offers a widesupport for implementing data marts and their loading processes. Modeling dimensions and cubesinside OWB provides the advantage of having the dimensional model in your metadata (youavoid the “loss of dimensional semantics”) and of following best practices for the design ofdimension and fact tables. Furthermore the implementation of loading processes with thedimension and cube operator is very efficient, of high quality, bug-free and standardized. Therealization of slowly changing dimensions or orphan management is consistent throughout thewhole project.The dimensional features of the Oracle Warehouse Builder are like a tool box; every project teamcan take those “tools”, which are suitable. You can use the dimension operator and at the sametime dispense with the cube operator, if your cubes are loaded by a partition exchange strategy.Some prototyping at the beginning of the project will help to decide which features shall beapplied and which not. Whatever the decision is, the focus should be to have a standardized andflexible solution at the end, which can be easily maintained and extended.Contact Details:Maren EschermannTrivadis AGEuropa Strasse 5CH-8153 GlattbruggTelefon: +41 (0) 44-808 7020Fax: +41 (0) 44-808 7021E-Mail: maren.eschermann@trivadis.comInternet: www.trivadis.com info@trivadis.com . www.trivadis.com . Info-Tel. 0800 87 482 347 . Datum 11.01.2012 . Seite 8 / 8