Multi dimensional model vs (1)


Published on

In class student presentation for Business Intelligence.

1 Comment
  • nice, thank you, just the slides 24 25 26 29 are frustrating because the sentences are not finished, maybe good for a live/interactive presentation but not for single readers like us here on SlideShare
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Multi dimensional model vs (1)

  1. 1. MULTIDIMENSIONAL DATA MODEL Niall Cosgrave 112223751 Timothy Halpin 112222171 Kevin McCarthy 107476477 Cian O’Brien 1085802351 Amanda O’Donovan 108385581
  2. 2. WHAT IS MULTIDIMENSIONAL DATA Multi-Dimensional Data Model is a model for data management whereby the databases are developed according to users preferences, in order to be used for specific types of retrievals. This model views data in the form of data cube. A data cube allow data to be modelled and viewed in multiple dimensions. It is define by dimensions and fact. 2
  3. 3. WHAT IS MULTIDIMENSIONAL DATA Multidimensional database (MDB) is a type of database that is optimized for data warehouse and online analytical processing (OLAP) applications Multidimensional data-base technology is a key factor in the interactive analysis of large amounts of data for decision-making purposes. 3
  4. 4. WHAT IS MULTIDIMENSIONAL DATA Multi-dimensional databases are especially useful in sales and marketing applications that involve time series. Large volumes of sales and inventory data can be stored to ultimately be used for logistics and executive planning. 4
  5. 5. WHY MULTIDIMENSIONAL DATABASE Enables interactive analyses of large amounts of data for decision-making purposes Differ from previous technologies by viewing data as multidimensional cubes , which have proven to be particularly well suited for data analyses Rapidly process the data in the database so that answers can be generated quickly. A successful OLAP application provides “just-in- time”information for effective decision-making. 5
  6. 6. WHY MULTIDIMENSIONAL DATABASE The multidimensional data model is important because it enforces simplicity As Ralph Kimball states in his landmark book, The Data Warehouse Toolkit: "The central attraction of the dimensional model of a business is its simplicity.... that simplicity is the fundamental key that allows users to understand databases, and allows software to navigate databases efficiently." 6
  7. 7. WHY MULTIDIMENSIONAL DATABASE The multidimensional data model is composed of logical cubes, measures, dimensions, hierarchies, levels, and attributes. 7
  9. 9. DIAGRAM OF THE MULTIDIMENSIONAL MODEL: Logical Dimensions: Logical Dimensions are dimensions contain a set of unique values that identify and categorise data. Hierarchies and Levels : A hierarchy is a way to organize data at different levels of aggregation. Attributes: An attribute provides additional information about the data. Some attributes are used for display. 9
  11. 11. 1991 CANADIAN CENSUS 11
  12. 12. SLICING, DICING AND ROTATING In the above cube we have the results of the 1991 Canadian Census with ethnic origin, age group and geography representing the dimensions of the cube, while 174 represents the measure. The dimension is a category of data. Each dimension includes different levels of categories. The measures are actual data values that occupy the cells as defined by the dimensions selected. Three important concepts are associated with data cubes - Slicing - Dicing - Rotating 12
  13. 13. SLICING THE DATA CUBE • Figure 2 illustrates slicing the Ethnic origin Chinese. When the cube is sliced like in this example, we are able to generate data for Chinese origin for the geography and age groups as a result. • The data that is contained within the cube has effectively been filtered in order to display the measures associated only with the Chinese ethnic origin. • From an end user perspective, the term slice most often refers to a two- dimensional page selected from 13 the cube.
  14. 14. DICING AND ROTATING Ontario • Dicing is a related operation to slicing in which a sub-cube of the original space is defined • Dicing provides the user with the smallest available slice of data, enabling you to examine each sub-cube in greater detail. • Rotating, which is sometimes called pivoting changes the dimensional orientation of the report or page display from the cube data. Rotating may consist of swapping the rows an columns, or moving one of the row dimensions into the column dimension. 14 • sist/
  15. 15. EXAMPLE OF A DATA CUBE IN USE „Design and development of data mart for animal resources’ is a 2008 paper by Rai et al that critically examines the development of a Central Data Warehouse for a multitude of agricultural areas. The paper provides a visual representation of a data cube that shows the livestock population census multidimensional cube which is accessed through Internet browser for OLAP. In this cube, hierarchies are All States, All Species and All Years. All States has state names as a top level and district as bottom level of data flow hierarchy. All Species has top level as species name, second level as sex, third level as age group and bottom level as working categories of animals. All Years has only one level, i.e. years. 15
  17. 17. EXAMPLE OF A DATA CUBE IN USE This on-line system has drag and drop option for creation of nested tables, drill up and drill down functionalities based on hierarchies of various dimensions. The system also has simple calculation options on tabular data, hide and show options to hide certain undesirable rows or columns to be displayed on the screen. Find and search options are available for finding a particular piece of information in tabular data of a cube. 17
  18. 18. CREATING YOUR OWN DATA CUBE There are a variety of tools available that allow you to build your own data cube such as Microsoft Excel and Microsoft SQL server. The processes required are: 1 Chose a data source: 2 Create the query that extracts data from the database. 3 Create the cube from the extracted data. The Contoso database that we used for the Dashboard project is a good example of a data source from which we can generate data cubes Use the query wizard to generate the query that you wish to build your cube on. 18
  19. 19. CREATING YOUR OWN DATA CUBE In the Query Wizard Finish screen, select Create an OLAP Cube from this query and click Finish. The third step is to then use the OLAP Cube Wizard. This application allows you to turn your table columns into dimensions. i.e. Drag product_category, product_subcategory, and brand_name so that they appear in that order, in the available dimension box. Rename the dimension „Product.‟ The next step is to select the option that best fits the type of cube you want to create. For example, select Save a cube file containing all data for the cube. Enter a path and filename for the cube, and then click Finish. Save the query definition that you have created. The cube wizard then creates the cube file. Once the cube is created the PivotChart Wizard allows you to create a PivotTable report from the data in the cube. 19 us/library/office/aa140038(v=office.10).aspx#odc_da_whatrcubes_topic5
  20. 20. DATA WAREHOUSING & DATA MARTS How do Data Cubes relate to Data Warehousing & Data Marts? Are they the same? • Data Warehousing (DW) Definition • Pros/Cons of DW • Relation if any to Data Cubes • Data Marts (DM) Definition • Pros/Cons of DM20 • Relation if any to Data Cubes
  21. 21. DATA WAREHOUSING What is a Data Warehouse? A DW contains historical data derived from transaction data, but it can include data from other sources It separates analysis workload from transaction workload and enables an organisation to consolidate data from several sources to business users “Data Mining: Concepts & Techniques” , J. Han & M. Kamber 21
  22. 22. DATA WAREHOUSING “...The data warehouse is nothing more than the union of all the data marts...”- Ralph Kimball “You can catch all the minnows in the ocean and stack them together and they still do not make a whale”- Bill 22 Inmom
  24. 24. DATA WAREHOUSING Benefits:1. Gives the data …2. Removes …3. Potential for …4. Increased productivity …5. Example : US Insurance Company, B. Shin 2001 Problems:1. Increased …2. Maintenance …3. Complexity …4. Required …5. Ownership … 246. Duration …
  25. 25. DATA WAREHOUSING Comparisons to Data Cubing:1. Data cubes provide a …2. Data cubes are used to …3. From a design standpoint, it‟s important to …4. To put data in and get data out …5. Some or all of these … 25
  26. 26. DATA MART The single most important issue … A subset of a data warehouse that … Characteristics include: 1. Focuses on … 2. Do not normally … 3. More easily … How Is a Data Mart different from a Data Warehouse?  A data warehouse, unlike a data mart …  Are essentially different architectural structures, even though when viewed from afar and superficially, they look to be very similar 26  Tumbleweed, oak tree example
  27. 27. DATA MART Differences between Data Warehouse & Mart: 27
  28. 28. DATA MART 28
  29. 29. DATA MART Benefits of creating a data mart: 1. To give users … 2. To improve … 3. Building a data mart … 4. The cost of implementing … Problems:1. Functionality2. Size3. Load performance4. Administration5. Setup and configuration 29
  30. 30. DATA MART Comparisons to Data Cubing:1. The data mart is typically housed in multidimensional technology which is great for …2. Data Cubing provides a solid base for …3. Data Cubing gives end users …4. “To me, a Data Mart is just place where data gets dumped in a relatively flat, unusable format. Data Cubes is taking that data and making it dance.” (B. Quinn, 2008) 30
  32. 32. RELATIONAL DATABASES Data is stored in Relations  Tables with rows and columns. Records and Fields in each Table Relationships between tables “A shared repository of data”  Sarma (2011) 32
  33. 33. OLTP Online Transaction Processing Data is processed immediately and is always kept current Banking, inventory, scheduling, reservation systems. Simple queries  Insert; update; select For complex queries, relational databases are 33 unsuitable
  34. 34. DATA WAREHOUSE A large store of data accumulated from various databases ETL Process  Extract Data  Transform Data  Data Cleaning  Load Data Data Cube used for representing this data 34
  35. 35. DIMENSIONS AND MEASURES Multi-dimensional model defined by fact table and dimension tables Measure attribute: Saved from relational into the fact table  Defines data in MDM model Meta Data: Describes all the pertinent aspects of the data in the database fully and precisely  Required for sources from relational database 35  Determines data inserted into warehouse
  36. 36. RELATIONAL VS. MULTI-DIMENSIONAL Relational Database Multi-Dimensional Cube1 Complex Simple Different tables and relationships Dimension table has a direct relationship with the fact table2 Flexible Rigid3 Normalization common Repetition allowed4 OLTP OLAP Data updated frequently Minimum number of joins, which is provided in multi-diensional by a single join to a fact table5 Data is stored in Tables Data is stored in Cubes6 Table fields store actual data Dimensions and measures store actual data7 Table size is measured in records Cube size is measured in cell-sets8 Keywords Questions or “Verbiage” 369 Fundamental business tasks Planning, problem solving, decision making
  37. 37. ONLINE ANALYTICAL PROCESSING (OLAP) “Multi-dimensional models lie at the core of OLAP”  Jensen (2007) Provide quick answers to queries that aggregate large amounts of data to find trends and patters. Well-suited for multidimensional data organization Specific Questions  Answers needed quickly 37
  38. 38. SIMPLICITY AND CONSTRUCTION "The central attraction of the dimensional model of a business is its simplicity.... that simplicity is the fundamental key that allows users to understand databases, and allows software to navigate databases efficiently." Measures have same relationships  Easily analysed and displayed together Those with little experience find multidimensional model queries only take a short time to master.” 38
  40. 40. MULTIDIMENSIONAL CUBE OVERVIEW -ADVANTAGESo Tables - nature and structure no Longer forced on user.o Captures health of Organisation – allows drill down options.o Incorporates business rules automatically – and not exposed to users.o Automatic pre-populated data – Saving time and Resources 40
  41. 41. MULTIDIMENSIONAL CUBE OVERVIEW -DRAWBACKSo User Misuse and misunderstanding.o Ridged and Inflexible nature.o Too specific – Manipulation of Datao Not suitable for ad-hoc queries, unless within the dimensions of the "cube space“ Good MOLAP Query Performance ROLAP OK 41 Simple Complex Analysis
  42. 42. MOLAP SERVER MDDB Server Query Periodic load DataWarehouse user Advantages o Performance Constraint Environment. o Used in Mission Critical Operations. Disadvantages o Inflexible and limited data allowance. 42 o Unavailable data. o Specifics of summarised data.
  43. 43. ROLAP SERVER Cache Warehouse user Server Live fetch Query Data Data cache Advantageso Not Limited by Cube Data – „Live fetch‟.o Maintains functionality of relational Database Disadvantages 43o Inhibited Performance on large databases.o Limitations by SQL functionalities
  44. 44. We Guarantee this Presentation was made with 100% natural sources, 0% Wikipedia44 THANK YOU FOR LISTENING Any questions?