Analysis Services en SQL Server 2008


Published on

Presentacion sobre Analysis Services en SQL Server 2008

Ing. Eduardo Castro Martinez, PhD
Microsoft SQL Server MVP

Published in: Technology
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Analysis Services en SQL Server 2008

  1. 1. Analysis Services in SQL Server® 2008<br />Eduardo Castro SQL MVP,MCDBA , MCSE, MCAD, MCSD <br /><br />Comunidad Windows Costa Rica<br />
  2. 2. Data Warehouse System Components<br />Data Warehouse<br />User <br />Data Access<br />Data <br />Sources<br />Staging<br />Area<br />Data Marts<br /> Data Input<br />Data Access<br />
  3. 3. Understanding Data Warehouse Design<br />The Star Schema<br />Fact Table Components<br />Dimension Table Characteristics<br />The Snowflake Schema<br />
  4. 4. The Star Schema<br />Employee_Dim<br />EmployeeKey<br />EmployeeID<br />...<br />Product_Dim<br />Time_Dim<br />ProductKey<br />TimeKey<br />ProductID<br />...<br />TheDate<br />...<br />Shipper_Dim<br />Customer_Dim<br />ShipperKey<br />CustomerKey<br />ShipperID<br />...<br />CustomerID<br />...<br />Dimension Table<br />Fact Table<br />Sales_Fact<br />TimeKey<br />EmployeeKey<br />ProductKey<br />CustomerKey<br />ShipperKey<br />Sales Amount<br />Unit Sales ...<br />
  5. 5. Fact Table Components<br />Measures<br />time_dim<br />134 1/1/2000<br />DimensionTables<br />sales_fact Table<br />customer_dim<br />Foreign Keys<br />201 ALFI Alfreds<br />customer_key<br />product_key<br />time_key<br />quantity_sales<br />amount_sales<br />product_dim<br />201<br />25<br />134<br />400<br />10,789<br /> 25 123 Chai<br />The grain of the sales_fact table is defined by the lowest level of detail stored in each dimension<br />
  6. 6. Dimension Table Characteristics<br />Describes Business Entities<br />Contains Attributes That Provide Context to Numeric Data<br />Presents Data Organized into Hierarchies<br />
  7. 7. The Snowflake Schema<br />Defines Hierarchies by Using Multiple Dimension Tables<br />Is More Normalized than a Single Table Dimension<br />Is Supported within Analysis Services<br />
  8. 8. Defining OLAP Solutions<br />OLAP Databases <br />Common OLAP Applications<br />Relational Data Marts and OLAP Cubes<br />OLAP in SQL Server 2005<br />
  9. 9. OLAP Databases<br />Optimized Schema for Fast User Queries<br />Robust Calculation Engine for Numeric Analysis<br />Conceptual, Intuitive Data Model <br />Multidimensional View of Data<br />Drill down and drill up<br />Pivot views of data<br />
  10. 10. OLAP in SQL Server<br />Microsoft Is One of Several OLAP Vendors<br />Analysis Services Is Bundled with Microsoft SQL Server 2005 and SQL Server 2008<br />Analysis Services Include<br /> OLAP engine <br /> Data Mining technology<br />
  11. 11. Dimension Fundamentals<br />
  12. 12. Cube Measures<br />Are the Numeric Values of Principle Interest<br />Correspond to Fact Table Facts<br />Intersect All Dimensions at All Levels<br />Are Aggregated at All Levels of Detail<br />
  13. 13. Relational Data Sources<br />Star and Snowflake Schemas <br />Are required to build a cube with Analysis Services<br />Fact Table<br />Contains measures<br />Contains keys that join to dimension tables<br />Dimension Tables<br />Must exist in same database as fact table<br />Contain primary keys that identify each member<br />
  14. 14. Applying OLAP Cubes<br />Defining a Cube<br />Querying a Cube<br />Defining a Cube Slice<br />Working with Dimensions and Hierarchies<br />Visualizing Cube Dimensions<br />Connecting to an OLAP Cube<br />
  15. 15. Defining a Cube<br />Atlanta<br />Chicago<br />Market Dimension<br />Denver<br />Grapes<br />Cherries<br />Detroit<br />Melons<br />Apples<br />Products Dimension<br />Q4<br />Q1<br />Q2<br />Q3<br />Time Dimension<br />
  16. 16. Fact Sales<br />Atlanta<br />Chicago<br />MarketsDimension<br />Denver<br />Grapes<br />Cherries<br />Dallas<br />Melons<br />ProductsDimension<br />Apples<br />Q4<br />Q1<br />Q2<br />Q3<br />TimeDimension<br />Querying a Cube<br />
  17. 17. SQL Server 2008 Analysis Services Enhancements<br />Multidimensional Analysis with SQL Server Analysis Services<br />Data Mining with SQL Server Analysis Services<br />
  18. 18. Multidimensional Analysis with SQL Server Analysis Services<br />Cube Wizard<br />Dimension Wizard<br />The Attribute Relationship Designer<br />The Aggregation Designer<br />AMO Warnings<br />
  19. 19. Cube Wizard<br />More efficient interface<br />Create a cube based on a single de-normalized table<br />Create a cube based on a data source that has only linked dimensions<br />
  20. 20. Dimension Wizard<br />Create dimensions more efficiently<br />Automatically detect parent-child hierarchies<br />Provide safer default error configuration<br />Set member properties while creating the dimension<br />
  21. 21. The Attribute Relationship Designer<br />Flexible relationship<br />Rigid relationship<br />Manage Relationship through right-click options in the Attribute Relationship pane<br />
  22. 22. The Aggregation Designer<br />New Aggregation Designer:<br />Aggregations designs are shown grouped by measure group<br />New view available for manual aggregation design<br />Improved Usage-Based Optimization Wizard:<br />Ability to append new aggregations to an existing aggregation design<br />Ability to modify storage settings for one or more partitions simultaneously<br />Improved Aggregation Design Wizard<br />
  23. 23. AMO Warnings<br />Uses familiar symbols such as the wavy underline and a yellow triangle with an exclamation point<br />Non-intrusive warning messages appear when you pause the mouse over a warning<br />Available for logical errors in database design, when users depart from design best practices, and for non-optimal aggregation designs<br />
  24. 24. Data Mining with SQL Server Analysis Services<br />Improved Data Mining Wizard<br />Separating Test and Training Data<br />Filtering Model Cases<br />Cross Validation of Mining Models<br />Data Mining Add-Ins for Microsoft Office Systems<br />
  25. 25. Data Mining Enhancements<br />Improve both short-term and long-term predictions with the Microsoft Time Series algorithm, which has been enhanced to support both ARTXP and ARIMA algorithms<br />Enable drillthrough on a mining structure to allow queries about the cases used for both training and testing<br />Create column aliases to make it easier to understand column content. In the Data Mining Designer, the alias appears in parentheses next to the column usage label<br />Separate test and training data<br />Improve performance and analyze different scenarios by using Model Case Filtering<br />Create cross-validation reports<br />Utilize Data Mining Add-ins for Office<br />
  26. 26. Separating Test and Training Data<br />Three ways to partition data into training and test sets:<br />The Data Mining Wizard<br />Modifying the properties of the mining structure:<br />If you did not create a test partition when you created the structure, you can modify the HouldoutMaxCases, HoldoutMaxPercent, and HoldoutSeed properties<br />DMX statement, AMO, or XML DDL<br />
  27. 27. Filtering Model Cases<br /><ul><li>Limit the cases used in a model based on any attribute included in the model
  28. 28. Use to compare subsets of your data such as different regions</li></li></ul><li>Cross-Validation of Mining Models<br />Additional method to test and view mining model accuracy<br />Data is partitioned into cross-sections which are used to train and test models against each of the other cross sections<br />One portion of the data is used to test, the remaining data is used to train the model<br />To define a cross-validation report, you must configure:<br />The Fold Count, which specifies the number of folds or partitions that the data is broken into for testing<br />The Max Cases, which specifies the maximum number of cases to use (a value of 0 specifies that all cases will be used)<br />The Target Attribute, which defines the column or attribute that you want to predict<br />The Target State, which defines a particular value that you want to analyze within the target attribute<br />
  29. 29. Data Mining Add-Ins for Microsoft Office System<br />Data Mining Client for Excel – create, test, and manage data mining projects within Excel 2007<br />Table Analysis Tools for Excel – use powerful analysis tools such as analyzing key influencers, highlighting exceptions, and forecasting for data stored in spreadsheets<br />Data Mining Templates for Visio – render decision trees, regression trees, cluster diagrams, and dependency nets in diagrams created in Visio 2007<br />Important: Data Mining Add-ins for Microsoft Office System will be available for SQL Server 2008 when it releases to manufacturing. <br />
  30. 30. Demonstration<br />Implementing Multidimensional Analysis <br />
  31. 31. Optimizing Storage<br />What Are Sparse Columns?<br />How to Compress Data and Backups<br />
  32. 32. What Are Sparse Columns?<br />Eliminate limitation of 1024 columns.<br />Efficient way to manage object models that frequently contain numerous NULL values.<br />CREATE TABLE products (product_numint, item_numint, price decimal(7,2), ...,                       color char(5) SPARSE, width float SPARSE...)<br />
  33. 33. How to Compress Data and Backups<br />Data compression can be enabled on tables or views<br />Different compression types can be configured on a per-partition basis<br />The following data compression types can be defined:<br />Row compression<br />Page compression<br />Backup compression:<br />Normally decreases time required to perform a backup<br />Managed with Transact-SQL, Backup Task, Maintenance Plan Wizard, and Integration Services Backup Database task<br />
  34. 34. Contact Information<br />Eduardo Castro<br /><br /><br />