Business Intelligence Portfolio Pamela Staerker firstname.lastname@example.org http://www.linkedin.com/in/pstaerker
SummaryThis Portfolio contains samples from BI solutions developed using MicrosoftSQL Server R2 and the Microsoft Business Intelligence Toolset. T-SQL Programming MDX Programming Integration Services ETL System (SSIS) Analysis Services OLAP Database (SSAS) Reporting Services (SSRS) Sharepoint BI Delivery PerformancePoint Services | Reporting Services | Excel Services
Freight Allocation Specification: Using the Northwind Database Orders and [Order Details] table, create a result set that allocates freight downward to all product line items, based on each products dollars as a percentage of the dollars for the product as a whole. Validate this by summing the allocated freight. Grand total of the summed freight is $64,942.69In order to achieve the precise results needed forthis allocation, it was necessary to CAST the Partial Result SetOrderTot as FLOAT. The OrderTot = UnitPrice *Quantity. While Quantity has an INTEGER datatype, the UnitPrice has a MONEY data type. TheMONEY data type restricts any decimal to fournon-expanding decimal places, which creates aproblem for the allocation ratio. When using afloating point the result is off by only1/100,000,000 of a penny.
Top Five VendorsSpecification: Using the AdventureWorks2008 Database Purchasing.Vendor and Purchasing.PurchaseOrderHeader tables to show the top five vendors for 2003. Generate a ranking number for each vendor and show the data by quarter. Complete Result Set The native T-SQL Pivot operator is used to display the data with the VendorRank on rows. The Pivot request involves three logical processing phases with associated elements: 1) grouping phase 2) spreading phase 3) aggregation phase Here the pivot table is grouped by VendorRank, with the OrderDate quarter spread on columns and the TotalDue to the vendor aggregated.
DYNAMIC SQLThis an example of using dynamic SQL when a pivoted results set when is needed, but the number of columns to be created isnot known in advance. Partial Result Set
GET VENDOR PRODUCTSThis procedure gets the top @n Vendors within a specified date range, and ranks them by PurchaseOrderHeader.TotalDue DESC. Theresults are stored in the @VendorTable variable. For each of the vendors, the top @y products within the same date range aresubsequently selected and ranked by (PurchaseOrderDetail.UnitPrice * PurchaseOrderDetail.OrderQtry) DESC. The finalresult shows the top @n vendors and their top @y product sales. Partial Result Set Ranking is done using DENSE_RANK in the OVER clause. DENSE_RANK indicates how many distinct ordering rows have lower values. The CROSS APPLY operator is used to return the products for the top vendors by applying each row in the product query to each row in the @VendorTable variable.
USING T-SQL MERGEHere the T-SQL Merge, which is new for SQL Server 2008, is used to insert new or update existing records in a production tablefrom a staging table in a data warehouse scenario. The MERGE statement allows data to be inserted, updated or deleted based on conditional logic.
TOP FIVE CITIES WITHIN TOP FIVE MONTHSUsing the Adventure Works cube the top 5 cities were ranked within the top 5 months, based on the percent sales change overthe prior month. Complete Result Set To get the top 5 cities within the top 5 months based on the % increase of the internet sales over the previous month , the top 5 RankedMonths (only) months were determined first using the TOPCOUNT function and then ranked using the RANK function. The GENERATE function, which is specifically used as a means of generating a set based on iteration over another set, was then applied to the current [RankedMonths] member and another TOPCOUNT was used to find the top 5 customer cities within each of the top 5 months.
LAST 2 QUARTERS OF DATA FOR HAMBURGThis query uses the Adventure Works cube, on rows to show Hamburg and the states that have the same parent as Hamburg onrows. Columns show the last 2 available quarters of data for internet sales and the percent of geographical parent. Complete Result Set The significance of this query is that by using a FILTER for null data combined with the TAIL function, the last two quarters of available data for the current customer geography will always be returned.
CALCULATING COMMON TIME BASED METRICSThe Adventure Works Cube is used here to show all customer country members across the columns and on rows show for Juneand July of 2004 and then crossed with the following measures: Internet Sales Amount, Sales Amount Last Period, YTD sales, YTDSales for last year, Geographic % of Parent for Internet sales for the month and Geographic % of Parent for Internet sales for lastyear. Time-based expressions shown here were combined with other time based expressions in order to assemble more complex metrics. As an example, the [YTDLY] member used the MDX function PARALLELPERIOD with the [YTD Sales] member to determine the sales one year ago Complete Result Set
INTEGRATION SERVICES ETL SYSTEM (SSIS)
ALLWORKS CONSTRUCTION COMPANY PROJECTAllWorks is a fictitious construction company that uses data stored in various formats as part of theirenterprise system. Employee and Client Geography data, along with Overhead and Job Order masterdata are stored in spreadsheets. The Material Purchases data is exported from an Oracle database intoXML format, and Timesheet data is provided in .csv files.For this project all data from the files were transformed and loaded to a normalized database usingSSIS. The Package Flow Design shows the input data flow into the SSIS packages. Each SSIS package wasnamed for the respective AllWorksOLTP database table that it loads. AllWorksOLTP Database
PROJECT SOLUTIONThe project solution contains all the data load packages as well as Master package, and a Database Maintenance package The Masterpackage uses the Execute Package Task to run each of the packages in the proper sequence based, database foreign keyconstraints, followed by the Database Maintenance package.The Control Flow of each package generates an email notification for either the success or failure of the package. Success emailsinclude counts of files processed: rows inserted, rows changed and invalid rows. Package configuration is used to dynamically updatethe database server, SMTP Server and mailbox variables based on the current runtime environment.
TIMESHEET PACKAGE CONTROL FLOWThe Timesheet Package Control Flow uses a Foreach Loop Container to loop through a variable number of .csv timesheet files in afolder. A Script Task inside the loop container accumulates insert/update/error totals in variables, for each file and for the entirefolder. Script Tasks are used to write either a success or a failure email, as applicable. The Send Mail Task is used to send the email toan SMTP server.
TIMESHEET DATA FLOWThe data for each record that moves through the Timesheet Data Flow pipeline, is first converted from a .csv file data type to a SQLServer data type. The data is then checked using a Lookup Transformation to verify that ProjectID and EmployeeID are valid. AConditional Split Transformation is used to make sure the project has not been closed. If the project is not closed and the ProjectIDand EmployeeID are valid then a Lookup Transformation is used to determine if the Employee Time sheet already exists in thedatabase, and if not the timesheet record is inserted, otherwise the timesheet record is updated. A Conditional Split is used so thatonly modified timesheet data is sent to the database. An OLE DB Destination is used to insert data. An OLE DB Command is used toupdate data.
LOOKUP TRANSFORMATIONA Lookup Transformation Editor is used to validate the ProjectID and also to bring the Project Closed Date data into the dataflowpipeline.
C# SCRIPTING FOR ACCUMULATING TOTALSThe Microsoft Visual Studio Tools for Applications (VSTA) was accessed using the Script Task. C# scripting is used to accumulate theTimesheet record count totals. Variables used by the Script Task were first created in the SSIS package. Read- write variables were selected for use in Script Task Editor from the SSIS package variables. These variables were then available to be used within the code of the Script Task through the Dts.Variables collection, which Integration Services automatically creates and makes available to the script code.
MASTER PACKAGEExecute Package Tasks with precedence constraints that mirror PK/FK database constraints were used to execute the individualpackages in the ETL solution. Upon successful completion of the ETL, database maintenance is performed.
DATABASE MAINTENANCE PACKAGEThe Database Maintenance Package shrinks the database, rebuilds the database indexes, updates the database statistics and backs-upthe database nightly following successful completion of the Master Package.
PRODUCTION SCHEDULEFollowing deployment of the ETL packages to SQL Server, the Master Package was scheduled to run nightly using SQL Agent.
ANALYSIS SERVICESOLAP DATABASE (SSAS)
DATA SOURCE VIEWSQL Views were scripted in the AllWorks OLTP database to generate the dimensions and measures needed to create the AllWorksdimensional database in Analysis Services. In SSAS a data connection to the SQL Server relational database was established and a datasource view (DSV) created using the dimensional views. In the DSV, entity relationships and logical primary keys were identified. The JobClosdedDate in the vwDimProject view is used in a role playing dimension, and therefore an entity relationship between vw.DimProjct.ClosedDateKey and vwDimDate.WeekendKey is needed. A named calculation was created to provide a description for the Boolean value returned by the Employee Flag in the vwDimEmployee view.
ALLWORKS DW CUBEA cube consisting of four measure groups and five dimensions was then generated using the Cube Wizard. The Date dimension isused in the cube twice as a role-playing dimension for the Project Closed date.
DIMENSION USAGERelationships in the cube between the dimensions and measures was verified by inspecting the Dimension Usage tab. A referenced relationship was manually created for the intersection between the ProjectClosed Date (role playing dimension) and the Summary Measure Group. This dimension usage tab from another cube shows a many-to- many relationship between BookSales and Authors. Each Author can have multiple Books for sale, and each Book may have more than one Author. This necessitates the intermediate cross reference table, BookXAuthors.
DIMENSIONSAll cube dimensionattributes and user-defined hierarchies. The Dimension Structure Tab was used to specify each dimension’s attributes and user-defined hierarchies. Here a date hierarchy was built using the Year, Quarter and Weekend date. Rigid attribute relationships were created to associate the attributes used in the Date Tree hierarchy. Setting the RelationshipType property determines whether Analysis Services creates rigid or flexible aggregations. After an incremental update, Analysis Services drops flexible aggregations and those aggregations must be manually reprocessed, but Analysis Services persists rigid aggregations, which improves query performance!
BROWSING THE CUBEAfter the cube was deployed and processed successfully, the Cube Browser dimensions and measures were selected from themetadata pane and the results were examined against the original OLTP database for accuracy. The cube can also be easily browsedusing an Excel Pivot Table.
CALCULATIONSMDX expressions were used to create calculated members and named sets. This calculated member computes the open receivables as a percent of invoice amount. It is also rendered here as a Excel pivot table which uses it along with the Invoice Amount measure and the Open Receivables calculated member, for the All Clients named set in 2006, broken out by quarter.
KPI’SKey performance indicators (KPIs) were created using the newly deployed calculated members for the KPI value expression. In this KPIa Traffic Light is used to visually represent the Profit Percent metric, with a goal of greater than 15% profit. The KPI is displayed for allclients using MS Excel. Green indicates the goal of greater than 15% profit for the client’s projects have been met. A yellow traffic lightindicates that the profit percentage is between 15% and 5% inclusive. A red light indicates that the profit is less than 5%. No trafficlight is displayed for clients who do not have any closed projects. While the KPI Goal here is static, it can also be data driven
ACTIONSA report URL action was created so that the client application can execute a live Google Maps search based on the project’s county.
PARTITIONS AND AGGREGATIONSPartitions were created for each fact table to separate current from historical data . MOLAP storage with aggregations designed for a50% performance increase were specified. Aggregations are precalculated summaries of data from leaf cells
REPORTING SERVICES (SSRS)
MOVING AVERAGE OLAP REPORTThis report shows sales dollar revenue as a column bar and the 12 month moving average as a horizontal line. The user can select oneor more years, and a product category, subcategory or product. The Fiscal Year dropdown excludes the first year of sales.
SALES BY CATEGORY EXPLODED PIE CHARTThis Exploded Pie Chart report shows sales proportions by category. Here the legend is hidden and an expression is used to label eachpie slice with the Category, Sales Dollar Amount and the % of Total. The labels are configured to display outside of the pie forreadability purposes.
EMPLOYEE SALES QUOTAS W/ SPARKLINES & GAUGESThis is an employee sales matrix OLAP report. The gauges give an ‘at a glance’ indication of each employee’s sales in comparison totheir quota, while the sparklines provide a visual representation of each employee’s sales vs quota trend.
TOP (N) PRODUCTS & TOP (Y) CITIES OLAP REPORTThis report allows the user select to the TOP N Products by Revenue, and within each product the TOP Y Cities. The user can select oneor more years. This report uses the MDX GENARATE function to achieve the TopY within the TopN.
SHAREPOINT BI DELIVERY (DASHBOARD)
CONTOSO RETAIL DASHBOARDThe Contoso Retail Dashboard project is comprised of six PerformancePoint pages deployed to SharePoint 2010. An additional SSRSreport with a delivery subscription is also included. Performance Point Content Site Collection Dashboard Pages
KPI SCORECARD W/PROFIT MARGIN HOTLINK REPORTThe first dashboard page contains two objective KPIs (Financial and I.T. Systems) and 4 KPIs (Product Gross Profit Margin, ChannelRevenue, Returns Pct and Machine downtime trend). Each KPI has a “hot-link” report. The Financial KPIs show all Product Categoriesunderneath. The KPI score card and associated charts were created using the PerformancePoint Dashboard Designer. The Product Gross Profit Margin KPI is linked at to the chart by the filter date, filter geography and product category selection. The hot-link chart shows monthly sales, gross profit margin and gross profit margin for the past year.
KPI SCORECARD W/SALES DRILL DOWN TO BRANDCharts can be drilled down into for more granular information. This is a drill down for the TV and Video Product Category to see whichBrands are being sold.
KPI SCORECARD W/CHANNEL SALES HOTLINK REPORTThe Channel Revenue KPI is linked at to the chart by the filter date, filter geography and current product category selection. The hot-link chart shows the monthly sales quota amount, the sales amount and the sales amount last year.
KPI SCORECARD W/RETURNS % HOTLINK REPORTThe Returns Percent KPI is linked to the chart by the filter date, filter geography and current product category selection. The hot-linkchart shows the sales return amount and the sales return percent for the ten stores with the most returns.
KPI SCORECARD W/HOTLINK SUPPORTING GRIDThe Machine Downtime KPI contains a hot-link to an analytic grid, that breaks down Machine Down Time, Machine Down Time LastYear and the Trend Outage Type by Fiscal Year and Geography.
PRODUCT SALES PROFILE WITH SIBLINGS REPORTThe second page of the Dashboard contains a PerformancePoint Analytic Chart of Product Monthly Sales for the selected Fiscal Year,along with the Product Sales as a percent of the product’s hierarchical parent. A Supporting Grid also displays the Monthly Product %of Parent Sales for all siblings of the Product selection.
EMPLOYEE PROFILE W/DRILL DOWN TO DECOMPOSITION TREEThe third page of the Performance Point Dashboard contains an Analytic Chart of Sales as a % of Quota, by Employee(s) and months fora year. A drill down to a decomposition tree for employee Kim Abercrombie, is used to display her subordinates and their % of SalesQuota, as well as how they rank in relation to their peers. Decomposition trees can be used for root cause analysis.
SSRS SALES MAP AND RETURNS % REPORTThe interactive Sales Map was created in SSRS and deployed to SharePoint. Here PerformancePoint acts as a wrapper for the SQLServer Reporting Services report, so that the report can be displayed in the dashboard. The PerformancePoint Fiscal Year and SortOption filters are linked to the map and chart, which show Sales Amount by State. Additionally, the chart plots the Sales ReturnPercent on a second vertical axis.
SSRS SALES MATRIXThe fifth page of the Dashboard is an SSRS Matrix report with two adjacent column groups for Sales by Retail Channel and Promotion,and Row Groups for Product Category/Subcategory with drilldown and % of total functionality.
EXCEL SALES REPORT WITH SLICERSA Pivot Table with Slicers for Product Category/Subcategory, Sales Chanel and Fiscal Year was created in Excel 2010 The Pivot Tableshows Sales Amount, Sales Return Amount, Gross Profit Margin and Sales Return Percent by the Geography Hierarchy. The report wassaved to SharePoint from Excel and then utilized by PerformancePoint to create the last page of the Contoso Retail Dashboard.
SSRS TOPS SELLING STORES SUBSCRIPTION REPORTThis is a simple Reporting Services Report deployed to Sharepoint. The report was set up to run nightly for the states of Maryland andVirginia and be delivered to a Sharepoint folder.
Pamela StaerkerSetFocus Master’s BI Program 2011-2012Instructor: Kevin S. Goff, Microsoft SQL Server MVP http://www.linkedin.com/in/pstaerker