The document provides examples of the author's business intelligence skills, including data modeling, SQL programming, SQL Server Integration Services, SQL Server Analysis Services, MDX programming, SQL Server Reporting Services, Excel Power Pivot, Performance Point Services, and SharePoint Services. Examples include data warehouse modeling, stored procedures, SSIS packages, SSAS cubes and MDX queries, SSRS reports, Power Pivot reports, PerformancePoint dashboards, and SharePoint dashboards.
This document provides an introduction to MapReduce and describes how to process stock market data using MapReduce. It explains how the data is split into input splits that are assigned to mappers. The custom MarketCapitalizationMapper and MarketCapitalizationReducer classes are used to calculate the market capitalization for each stock symbol by multiplying the stock price and volume. The mapper emits key-value pairs that are sorted and sent to reducers, and the reducer outputs the highest market cap for each symbol. Sample output is shown listing the market caps for different stock symbols.
Drizzles Approach To Improving Performance Of The ServerPerconaPerformance
The document discusses Drizzle's approach to database performance. It values open discussion, a focus on interfaces over implementations, avoiding "magic" code, and using standard libraries. It also discusses cleaning up Drizzle's codebase to make it easier for developers to contribute. The document then provides an example of complex code that "makes baby kittens cry" and could benefit from refactoring.
This progress report summarizes work on co-clustering methods for documents and words, including a new proposed method called CCAM (Co-clustering with Augmented Matrix). It introduces baseline K-means clustering, existing ITCC method, and the new CCAM method. Results are evaluated using both a results-based approach comparing cluster assignments, and a feature-based approach using cluster assignments as additional features. CCAM outperforms ITCC and baseline methods on average F-measure for ad and user clustering across different values of K clusters. Future work includes discretizing features and exploring different CCAM parameters.
Im zweiten Teil seiner OData Session zeigt Rainer Stropek, wie man eigene OData-Provider entwickelt. In einem durchgängigen Beispiel demonstriert er, wie man erst einen LINQ-Provider und darauf aufbauend einen OData-konformen REST Service erstellt und von verschiedenen Programmiersprachen und Tools darauf zugreift. In der Session werden Grundkenntnisse von OData und LINQ vorausgesetzt.
Spring Data is a framework that unifies access to data stores and repositories. It provides templates for data access, object mapping to data stores, and repository support with CRUD and query methods. Spring Data supports both relational and NoSQL databases. Repositories provide a common interface for data access while templates handle store-specific operations like queries. Object mapping annotations map domain objects to different data models. Spring configuration enables transactions and scans for repositories. Unit tests validate data access and repositories.
The document discusses Dynamic Data Exchange (DDE), an early Windows API that allows data sharing between applications; it outlines Business Information Server (BIS) support for DDE including initiating conversations, reading/writing data, and executing commands; examples are provided for using DDE between BIS and applications like Excel, Word, and Visual Basic.
This document describes how to create a custom DataMapper adapter for MongoDB. It discusses initializing the adapter, connecting to MongoDB, and implementing CRUD methods like create, read, update and delete. Methods are provided to parse DataMapper query conditions to MongoDB query formats, handle associations, and apply field and collection naming conventions. The adapter subclasses DataMapper::Adapters::AbstractAdapter and implements adapter-specific behavior while retaining compatibility with DataMapper APIs.
The document discusses Relational Database Management Systems (RDBMS). It defines key concepts such as data, database, DBMS, RDBMS and provides examples of how data is structured in tables with rows and columns. It also summarizes common RDBMS features like SQL queries, data types, integrity constraints, functions and joins. Overall, the document provides a high-level overview of RDBMS components and functionality.
This document provides an introduction to MapReduce and describes how to process stock market data using MapReduce. It explains how the data is split into input splits that are assigned to mappers. The custom MarketCapitalizationMapper and MarketCapitalizationReducer classes are used to calculate the market capitalization for each stock symbol by multiplying the stock price and volume. The mapper emits key-value pairs that are sorted and sent to reducers, and the reducer outputs the highest market cap for each symbol. Sample output is shown listing the market caps for different stock symbols.
Drizzles Approach To Improving Performance Of The ServerPerconaPerformance
The document discusses Drizzle's approach to database performance. It values open discussion, a focus on interfaces over implementations, avoiding "magic" code, and using standard libraries. It also discusses cleaning up Drizzle's codebase to make it easier for developers to contribute. The document then provides an example of complex code that "makes baby kittens cry" and could benefit from refactoring.
This progress report summarizes work on co-clustering methods for documents and words, including a new proposed method called CCAM (Co-clustering with Augmented Matrix). It introduces baseline K-means clustering, existing ITCC method, and the new CCAM method. Results are evaluated using both a results-based approach comparing cluster assignments, and a feature-based approach using cluster assignments as additional features. CCAM outperforms ITCC and baseline methods on average F-measure for ad and user clustering across different values of K clusters. Future work includes discretizing features and exploring different CCAM parameters.
Im zweiten Teil seiner OData Session zeigt Rainer Stropek, wie man eigene OData-Provider entwickelt. In einem durchgängigen Beispiel demonstriert er, wie man erst einen LINQ-Provider und darauf aufbauend einen OData-konformen REST Service erstellt und von verschiedenen Programmiersprachen und Tools darauf zugreift. In der Session werden Grundkenntnisse von OData und LINQ vorausgesetzt.
Spring Data is a framework that unifies access to data stores and repositories. It provides templates for data access, object mapping to data stores, and repository support with CRUD and query methods. Spring Data supports both relational and NoSQL databases. Repositories provide a common interface for data access while templates handle store-specific operations like queries. Object mapping annotations map domain objects to different data models. Spring configuration enables transactions and scans for repositories. Unit tests validate data access and repositories.
The document discusses Dynamic Data Exchange (DDE), an early Windows API that allows data sharing between applications; it outlines Business Information Server (BIS) support for DDE including initiating conversations, reading/writing data, and executing commands; examples are provided for using DDE between BIS and applications like Excel, Word, and Visual Basic.
This document describes how to create a custom DataMapper adapter for MongoDB. It discusses initializing the adapter, connecting to MongoDB, and implementing CRUD methods like create, read, update and delete. Methods are provided to parse DataMapper query conditions to MongoDB query formats, handle associations, and apply field and collection naming conventions. The adapter subclasses DataMapper::Adapters::AbstractAdapter and implements adapter-specific behavior while retaining compatibility with DataMapper APIs.
The document discusses Relational Database Management Systems (RDBMS). It defines key concepts such as data, database, DBMS, RDBMS and provides examples of how data is structured in tables with rows and columns. It also summarizes common RDBMS features like SQL queries, data types, integrity constraints, functions and joins. Overall, the document provides a high-level overview of RDBMS components and functionality.
The document provides details of a SSIS/ETL project for a company called AllWorks, Inc. It includes objectives to design a normalized database, create the database and tables in SQL, develop SSIS packages to extract data from source files and load it into the database tables, and deploy/schedule the packages. Sections include analysis of source data, design of a relational schema in third normal form, and scripting the creation of the database and tables.
This document provides an introduction to MDX (Multidimensional Expressions), which is the query language used for multidimensional databases. It explains that MDX allows for very concise queries with relatively complex results, such as year-to-date, rolling averages, and net performance. The document also discusses key MDX concepts like cubes, measures, tuples, and sets. It provides an example MDX query and demonstrates how to use functions like WITH, MEMBER, TOPCOUNT, and WHERE to retrieve specific data from an Analysis Services cube.
This document contains 10 Transact-SQL queries that analyze order data from the AdventureWorks2008 sample database. The queries include wildcard searches, grouping with HAVING clauses, correlated subqueries, outer joins, unions, stored procedures, common table expressions, pivoting, and ranking functions. The queries provide summaries of order details by product name, counts of orders by product subcategory, lists of vendors with no orders in 2003, summaries of freight charges by shipper, and more.
The portfolio describes projects completed as part of a Business Intelligence Masters Program focusing on Microsoft SQL Server and Visual Studio. It includes:
1) An advanced Transact SQL project involving 10 complex queries on a SQL Server database.
2) A SQL Server Integration Services project to extract, transform, and load data from Excel, CSV and XML files into a SQL Server database for a construction company.
3) SQL Server Analysis Services and Reporting Services projects to analyze and report on the construction company data in the SQL Server database.
The portfolio demonstrates skills in Transact SQL, SQL Server Integration Services, Analysis Services and Reporting Services gained through hands-on projects in the intensive Masters Program.
Intershop Commerce Management with Microsoft SQL ServerMauro Boffardi
This document discusses Intershop Commerce Management's support for Microsoft SQL Server and Azure SQL Database as operational databases. Key points include:
- Intershop Commerce Management version 7.10 now supports Microsoft SQL Server and Azure SQL Database in addition to Oracle Database.
- Microsoft SQL Server and Azure SQL Database provide features for business intelligence, advanced analytics, data management, and machine learning.
- Organizations have options to use SQL Server on-premises, Azure SQL Database on Azure, or let Intershop manage the database through their commerce-as-a-service offering.
- The document outlines the steps taken to migrate an existing Intershop implementation from Oracle to Microsoft SQL Server, including
The document discusses using R to analyze and visualize Oracle database metrics and statistics in real time. It provides examples of R code to connect to an Oracle database and retrieve system statistics and wait event data. The code then computes changes from the previous snapshot and graphs metrics over time, including system statistics by interval, wait times and events, and wait class distributions. It also describes splitting the screen into multiple graphs to show various views of the real-time data. The goal is to build interactive dashboards to monitor database performance using R.
The document provides details about Kevin Bengtson's SQL portfolio, including several database projects and T-SQL queries projects with examples. It also outlines SQL server administrative tasks performed and an SSIS/SSRS project involving creating a MiniAdventureWorks database. The final section describes a BlockFlix database designed for a video rental store.
The document discusses MongoDB transactions and concurrency. It provides code examples of how to perform transactions in MongoDB using logical sessions, including inserting a document into a collection and updating related documents in another collection atomically. It also discusses some of the features and timeline for implementing distributed transactions in sharded MongoDB clusters.
This document discusses replacing the use of $GLOBALS['TYPO3_DB'] with Doctrine DBAL for database queries in TYPO3 extensions. Doctrine DBAL provides a database abstraction layer that supports multiple database vendors, whereas $GLOBALS['TYPO3_DB'] only supports MySQL. Migrating to Doctrine DBAL offers benefits like a more reliable industry standard and easier API. The document provides examples of common queries like select, insert, update using the Doctrine query builder and highlights best practices for security and restrictions. $GLOBALS['TYPO3_DB'] will be removed in TYPO3 8 LTS, so extensions need to migrate to Doctrine DB
This document discusses cost-based query optimization in Apache Hive. It provides background on the author and describes how cost-based optimization is being incrementally introduced in Hive using the Apache Optiq query planning framework. The initial focus is on join reordering using join cardinality costing. Examples are provided showing performance improvements on TPC-DS queries when the cost-based optimizer is used to reorder joins compared to the rule-based optimizer in Hive. Future work is needed to support more query constructs and scale to larger queries.
Cost-based query optimization in Apache HiveJulian Hyde
Tez is making Hive faster, and now cost-based optimization (CBO) is making it smarter. A new initiative in Hive 0.13 introduces cost-based optimization for the first time, based on the Optiq framework.
Optiq’s lead developer Julian Hyde shows the improvements that CBO is bringing to Hive 0.13. For those interested in Hive internals, he gives an overview of the Optiq framework and shows some of the improvements that are coming to future versions of Hive.
The document discusses new developer features introduced in SQL Server 2012-2016, including SSDT tools, T-SQL improvements like THROW and sequences, in-memory OLTP, common table expressions, and features in SQL Server 2016 such as dynamic data masking, row-level security, always encrypted, temporal tables, and JSON support. SQL Server 2016 also introduced the DROP IF EXISTS statement to drop objects and the ability to insert rows using merge statements with common table expressions.
Database Development Replication Security Maintenance Reportnyin27
The document discusses various database administration tasks including:
1. Creating stored procedures, functions, views and indexes
2. Configuring security using roles, permissions and encryption
3. Implementing database maintenance including backups, jobs, partitioning and monitoring
4. Setting up reports and notifications
Understand when to use user defined functions in sql server tech-republicKaing Menglieng
User-defined functions (UDFs) in SQL Server allow users to define custom functions that can accept parameters and return values. There are two main types of UDFs - table-valued functions that return results in a table that can be queried, and scalar-valued functions that return a single value. The document provides examples of creating both types of UDFs and using them to return sales data from a sample SalesHistory table based on input parameters.
The document is a business intelligence portfolio for Chris Bull containing examples of his skills and experience in areas such as data modeling, SQL programming, SQL Server Integration Services, SQL Server Analysis Services, MDX programming, SQL Server Reporting Services, PerformancePoint Server, and SharePoint Server. It includes samples of his work developing ETL processes, cubes, reports, and other BI solutions. It also provides a summary of his 14 years of IT experience and 2 recommendations from academic references.
U-SQL Query Execution and Performance TuningMichael Rys
This 400 level presentation explains the U-SQL Query Execution in Azure Data Lake and provides several Performance Tuning tips: What tools are available and some best practices.
This document provides SQL code to populate a date dimension table in a data warehouse. It includes code to create lookup tables for day and month names in multiple languages. It then shows code to create the date dimension table with over 50 date-related fields. Finally, it includes a stored procedure that loops through a date range, inserts rows into the date dimension table, and calculates values for fields like week, month, quarter, and translations based on the date. The code is designed to populate the date dimension table in a standardized way to support date-based analysis in business intelligence projects.
Review this presentation to learn what it means to support a spatial database, and start to see the power of answering spatial questions inside a Postgres database.
Object-relational mapping (ORM) tools address the impedance mismatch between object-oriented programming and relational databases. Hibernate is a popular open-source Java ORM that uses object mapping files to define object-database relationships. It provides an object persistence mechanism and query language to retrieve and manipulate objects while insulating developers from vendor-specific SQL. Hibernate supports inheritance, caching, and concurrency control to improve performance but can introduce complexity.
Running Intelligent Applications inside a Database: Deep Learning with Python...Miguel González-Fierro
In this talk we present a new paradigm of computation where the intelligence is computed inside the database. Standard software systems must get the data from the database to execute a routine. If the size of the data is big, there are inefficiencies due to the data movement. Store procedures tried to solve this issue in the past, allowing for computing simple functions inside the database. However, only simple routines can be executed.
To showcase the capabilities of our new system, we created a lung cancer detection algorithm using Microsoft’s Cognitive Toolkit, also known as CNTK. We used transfer learning between ImageNet dataset, which contains natural images, and a lung cancer dataset, which contains scans of horizontal sections of the lung for healthy and sick patients. Specifically, a pretrained Convolutional Neural Network on ImageNet is used on the lung cancer dataset to generate features. Once the features are computed, a boosted tree is applied to predict whether the patient has cancer or not.
All this process is computed inside the database, so the data movement is minimized. We are even able to execute the algorithm using the GPU of the virtual machine that hosts the database. Using a GPU, we can compute the featurization in less than 1h, in contrast to using a CPU, that would take up to 32h. Finally, we set up an API to connect the solution to a web app, where a doctor can analyze the images and get a prediction of a patient.
The document provides details of a SSIS/ETL project for a company called AllWorks, Inc. It includes objectives to design a normalized database, create the database and tables in SQL, develop SSIS packages to extract data from source files and load it into the database tables, and deploy/schedule the packages. Sections include analysis of source data, design of a relational schema in third normal form, and scripting the creation of the database and tables.
This document provides an introduction to MDX (Multidimensional Expressions), which is the query language used for multidimensional databases. It explains that MDX allows for very concise queries with relatively complex results, such as year-to-date, rolling averages, and net performance. The document also discusses key MDX concepts like cubes, measures, tuples, and sets. It provides an example MDX query and demonstrates how to use functions like WITH, MEMBER, TOPCOUNT, and WHERE to retrieve specific data from an Analysis Services cube.
This document contains 10 Transact-SQL queries that analyze order data from the AdventureWorks2008 sample database. The queries include wildcard searches, grouping with HAVING clauses, correlated subqueries, outer joins, unions, stored procedures, common table expressions, pivoting, and ranking functions. The queries provide summaries of order details by product name, counts of orders by product subcategory, lists of vendors with no orders in 2003, summaries of freight charges by shipper, and more.
The portfolio describes projects completed as part of a Business Intelligence Masters Program focusing on Microsoft SQL Server and Visual Studio. It includes:
1) An advanced Transact SQL project involving 10 complex queries on a SQL Server database.
2) A SQL Server Integration Services project to extract, transform, and load data from Excel, CSV and XML files into a SQL Server database for a construction company.
3) SQL Server Analysis Services and Reporting Services projects to analyze and report on the construction company data in the SQL Server database.
The portfolio demonstrates skills in Transact SQL, SQL Server Integration Services, Analysis Services and Reporting Services gained through hands-on projects in the intensive Masters Program.
Intershop Commerce Management with Microsoft SQL ServerMauro Boffardi
This document discusses Intershop Commerce Management's support for Microsoft SQL Server and Azure SQL Database as operational databases. Key points include:
- Intershop Commerce Management version 7.10 now supports Microsoft SQL Server and Azure SQL Database in addition to Oracle Database.
- Microsoft SQL Server and Azure SQL Database provide features for business intelligence, advanced analytics, data management, and machine learning.
- Organizations have options to use SQL Server on-premises, Azure SQL Database on Azure, or let Intershop manage the database through their commerce-as-a-service offering.
- The document outlines the steps taken to migrate an existing Intershop implementation from Oracle to Microsoft SQL Server, including
The document discusses using R to analyze and visualize Oracle database metrics and statistics in real time. It provides examples of R code to connect to an Oracle database and retrieve system statistics and wait event data. The code then computes changes from the previous snapshot and graphs metrics over time, including system statistics by interval, wait times and events, and wait class distributions. It also describes splitting the screen into multiple graphs to show various views of the real-time data. The goal is to build interactive dashboards to monitor database performance using R.
The document provides details about Kevin Bengtson's SQL portfolio, including several database projects and T-SQL queries projects with examples. It also outlines SQL server administrative tasks performed and an SSIS/SSRS project involving creating a MiniAdventureWorks database. The final section describes a BlockFlix database designed for a video rental store.
The document discusses MongoDB transactions and concurrency. It provides code examples of how to perform transactions in MongoDB using logical sessions, including inserting a document into a collection and updating related documents in another collection atomically. It also discusses some of the features and timeline for implementing distributed transactions in sharded MongoDB clusters.
This document discusses replacing the use of $GLOBALS['TYPO3_DB'] with Doctrine DBAL for database queries in TYPO3 extensions. Doctrine DBAL provides a database abstraction layer that supports multiple database vendors, whereas $GLOBALS['TYPO3_DB'] only supports MySQL. Migrating to Doctrine DBAL offers benefits like a more reliable industry standard and easier API. The document provides examples of common queries like select, insert, update using the Doctrine query builder and highlights best practices for security and restrictions. $GLOBALS['TYPO3_DB'] will be removed in TYPO3 8 LTS, so extensions need to migrate to Doctrine DB
This document discusses cost-based query optimization in Apache Hive. It provides background on the author and describes how cost-based optimization is being incrementally introduced in Hive using the Apache Optiq query planning framework. The initial focus is on join reordering using join cardinality costing. Examples are provided showing performance improvements on TPC-DS queries when the cost-based optimizer is used to reorder joins compared to the rule-based optimizer in Hive. Future work is needed to support more query constructs and scale to larger queries.
Cost-based query optimization in Apache HiveJulian Hyde
Tez is making Hive faster, and now cost-based optimization (CBO) is making it smarter. A new initiative in Hive 0.13 introduces cost-based optimization for the first time, based on the Optiq framework.
Optiq’s lead developer Julian Hyde shows the improvements that CBO is bringing to Hive 0.13. For those interested in Hive internals, he gives an overview of the Optiq framework and shows some of the improvements that are coming to future versions of Hive.
The document discusses new developer features introduced in SQL Server 2012-2016, including SSDT tools, T-SQL improvements like THROW and sequences, in-memory OLTP, common table expressions, and features in SQL Server 2016 such as dynamic data masking, row-level security, always encrypted, temporal tables, and JSON support. SQL Server 2016 also introduced the DROP IF EXISTS statement to drop objects and the ability to insert rows using merge statements with common table expressions.
Database Development Replication Security Maintenance Reportnyin27
The document discusses various database administration tasks including:
1. Creating stored procedures, functions, views and indexes
2. Configuring security using roles, permissions and encryption
3. Implementing database maintenance including backups, jobs, partitioning and monitoring
4. Setting up reports and notifications
Understand when to use user defined functions in sql server tech-republicKaing Menglieng
User-defined functions (UDFs) in SQL Server allow users to define custom functions that can accept parameters and return values. There are two main types of UDFs - table-valued functions that return results in a table that can be queried, and scalar-valued functions that return a single value. The document provides examples of creating both types of UDFs and using them to return sales data from a sample SalesHistory table based on input parameters.
The document is a business intelligence portfolio for Chris Bull containing examples of his skills and experience in areas such as data modeling, SQL programming, SQL Server Integration Services, SQL Server Analysis Services, MDX programming, SQL Server Reporting Services, PerformancePoint Server, and SharePoint Server. It includes samples of his work developing ETL processes, cubes, reports, and other BI solutions. It also provides a summary of his 14 years of IT experience and 2 recommendations from academic references.
U-SQL Query Execution and Performance TuningMichael Rys
This 400 level presentation explains the U-SQL Query Execution in Azure Data Lake and provides several Performance Tuning tips: What tools are available and some best practices.
This document provides SQL code to populate a date dimension table in a data warehouse. It includes code to create lookup tables for day and month names in multiple languages. It then shows code to create the date dimension table with over 50 date-related fields. Finally, it includes a stored procedure that loops through a date range, inserts rows into the date dimension table, and calculates values for fields like week, month, quarter, and translations based on the date. The code is designed to populate the date dimension table in a standardized way to support date-based analysis in business intelligence projects.
Review this presentation to learn what it means to support a spatial database, and start to see the power of answering spatial questions inside a Postgres database.
Object-relational mapping (ORM) tools address the impedance mismatch between object-oriented programming and relational databases. Hibernate is a popular open-source Java ORM that uses object mapping files to define object-database relationships. It provides an object persistence mechanism and query language to retrieve and manipulate objects while insulating developers from vendor-specific SQL. Hibernate supports inheritance, caching, and concurrency control to improve performance but can introduce complexity.
Running Intelligent Applications inside a Database: Deep Learning with Python...Miguel González-Fierro
In this talk we present a new paradigm of computation where the intelligence is computed inside the database. Standard software systems must get the data from the database to execute a routine. If the size of the data is big, there are inefficiencies due to the data movement. Store procedures tried to solve this issue in the past, allowing for computing simple functions inside the database. However, only simple routines can be executed.
To showcase the capabilities of our new system, we created a lung cancer detection algorithm using Microsoft’s Cognitive Toolkit, also known as CNTK. We used transfer learning between ImageNet dataset, which contains natural images, and a lung cancer dataset, which contains scans of horizontal sections of the lung for healthy and sick patients. Specifically, a pretrained Convolutional Neural Network on ImageNet is used on the lung cancer dataset to generate features. Once the features are computed, a boosted tree is applied to predict whether the patient has cancer or not.
All this process is computed inside the database, so the data movement is minimized. We are even able to execute the algorithm using the GPU of the virtual machine that hosts the database. Using a GPU, we can compute the featurization in less than 1h, in contrast to using a CPU, that would take up to 32h. Finally, we set up an API to connect the solution to a web app, where a doctor can analyze the images and get a prediction of a patient.
4. AllWorks Data Warehouse
A data warehouse model must consider both the available data and reporting
needs to be supported.
4
5. Data Modeling Alternatives
Even a small data warehouse
might have alternate designs
that need to be tested for
support of the needs of SSAS or
SSRS development.
5
7. A Table Value Function
The function returns a product and the price as of a give date.
The function can be used to return the price
IF EXISTS (SELECT * FROM sys.objects for a single item,
WHERE object_id = OBJECT_ID(N'[dbo].[GetStandardCostByDate]')
AND type in (N'FN', N'IF', N'TF', N'FS', N'FT'))
DROP FUNCTION [dbo].[GetStandardCostByDate] DECLARE @ProdNum NVARCHAR(25) = 'BK-R89R-58‘
GO , @CostDate DATETIME = '1/1/2008'
CREATE FUNCTION dbo.GetStandardCostByDate SELECT * FROM dbo.GetStandardCostByDate
(@ProdNum NVARCHAR(25), @CostDate AS DATETIME) (@ProdNum,@CostDate)
RETURNS TABLE
AS
RETURN
SELECT PP.ProductID
, PP.ProductNumber
, PP.Name
, ROUND(PCH.StandardCost,2) AS RoundedStandCost
. . . or it could be used to return a list based
FROM Production.ProductCostHistory PCH on addition selection criteria.
JOIN Production.Product PP
ON PCH.ProductID = PP.ProductID SELECT GET.ProductNumber
WHERE StartDate < DATEADD(DAY,1,@CostDate) , GET.Name
AND PP.ProductNumber = @ProdNum , GET.RoundedStandCost
AND ( PCH.EndDate is null FROM Production.Product PP
OR PCH.EndDate >= @CostDate CROSS APPLY dbo.GetStandardCostByDate (PP.ProductNumber, GETDATE()) GET
) WHERE GET.RoundedStandCost >= 1500
GO ORDER BY GET.RoundedStandCost DESC
Both uses are potentially valuable in .NET
and web based applications.
7
8. A Pivot Table with Dense Rank Query
Pivot tables can be produced in many of the BI layers beginning with SQL.
This example provides output that is useful in reporting and applications.
;WITH SaleByShipper AS
( SELECT DATEADD(DAY, 7 - DATEPART(WEEKDAY,OrderDate),OrderDate) AS WeekEnding
, ShipMethodID
, SUM(TotalDue) AS ShipperTotal
FROM Purchasing.PurchaseOrderHeader POH
WHERE YEAR(DATEADD(DAY, 7 - DATEPART(WEEKDAY,OrderDate),OrderDate)) = 2007
GROUP BY DATEADD(DAY, 7 - DATEPART(WEEKDAY,OrderDate),OrderDate)
, ShipMethodID
)
SELECT WeekEnding
, [1] AS XRQ
, [2] AS ZY
, [3] AS OVERSEAS
, [4] AS OVERNIGHT
, [5] AS CARGO
INTO #Pivot
FROM SaleByShipper
PIVOT ( SUM(ShipperTotal) FOR ShipMethodID IN ([1], [2], [3], [4], [5])) AS ShipDol
SELECT TOP 5 REPLACE(CONVERT(CHAR(10),WeekEnding,111),'/','-') AS WeekEnding
, ISNULL(XRQ,0) + ISNULL(ZY,0) + ISNULL(OVERSEAS,0) + ISNULL(OVERNIGHT,0) + ISNULL(CARGO,0) AS GrandTotal
, DENSE_RANK() OVER (ORDER BY ISNULL(XRQ,0) + ISNULL(ZY,0) + ISNULL(OVERSEAS,0) + ISNULL(OVERNIGHT,0) + ISNULL(CARGO,0) DESC) AS [Rank]
, XRQ
, ZY
, OVERSEAS
, OVERNIGHT
, CARGO
FROM #Pivot
8
9. Segments of a Stored Procedure
Producing a dual ranked output that has application uses for reporting.
SELECT * INTO #Vendor
FROM
( SELECT VEN.BusinessEntityID In a stored procedure
, VEN.Name AS VendorName
VendorRank
, DENSE_RANK() OVER (ORDER BY SUM(TotalDue) DESC) AS
a temporary table built using a
FROM
, SUM(POH.TotalDue) AS TotalDue
Purchasing.Vendor VEN
sub-query to build a Vendor
JOIN Purchasing.PurchaseOrderHeader POH Ranking,
ON POH.VendorID = VEN.BusinessEntityID
WHERE POH.OrderDate >= @StartDate
AND POH.OrderDate < DATEADD(DAY,1,@EndDate)
and a similar query on Products
GROUP BY VEN.BusinessEntityID
, VEN.Name are joined in a final query
) vendalias
WHERE VendorRank <= @TopVend to provide a combined ranked
SELECT V.VendorName listing of vendors and products.
, V.VendorRank
, V.TotalDue EXEC dbo.TopProductForTopVendorSales
, P.ProductName @TopVend = 3, @TopProd = 3,
@StartDate = '1/1/2007',
, P.ProductRank
@EndDate = '6/30/2008'
, P.ProductTotalDue GO
FROM #Vendor V
JOIN #Product P
ON P.VendorID = V.BusinessEntityID
ORDER BY V.VendorRank
, P.ProductRank
9
10. S Q L S E R V E R I N T E G R AT I O N S E R V I C E S
(SSIS)
10
11. The AllWorks Construction Company
A series of real world projects are undertaken for a fictional construction firm. After
creation of the database, an SQL Server Integration Services package was created to
process the data.
The first step was the creation of the
relational database using SQL.
This was followed by the development of an
ETL process using SQL Server Integration
Services.
The process was deployed to SQL Server
Agent.
The deployed SSIS process is designed to do
an initial load of the data as well as to be
used for regular, ongoing processing of data.
The source included files from Excel, CSV
and XML.
11
12. Processing Employee Time Records
This process is used for the initial load of time records, but designed for scheduled
processing that can be done daily or at any selected frequency. The process validates
Projects and Employees and produces a single report of invalid records. Records that pass
validation are checked for late time and a separate late time report is produced. Only
records passing all three tests are inserted, and reports are sent by e-mail for invalid
records and late time. Multiple source files are received and processed in a ForEach Loop
in Control Flow.
Data Flow
Control Flow
12
13. Master ETL Control Flow
A Master Control Flow was used to assure that the file processing order was sequenced to
process primary keys before use of the primary key as a foreign key. The process began
with control tables and ended with secondary transaction tables. Once processing is
complete, database maintenance tasks are done in a final step.
13
14. S Q L S E R V E R A N A LY S I S S E R V I C E S
(SSAS)
14
15. Browsing the All Works Cube Data
The AllWorks OLTP database was staged to provide the OLAP cube shown here in the
browser.
One of the features of
the cube is the ability
to click on a county
and open a map
centered on the
county.
15
16. Creating A Staging Area for OLAP Deployment
A copy of the OLTP Database was loaded to AllWorksDW and T-SQL Programs were
used to create and load Fact and Dimension tables.
CREATE FUNCTION [dbo].[WeekEndingDate]
( @InputDate date ) -- 9-20-2010
This query creates a function to find the week
RETURNS Date ending date. A similar function to find the week
AS
BEGIN
ending key was also created.
DECLARE @ReturnDate DATE
SET @ReturnDate =
dateadd(day, ( @@DateFirst - datepart(weekday,@InputDate)), @InputDate)
RETURN @ReturnDate
END
GO
Fact and Dimensions were
;WITH EndDateCTE AS
( created using a combination
SELECT DISTINCT [dbo].[WeekEndingDate](JobClosedDate) AS Dates FROM dbo.JobMaster
UNION
of:
SELECT DISTINCT [dbo].[WeekEndingDate](WorkDate) FROM dbo.JobTimeSheets
UNION Views and
SELECT DISTINCT [dbo].[WeekEndingDate](PurchaseDate)FROM dbo.JobMaterialPurchases
UNION
SELECT [dbo].[WeekEndingDate]('Oct 2, 2004') -- DEFAULT JobCloseDate
Tables created in Stored
) Procedures
INSERT INTO [AllWorksDW].[dbo].[DimDate]
SELECT dbo.WeekEndingKey(Dates)
, CONVERT(VARCHAR(12),Dates, 107)
, CAST(CAST(YEAR(Dates) as varchar(4)) + CAST(DATEPART(qq,Dates) as varchar(1)) AS INT)
, 'Q' + CAST(DATEPART(Q,Dates) as varchar(1)) + '-' + CAST(YEAR(Dates) as varchar(4)) This query was used in a
, YEAR(Dates) Stored Procedure to create
FROM EndDateCTE
the Date Dimension.
WHERE Dates > '12/31/2003'
16
17. Developing A Cube
The AllWorksOLAP cube build included basic structure, dimension usage, calculated
members and sets, KPI’s, Actions, Partitions, Aggregations, and Perspectives.
17
20. Useful Simple MDX Queries
WITH
MEMBER [Labor Hrs Last Year] AS
A query producing a weekly moving average. ( [Worked Hours]
, parallelperiod ( [Dates].[Calendar Tree].[Quarter]
WITH SET [Weeks] AS , 4
FILTER ([Dates].[Calendar Tree].[Weekend], [Overhead Cost] <> null) , [Dates].[Calendar
Tree].CurrentMember
)
MEMBER [MovingXWeekAvg] AS )
AVG (
Comparing this year to
, FORMAT_STRING = 'Standard'
LastPeriods ( 52 last year
, [Dates].[Calendar Tree].CurrentMember
) SELECT [Dates].[Quarter] * { [Worked Hours], [Labor Hrs Last Year] }
on columns,
, ([Overhead Cost]) NON EMPTY [Labor].[Employee Name].Children
) on rows
from [Labor]
SELECT {[Overhead Cost],[MovingXWeekAvg]} ON COLUMNS, WHERE [Dates].[Calendar Tree].[Q4-2005]
[Weeks] ON ROWS
FROM [Overhead]
Some really simply queries produce interesting, useful
results, like this TOPPERCENT query:
SELECT [Invoice Amount] on COLUMNS,
TOPPERCENT ([Projects].[Project].children, 30, [Invoice Amount]) ON ROWS
FROM [Global]
20
21. A More Complex MDX Example
WITH MEMBER [Internet Sales Amount Prior PD] as This much more complex example calculates the
( [Internet Sales Amount] , [Date].[Calendar].PrevMember)
, format_string = ‘currency' percentage increase in sales over the prior month.
MEMBER [% Sales Change] AS It identifies the months having the greatest increase
IIF ( [Internet Sales Amount Prior PD] <> null
, ([Internet Sales Amount] - [Internet Sales Amount Prior PD]) and ranks these.
/ [Internet Sales Amount Prior PD]
, 0) It evaluates cities within each of these months for
, format_string = 'percent'
sales increases and ranks those.
SET [TopMonth] AS
TOPCOUNT ( [Date].[Calendar].[Month], 3, [% Sales Change] ) Finally the query reports the top 3 months and top 3
MEMBER [MonthRank] AS cities within those month and reports
RANK ( [Date].[Calendar].CurrentMember, [TopMonth] )
the amount of sales for the month and prior
SET [Top3MonthsTop3Cities] AS
GENERATE( [TopMonth], ( [Date].[Calendar].CurrentMember, month,
TOPCOUNT([Customer].[City].Children, 3, [% Sales Change]) ) )
the percentage increase,
MEMBER [CityRank] as
RANK( ( [Date].[Calendar].CurrentMember, the ranking of the months, and
[Customer].[City].CurrentMember),
EXISTS( [Top3MonthsTop3Cities], [Date].[Calendar].CurrentMember))
the ranking of the cities within those months.
SELECT
{[Internet Sales Amount] , [Internet Sales Amount Prior PD],
[% Sales Change], [MonthRank], [CityRank] }
ON COLUMNS,
[Top3MonthsTop3Cities]
ON ROWS
FROM [Adventure Works]
21
23. Example Reports Developed in SSRS
The SSRS Project developed reports Promotional Sales: A simple
for deployment in SharePoint. report requiring custom MDX.
Sales by state with an Employee Sales: Actually is a
accompanying chart of the top 10 master report with 4 sub reports.
states by either sales or returns.
23
25. Excel Power Pivot Against Contoso Operation Cube
Product Category is configured to
highlight and preselect valid
subcategories.
Geography is based on a hierarchy from the cube,
so can be expanded to lower levels.
Gridlines and headings are turned of to provide a
cleaner appearance when deployed to SharePoint.
25
27. Performance Point Services
Reports were prepared directly in PPS, and reports
deployed to SharePoint from SSRS and Excel where brought
into SharePoint to be deployed to the Dashboard.
For SSRS reports, the deployment to the Dashboard via PPS
enabled the use of superior drop down and tree selections for
report parameters.
27
28. Contoso Retail Scorecard Using SSAS KPIs
KPIs for the Contoso SSAS Operation Cube are used to build
a scorecard. Indicators are modified to provide a better
web appearance.
When building the Dashboard Page containing the KPIs, the primary KPIs are linked
to a different chart that will appear in SharePoint when that KPI is selected. 28
29. Other PPS Reports
Other Charts and Reports are created directly in PPS.
The Excel Pivot Report is added so that it will be included
on the Dashboard.
SSRS Reports are added and deployed to the
SharePoint Dashboard using PPS page
controls. This not only allows inclusion on
the Dashboard, but also enables these
reports to use SharePoint selection controls
which have a better appearance and are
easier to use.
29
32. KPI Scorecard in SharePoint Dashboard
The right hand chart is set by the select KPI on
the left.
Product Gross Margin selected in KPI.
Channel Revenue selected in KPI.
Returns % selected in KPI.
32
Machine Downtime Trend selected in KPI.
33. A Deceptively Simple Chart from PPS
But it has
impressive drill
down capabilities.
33
35. SSRS Scheduled Reports
A simple informational report
was created in SSRS, deployed
in SharePoint, then versions
for Maryland and Virginia
were set up for daily
production.
35
36. Demonstration and Educational Presentations
These presentations, prepared independently, provide education and demonstrations
for anyone interest in T-SQL and MDX query syntax and use. The completed
presentations are posted on LinkedIn as SlideShare Presentations.
36
37. Experience Summary
Over 12 years experience as a analyst, data auditor and T-SQL Programmer for
data and financial system implementation and maintenance
Prior experience in accounting and system specification, selection and
implementation
MS Business Intelligence
T-SQL Programming
SQL Server Integration Services (SSIS)
SQL Server Analysis Services (SSAS)
MDX Programming
SQL Server Reporting Services (SSRS)
Excel, Excel Services and Power Pivot
SharePoint for Business Intelligence
Performance Point Services (PPS)
37
38. Recommendations
Letter of Recommendation
Robert Litsinger
The following is a letter of reference/recommendation for Robert Litsinger. Robert was enrolled in the SetFocus SQL Server Business Intelligence Master’s
Program in the third quarter of 2011 and will graduate on October 21, 2011. I was Robert’s instructor throughout the 12-week program.
Robert brought substantial experience (25+ years) into the Master’s Program, with a background as a systems analyst, database professional, and management
consultant. Robert demonstrated very early in our Master’s Program that he “knows data”. His prior work on data conversions in particular came through during
many lecture discussions.
Robert performed very well in our Master’s Program curriculum, both during lecture and project weeks. He picked up a great deal of information, performed well
on all areas of the BI stack, and often worked on lab assignments in the evenings/weekends after a long day of class. His attention to detail is exactly what every
professional instructor hopes to see. The road to learning the Microsoft BI stack is paved with countless details: Robert demonstrated he can integrate them into
his thought processes. During our final student team project, we will select Robert as the team lead.
Robert is very intelligent and articulate, and can translate business requirements into results. He has decades of experience in this area that a company will value.
He also constructed some documentation of his own as part of the learning curve for MDX programming – several of his ideas were so good that I plan to utilize
them for the next class.
Having been a hiring manager, I know that companies are equally concerned about the character of an applicant. Without any exaggeration, Robert is a model of
professionalism, especially during periods of adversity. A company will benefit from his experience, capabilities, and work ethic - and will also appreciate his
steady demeanor and common sense. He is a quality person who produces quality work. I enjoyed having him in class and appreciated his insights.
I would recommend Robert without hesitation for any SQL Server Business Intelligence developer/management position. Please contact me at
kgoff@setfocus.com, if you have any additional questions.
Kevin S. Goff | kgoff@setfocus.com
Microsoft SQL Server MVP
SetFocus SQL Server Business Intelligence Practice Manager
About SetFocus: SetFocus, LLC (www.setfocus.com) is a Microsoft Certified Gold Partner for Learning Solutions.
The Master's Program consists of intensive coverage of T-SQL, SSIS, SSAS, MDX, SSRS, PerformancePoint Server,
SharePoint, and Excel Services. Our BI curriculum is one of the most intensive curriculums in the industry. The
projects are based on actual project specifications from industry SQL Server and OLAP/Business Intelligence
applications.
38