How to quickly find an error in a join in SQL
Follow this step-by-Step guide and you will quickly and easily find the error
By being methodical you can save yourself hours of pain
#SQL #SQLTips #SQLTimesavers
2. Debugging Joins
Finding an error in a join between two tables can be very time consuming
This query has an error in one of the joins
We will show you a very quick way to find it
select SalesOrderNumber
, OrderQuantity, SalesAmount, FullDateAlternateKey as OrderDate,p.ProductAlternateKey as ProductKey,
p.EnglishProductName
, psc.EnglishProductSubcategoryName, pc.EnglishProductCategoryName, c.FirstName, c.LastName,
g.EnglishCountryRegionName, s.SalesTerritoryCountry
from [dbo].[FactInternetSales] f
INNER JOIN [dbo].[DimDate] d ON f.[OrderDateKey] = d.[DateKey]
INNER JOIN [dbo].[DimProduct] p ON f.[ProductKey] = p.[ProductKey]
INNER JOIN [dbo].[DimProductSubcategory] psc ON p.[ProductSubcategoryKey] = psc.[ProductSubcategoryKey]
INNER JOIN [dbo].[DimProductCategory] pc ON psc.[ProductCategoryKey] = pc.[ProductCategoryKey]
INNER JOIN [dbo].[DimCustomer] c ON f.[CustomerKey] = c.[CustomerKey]
INNER JOIN [dbo].[DimGeography] g ON c.[GeographyKey] = g.[GeographyKey]
INNER JOIN [dbo].[DimSalesTerritory] s ON g.[GeographyKey] = s.[SalesTerritoryKey]
3. Start by going back to basics
Copy the whole query
Then change the select clause to read
select count(1) as RowCounter
This will just tell us how many rows it returns
select count(1) as RowCounter
from [dbo].[FactInternetSales] f
INNER JOIN [dbo].[DimDate] d ON f.[OrderDateKey] = d.[DateKey]
INNER JOIN [dbo].[DimProduct] p ON f.[ProductKey] = p.[ProductKey]
INNER JOIN [dbo].[DimProductSubcategory] psc ON p.[ProductSubcategoryKey] = psc.[ProductSubcategoryKey]
INNER JOIN [dbo].[DimProductCategory] pc ON psc.[ProductCategoryKey] = pc.[ProductCategoryKey]
INNER JOIN [dbo].[DimCustomer] c ON f.[CustomerKey] = c.[CustomerKey]
INNER JOIN [dbo].[DimGeography] g ON c.[GeographyKey] = g.[GeographyKey]
INNER JOIN [dbo].[DimSalesTerritory] s ON g.[GeographyKey] = s.[SalesTerritoryKey]
4. Next focus on the first fact table
Comment out all other joins
Then Run it and check the result, this is how many total
rows should be in the results
select count(1) as RowCounter
from [dbo].[FactInternetSales] f
--INNER JOIN [dbo].[DimDate] d ON f.[OrderDateKey] = d.[DateKey]
--INNER JOIN [dbo].[DimProduct] p ON f.[ProductKey] = p.[ProductKey]
--INNER JOIN [dbo].[DimProductSubcategory] psc ON p.[ProductSubcategoryKey] = psc.[ProductSubcategoryKey]
--INNER JOIN [dbo].[DimProductCategory] pc ON psc.[ProductCategoryKey] = pc.[ProductCategoryKey]
--INNER JOIN [dbo].[DimCustomer] c ON f.[CustomerKey] = c.[CustomerKey]
--INNER JOIN [dbo].[DimGeography] g ON c.[GeographyKey] = g.[GeographyKey]
--INNER JOIN [dbo].[DimSalesTerritory] s ON g.[GeographyKey] = s.[SalesTerritoryKey]
5. Start Feeding some of the joins back in
Uncomment a few of the joins and run again
select count(1) as RowCounter
from [dbo].[FactInternetSales] f
INNER JOIN [dbo].[DimDate] d ON f.[OrderDateKey] = d.[DateKey]
INNER JOIN [dbo].[DimProduct] p ON f.[ProductKey] = p.[ProductKey]
INNER JOIN [dbo].[DimProductSubcategory] psc ON p.[ProductSubcategoryKey] = psc.[ProductSubcategoryKey]
--INNER JOIN [dbo].[DimProductCategory] pc ON psc.[ProductCategoryKey] = pc.[ProductCategoryKey]
--INNER JOIN [dbo].[DimCustomer] c ON f.[CustomerKey] = c.[CustomerKey]
--INNER JOIN [dbo].[DimGeography] g ON c.[GeographyKey] = g.[GeographyKey]
--INNER JOIN [dbo].[DimSalesTerritory] s ON g.[GeographyKey] = s.[SalesTerritoryKey]
select count(1) as RowCounter
from [dbo].[FactInternetSales] f
INNER JOIN [dbo].[DimDate] d ON f.[OrderDateKey] = d.[DateKey]
INNER JOIN [dbo].[DimProduct] p ON f.[ProductKey] = p.[ProductKey]
INNER JOIN [dbo].[DimProductSubcategory] psc ON p.[ProductSubcategoryKey] = psc.[ProductSubcategoryKey]
INNER JOIN [dbo].[DimProductCategory] pc ON psc.[ProductCategoryKey] = pc.[ProductCategoryKey]
INNER JOIN [dbo].[DimCustomer] c ON f.[CustomerKey] = c.[CustomerKey]
INNER JOIN [dbo].[DimGeography] g ON c.[GeographyKey] = g.[GeographyKey]
--INNER JOIN [dbo].[DimSalesTerritory] s ON g.[GeographyKey] = s.[SalesTerritoryKey]
6. Its always the last one…..
In our example it is the last row
select count(1) as RowCounter
from [dbo].[FactInternetSales] f
INNER JOIN [dbo].[DimDate] d ON f.[OrderDateKey] = d.[DateKey]
INNER JOIN [dbo].[DimProduct] p ON f.[ProductKey] = p.[ProductKey]
INNER JOIN [dbo].[DimProductSubcategory] psc ON p.[ProductSubcategoryKey] = psc.[ProductSubcategoryKey]
INNER JOIN [dbo].[DimProductCategory] pc ON psc.[ProductCategoryKey] = pc.[ProductCategoryKey]
INNER JOIN [dbo].[DimCustomer] c ON f.[CustomerKey] = c.[CustomerKey]
INNER JOIN [dbo].[DimGeography] g ON c.[GeographyKey] = g.[GeographyKey]
INNER JOIN [dbo].[DimSalesTerritory] s ON g.[GeographyKey] = s.[SalesTerritoryKey]
Examine the join in the last row,
We find it is using an incorrect key
select count(1) as RowCounter
from [dbo].[FactInternetSales] f
INNER JOIN [dbo].[DimDate] d ON f.[OrderDateKey] = d.[DateKey]
INNER JOIN [dbo].[DimProduct] p ON f.[ProductKey] = p.[ProductKey]
INNER JOIN [dbo].[DimProductSubcategory] psc ON p.[ProductSubcategoryKey] = psc.[ProductSubcategoryKey]
INNER JOIN [dbo].[DimProductCategory] pc ON psc.[ProductCategoryKey] = pc.[ProductCategoryKey]
INNER JOIN [dbo].[DimCustomer] c ON f.[CustomerKey] = c.[CustomerKey]
INNER JOIN [dbo].[DimGeography] g ON c.[GeographyKey] = g.[GeographyKey]
INNER JOIN [dbo].[DimSalesTerritory] s ON g.[SalesTerritoryKey] = s.[SalesTerritoryKey]
7. Fixed….
Finally replace the incorrect join in your original query
select SalesOrderNumber
, OrderQuantity, SalesAmount, FullDateAlternateKey as OrderDate,p.ProductAlternateKey as ProductKey
, p.EnglishProductName, psc.EnglishProductSubcategoryName, pc.EnglishProductCategoryName, c.FirstName,
c.LastName, g.EnglishCountryRegionName, s.SalesTerritoryCountry
from [dbo].[FactInternetSales] f
INNER JOIN [dbo].[DimDate] d ON f.[OrderDateKey] = d.[DateKey]
INNER JOIN [dbo].[DimProduct] p ON f.[ProductKey] = p.[ProductKey]
INNER JOIN [dbo].[DimProductSubcategory] psc ON p.[ProductSubcategoryKey] = psc.[ProductSubcategoryKey]
INNER JOIN [dbo].[DimProductCategory] pc ON psc.[ProductCategoryKey] = pc.[ProductCategoryKey]
INNER JOIN [dbo].[DimCustomer] c ON f.[CustomerKey] = c.[CustomerKey]
INNER JOIN [dbo].[DimGeography] g ON c.[GeographyKey] = g.[GeographyKey]
INNER JOIN [dbo].[DimSalesTerritory] s ON g.[SalesTerritoryKey] = s.[SalesTerritoryKey]