Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

869 views

Published on

Introduction DAX

Published in:
Technology

No Downloads

Total views

869

On SlideShare

0

From Embeds

0

Number of Embeds

140

Shares

0

Downloads

35

Comments

0

Likes

1

No embeds

No notes for slide

- 1. SQL Saturday #264 Ancona, Italy Introduction DAX
- 2. Sponsors & Media Partners
- 3. Marco Pozzan Work with SQL server from 2000 version Work in BI since 2005 President of the community 1nn0va (www.innovazionefvg.net) Project manager at Servizi CGN (www.cgn.it) SQL Server and DWH Consultant References: twitter: @marcopozzan email: info@marcopozzan.it site: www.marcopozzan.it
- 4. Agenda What is Powerpivot? Demo of Powerpivot What is DAX? Calculated columns and Calculated Fields Evaluation Context Calculate Demo in DAX
- 5. What is Powerpivot? Free AddIn for Microsoft Excel 2010 e 2013 Different versions for 32/64 bit (4GB limit) Do not need SQL Server or other prerequisities Very powerful analysis engine based on SSAS di SQL Server 2012 No API available to control it No security available Always impersonates current user
- 6. Versions installed: Client-side: inside to Excel Server-side: Built on SharePoint 2012 o SQL Server 2012(Tabular) the SSAS engine client-side version the engine runs in-process with Excel
- 7. SSAS 2012 Use Vertipaq engine which is a columnar database high compression Work completely in memory No I/O, aggregates, or other… IMBI = New way of thinking about the algorithms
- 8. Powerpivot Import data Relationship between tables Slicer
- 9. Advantages (PowerPivot) Fast No ETL (Power Query) Metadata (model) Integration of heterogeneous sources Sharing Especially with Sharepoint Expressiveness Relation and Dax
- 10. Disadvantages (PowerPivot) There are ETL to clean the data Quality of data? Data size Please note that these are not problems!
- 11. What is DAX? Designed to work within a PivotTable Programming language of Tabular and PowerPivot Resembles Excel (say ) No concept of «row» and «column» Different Type System Mix between MDX, SQL, EXCEL
- 12. Dax Types Non numerical: String Binary Objects (Power View) Numerical: Currency Integer Real DateTime (integer: dd 30/12/1899, decimal: fraction of a day) Boolean
- 13. Type Handling Operators are not strongly typed ("1"+1) Operator Overloading (warning ) Example 1 & 2 = "12" "1" + "2" = 3
- 14. Columns in DAX 1/2 'TableName’[ColumnName] =FactInternetSales[OrderDate] Quotes can be omitted if the tablename does not contain spaces (Don’t do it )
- 15. Columns in DAX 2/2 TableName can be omitted and then will look in the current table not to do it as it is hard to understand the formulas =[OrderDate] Brackets cannot be omitted
- 16. Calculated Columns Computed using DAX and persisted in the database Use another columns Always computed for the current row FactInternetSales[OrderDate] means The value of the OrderDate column In the FactInternetSales table For the current row Different for each row
- 17. Measures (Calculated Fields) Do not work row by row Written using DAX Not stored on the database Use tables and aggregators Do not have the «current row» I can not write the following formula =FactInternetSales[OrderDate] :=SUM(FactInternetSales[OrderDate])
- 18. Define the right name of the column If you change the name of the columns must be changed manually in the measures So, immediately defined the right names
- 19. Calculated column e Measures Suppose you want to calculate the margin with a calculated column: =FactInternetSales[SalesAmount]-FactInternetSales[TotalProductCost] I Can aggregate margin column with a measure SUMofMargin:=SUM(FactInternetSales[Margin])
- 20. Calculated column e Measures margin compared to sales (margine%) =FactInternetSales[Margin] / FactInternetSales[SalesAmount] This expression is not correct if he come aggregate I must use this Margine%:=SUM(FactInternetSales[Margin]) / SUM(FactInternetSales[SalesAmount])
- 21. Measures rules and convention Define the name of the table to which it belongs The measures are global to the model There may be two measures with the same name in different tables You can move from one table to another, this can not be done with computed columns Do not refer to a measure with table name is confused with calculated columns
- 22. Summary 1/2 Columns consume memory and measures consume CPU Are calculated at different times They have different purposes They are structured differently Are managed in different ways
- 23. Summary 2/2 Use measures (90%) Calculate ratios Calculate percentages Need complex aggregations Use column when (10%) It requires slicer or filter values The expression is calculated on the current row
- 24. Counting Values COUNTROWS: rows in a table COUNTBLANK: counts blanks COUNTA: counts anything but not blanks COUNT : only for numeric columns Compatibility with Excel DISTINCTCOUNT: performs distinct count Multidimensional -> measure group with a distinctcount measure .. Slow like a snail
- 25. Errors in DAX 1/2 1+2 always works [SalesAmount]/[Margin] might fail Causes of errors Conversion errors Arithmetical operations Empty or missing values ISERROR (Expression) returns true or false, depending on the presence of an error during evaluation
- 26. Errors in DAX 2/2 IFERROR (Expression, Alternative) in case of error returns Alternative true. Useful to avoid writing expression twice Both IFERROR and ISERROR are very slow so be careful how you use computed columns
- 27. Aggregation Functions Work only on numeric columns Aggregation functions: SUM AVERAGE MIN MAX Aggregate columns only not expression SUM(Order[Quantity]) SUM(Order[Quantity]) * Orders[Quantity])
- 28. The X aggregation functions 1/2 Iterate on the table and evaluate the expression for each row Always get two parameters: the table to iterate and the formula to evaluate SUMX,AVERAGEX,MINX,MAXX SUMX ( Sales, Sales[Price] * Sales[Quantity] )
- 29. The X aggregation functions 2/2 First calculate the internal parameters and then makes the sum The columns must all be on the same table or use RELATED (if there is a relationship) They are very slow but I do not use memory
- 30. Alternatively, the X functions An alternative to the X functions. create a calculated column aggregate on that column very fast but use memory
- 31. Logical Functions AND (little used) or && OR (little used) or || IF IFERROR NOT (little used) SWITCH
- 32. Switch Color := IF(DimProduct[Color] = "R", "Red", IF(DimProduct[Color] = "Y", " Yellow ", "Other")) Color := Switch ( DimProduct[Color], "R", "Red", "Y", "Yellow", "Other" )
- 33. Information Function Completely useless (do not take the expressions but only columns) ISNUMBER ISTEXT ISNONTEXT Useful ISBLANK ISERROR But if we do not know us (we created the column) if it is a number or a text those who should know (by Alberto Ferrari)
- 34. DIVIDE Function check that the denominator is not 0 IF( Sales[Price] <> 0, Sales[Quantity] / Sales[Price],0) DIVIDE(Sales[Quantity], Sales[Price],0)
- 35. Date Function Many useful functions: DATE ,DATEVALUE, DAY, EDATE,EMONTH ,HOUR, MINUTE, MONTH, NOW, SECOND, TI ME, TIMEVALUE, TODAY (interesting!!!), WEEKDAY, WEEKNUM, YEAR, YEARFRANC Time intelligence functions
- 36. Evaluation Context 1/3 Characterizes DAX from any other language They are similar to the “where clause” of the MDX query in SSAS Contexts under which a formula is evaluated Filter Context , RowContext
- 37. Evaluation context 2/3 Filter Context: Set of active rows for the computation The filter that comes from the PivotTable Defined by slicers, filters, columns, rows One for each cell of the PivotTable
- 38. Evaluation context 3/3 Row Context: Contains a singles row Current row during iterations Define by X function or Calculate column definition not by pivot tables This concept is new among MDX because not working leaf by leaf, but only on the context.
- 39. The two context are always Filter context: Filter tables Might be empty (All the tables are visible) It is used by aggregate functions In calculated column is all the tables because there is not pivot table Row context: Iterate the active row in the filter context Might be empty (There is no iteration running)
- 40. With more tables?
- 41. Evaluation Context Filter context: Is propagated through relationships from one to many The direction of the relationships is very important. Is different from SQL (inner,left,...) Applies only once (+ performance) Row context: Does not propagate over relationships Use RELATED (open a new row context on the target) Apply for each row (- performance)
- 42. Exmple of a Filter Context
- 43. Table Function FILTER (adding new conditons. Is an iterator!!!) ALL (Remove all conditions from a table. Returns all rows from a table) Useful to calculate ratios and percentages Removes all filters from the specified columns in the table VALUES (valori di una colonna compresi i blank) RELATEDTABLE (tutti i valori collegati alla riga corrente) All function returns a table
- 44. Filter
- 45. All
- 46. Mixing Filters
- 47. VALUES Return to the table with a single column containing all possible values of the column visible in the current context SelectedYear:=COUNTROWS(VALUES(Dati[Year])) When the result is a column and a row can be used as scalar
- 48. RELATEDTABLE Return only row of sales (Dati) related with the current store (Store) =COUNTROWS (RELATEDTABLE(Dati))
- 49. Considerations we have seen that we can: Can add a filter on a column Remove filter on the full table Mixing filter …..but: ignore only a part of the filter context and not all add a condition to the context filter or modify an existing condition
- 50. Calculate The most simple but complex to understand CALCULATE( Expression, Filter1, …. FiltroN ) Computed before the filter (AND) and then the expression All filters are processed in parallel and are independent of each other Replace the filter context (replace whole table or a single column)
- 51. Calculate
- 52. Calculate with filter So this formula is not correct ProductLMC := CALCULATE( SUM(FactInternetSales[SalesAmount]); DimProduct[ListPrice] > DimProduct[StandardCost] )) Use FILTER ProductLMC := CALCULATE( SUM(FactInternetSales[SalesAmount]); FILTER(DimProduct, DimProduct[ListPrice] > DimProduct[StandardCost] )) The filter is a boolean condition that works on a single column (Ex: DimProduct[Color] = "White" or DimProduct[ListPrice] > 1000) In this case there are too many columns in the filter (ListPrice and StandardCost)
- 53. Calculate – pay attention to the filter context ProductM100:= CALCULATE ( SUM(FactInternetSales[SalesAmount]), FILTER( DimProduct, DimProduct[ListPrice] >= 100 ) ) Color = silver Color = silver ListPrice >= 100 Filter Context The DimProduct is evaluated in the original filter context before evaluate CALCULATE
- 54. Calculate – pay attention to the context filter ProductM100_Bis:= CALCULATE ( SUM(FactInternetSales[SalesAmount]), FILTER( ALL(DimProduct), DimProduct[ListPrice] >= 100 ) ) Color = silver ListPrice >= 100 All column ….. Filter Context The new context of filter will be the SUM are all the row because “color = silver” was removed
- 55. Earlier Returns a value from the previous row context: =SUMX( FILTER(Sales; Sales[Date]<=EARLIER(Sales[Date]) && YEAR(Sales[Date]) = YEAR(EARLIER(Sales[Date])) ) ;Sales[Value] ) In row contex we have only 1 variables available FOR A = 1 TO 5 FOR B = 1 TO 5 IF A < B THEN NEXT NEXT FOR = 1 TO 5 FOR A = 1 TO 5 IF IEARLEIER ( ) < A THEN NEXT NEXT
- 56. Calculate – Context transition In DimProduct the two expressions are the same? = SUM(FactInternetSales[SalesAmount]); = CALCULATE(SUM(FactInternetSales[SalesAmount]));
- 57. ABC and Pareto Analysis 80% of effects come from 20% of the causes L’80% of sales come from 20% of customers Pareto analysis is the basis of the classification ABC Class A contains items for >=70% of total value Class B contains items for >=20% and <70% of total value Class C contains items for <20% of total value
- 58. ABC and Pareto Analysis For each row calculate the TotalSales =CALCULATE( SUM(FactInternetSales[SalesAmount])) Calculate all products with total sales greater than the selling of the row RunningTotalSales = SUMX( FILTER( DimProduct; DimProduct[TotalSales] >= EARLIER(DimProduct[TotalSales]) ); DimProduct[TotalSales] )
- 59. Analisi di Pareto e l’analisi ABC calculate the percentage of sales by product of the total sales =DimProduct[RunningTotalSales] / SUM(DimProduct[TotalSales]) visualize the labels A, B, C =IF( DimProduct[RunningPct] <= 0.7; "A"; IF( DimProduct[RunningPct] < =0.9; "B"; "C"; ) )
- 60. ABC and Pareto Analysis the number of products that generate those sales =COUNTROWS(DimProduct)
- 61. ABC and Pareto Analysis
- 62. Link and Book PowerPivot http://www.powerpivot.com SQLBI http://www.sqlbi.com WebCast (Powerpivot 1.0) http://www.presentation.ialweb.it/p29261115/ Book
- 63. Q&A
- 64. #sqlsat264 #sqlsatancona Thanks!

No public clipboards found for this slide

Be the first to comment