An introductory session to DAX and common analytic patterns that we've built and used in enterprise environments. This session was originally presented at SQL Saturday Silicon Valley 2016.
Power BI, SSAS Tabular, and Excel all use DAX. This presentation is meant to be used with a PBIX notebook found here: https://github.com/IkeEllis/democode/blob/master/IntroToDAX/Power%20BI%20Introduction%20to%20DAX.pbix
Presentación realizada en el Capítulo de Usuarios de SQL Server en Puerto Rico (PRPASS) en el mes de Octubre del 2012.
Presented during the monthly sessions of Puerto Rico PASS Chapter (www.prpass.org) - October, 2012.
DAX and Power BI Training - 002 DAX Level 1 - 3Will Harvey
DAX Level 1 - 3: In this session we explain DAX and cover other foundational concepts in PowerPivot such as the Data Model, Measures and Calculated Columns as well as the important skill of understanding how filtering works in the Data Model.
DAX and Power BI Training - 001 OverviewWill Harvey
Course & Power BI Overview: This is the first session in a course that primarily focuses on DAX and PowerPivot, but also teaches the surrounding tools such as Power Query, Power BI Desktop and PowerBI.com.
Power BI, SSAS Tabular, and Excel all use DAX. This presentation is meant to be used with a PBIX notebook found here: https://github.com/IkeEllis/democode/blob/master/IntroToDAX/Power%20BI%20Introduction%20to%20DAX.pbix
Presentación realizada en el Capítulo de Usuarios de SQL Server en Puerto Rico (PRPASS) en el mes de Octubre del 2012.
Presented during the monthly sessions of Puerto Rico PASS Chapter (www.prpass.org) - October, 2012.
DAX and Power BI Training - 002 DAX Level 1 - 3Will Harvey
DAX Level 1 - 3: In this session we explain DAX and cover other foundational concepts in PowerPivot such as the Data Model, Measures and Calculated Columns as well as the important skill of understanding how filtering works in the Data Model.
DAX and Power BI Training - 001 OverviewWill Harvey
Course & Power BI Overview: This is the first session in a course that primarily focuses on DAX and PowerPivot, but also teaches the surrounding tools such as Power Query, Power BI Desktop and PowerBI.com.
Tableau online training and tableau desktop , tableau admin
tableau certification, tableau interview questions, tableau training institutes in hyderabad, tableau training institutes in Bangalore
DAX and Power BI Training - 004 Power QueryWill Harvey
I this session we are introducing Power Query for Excel, the data sources you can connect to, and the transformations you can apply. We also introduce more advanced topics of writing your own M functions.
Tableau Training For Beginners | Tableau Tutorial | Tableau Dashboard | EdurekaEdureka!
This Edureka Tableau Training for beginners (Tableau Tutorial Blog: https://goo.gl/DaqKvp) helps you understand about Tableau in detail. It provides knowledge on what Business Intelligence is and get an introduction to Tableau as well. This Tableau tutorial also gives a sample use case using a data set containing state wise population and crime rate, to create a Horizontal bar graph and Symbol map to represent the data.
Two of the tech industry’s essential front runners providing business intelligence solutions are Microsoft’s Power BI and Tableau. These leaders of data visualization help businesses narrow down and analyze their data with powerful built-in tools and clear visualizations. Each platform has distinctive strengths and weaknesses that should be considered before deciding on a business intelligence software.
Data Visualization Techniques in Power BIAngel Abundez
A progression from fundamental charts to more advanced ways to look at data. We end with Custom Visuals and R Visuals that extend this visualization platform.
This presentation contains an introduction of tableau software and in a particular way in Connecting to data, Visual Analytics, Dashboard and stories, Calculations, Mapping and Tableau Online & Competitors.
If SQL is the universal language of data, why do we author our most important data applications (metrics, analytics, business intelligence) in languages other than SQL? Multidimensional databases and languages such as MDX, DAX and Tableau LOD solve these problems but introduce others: they require specialized knowledge, complicate the data pipeline and don’t integrate well. Is it possible to define and query business intelligence models in SQL?
Apache Calcite has extended SQL to support metrics (which we call ‘measures’), filter context, and analytic expressions. With these concepts you can define data models (which we call Analytic Views) that contain metrics, use them in queries, and define new metrics in queries.
In this talk by the original developer of Apache Calcite, we describe the SQL syntax extensions for metrics,
and how to use them for cross-dimensional calculations such as period-over-period, percent-of-total,
non-additive and semi-additive measures. We describe how we got around fundamental limitations in SQL
semantics, and approaches for optimizing queries that use metrics.
A talk given by Julian Hyde at Data Council, Austin, TX, on March 29, 2023.
Tableau online training and tableau desktop , tableau admin
tableau certification, tableau interview questions, tableau training institutes in hyderabad, tableau training institutes in Bangalore
DAX and Power BI Training - 004 Power QueryWill Harvey
I this session we are introducing Power Query for Excel, the data sources you can connect to, and the transformations you can apply. We also introduce more advanced topics of writing your own M functions.
Tableau Training For Beginners | Tableau Tutorial | Tableau Dashboard | EdurekaEdureka!
This Edureka Tableau Training for beginners (Tableau Tutorial Blog: https://goo.gl/DaqKvp) helps you understand about Tableau in detail. It provides knowledge on what Business Intelligence is and get an introduction to Tableau as well. This Tableau tutorial also gives a sample use case using a data set containing state wise population and crime rate, to create a Horizontal bar graph and Symbol map to represent the data.
Two of the tech industry’s essential front runners providing business intelligence solutions are Microsoft’s Power BI and Tableau. These leaders of data visualization help businesses narrow down and analyze their data with powerful built-in tools and clear visualizations. Each platform has distinctive strengths and weaknesses that should be considered before deciding on a business intelligence software.
Data Visualization Techniques in Power BIAngel Abundez
A progression from fundamental charts to more advanced ways to look at data. We end with Custom Visuals and R Visuals that extend this visualization platform.
This presentation contains an introduction of tableau software and in a particular way in Connecting to data, Visual Analytics, Dashboard and stories, Calculations, Mapping and Tableau Online & Competitors.
If SQL is the universal language of data, why do we author our most important data applications (metrics, analytics, business intelligence) in languages other than SQL? Multidimensional databases and languages such as MDX, DAX and Tableau LOD solve these problems but introduce others: they require specialized knowledge, complicate the data pipeline and don’t integrate well. Is it possible to define and query business intelligence models in SQL?
Apache Calcite has extended SQL to support metrics (which we call ‘measures’), filter context, and analytic expressions. With these concepts you can define data models (which we call Analytic Views) that contain metrics, use them in queries, and define new metrics in queries.
In this talk by the original developer of Apache Calcite, we describe the SQL syntax extensions for metrics,
and how to use them for cross-dimensional calculations such as period-over-period, percent-of-total,
non-additive and semi-additive measures. We describe how we got around fundamental limitations in SQL
semantics, and approaches for optimizing queries that use metrics.
A talk given by Julian Hyde at Data Council, Austin, TX, on March 29, 2023.
A talk given by Julian Hyde at DataCouncil SF on April 18, 2019
How do you organize your data so that your users get the right answers at the right time? That question is a pretty good definition of data engineering — but it is also describes the purpose of every DBMS (database management system). And it’s not a coincidence that these are so similar.
This talk looks at the patterns that reoccur throughout data management — such as caching, partitioning, sorting, and derived data sets. As the speaker is the author of Apache Calcite, we first look at these patterns through the lens of Relational Algebra and DBMS architecture. But then we apply these patterns to the modern data pipeline, ETL and analytics. As a case study, we look at how Looker’s “derived tables” blur the line between ETL and caching, and leverage the power of cloud databases.
NCAIR presentation on Microsoft Power tools - Power Pivot, Power Query, Power View, and Power Map. Presenters: David Onder and Alison Joseph (Business Analyst)
Explore the IF (with AND and OR) function, the VLOOKUP function, selected Date, Statistical, Financial, and Mathematical functions, frequently overlooked Text functions, and more from real-life worksheets examples.
More Excel tips, tutorials and training: http://www.lynda.com/Excel-training-tutorials/192-0.html
U-SQL Query Execution and Performance TuningMichael Rys
This 400 level presentation explains the U-SQL Query Execution in Azure Data Lake and provides several Performance Tuning tips: What tools are available and some best practices.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
4. Data Analysis Expressions (DAX)
What is DAX?
DAX is a language that allows us to write dynamic expressions for relataional
constructs, using familiar functions
Powerful dynamic data analysis tool for relational data
Expressions can traverse relationships!
Available in PowerPivot, PowerBI, and SSAS Tabular
Classic “Import Mode”
What isn’t DAX?
NOT a programming language
6. Measures
A measure is a formula/expression comprising functions applied to
columns and tables
Reusable aggregation evaluated differently, depending on how you use it
Measures can be nested
7. Functions
Logical
IF( logical test, <value if true>, <value if false> )
SWITCH( <expression>, <value>, <result>, … ) – evaluates an expression against a list of values, and returns the result corresponding to the first matching value
TRUE() – returns logical true
Aggregate
SUM( <column> ) – adds all the numbers in a column
DIVIDE( <numerator>, <denominator>, [, <alternateresult>] ) – basic division; optional value returned
Statistical
MAX( <column> ) – returns the largest numeric value in a <column>
MIN( <column> ) – returns the smallest value in a <column>
Text
BLANK() – returns a blank
Filter
FILTER(<table>, <filter>) – returns a table representing a subset of another table or expression
ALL( <table> | <column>) – returns all the rows in a table, ignoring any filter context
VALUES(<table or column>) – returns one column of the distinct values from the specified column or table
CALCULATE( <expression>, <filter1>, <filter2>, … ) – evaluates an expression in a context that is modified by the specified filters
8. Functions (cont’d)
Date and Time
TODAY() – returns the current date
NOW() – returns the current date and time in datetime format
Time Intelligence*
DATESBETWEEN(<dates>, <start date>, <end date>) – returns a table of <dates> starting with the <start date> and continues
until the <end date>.
NEXTDAY(<dates>) – returns a table that contains a column with the next dates following each of the <dates> passed
FIRSTDATE(<dates>) – returns the first date in the context of the specified column of dates
LASTDATE(<dates>) – returns the last date in the context of the specified column of dates
SAMEPERIODLASTYEAR(<dates>) – returns a table with a column of dates shifted one year back for each of the <dates> specified
LASTNONBLANK( <column>, <expression> ) – returns last value in the <column> where the <expression> returns blank
FIRSTNONBLANK( <column>, <expression> ) – returns the first value in the <column> where the <expression> returns blank
*Requires Date Table
9. Evaluation Contexts
Evaluation Contexts:
Filter Context
Four types of filter context:
1. Row Selection
2. Column Selection
3. Slicer Selection
4. Filter Selection
Defines the subset of data a measure is calculated using aka “Which rows are selected based on which attribute
values?”
Applied before anything else
Row Context
All the columns in the Current Row
“DAX is simple, it’s not easy, but it’s
simple”
- Alberto Ferrari
16. Cumulative Total Measures
Aggregates values of a column for the currently selected date and all previous dates within
the specified range
Can be used to derive balances from transactions eg.
Inventory Stock
Balances
Cumulative Balances
Does not require use of Time Intelligence functions
17. Cumulative Measure Demo - PowerBI
Cumulative Energy Generated (Checked) =
IF (
MIN ( 'Date'[Date] ) <= MAX ( ‘Output’[Date], ALL ( ‘Output’ ) ) ,
CALCULATE (
SUM ( 'Output'[Energy Generated] ),
FILTER ( ALL ( 'Date'[Date] ), 'Date'[Date] <= MAX ( 'Date'[Date] ) )
)
)
18. Year-to-Date Total
TOTALYTD function applies the expression for all data from the start of the year to the
currently selected date in the filter context
Year To Date =
TOTALYTD( <expression>, <dates> [, <filter>] [, <year end date>] )
19. Year-to-Date Total Demo - PowerBI
Total Energy Generated YTD =
TOTALYTD ( SUM ( 'Output'[Energy Generated] ), 'Date'[Date] )
20. Year Over Year
Use time intelligence to calculate an aggregate for the same period last year
“Last Year” measure is used to compare to “Current Year” measure, and/or to derive a
measure of the change year-over-year
21. Year-Over-Year Demo – PowerBI
Total Energy Generated Last Year =
CALCULATE ( [Total Energy Generated], SAMEPERIODLASTYEAR ( 'Date'[Date] ) )
22. Semi-Additive Measures
Snapshot Fact Table with balance values, such as Inventory or
Account Balances
These scenarios disallow us from summing across time
The solution is to sum across all attributes except for time by
filtering for only a single point in time (eg. last date in the period)
Several functions allow you to adjust filter context to a single
point in time, within the original context period
FIRSTDATE / LASTDATE
FIRSTNONBLANK / LASTNONBLANK
OPENING… / CLOSING…
23. Semi-Additive Measure Demo -
PowerPivot
Total On Hand Quantity LASTNONBLANK =
CALCULATE (
SUM ( Inventory[OnHandQuantity] ),
LASTNONBLANK ( 'Date'[Date],
CALCULATE ( SUM ( Inventory[OnHandQuantity] ) ) )
)
24. Disconnected Slicers
Allows you to use a slicer to modify measures
Measure Switching
Used to switch between a set of measure values in a container measure
25. Disconnected Slicers Demo - PowerBI
Setup Steps:
1. Create/identify your target measures (eg. [Energy Exported] & [Energy
Generated])
2. Create disconnected table to use in slicer selection
3. Create background “value selection measure” using MAX()
• Hide this!
4. Create SWITCH measure to use in visualizations
27. Value Binning
Used to group similar values, or to bin values for
analysis aka Histograms
Can bin values based on equality, or inequality
comparisons with the SWITCH() function
Use Cases:
Age groups
Product Groups
Any kind of frequency distributions
29. Summary
DAX is dynamic because you can write measures that correctly evaluate under
their current Evaluation Context
Filter Context
Row Context
Functions are the building blocks of our measures and perform a myriad of
tasks eg. altering Context, aggregating, logical operations, time intelligence,
etc
Time Intelligence functions require a Date table to operate
Understanding Tabular Data Modeling will go a long way towards helping your
understanding of DAX
How many people have worked with Excel formulas?
How many people have worked with PowerPivot?
Story:
Introduced to data by my Dad (scary DBA types)
Began learning Relational Databases and attending SQL Saturday’s
Indexing internals – Kalen Delaney
T-SQL – Kevin Boles
Merge Operators – Ami Levin
Developer/DBA
Really interested in BI (DW)
Realized my data analysis tool belt was lacking
So I decided to present on DAX
Introduction to DAX
Introduction to Measure and Analysis concepts
Introduction to Evaluation Context
Introduction to Measure and Calculated Column Patterns (which happen to often alter Evaluation Context)
“DAX is a language that allows us to write dynamic expressions for relational constructs, using familiar functions”
What is DAX?
though it shares functions with Excel formula language, it differs by being intended for analysis of relational data
There are some minor differences between PowerBI and PowerPivot eg. the colon
Review CALCULATE in detail
We need to understand Evaluation Context so that we can begin altering it.
Filter Context
Row Context
Iterator Functions… not going to cover these much
Point out the parts of the measure syntax
What sales does the measure sum?
Sum of all sales
Why do the numbers change when we add color to the rows?
The measure still evaluates sum of all sales
However now, it is operating under the context of a color, and can only sum sales for that color!
The value of a formula depends on it’s context.
What we put in a context filters the subset of data we can measure, and so we call it the Filter Context!
Diagram View:
Already setup data model ( imported tables )
Created relationships between tables; DAX can traverse these without an explicit join
Data Grid:
Created simple SUM
Shows SUM of all Sales
Add Continent to Rows; filter context changes
Add ‘Product’[Product Class] to Columns; filter context changes
Diagram View:
Create SUM that adds Asia sales | CALCULATE ( SUM ( ‘Sales’[SalesAmount] ) [Total Sales], ‘Geography’[Continent] = “Asia” )
Show results
Create SUM that only adds Asia sales; returns blank all others | CALCULATE ( [Total Sales], FILTER ( ‘Geography’, ‘Geography’[Continent] = “Asia” ) )
Demo: show basic measures in PowerPivot then PowerBI
Demo: CALCULATE with a simple argument against Geography
To avoid calculating values for dates greater than the max date in the transactions table (‘Output’), add a check that the minimum ‘Date’[DateKey] <= maximum ‘Transactions’[Date]
Can also replace the MAX(‘Transactions’[Date]) check with TODAY(), however this is not always valid as it assumes all data is “current”
Paste to exclude future dates:
IF ( MIN ( 'Date'[Date] ) <= CALCULATE ( MAX ( 'Output'[Date] ), ALL ( 'Output' ) ),
TOTALYTD function applies the expression for all data from the start of the year to the currently selected date in the filter context
TOTALYTD function applies the expression for all data from the start of the year to the currently selected date in the filter context
Paste to exclude future dates:
IF ( MIN ( 'Date'[Date] ) <= CALCULATE ( MAX ( 'Output'[Date] ), ALL ( 'Output' ) ),
DIVIDE ( [Sales] – [Last Year Sales], [Last Year Sales] )
Paste to exclude future dates:
IF ( MIN ( 'Date'[Date] ) <= CALCULATE ( MAX ( 'Output'[Date] ), ALL ( 'Output' ) ),
How can we alter the filter context to only a single point in time?
Using CALCULATE to override the context
First and Last Date will not return values - if the date a period ends on has no data
* Time intelligence functions in PowerBI require the DateKey to be a Date data type
* The expression argument in LASTNONBLANK must be wrapped in a CALCULATE, otherwise it will use the original filter context, and not the LASTNONBLANK column argument context
Setup Target Measures ( [Energy Export] & [Energy Generated] )
User defined table ( disconnected )
Value selection measure ( MAX ( ‘disconnected_table’[id] ) )
Switching measure
Demo: Inventory Aging
Demo: Inventory Aging
In this case, I am using the Inventory Semi-Additive Measure as well