Debugging Complex Calculations

•

0 likes•115 views

Austin Dahl

TC16 Talk by Ausitn Dahl and Mrunal Shridhar

Data & Analytics

Debugging Complex
Calculations
Austin Dahl
Director of Engineering, Analytics
Mrunal Shridhar
Commercial Presales Lead, EMEA

• Know row-level granularity of
your data
• Know what makes each row
unique
The dimension fields
represent the LOD of
the data source.
You cannot drill down
further than this.

Totally Aggregated
Totally Disaggregated
(granularity of data source -
cannot go lower)
Dimensions determine the Viz LOD. The Viz LOD
becomes less aggregated/more granular as more
dimensions are added.
Granularity
Less
More
Aggregation
More
Less
#
#
#
#
#
#
#
Dimensions
Measures

Do you have all the
dimensions?
No need to derive new
dimensions
New dimension need to
have a subset of data
only?
Subset of data need to
update automatically?
Create a computed
conditional set
Create a manual set
Dimension need to be
derived at row level?
Create row-level
calculation
Create FIXED LoD
expression

Measure needs mixed level
of aggregation?
Break measure into sub-
measures with a fixed level
of granularity?
Go back to Step 1, and go
through the process for each
sub-measure
Measure level of aggregation
same as dataset granularity?
Create a row level calculated
field
Measure aggregation at the
same LoD as the viz LoD?
Create an aggregated
calculated field
Measure aggregation more
granular than viz LoD?
Does result need just one
mark/value?
Create FIXED or EXCLUDE
LoD expression
Create table calculation
Create INCLUDE LoD
expression
Aggregation
Granularity

discount category
Measure needs mixed level
of aggregation?
Break measure into sub-
measures with a fixed level
of granularity?
Go back to Step 1, and go
through the process for each
sub-measure
Measure level of aggregation
same as dataset granularity?
Create a row level calculated
field
Measure aggregation at the
same LoD as the viz LoD?
Create an aggregated
calculated field
Measure aggregation more
granular than viz LoD?
Does result need just one
mark/value?
Create FIXED or EXCLUDE
LoD expression
Create table calculation
Create FIXED or INCLUDE
LoD expression
Aggregation
Granularity

discount product category
Measure needs mixed level
of aggregation?
Break measure into sub-
measures with a fixed level
of granularity?
Go back to Step 1, and go
through the process for each
sub-measure
Measure level of aggregation
same as dataset granularity?
Create a row level calculated
field
Measure aggregation at the
same LoD as the viz LoD?
Create an aggregated
calculated field
Measure aggregation more
granular than viz LoD?
Does result need just one
mark/value?
Create FIXED or EXCLUDE
LoD expression
Create table calculation
Create INCLUDE LoD
expression
Aggregation
Granularity

Dropping dimensions on
these shelves adds them to
the Viz LOD.

• It is easy for you to create a ton of LoD expressions
• It is very easy to overwhelm your data model
• It is most easy to use a wrong LoD expression in your viz

Extract Filters
Data Source Filters
Context Filters
FIXED Expressions Evaluated
Dimension Filters
INCLUDE/EXCLUDE Expressions Evaluated
Measure Filters
Local Filters (ATTR, geocoding)
Table Calc Filters
Hide
database
local

Query
• Query
database
• Cache
Results
Data
• Local data
joins
• Local
calculations
• Local filters
• Totals
• Forecasting
• Table
calculations
• 2nd pass
filters
• Sort
Layout
• Layout views
• Compute
legends
• Encode
marks
Render
• Marks
• Selection
• Highlighting
• Labels

What can go wrong?
Join Some rows might not match. What will happen?
Join There might be more than one match. Will you count
things twice?
Left It’s asymmetric, do you have the right primary?
Post-
aggregate
It’s already aggregated, is the combination of
aggregates valid?

Please complete
the session survey
from the Session
Details screen in
your TC16 app

Viewers also liked

prueba luisayyuri69

Dagur íslenskrar náttúruhjorvar

Ftd33bluechip45

BrochureIGP9

IndustryWinnerStephen Black

Teorías del liderazgoMercy Vargas

Ftd46bluechip45

Copia de ejemplo tema 4 tabl dinamicVictor Sinisterra

Готовність дітей до навчання в школіТкачова Петрівна

La ira trea 6 fisiologia de la conductayenniffer22

Cuadro Comparativo Sobre las eras GeológicasSarah Jackson

Viewers also liked (11)

prueba

Dagur íslenskrar náttúru

Ftd33

Brochure

IndustryWinner

Teorías del liderazgo

Ftd46

Copia de ejemplo tema 4 tabl dinamic

Готовність дітей до навчання в школі

La ira trea 6 fisiologia de la conducta

Cuadro Comparativo Sobre las eras Geológicas

Similar to Debugging Complex Calculations

Tableau LOD Expressions | EdurekaEdureka!

Designing high performance datawarehouseUday Kothari

BI Knowledge Sharing Session 2Kelvin Chan

Tableau PPTAnvesh Rao

Tableau pptSharepoint Online Training

Tableau Online Training in canadaBoundTechS

Art and Science of Dashboard DesignSavvyData

PowerBI importance of power bi in data analytics fieldshubham299785

Measure the right stuff with crystal reports bb con 2011Joe Meehan

Bar chart CreationAyshwaryaBaburam

Crystal Reports ReviewJustin R. Rue

Sww 2008 Automating Your Designs Excel, Vba And BeyondRazorleaf Corporation

Dbms schemas for decision supportrameswara reddy venkat

Level of-detail-expressionsYogeeswar Reddy

Pass 2018 introduction to daxIke Ellis

Obiee11g building logical dimension hierarchyAmit Sharma

Visual guidance calgary user groupBerkovich Consulting

Data Visualization Tips for Oracle BICS and DVCSEdelweiss Kammermann

Tableau - Learning Objectives for Data, Graphs, Filters, Dashboards and Advan...Srinath Reddy

OLAPAshir Ali

Similar to Debugging Complex Calculations (20)

Tableau LOD Expressions | Edureka

Designing high performance datawarehouse

BI Knowledge Sharing Session 2

Tableau PPT

Tableau ppt

Tableau Online Training in canada

Art and Science of Dashboard Design

PowerBI importance of power bi in data analytics field

Measure the right stuff with crystal reports bb con 2011

Bar chart Creation

Crystal Reports Review

Sww 2008 Automating Your Designs Excel, Vba And Beyond

Dbms schemas for decision support

Level of-detail-expressions

Pass 2018 introduction to dax

Obiee11g building logical dimension hierarchy

Visual guidance calgary user group

Data Visualization Tips for Oracle BICS and DVCS

Tableau - Learning Objectives for Data, Graphs, Filters, Dashboards and Advan...

OLAP

Recently uploaded

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor

Midocean dropshipping via API with DroFxolyaivanovalion

Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR

ALSO dropshipping via API with DroFx.pptxolyaivanovalion

Carero dropshipping via API with DroFx.pptxolyaivanovalion

Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Riyadh +966572737505 get cytotec

100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate

Week-01-2.ppt BBB human Computer interactionfulawalesam

Introduction-to-Machine-Learning (1).pptxfirstjob4

BigBuy dropshipping via API with DroFx.pptxolyaivanovalion

{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71

April 2024 - Crypto Market Report's Analysismanisha194592

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823

Invezz.com - Grow your wealth with trading signalsInvezz1

CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823

Recently uploaded (20)

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130

Midocean dropshipping via API with DroFx

Generative AI on Enterprise Cloud with NiFi and Milvus

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE

ALSO dropshipping via API with DroFx.pptx

Carero dropshipping via API with DroFx.pptx

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec

100-Concepts-of-AI by Anupama Kate .pptx

Week-01-2.ppt BBB human Computer interaction

Introduction-to-Machine-Learning (1).pptx

BigBuy dropshipping via API with DroFx.pptx

{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha

April 2024 - Crypto Market Report's Analysis

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...

Invezz.com - Grow your wealth with trading signals

CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online

Debugging Complex Calculations

2. Debugging Complex Calculations Austin Dahl Director of Engineering, Analytics Mrunal Shridhar Commercial Presales Lead, EMEA

7. • Know row-level granularity of your data • Know what makes each row unique The dimension fields represent the LOD of the data source. You cannot drill down further than this.

8. Totally Aggregated Totally Disaggregated (granularity of data source - cannot go lower) Dimensions determine the Viz LOD. The Viz LOD becomes less aggregated/more granular as more dimensions are added. Granularity Less More Aggregation More Less # # # # # # # Dimensions Measures

10.

11. Do you have all the dimensions? No need to derive new dimensions New dimension need to have a subset of data only? Subset of data need to update automatically? Create a computed conditional set Create a manual set Dimension need to be derived at row level? Create row-level calculation Create FIXED LoD expression

12. Measure needs mixed level of aggregation? Break measure into sub- measures with a fixed level of granularity? Go back to Step 1, and go through the process for each sub-measure Measure level of aggregation same as dataset granularity? Create a row level calculated field Measure aggregation at the same LoD as the viz LoD? Create an aggregated calculated field Measure aggregation more granular than viz LoD? Does result need just one mark/value? Create FIXED or EXCLUDE LoD expression Create table calculation Create INCLUDE LoD expression Aggregation Granularity

13.

14.

15. discount category Measure needs mixed level of aggregation? Break measure into sub- measures with a fixed level of granularity? Go back to Step 1, and go through the process for each sub-measure Measure level of aggregation same as dataset granularity? Create a row level calculated field Measure aggregation at the same LoD as the viz LoD? Create an aggregated calculated field Measure aggregation more granular than viz LoD? Does result need just one mark/value? Create FIXED or EXCLUDE LoD expression Create table calculation Create FIXED or INCLUDE LoD expression Aggregation Granularity

16. discount product category Measure needs mixed level of aggregation? Break measure into sub- measures with a fixed level of granularity? Go back to Step 1, and go through the process for each sub-measure Measure level of aggregation same as dataset granularity? Create a row level calculated field Measure aggregation at the same LoD as the viz LoD? Create an aggregated calculated field Measure aggregation more granular than viz LoD? Does result need just one mark/value? Create FIXED or EXCLUDE LoD expression Create table calculation Create INCLUDE LoD expression Aggregation Granularity

17.

18.

19.

20.

21.

22.

23.

24. Dropping dimensions on these shelves adds them to the Viz LOD.

25.

26.

27.

28.

29. • It is easy for you to create a ton of LoD expressions • It is very easy to overwhelm your data model • It is most easy to use a wrong LoD expression in your viz

30.

31.

32.

33. Extract Filters Data Source Filters Context Filters FIXED Expressions Evaluated Dimension Filters INCLUDE/EXCLUDE Expressions Evaluated Measure Filters Local Filters (ATTR, geocoding) Table Calc Filters Hide database local

34.

35. Query • Query database • Cache Results Data • Local data joins • Local calculations • Local filters • Totals • Forecasting • Table calculations • 2nd pass filters • Sort Layout • Layout views • Compute legends • Encode marks Render • Marks • Selection • Highlighting • Labels

36.

37.

38.

39.

40.

41.

42.

43.

44.

45.

46.

47.

48. NULL

49.

50.

51.

52. What can go wrong? Join Some rows might not match. What will happen? Join There might be more than one match. Will you count things twice? Left It’s asymmetric, do you have the right primary? Post- aggregate It’s already aggregated, is the combination of aggregates valid?

53.

54.

55.

56.

57.

58.

59.

60.

61. Please complete the session survey from the Session Details screen in your TC16 app

Editor's Notes

Demo from workbook what these tools are and how they can help
No demo Now that you know what dimensions are needed to answer your analytical question. First step is to ensure that you those dimensions. Use this framework to ensure that you have them. Options include: Sets – manual or computed. Row level calculation LoD expression (Fixed) The reason we didn’t use filter and instead a set is because when you filter you exclude those values and cannot be used in a calculation.
Show a couple of examples of us working through the framework
Show a couple of examples of us working through the framework
Show a couple of examples of us working through the framework
Demo building table calculations in v9.3 using index() and then the new improvements in v10
Demo building table calculations in v9.3 using index() and then the new improvements in v10
Demo building table calculations in v9.3 using index() and then the new improvements in v10
Show a quick example, for dimension or measure, and for filtering on fixed LoD
Add comments to calculations Add comments to field by copying desc
Demo % of Total
Examples to discuss: Font – layout views Filter – query but could be data if it is a table calculation or 2nd pass Marks – when you select a mark it is simply render but when you have action filters you could trigger the whole pipeline. Show an example of interactive vs. non-interactive interaction
One of the big ideas I want you to understand is you can use the techniques of visualization to make your job of debugging easier and faster. To do this I’m going to take an example from Mrunal’s LOD expressions.
A few minutes ago, Mrunal was making sure that his calc for profit ratio on categories matched with the total calculation for profit ratio. I’m going to build a visualization that shows both the new calc and the total. If they align I know the calc is correct. Do they align? Yes. The calc is what I was expecting. In this case, I was only trying to see if three numbers match, but the technique of using a visualization will scale up. If there were 5 or 15 or 500 categories, I could tell in a glance if there was a difference between the bar and the reference line.
The next section is about joins.
In a normal join operation, you start with the columns on the left and augment them with columns from the right where you have a match on the join key. In this example, the data set on the left has Sales for each State and the data set on the right has Population for each state. When I join them by state, I’m going to end up with State, Region, Category, Sales and Profit, as well as Time Zone and Population. For the first two rows, California, the data I get in Time Zone and Population is West and 38 million something. Similar things happen for Colorado and Illinois. For Missouri, Montana and Nevada, there’s just one row each. For Texas and Washington it’s two rows each.
I can build a map of the sales per region. BTW, do you know how to create custom territories from existing geographic fields?) Right click on the field, Geographic Role Create From Select the appropriate field. I can easily build a map of sales per time zone. Cool, my join added something useful data.
Let’s build another viz. How many potential customers are there in each state? Wait! How can that be? There are only 38M people in California. What’s going wrong?
Remember as part of the join, population gets put in to two rows for California. When I drag out population, the default aggregation is SUM. So my viz, which is doing an aggregation by state sums all the Population for each state. That’s wrong for 5 of the 8 states in this data set.
How would I detect this might be a problem? Do a sanity check, and make it easy for yourself. You can add reference lines. You can build a dashboard with the original viz.
You might be tempted to use a different aggregation, but you will need to sanity check it repeatedly. LODs usually work nicely. Blending is almost magical in how it works. It very often does exactly what you want.
Another concern with Joins are Nulls. When you do a left or a right join there can be Nulls. In this case, if I do a left join the row for Oregon will get NULL, for Region, Category, Sales and Profit. If I do an inner join, I don’t even get a row for Oregon, but let’s stick to left joins for now.
Let’s look at Sales for the time zone.
If I didn’t already know there were NULLs for Oregon, how might I find out? I can build a bar chart to look at how many records there are. I can build a heat map to see where the NULLs are. This works well even when there are lots and lots of records. As you are looking, there are questions you can ask yourself.
Blends are a post aggregate left join. An aggregate of aggregates is sometime fine. A sum of sums is valid. A min of mins is valid. A average of sums is not the same as the average of the underlying items. An average of averages is not either, but it’s often close and can fool you easily.
Here’s an example that I was involved with. You may ask why different data sources? Because I didn’t have control of all of them. They were from different departments. You may ask why I didn’t use a cross data base join? Because when I started this, Tableau didn’t have the features.
These are my motivations. How can I check for dirty data? How can I check my assumptions?
I’ve been talking about blends for a while, but there are similar concepts that apply to joins. Most of the time, you don’t expect many-to-many joins and most of the time they are not the right thing. However, sometimes the data surprises you and you get some unexpected joins. In the Intern example, I did not expect Stephanie to have to mentor two interns. Here’s is an example to detect that in the join case.

Debugging Complex Calculations

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (11)

Similar to Debugging Complex Calculations

Similar to Debugging Complex Calculations (20)

Recently uploaded

Recently uploaded (20)

Debugging Complex Calculations

Editor's Notes