Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How to Sharpen Your Investigative Analysis with PowerPivot

DAHub Europe 2015

  • Be the first to comment

  • Be the first to like this

How to Sharpen Your Investigative Analysis with PowerPivot

  1. 1. How to Sharpen Your Investigative Analysis with the New Excel (a PowerPivot intro) Carmen Mardiros - navabi GmbH DA Hub 2015
  2. 2. Core component of the Microsoft self-serve BI stack Fast and intelligent data modelling for the Excel pro. ! Stack also includes PowerQuery (getting and cleaning data), PowerView (reporting) and PowerBI (online report publishing). ! Integrated in Excel In many ways it feels very familiar (especially if you use pivot tables and charts extensively). It’s FREE Well, as long as you have Excel Professional Plus 2010 or 2013. ! Highly recommended: Get the 2013 64-bit version What is Power Pivot?
  3. 3. How PowerPivot helps you to become a better analyst Analytics tools can’t substitute you. But they can help you to become more efficient, unlock your true potential and get the recognition you deserve. ! Today is about lots of examples. ! Ready to use formulas Not enough time to break all formulas down or explain how PowerPivot works in detail, but will explain how and when to use them, what to change about them, how your data must look like for them to work. Resources to develop your PowerPivot skills A few titles to help you build upon what you’ve learned today. What today is about
  4. 4. After today, pivot tables will never look the same again. ! So what’s wrong with Excel anyway?
  5. 5. 1. Excel can’t handle lots of data…
  6. 6. … PowerPivot handles many millions easily
  7. 7. 2. Regular pivot calculated fields are very basic…
  8. 8. … PowerPivot bends the “normal” pivot rules to its will
  9. 9. 3. Must re-create formatting every time you add a metric to a regular pivot… Takes 8 clicks to set the formatting for Transactions and 2 more to change the title. ! Remove it from the pivot and add it again? Start all over. ! Every. Single. Time
  10. 10. … in PowerPivot you change once and formatting stays the same
  11. 11. Flash intro to Power Query
  12. 12. PowerQuery: Getting multiple CSVs into PowerPivot Connectors for many databases, Facebook, Salesforce, Hadoop, feeds, Excel files, CSV files etc and very soon Google Analytics
  13. 13. PowerQuery: Getting multiple CSVs into PowerPivot PowerQuery has its own language as well as intuitive UI. ! We use formula to get keep only 1 header from our folder of cdv files
  14. 14. PowerQuery: Getting multiple CSVs into PowerPivot let Source = Folder.Files(“C:UsersmooDesktopdahub"), Tables = List.Transform(Source[Content], each Table.PromoteHeaders(Csv.Document(_,null,null,null,1252))), SingleTable = Table.Combine(Tables) in SingleTable
  15. 15. PowerQuery: Getting multiple CSVs into PowerPivot CSV files are combined on the fly into a single table NOTE: *All* CSV files must have the same structure
  16. 16. Load to PowerPivot
  17. 17. Flash intro to PowerPivot
  18. 18. Enable to Add-in and it’s all systems go Google “enable powerpivot addin"
  19. 19. PowerPivot window is where the magic happens
  20. 20. Has calculated columns like Excel but that’s where similarity ends NOTE: avoid using calculated columns unless you absolutely have to. ! They are very costly in terms of performance as they are stored in memory.
  21. 21. Portable “measures” are the unit of work for PowerPivot # Sessions:=SUM('dahub_sessions'[sessions]) special equal operator keep this explicit and eye-friendly full column reference that is being summarised
  22. 22. Every measure is simply a building block % Conversion Rate:= [# Transactions]/[# Sessions]
  23. 23. Allows you to build sophisticated formulas Each measure is made up of other measures. PowerPivot resolves all the dependencies and calculates them in the right order. One change, trickles through entire reporting If the name is ‘visits’ and your field is now ‘sessions’, you make 1 change and all your measures update like magic. Calculated on the fly, not stored in memory Until you actually use them in a pivot, they add no performance overhead. Maintainability heaven at no extra cost. Why measures are so amazing
  24. 24. Just drag to the Pivot and voila
  25. 25. Real-world examples
  26. 26. DISTINCTCOUNT function magic! # Unique Campaigns:= DISTINCTCOUNT(‘dahub_sessions'[campaign_id]) ! # Sessions per Campaign:= DIVIDE([# Sessions], [# Unique Campaigns])
  27. 27. Which channels have a wide portfolio of active campaigns or a very active narrow one? This is impossible to answer with regular pivot tables
  28. 28. ! Has traffic gone up or down today because we have fewer campaigns bringing in traffic?
  29. 29. ! How many campaigns are bringing in a minimum of 1000 sessions each day? ! Mind blowing
  30. 30. # Unique Campaigns min 1000 sessions:= CALCULATE( [# Unique Campaigns], FILTER( VALUES('dahub_sessions'[campaign_id]), [# Sessions] >= 1000 ) ) This is the formula… but don’t try to take it in yet
  31. 31. First, PowerPivot sets the pivot coordinates and calculates the “base” measure # Sessions
  32. 32. Then, *before* calculating [# Unique Campaigns], it adds an additional filter that keeps only campaigns that fit the criteria.
  33. 33. # Unique Campaigns min 1000 sessions:= CALCULATE( [# Unique Campaigns], FILTER( VALUES('dahub_sessions'[campaign_id]), [# Sessions] >= 1000 ) ) Let’s break the formula down…. 1. Pivot coordinates are set and underlying data filtered accordingly. 2. Additional FILTER is applied 3. And only *afterwards* [# Unique Campaigns] is calculated
  34. 34. ! What % of all campaigns are bringing in a minimum of 100 sessions each day?
  35. 35. Variations Campaigns / ad group / keywords / landing pages [# Sessions] >= 50 ! Size of your effectively active SEO/PPC portfolio and how that changes over time. Use Cost per Conversion instead If you have cost in your data, create a [£ Cost per Conversion] and swap [# Sessions] with it. ! Monitor the number of adgroups / keywords exceeding the maximum budget Combine multiple conditions in the FILTER FILTER( VALUES('dahub_sessions'[keyword_id]), [£ Cost per Conversion] >= 50 && [# Clicks] >= 10 )
  36. 36. More variations… Campaigns and channels bringing most of the high spenders Which channels or campaigns bring the highest number of transactions over a certain Revenue threshold? ! (requires that you have a dataset with source_medium, campaign, and transaction_id and you create a measure # Unique Transactions using transaction_id) Campaigns and channels bringing *predominantly* high spenders If you have cost in your data, create a [£ Cost per Conversion] and swap [# Sessions] with it. ! Monitor the number of adgroups / keywords exceeding the maximum budget
  37. 37. # Unique Transactions min £500:= CALCULATE( [# Unique Transactions], FILTER( VALUES(‘dahub_sessions'[transaction_id]), [£ Transaction Revenue] >= 500 ) ) Let’s break the formula down…. NOTE: In pivot you need source_medium and / or campaign on rows and you need transaction_id in your data
  38. 38. Banding The problem: too many unique values to analyse. ! The solution: creating dynamic groups to “cluster” very granular data into a small number of groups
  39. 39. The nested IF way…. =IF( [keyword_id] contains "<your brand name>", "brand", IF( [keyword_id] contains "not provided", "not provided", IF( [keyword_id] contains "not set", "not set", "generic" ) ) )
  40. 40. The nested IF way in PowerPivot using SWITCH function =SWITCH( TRUE(), IFERROR(SEARCH("<your brand name>", 'dahub_sessions'[keyword_id]), -1) <> -1, "brand", IFERROR(SEARCH("not provided", 'dahub_sessions'[keyword_id]), -1) <> -1, "not provided", IFERROR(SEARCH("not set", 'dahub_sessions'[keyword_id]), -1) <> -1, "not set", "generic" ) Nested IFs forever gone
  41. 41. Which landing pages attract predominantly branded / generic traffic?
  42. 42. CALCULATE is a super SUMIF The single most powerful feature in PowerPivot # Sessions Branded:= CALCULATE( [# Sessions], 'dahub_sessions'[brand_group] = "brand" ) ! # Sessions Non Branded:= CALCULATE( [# Sessions], 'dahub_sessions'[brand_group] = "non brand" ) This is a CALCULATE filter that gets added *before* [# Sessions] is calculated
  43. 43. CALCULATE allows segmentation you could never do before If a CALCULATE filter is on a column that’s already in pivot, it gets overridden. Remove the column from pivot and calculation still works!
  44. 44. CALCULATE filters have countless uses Determine hidden biases in AB testing See if your variations had a comparable % of branded / non branded traffic which might skew the results. ! Also works with device, mobile traffic and any other dimension you might have in your data. Works best when you create custom “clusters” using SWITCH The formulas work with any dimension in your dataset but if you really want to unlock CALCULATE’s filtering potential, it really pays to create custom calculated columns using the SWITCH formula. Create horizontal conversion funnels CALCULATE is the essential building block for taking conversion funnel analysis to the net step
  45. 45. Step 1. The right data for Horizontal Funnels To get a Funnel Step column you need to create segments for each step in your web analytics tool and export them as CSV file. Then, import into PowerPivot using Power Query and the multiple CSV import method.
  46. 46. Step 1. The right data for Horizontal Funnels You need these segments: ! All Sessions (unsegmented) Category Pages Products Add to Basket Basket Secure Login Address Confirm Order Payment
  47. 47. Step 2. Use CALCULATE on each funnel step # Sessions All:= CALCULATE( [# Sessions Funnel], 'dahub_funnel'[funnel_step] = "All" ) ! … ! # Sessions Payment:= CALCULATE( [# Sessions Funnel], 'dahub_funnel'[funnel_step] = "Payment" ) Create a new measure for each step in the funnel
  48. 48. Step 2. Use CALCULATE on each funnel step This allows you to create custom “goals” on the fly out of *ANY* segment
  49. 49. Step 3. Create ratios for each funnel step % Sessions Address:= DIVIDE([# Sessions Address], [# Sessions All]) Use All Sessions as a base for division: Use previous funnel step as a base for division: % Sessions Address progress:= DIVIDE([# Sessions Address], [# Sessions Secure Login])
  50. 50. Step 4. Add measures to Pivot and analyse You can use *ANY* dimension you have available in your dataset on rows. Here, it’s Landing Page but you can use date, channel dimensions, device etc. ! Can EVEN add an additional segmentation level like user type (newly acquired, loyal etc)
  51. 51. Resources Best book for PowerPivot novices with gradual learning curve. ! By the end it gets pretty advanced. You learn about relationships, how to model multiple tables, time intelligence functions and much more.
  52. 52. Resources All in one reference for formulas for almost any scenario. All explained and broken down. ! You need a good understanding of PowerPivot to begin with so don’t get this first.
  53. 53. Questions?
  54. 54. Bonus - Lifecycle metrics Essential for comparing business entities (users, customers) as well as assets (content, landing pages, promos etc)
  55. 55. Step 1. Find First Date for each landing page First Date Landing Page:= CALCULATE( MIN('dahub_sessions'[session_date]), ALL('dahub_sessions'[session_date]), VALUES('dahub_sessions'[landing_page_id]) )
  56. 56. Step 2. Find [# Sessions] on first day # Sessions in first day:= CALCULATE( [# Sessions], FILTER( ALL('dahub_sessions'[session_date]), 'dahub_sessions'[session_date] = [First Date Landing Page] ), VALUES('dahub_sessions'[landing_page_id]) )
  57. 57. Step 3. Find [# Sessions] in first 7 days # Sessions in first 7 days:= CALCULATE( [# Sessions], FILTER( ALL('dahub_sessions'[session_date]), 'dahub_sessions'[session_date] >= [First Date Landing Page] && 'dahub_sessions'[session_date] <= [First Date Landing Page] + 7 ), VALUES('dahub_sessions'[landing_page_id]) )