Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Slides for automate or die (presentation)

542 views

Published on

Slides for webinar Automate or Die: 5 ways to massively improve quant researcher productivity, presented by Matthew Steele on 17 July 2017

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Slides for automate or die (presentation)

  1. 1. M AT T S T E E L E P R E S E N T S If you have any questions, enter them into the Questions field. Questions will be answered at the end. If we do not have time to get to your question, we will email you. We will email you a link to the video, slides, and data. Get a free one-month trial of Q from www.q-researchsoftware.com AUTOMATE OR DIE
  2. 2. Why do we need to think about automation via software? 2
  3. 3. The pressures facing researchers How to set yourself apart as a researcher? More data Shorter turnaround Tighter budgets Clients can DIY 3
  4. 4. What automation can deliver ✓ Increase productivity (because you do things faster) ✓ Lower costs (saving sweat and tears) ✓ Opportunities for new analysis (help you be a more effective) ✓ Higher quality • Avoiding human errors • Automating expertise • More time to think and play ✓ Avoiding non-value-add work 4
  5. 5. “Software is eating the world” Marc Andreason 5
  6. 6. How software can play a role in data setup, analysis and reporting THE FOCUS OF TODAY 6
  7. 7. Agenda • Introduction • 8 areas of opportunity for automation in market research 1. Automated data checking/cleaning/tidying/coding/updating 2. Automatic identification of tables that contain interesting results 3. Using standardized residuals to highlight interesting cells on a table 4. Automatic updating of analyses with new data 5. Automatic charting 6. Automatic writing of PowerPoint slides 7. Automatic updating of PowerPoint slides 8. Using Dashboards for self-service • Live Q&A session 7
  8. 8. Key point to realise upfront: Automation is not black and white Automation continuum The automation continuum Blood, sweat, & tears “Turn-key” automation 8
  9. 9. What degree of automation do we want from software? Data setup Analysis Reporting Automation continuum The automation continuum Blood, sweat, & tears “Turn-key” automation PowerPoint Automation Automated reporting Automatically- updated dashboards that export to PowerPoint Analysis automation Automated charting *Cannot be entirely automated The jobs to be done Extracting the data Data tidying/cleaning/coding/variables New brands/options/questions* Crosstabs/analysis Constructed tables (e.g,. Brand health) Updating charts Updating tables in reports Updating text 9
  10. 10. Automation continuum The automation continuum Blood, sweat, & tears “Turn-key” automation The automation continuum No automation Checklists/ QA Processes Scripting Computer programming Self- documenting point-and-click automation N/A No software SPSS VBA For Excel VB, R, and Python (and their programmers) Q Tables in Excel. Lots of cutting and pasting. Human effort Computer code is written to automate how a program works (eg; SPSS Syntax) New apps are written in computer code designed to work on multiple projects. An app is used that does the heavy lifting without the user having to write any code, except for unusual things. Code not mandatory (can use buttons and menu-items) Clear, accessible, linked record of what was done in the graphic user interface (GUI) 10
  11. 11. No automation Checklists/ QA Processes Scripting Computer programming Self- documenting point-and-click automation N/A No software SPSS VBA For Excel VB, R, and Python (and their programmers) Q Tables in Excel. Lots of cutting and pasting. Human effort Computer code is written to automate how a program works (eg; SPSS Syntax) New apps are written in computer code designed to work on multiple projects. An app is used that does the heavy lifting without the user having to write any code, except for unusual things. Automation continuum The automation continuum Blood, sweat, & tears “Turn-key” automation The automation continuum THE FOCUS OF TODAY 11
  12. 12. Automation continuum The automation continuum Blood, sweat, & tears “Turn-key” automation The automation continuum No automation Checklists/ QA Processes Scripting Computer programming Self- documenting point-and-click automation Self-service Artificial intelligence N/A No software SPSS VBA For Excel VB, R, and Python (and their programmers) Q SurveyMonkey Qualtrics Displayr Nobody Tables in Excel. Lots of cutting and pasting. Human effort Computer code is written to automate how a program works (eg; SPSS Syntax) New apps are written in computer code designed to work on multiple projects. An app is used that does the heavy lifting without the user having to write any code, except for unusual things. An app that allows the users to completely DIY (avoids need for suppliers altogether) Magic 12
  13. 13. Automated data checking/cleaning/tidying Using standardized residuals to highlight interesting cells on a table or chart Automatic identification of tables that contain interesting results Automatic updating of analyses with new data Automatic charting Automatic writing of PowerPoint Slides Automatic updating of PowerPoint Slides Using Dashboards for self-service A U T O M AT E O R D I E : 8 W AY S TO A U TO M AT E Q U A N T R E S E A R C H 13
  14. 14. Extracting the data Data tidying/cleaning/coding/variables New brands/options/questions* Crosstabs/analysis Constructed tables (e.g,. Brand health) Updating charts Updating tables in reports Updating text Data setup Analysis Reporting The jobs to be done 14
  15. 15. You have a lot of dirty laundry to do with survey data Outliers Flatliners Survey skips HTML stuck in labels Empty variable labels Binary variables Missing values Speedsters Survey loops Variable type 15
  16. 16. Automated data setup: CHECKING SURVEY SKIPS Survey skips 16
  17. 17. Checking total hole counts % n Coca-Cola 100% 600 Diet Coke 100% 600 Coke Zero 94% 563 Pepsi 100% 600 Diet Pepsi 78% 470 Pepsi Max 93% 559 Row % Hate Dislike Neutral Like Love Base n Coca-Cola 4% 7% 15% 32% 42% 600 Diet Coke 14% 29% 23% 11% 24% 600 Coke Zero 13% 19% 24% 17% 26% 563 Pepsi 9% 12% 34% 8% 38% 600 Diet Pepsi 17% 24% 42% 5% 12% 470 Pepsi Max 12% 16% 30% 18% 24% 559 Question 1 - Awareness Question 5 – Brand attitude 17
  18. 18. Checking – cross-tabbing questions Example cross-tab: Brand Attitude - Diet Pepsi by Q1 - Aware (Diet Pepsi) BrandAttitude-DietPepsi n Not Aware Aware Hate 0 81 Dislike 0 112 Neutral 0 198 Like 0 23 Love 0 56 Column n 0 470 Q1 - Aware (Diet Pepsi) 18
  19. 19. Checking – using a Sankey diagram 19
  20. 20. Introducing the SANKEY diagram 20
  21. 21. How different software can create a Sankey Diagram • Create > Charts > Visualization• Code Computer programming Self-documenting point-and-click automation The automation continuum Example:: https://cran.r-project.org/web/packages/riverplot/ 21
  22. 22. Automated data setup: CHECKING FOR FLATLINERS         0 1 2 3 4 5 6 7 I drink cola whenever I can Cola is for people with unhealthy diets I try to avoid cola whenever possible Cola drinks are the best drinks around I would live on cola if I could Cola is destroying our kids’ health Cola is only appropriate for parties I need a cola when I eat pizza Flat-liners 22
  23. 23. The steps required to figure out who’s flat-lining a) Identify variables that belong to same scale question set b) Compute who gave respondents gave flat-line answers on any scale point c) optional: additional computations to see if they were flat-lining elsewhere in the survey d) How to then delete them from the data? An additional deletion stage? 23
  24. 24. Example of SPSS Code to detect flat-liners compute sd.brand = sd( vara1 to vara9 ). compute sd.att1 = sd( varb1 to varb8 ). compute sd.att2 = sd( varc1 to varc5). * More batteries potentially freq sd.brand sd.att1 sd.att2 /format=notable /statistics /percentiles= 5 10 25 75 95 . compute d.brand = ( sd.brand < 0.5). compute d.att1 = ( sd.att1 < 0.5). compute d.att2 = ( sd.att2 < 0.5). count flatlines = d.brand to d.att2 (1). freq flatlines. * Remove those with 60% or more flatlined batteries. compute clean.sample = (flatlines <= 1). exe. filter by clean.sample. * Check across y. use all. format flatlines clean.sample (F1.0). * split file by qcountry2. set tnumbers = labels. ctables /table clean.sample > flatlines by y /CATEGORIES VARIABLES=flatlines EMPTY=EXCLUDE /CATEGORIES VARIABLES=clean.sample [1, 0] EMPTY=INCLUDE /table flatlines [COLPCT.COUNT] by y. set tnumbers = both. 24
  25. 25. Detecting Flatliners • QScript: “Project Setup - Identify Questions with Straight-lining” • Syntax The automation continuum Scripting Computer programming Self-documenting point-and-click automation 25
  26. 26. Automated data setup…. Other opportunities? Library of scripts build in Existing scripts can be modified and/or new bespoke scripts can be done DIY DIY Code Additional code/syntax can be found on web and tweaked or write your custom own from scratch Specialized for survey analysisGeneralist analytic package The automation continuum 26
  27. 27. Automated data checking/cleaning/tidying Using standardized residuals to highlight interesting cells on a table or chart Automatic identification of tables that contain interesting results Automatic updating of analyses with new data Automatic charting Automatic writing of PowerPoint Slides Automatic updating of PowerPoint Slides Using Dashboards for self-service A U T O M AT E O R D I E : 8 W AY S TO A U TO M AT E Q U A N T R E S E A R C H 27
  28. 28. A striking table Standardised residuals help identify interesting results! 28
  29. 29. Standardized residuals can be applied to any table with numbers Column % 18 - 29 30 - 39 40 - 49 50 + Coca-Cola 55% 53% 28% 38% Diet Coke 3% 15% 14% 11% Coke Zero 16% 16% 23% 17% Pepsi 5% 8% 22% 6% Diet Pepsi 1% 0% 3% 6% Pepsi Max 18% 7% 11% 23% 29
  30. 30. Which do you prefer Column % 18 - 29 30 - 39 40 - 49 50 + Coca-Cola 55% ↑ 53% ↑ 28% ↓ 38% Diet Coke 3% ↓ 15% 14% 11% Coke Zero 16% 16% 23% 17% Pepsi 5% 8% 22% ↑ 6% Diet Pepsi 1% 0% 3% 6% ↑ Pepsi Max 18% 7% ↓ 11% 23% ↑ Column % 18 - 29 30 - 39 40 - 49 50+ Coca-Cola 55% 53% 28% 38% C d C d Diet Coke 3% 15% 14% 11% a a a Coke Zero 16% 16% 23% 17% Pepsi 5% 8% 22% 6% A b D Diet Pepsi 1% 0% 3% 6% a b Pepsi Max 18% 7% 11% 23% b B c Column Names A B C D 30
  31. 31. 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑𝑖𝑧𝑒𝑑 𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙(𝑍) = Observed % − Expected % )Expected %(1 − Column Total % (1 − Row Total %)/N Does the cell differ significantly from “expectation”? For p < 0.05, then the Z-statistic for the cell will be greater than +1.96 or less than -1.96 31
  32. 32. Formatting can be applied them to the original tables. We can then transplant that same colour coding and/or arrows back to the original column-% table. column-% 18 - 29 30 - 39 40 - 49 50 + Coca-Cola 55% ↑ 53% ↑ 28% ↓ 38% Diet Coke 3% ↓ 15% 14% 11% Coke Zero 16% 16% 23% 17% Pepsi 5% 8% 22% ↑ 6% Diet Pepsi 1% 0% 3% 6% ↑ Pepsi Max 18% 7% ↓ 11% 23% ↑ z-Statistic 18 - 29 30 - 39 40 - 49 50 + Coca-Cola 3.3 ↑ 2.5 ↑ -3.7 ↓ -2.2 Diet Coke -3.2 ↓ 2.0 1.5 .0 Coke Zero -.5 -.6 1.5 -.3 Pepsi -1.9 -.4 4.9 ↑ -2.0 Diet Pepsi -1.7 -2.2 .0 3.5 ↑ Pepsi Max 1.0 -3.2 ↓ -1.7 3.3 ↑ 32
  33. 33. The new-school table? 33
  34. 34. The new-school table? 34
  35. 35. 35
  36. 36. The new-school table? Yes please. 36
  37. 37. How different software can compute/format standardized residuals Automation continuum The automation continuum Blood, sweat, & tears “Turn-key” automation Scripting Computer programming Self-documenting point-and-click automation Table of standardized residuals (via menu) • Instant Z-statistics in table • Auto-formatting of tables based on the Z-statistic • Table of standardized residuals (via code) • Formatting with specific functions (eg: mosaic() 37
  38. 38. Performing the calculation in SPSS 38
  39. 39. Performing the calculation in SPSS 39 Formatting then with : SPSS Output Management System
  40. 40. Performing the calculation in R 40
  41. 41. Formatting in R via the mosaic() function 41
  42. 42. Formatting in R via the formattable() function 42
  43. 43. Automated data checking/cleaning/tidying Using standardized residuals to highlight interesting cells on a table or chart Automatic identification of tables that contain interesting results Automatic updating of analyses with new data Automatic charting Automatic writing of PowerPoint Slides Automatic updating of PowerPoint Slides Using Dashboards for self-service A U T O M AT E O R D I E : 8 W AY S TO A U TO M AT E Q U A N T R E S E A R C H 43
  44. 44. 44
  45. 45. 45
  46. 46. Screen out the “junk” tables Tables with significant results ONLY (& potentially ranked by significance) ALL the tables 46
  47. 47. How different software can screen tables automatically (based on statistical significance) • QScript (remove uninteresting tables) • Built-in function (Smart Tables) Automation continuum The automation continuum Blood, sweat, & tears “Turn-key” automation Scripting Computer programming Self-documenting point-and-click automation 47
  48. 48. Automated data checking/cleaning/tidying Using standardized residuals to highlight interesting cells on a table or chart Automatic identification of tables that contain interesting results Automatic updating of analyses with new data Automatic charting Automatic writing of PowerPoint Slides Automatic updating of PowerPoint Slides Using Dashboards for self-service A U T O M AT E O R D I E : 8 W AY S TO A U TO M AT E Q U A N T R E S E A R C H 48
  49. 49. 1. Segment reworking 2. Product reworking 3. Data revisions 4. Tracking Data setup Analysis Reporting 49
  50. 50. Automation continuum The automation continuum Blood, sweat, & tears “Turn-key” automation A key difference in reporting automation: Scripting vs. Self-documenting point-and-click No automation Checklists/QA Processes Scripting Computer programming Self- documenting point-and-click automation Self-service Artificial intelligence 50
  51. 51. Scripting approach: Example from SPSS Two potential problems: ➢ Opportunity for misspecification ➢ Opportunity for miscommunication 51
  52. 52. What if our data file updates? No automation Checklists/QA Processes Scripting Computer programming Self- documenting point-and-click automation Self-service Artificial intelligence Re-rerun syntax What if we got in an updated datafile from our supplier that had an extra quarter? Eg: “Cola Tracking – Jan to Sept”.sav -->> “Cola Tracking – Jan to Dec”.sav Update data 52
  53. 53. Automated data checking/cleaning/tidying Using standardized residuals to highlight interesting cells on a table or chart Automatic identification of tables that contain interesting results Automatic updating of analyses with new data Automatic charting Automatic writing of PowerPoint Slides Automatic updating of PowerPoint Slides Using Dashboards for self-service A U T O M AT E O R D I E : 8 W AY S TO A U TO M AT E Q U A N T R E S E A R C H 53
  54. 54. 4 ways to think about automatic charting • OfficeReports, • e-Tabs Graphique • think-cell. Quality Speed #1 Chart template files #2 Templating within apps #3 Specific charting tools #4 Bespoke visualizations • D3 54
  55. 55. 55
  56. 56. An easy win in PowerPoint Chart template files – the .crtx files 56 #1 Chart template files
  57. 57. TECHNICAL ELEMENTS OF PPT: Charts in PowerPoint offer greater updatability than tables 40% 25% 15% 20% 30% 45% 15% 10% 20% 50% 20% 10% Coke Pepsi Fanta Sprite 18-30 31-45 45+ Coke Pepsi Fanta Sprite 18-30 31-45 45+ 18-30 31-45 45+ Sprite Fanta Pepsi Coke 57
  58. 58. Reference: https://www.ibm.com/support/knowledgecenter/en/SSLVMB_23.0.0/spss/tutorials/ outputtut_msapps.html 58 #2 Templating within apps
  59. 59. More automation with greater upfront work Example: Plotly package #3 Specific charting tools 59
  60. 60. AUTO CHARTING #3 & 4: Programmed charts Example: Plotly package #3 Specific charting tools #4 Bespoke visualizations Example: “Palm Trees” (within Q) 60
  61. 61. AUTO CHARTING #4: New Charts Chart template files Templating within data analysis apps Specific charting tools Bespoke visualizations Quality Speed 61
  62. 62. Automated data checking/cleaning/tidying Using standardized residuals to highlight interesting cells on a table or chart Automatic identification of tables that contain interesting results Automatic updating of analyses with new data Automatic charting Automatic writing of PowerPoint Slides Automatic updating of PowerPoint Slides Using Dashboards for self-service A U T O M AT E O R D I E : 8 W AY S TO A U TO M AT E Q U A N T R E S E A R C H 62
  63. 63. Manually charting in PowerPoint is a chore 63
  64. 64. Some considerations about PowerPoint o PowerPoint is not a uniform whole – composition of parts o “Automatic” ≠ 100% automation o PowerPoint has limitations – can only be “asked” to do stuff o One-step or multi-step process? o Replication and preservation of your design o Alterable in PowerPoint after export/update? 64
  65. 65. Automatic writing of PowerPoint charts Automation continuum The automation continuum Blood, sweat, & tears “Turn-key” automation Scripting Computer programming Self-documenting point-and-click automation Syntax (static, limited) Code (static, limited) 65 Point and click to use: • PowerPoint template (.pptx file) • Chart template (.crtx file)
  66. 66. Automatic writing of PowerPoint charts Temporary. Select if d1 <= 30. FREQUENCIES VARIABLES=d2 d3 d4 /ORDER=ANALYSIS. 66 click File > Export and choosing to export to PowerPoint File > Export and choose to export to PowerPoint
  67. 67. Automatic writing of PowerPoint charts 67
  68. 68. Automatic writing of PowerPoint charts (reporteR) 68
  69. 69. 69
  70. 70. Automation continuum The automation continuum Blood, sweat, & tears “Turn-key” automation Scripting Computer programming Self- documenting point-and-click automation Self-service Computer code is written to automate how a program works (eg; SPSS Syntax) New apps are written in computer code designed to work on multiple projects. An app is used that does the heavy lifting without the user having to write any code, except for unusual things. An app that allows the users to completely DIY (avoids need for suppliers altogether) 70
  71. 71. Automated data checking/cleaning/tidying Using standardized residuals to highlight interesting cells on a table or chart Automatic identification of tables that contain interesting results Automatic updating of analyses with new data Automatic charting Automatic writing of PowerPoint Slides Automatic updating of PowerPoint Slides Using Dashboards for self-service A U T O M AT E O R D I E : 8 W AY S TO A U TO M AT E Q U A N T R E S E A R C H 71
  72. 72. The jobs to be done: Reporting Data setup Analysis Reporting 1. Segment reworking 2. Product reworking 3. Data revisions 4. Tracking 72
  73. 73. Data setup Analysis Reporting 73
  74. 74. 74
  75. 75. Automated data checking/cleaning/tidying Using standardized residuals to highlight interesting cells on a table or chart Automatic identification of tables that contain interesting results Automatic updating of analyses with new data Automatic charting Automatic writing of PowerPoint Slides Automatic updating of PowerPoint Slides Using Dashboards for self-service A U T O M AT E O R D I E : 8 W AY S TO A U TO M AT E Q U A N T R E S E A R C H 75
  76. 76. Using Dashboards for self-service 76
  77. 77. A couple of concluding thoughts 77 A G E N D A | A U TO M AT E O R D I E
  78. 78. ✓ Opportunities for new analysis ✓ Increase productivity ✓ Lower costs ✓ Higher quality ✓ Avoiding non-value-add work Planning for retirement Surviving Transforming Dreaming The automation continuum Automation continuum The automation continuum Blood, sweat, & tears “Turn-key” automation No automation Checklists/ QA Processes Scripting Computer programming Self- documenting point-and-click automation Self-service Artificial intelligence 78

×