Advertisement
Advertisement

More Related Content

Slideshows for you(20)

Advertisement

Similar to Working With Big Data(20)

Advertisement

Working With Big Data

  1. Seth Familian Founder + Principal, Familian&1 WORKING WITH BIG DATA FOLLOW ALONG! familian1.com/wwbd
  2. INTRODUCTION SETH FAMILIAN FOUNDER + PRINCIPAL, FAMILIAN&1 2 Corporate Strategy User Experience Design Creative ProcraftinationTeaching + Education Growth Hacking
  3. AGENDA ‣ Context: What’s big data? ‣ Building dashboards ‣ Useful tools ‣ Inferring segments ‣ Final thoughts 3
  4. WORKING WITH BIG DATA CONTEXT: WHAT’S BIG DATA? 4
  5. CONTEXT: WHAT’S BIG DATA? WELCOME TO DATA OBESITY! 5 http://www.datasciencecentral.com/profiles/blogs/basic-understanding-of-big-data-what-is-this-and-how-it-is-going
  6. CONTEXT: WHAT’S BIG DATA? HOW BIG IS BIG? 6 http://www.domo.com/blog/2013/05/the-physical-size-of-big-data/ in 1 year! creates enough data to fill
  7. CONTEXT: WHAT’S BIG DATA? BIG IN GROWTH, TOO. 7 http://www.infosysblogs.com/brandedge/2013/04/20130419Infographc.html https://studentforce.wordpress.com/2013/09/21/umuc-big-data-revolution-is-here/
  8. CONTEXT: WHAT’S BIG DATA? 9 SOURCES 8 https://studentforce.wordpress.com/2013/09/21/umuc-big-data-revolution-is-here/
  9. CONTEXT: WHAT’S BIG DATA? 6 TYPES 9 { "created_at": "Thu Sep 15 16:29:08 +0000 2016", "id": 776457834095644700, "id_str": "776457834095644672", "text": "I love @glip because it makes me more productive and reliant on far fewer tools! #gliplove #goglip #gliptastic :)", "truncated": false, "entities": { "hashtags": [ { "text": "gliplove", "indices": [ 82, 91 ] }, { "text": "goglip", "indices": [ 92,
  10. CONTEXT: WHAT’S BIG DATA? 6 TYPES 10
  11. CONTEXT: WHAT’S BIG DATA? 6 TYPES 11
  12. CONTEXT: WHAT’S BIG DATA? THE FOUR V’S 12 http://www.slideshare.net/gschmutz/ukoug2013-big-datafastdata 9 Data Sources 6 Data Types
  13. CONTEXT: WHAT’S BIG DATA? COHESIVE ASSESSMENT 13 https://datafloq.com/read/understanding-sources-big-data-infographic/338
  14. WORKING WITH BIG DATA BUILDING DASHBOARDS FROM BIG DATA 14
  15. BUILDING DASHBOARDS SMALL DATA EXAMPLE 15 via
  16. USEFUL TOOLS SPLUNK 16 SPLUNK.COM FOR ANY MACHINE DATA
  17. Raw 
 “Header”
 File BUILDING DASHBOARDS BIG DATA VISUALIZATION 17 26Mrows 250Kaffiliate IDs 28sub-channels
  18. “Long” Data BUILDING DASHBOARDS THE “OLD SCHOOL” APPROACH 18 Raw 
 “Header”
 File 2 1 Affiliates Lookup File Update Loop Summary Index saved searches scheduled searches TRANSFORM Instantly
 Generates EXTRACT LOAD Channel
 Dashboards
  19. BUILDING DASHBOARDS ANATOMY OF A SPLUNK SEARCH 19
  20. BUILDING DASHBOARDS SCHEDULED SEARCHES + INDICES 20 by sourceoforder by af_type by af_source by af_name by af_name 
 + ppc_s SUMMARY INDEX INDEX_MAIN_SOURCES INDEX_A.COM INDEX_AFFILS INDEX_PAID_SEARCH INDEX_SHOPPING_ENGINES 12M+ TRANSACTIONSFULL DB METRICS SAVED SEARCHES DATA DELTAS METRICS DATA DELTAS METRICS DATA DELTAS METRICS DATA DELTAS METRICS DATA DELTAS
  21. BUILDING DASHBOARDS NESTED CHARTS + SMALL MULTIPLES 21
  22. BUILDING DASHBOARDS SEGMENT 22 SEGMENT.COM FOR DATA ROUTING
  23. BUILDING DASHBOARDS CLEARBIT 23 CLEARBIT.COM FOR DATA ENRICHMENT
  24. BUILDING DASHBOARDS SEGMENT-CLEARBIT ENRICHMENT 24 RAW DATA EXTRACT Core Website Enrichment of email addresses
  25. USEFUL TOOLS MIXPANEL 25 MIXPANEL.COM FOR USER-EVENT DATA
  26. USEFUL TOOLS MIXPANEL ANALYTICS 26 VIA INSIGHT-SCREENSHOT REPS
  27. MARKETING + OPS INTERCOM 27 A BETTER CRM INTERCOM.IO
  28. BUILDING DASHBOARDS AUTOMATED BIG DATA FLOW (EXAMPLE 1) 28 RAW DATA EXTRACT LOADTRANSFORM Traffic Sources 
 & Session Stats RAW DATA Behavioral Segments, Funnels, Retention & LTV EXTRACT Additional aggregation and data refinement Core Website Social Engagement Footprint Unified social
 footprint metrics Enrichment of email addresses CRM data store for easy segmentation + analysis Additional context 
 on Twitter followers More flexible segments, funnels + retention metrics
  29. BUILDING DASHBOARDS AUTOMATED BIG DATA FLOW (EXAMPLE 2) 29 LOAD Custom dashboards synced with 70+ APIs Traffic Sources 
 & Session Stats Realtime (RT) TRANSFORMRAW DATA EXTRACT Core Website Social Engagement Footprint heavy-duty query tools already in place App Databases Custom aggregation scripts Postgres or Redshift DB Daily Pull Internally-reported metrics summarized for triangulation Daily CSV Behavioral segmentation 
 + in-app messaging RT Behavioral Segments, Funnels, Retention & LTV RT Unified social
 footprint metrics Unified app downloads & ratings metrics App Store Activity email address enrichment
  30. BUILDING DASHBOARDS AUTOMATED BIG DATA FLOW (EXAMPLE 3) 30 LOAD Custom dashboards synced with 70+ APIs Traffic Sources 
 & Session Stats Realtime (RT) TRANSFORMRAW DATA EXTRACT Core Website MongoDB Custom aggregation scripts MySQL Presence Table Internally-reported metrics summarized for triangulation Weekly CSV Behavioral segmentation 
 + in-app messaging RT Behavioral Segments, Funnels, Retention & LTV RT Unified app downloads & ratings metrics App Store Activity email address enrichment Daily Dump Instant dashboards for Intercom
  31. BUILDING DASHBOARDS 31 MORE ADVANCED AUTOMATION
  32. AUTOMATION ZAPIER 32 ZAPIER.COM FOR COMPLEX ACTIONS
  33. MARKETING + OPS TYPEFORM 33 GORGEOUS + POWERFUL SURVEYS
  34. AUTOMATION 34 EXAMPLE WITH INTERCOM + TYPEFORM ZAPIER
  35. AUTOMATION 35
  36. MIND MELTED YET? LET’S TAKE 15. BREAK TIME! 36 ‣ Stretch your legs ‣ Hydrate or grab a snack ‣ We’ll start again in 15 mins!
  37. Seth Familian Founder + Principal, Familian&1 WORKING WITH BIG DATA FOLLOW ALONG! familian1.com/wwbd
  38. WORKING WITH BIG DATA USEFUL TOOLS FOR BIG DATA 38
  39. USEFUL TOOLS A BUSY LANDSCAPE 39
  40. USEFUL TOOLS LET’S SIMPLIFY 40
  41. USEFUL TOOLS AND LET’S REFRAME IT 41 EVENT-BASED ANALYTICS +TEXTUAL VISUAL ANALYTICS + INSIGHT PROCESSING + NORMALIZATION DATA TRANSFORMATION (ETL) ACTIVITY MODALITY DATA DISPLAY + DASHBOARDING STATISTICAL ANALYTICS VISUAL ANALYTICS
  42. USEFUL TOOLS POWER PLAYERS 42 EVENT-BASED ANALYTICS +TEXTUAL VISUAL ANALYTICS + INSIGHT PROCESSING + NORMALIZATION VISUAL ANALYTICSSTATISTICAL ANALYTICS DATA TRANSFORMATION (ETL) DATA DISPLAY + DASHBOARDING ACTIVITY MODALITY
  43. USEFUL TOOLS SPLUNK 43 SPLUNK.COM FOR ANY MACHINE DATA
  44. USEFUL TOOLS SPLUNK 44 SPLUNK.COM FOR ANY MACHINE DATA
  45. USEFUL TOOLS CHARTED 45 CHARTED.CO FOR SUPER SIMPLE CHARTS
  46. USEFUL TOOLS C3.JS 46 C3JS.ORG FOR CUSTOM CHARTING
  47. USEFUL TOOLS TAGUL 47 TAGUL.COM FOR GORGEOUS WORD CLOUDS
  48. USEFUL TOOLS QUID 48 QUID.COM FOR UNSTRUCTURED ANALYSIS
  49. USEFUL TOOLS QUINTLY 49 QUINTLY.COM FOR SOCIAL MEDIA DATA
  50. USEFUL TOOLS GOOGLE 
 ANALYTICS 50 ANALYTICS.GOOGLE.COM FOR WEBSITE TRAFFIC
  51. USEFUL TOOLS GOOGLE SHEETS 51 SHEETS.GOOGLE.COM
  52. USEFUL TOOLS GOOGLE SHEETS 52 SHEETS.GOOGLE.COM
  53. USEFUL TOOLS GECKOBOARD 53 GECKOBOARD.COM
  54. USEFUL TOOLS KLIPFOLIO 54 KLIPFOLIO.COM
  55. WORKING WITH BIG DATA INFERRING SEGMENTS FROM BIG DATA 55
  56. INFERRING SEGMENTS FROM BIG DATA Frequency (F) Ranking Recency (R) Ranking Monetary (M) Ranking 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1. all customers are independently ranked into equal-sized “tiles” three times over 56 1,164,927 customers 2. M scores are multiplied by 100 and F scores are multipled by 10 to create unique ranking values 100 200 300 400 500 10 20 30 40 50 1 2 3 4 5 3. MFR scores are added up for each customer to yield 125 unique MFR segments: 111 121 131 141 151 112 122 132 142 152 155 211 221 231 241 251 212 222 232 242 252 255 511 521 531 541 551 512 522 532 542 552 555 Most recent, frequent, and highest-value customers Least recent, frequent, and lowest-value customers RFM SCORING
  57. INFERRING SEGMENTS FROM BIG DATA 57 WHY QUANTILES? NORMAL DISTRIBUTION SKEWED-DISTRIBUTION
  58. INFERRING SEGMENTS FROM BIG DATA 58 X11 X21 X31 X41 X12 X22 X32 X42 X13 X23 X33 X43 X14 X24 X34 X44 X15 X25 X35 X45 X51 X52 X53 X54 X55 High Frequency High Recency Low Frequency Low Recency Still Loyal Once Loyal New Old F + R = LOYALTY INSIGHTS
  59. INFERRING SEGMENTS FROM BIG DATA $0 $1,500 $3,000 $4,500 $6,000 59 Average total spent ($) by new MFR quantiles rerun for non-outlier M1 + M2 customers M1 M2 M3 M4 M5 percent: top 20% of M1+2 2nd 20% 3rd 20% 4th 20% Bottom 20% segment size: 93,134 93,139 92,861 93,406 93,143 avg. $ spent: $3,337 $1,137 $642 $412 $276 total $ spent: $345,234,826 $105,573,528 $59,348,459 $38,398,553 $25,537,936 % of total revs: 53% 32% 18% 11% 8% High-Value Customers Low-Value Customers M = VALUE 
 INSIGHTS
  60. INFERRING SEGMENTS FROM BIG DATA 60 High Value Customers Low Value Customers Still Loyal Once Loyal New Old M1 M2 M3 M4 M5 212111 121 112 122 113 123 211 221 222 311 321 312 322 411 421 412 422 511 521 512 522 114 124 115 125 213 223 214 224 215 225 313 323 314 324 315 325 413 423 414 424 415 425 513 523 514 524 515 525 131 132 141 142 151 152 231 232 241 242 251 252 331 332 341 342 351 352 431 432 441 442 451 452 531 531 541 542 551 552 133 134 143 144 153 154 135 145 155 233 234 243 244 253 254 235 245 255 333 334 343 344 353 354 335 345 355 433 434 443 444 453 454 435 445 455 533 534 543 544 553 554 535 545 555 1 2 3 4 5 6 7 8 COMBINING INSIGHTS
  61. INFERRING SEGMENTS FROM BIG DATA PRICE-POINT CUTOFFS 61 Best Camera/Lens Purchased DSLR Body DSLR Lens DSLR Body + Lens Point-and-Shoot Segment Name Relationship 
 to Photography Memory Keepers Use cameras to record family memories and milestones less than 
 $650 less than 
 $300 less than 
 $950 less than 
 $450 Hobbyists Enjoy the picture-taking process; understand and use camera controls $650 - $1725 $300 - $750 $950 - $2300 $450 - $700 Prosumers Advanced skills, but do not make a living from photography $1725 - $2750 $750 - $3000 $2300 - $4200 $700 - $2500 Pros Rely on photography as a profession $2750+ $3000+ $4200+ $2500+
  62. INFERRING SEGMENTS FROM BIG DATA CROSSING RFM W/ CATEGORIES 62 Low Value High Value Still Loyal Once Loyal New Old Still Loyal Once Loyal New Old Memory Keepers 1 2 3 4 5 6 7 8 Hobbyists 9 10 11 12 13 14 15 16 Prosumers 17 18 19 20 21 22 23 24 Professionals 25 26 27 28 29 30 31 32 3. Cross-Tabulate 
 Top customers and categories to create behavioral and 
 loyalty-based segments 9 
 key categories
 account for 81% of sales 2. Isolate 
 the top customers and categories by total dollars spent, frequency, and recency (RFM) measures 465,683 
 top customers
 account for 88% of sales
 1,164,927 customers 807 categories 1. Aggregate 
 72 months of Internet channel transaction data, organizing by key variables 2,246,094 Internet Channel transactions 4. Generate
 Segment-specific marketing recommendations which can be further targeted by brand YIELDS SOLID TARGETS 
 FOR TACTICAL PLANNING
  63. INFERRING SEGMENTS FROM BIG DATA DASHBOARD INTEGRATION 63
  64. INFERRING SEGMENTS FROM BIG DATA 64 REPORTING ≠ STRATEGY!
  65. WORKING WITH BIG DATA FINAL THOUGHTS 65
  66. FINAL THOUGHTS A NEW TYPE OF KNOWLEDGE WORKER 66 http://www.doclens.com/87922/think-issue-7-2014/
  67. FINAL THOUGHTS AN INCREDIBLY VALUABLE SKILL 67 https://studentforce.wordpress.com/2013/09/21/umuc-big-data-revolution-is-here/
  68. FINAL THOUGHTS THE CORNERSTONE OF A DAUNTING FUTURE? 68 https://studentforce.wordpress.com/2013/09/21/umuc-big-data-revolution-is-here/
  69. FINAL THOUGHTS DATA AS INTERFACE 69 for using Made Visual BACKGROUND TITLES + BUTTONS TEXT + LINES Your data + brand up to 
 100,000 
 objects Anywhere on the Web using 
 1 line
 of code
  70. FINAL THOUGHTS DATA AS INTERFACE 70 for using
  71. FINAL THOUGHTS DATA AS INTERFACE 71
  72. FINAL THOUGHTS START HERE 72 CHARTED.CO
  73. FINAL THOUGHTS OR HERE 73 SEGMENT.IO
  74. MIXPANEL.COM FINAL THOUGHTS OR HERE 74
  75. HBR.ORG FINAL THOUGHTS OR HERE 75
  76. FINAL THOUGHTS OR HERE 76
  77. DISCUSSION TIME WORKING WITH BIG DATA 77 QUESTIONS · FEEDBACK · IDEAS · INSIGHTS
  78. THANK YOU KEEP IN TOUCH! 78 SETH@FAMILIAN1.COM · @SETHFAM1
Advertisement