Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016


Published on

Overview of analytics at Cabonite and the role played by Snowplow, given by Robert Johnson in April 2016

Published in: Data & Analytics
  • Login to see the comments

  • Be the first to like this

Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016

  1. 1. PROJECT HOLOCRON Carbonite Analytics Platform Overview
  2. 2. ROBERT JOHNSON Director, Analytics Platform
  3. 3. THE VISION A brief intro of where we started and where we wanted to go…
  4. 4. November 2014 — State of Carbonite Analytics 4 • SQL Server Warehouse (Death Star) • Numerous Pipelines • Replication from Production Systems • Reporting Systems • Cognos • Tableau • Digital Analytics • IBM Coremetrics • Tealium • C3 Metrics
  5. 5. We needed more… 5 • Which of our marketing campaigns are effective? • Where should we be allocating our marketing spend? • What are the weak points in the customer acquisition lifecycle? • What features are customers using in our products? • How do we optimize web, product, and mobile workflows? What works? • What are our customers doing in our products? • How do we connect the digital analytics world with our traditional BI Warehouse? • We need a scalable, cost effective solution • We wanted a lambda architecture (batch and stream processing) • We wanted to use AWS • We didn’t want a proprietary internal system • We wanted to use and create as much open source technology as possible • Do we build or buy?
  6. 6. Project Holocron — Build a modern Data Warehouse solution 6 • Our research • Hortonworks / Hadoop • AWS EMR, Kinesis, Redshift • re:Invent, Hadoop Summit, etc. • We found the winning infrastructure • AWS • Looker • Snowplow • Redshift
  7. 7. OUR JOURNEY How this awesome open source project propelled us forward…
  8. 8. 2015, Q1 - Onboarding 8 • Goals • Get Snowplow Up and Running • Get Web data flowing (tracking beacons) • Setup Looker • Create proof of concept, sample, reports • How we achieved our quick wins • Managed Snowplow Hosting (1 week!) • Deployed Beacon to all of our sites • Worked with Looker and Snowplow to create a new Event model in Looker • Created basic page view and session reports
  9. 9. 2015, Q2 - Digging In 9 • Goals • Replace Coremetrics • Create a web marketing attribution framework • Create a stable operations platform • How we achieved our quick wins • Created dozens of reports in Looker based on customer requirements • Took advantage of Snowplows built in Web Events (Page views, link clicks, etc.) • Worked with our CMO to create a best in class marketing measurement framework (PCT) • Starting managing links in Excel (yuck) • Migrate Redshift to separate AWS prod account (protecting corporate-side data) • Cluster management with Ansible and CloudFormation • Ansible management of IAM • Implemented a Blacklist
  10. 10. 2015, Q3 - Adoption Hurdles 10 • Goals • Solve our Looker adoption issues • Get Link attribution info into Redshift • Standardize Event and Tag management across projects • Support Cart, Form Tracking, and Custom Events • Ensure Operational Integrity of Platform • How we achieved our quick wins • Created a Django API (Viceroy) for managing and storing PCT attributed links • Updated API to support Blacklist management • Set aside a strike team to sit with Marketing to help communication and adoption • Use Viper to standardize all of the analytics libraries we use (Google, HotJar, Optimizely, Tealium, etc.) • Use Viper to provide standardized API for Custom IGLU events • Created Operational processes to watchdog our data (with Looker Reports)
  11. 11. 2015, Q4 - Adoption Hurdles 11 • Goals • Provide Link Management capabilities as a self-service utility to Marketing • Find a Cost Management solution for AWS • Create a means of increasing confidence in our platform data • Find a utility that will help us with more complex ETL tasks such as Click Streaming and Data Ingestion • How we achieved our quick wins • We released Alpha and Beta of Project Viceroy Link Builder using ReactJS/ Redux • We chose CloudHealth for cost management, it’s awesome • We created a prototype weekly “Ion Cannon” email to help us determine what we want to automate later • We implemented Databricks so that we could perform advanced analytics using Spark
  12. 12. 2016 and beyond… 12 • Goals • Implement a system of Monitoring our Marketing Tags for performance issues and auditing • Instrument our Products with Event Tracking • Convert many of our Looker PDTs to Spark / EMR • How we achieved our quick wins • We’ve implementing TagInspector Realtime and Scanner • Viper 2.0 for Endpoint and all sites • Android Tracker for Mobile • Custom Events and Contexts for all! • We’re implementing Informatica Cloud • Working with the Snowplow team to customize the Enrichment process to use Spark • Databricks for Dev and Test, EMR for Prod
  13. 13. Project Viper 13 • The Analytics Team “Tag” • Decouple our efforts from the Web Teams • Single place to manage all of the various Analytics Tags • Single line of JavaScript for all of our needs • An Open Source Dev Side “Tag Manager” • To be Open Sourced in 2016 • Event Driven Framework • Built in JS ES6/2015 • Custom NPM Modules for Viper Plugins • Snowplow Support • Tealium Support • TagInspector Support • … and more
  14. 14. Project Viceroy 14 • Web Attribution Management • Built on our PCT Framework • Marketing Manager friendly, easy to use • Manage Marketing Attribution • REST API • ReactJS Frontend • Create Placements • Ad Server Templates (Marin, DCM, etc.) • Email Systems (Responsys, Exact Target, etc) • Direct Links • links • Standardized Attribution • No more typos • No more missing parameters • No more malformed links
  15. 15. THANK YOU! QUESTIONS? Robert Johnson Director, Analytics Platform