Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Why Big Query is so Powerful - Trusted Conf

352 views

Published on

A basic introduction to Big Query, how it works and what it can do. Look into a use case of Big Query, using Google Analytics and CRM data to create a powerful remarketing list.

Published in: Data & Analytics
  • Be the first to comment

Why Big Query is so Powerful - Trusted Conf

  1. 1. Why Big Query is so Powerful
  2. 2. 2 What we’ll cover ● Introduction to BigQuery ● How does BQ works ● Use Case 1 - Integrate with CRM Data to get meaningful insights ● Use Case 2 - Create remarketing list using BQ ML
  3. 3. 1. What is Big Query? Introduction of Big Query 4
  4. 4. What is Big Query ▹ an enterprise data warehouse ▹ super-fast SQL queries ▹ leverage Google's infrastructure ▹ Cheap, pay on demand solution (easy to scale) 5
  5. 5. No hardware, easy to set up ▹ You can have a full datawarehouse running within minutes, with virtually ZERO ongoing operational overhead 6 BigQuery sits somewhere in between
  6. 6. So Exactly how fast?7
  7. 7. query uses over 3000 CPU cores, hundreds of disks and a 330Gb network to run a regex over 4TB of data in under 30 secs. 8
  8. 8. And.. how cheap?9
  9. 9. Why is it so powerful for Marketers? ▹ We usually have the limited dev resources to maintain and configure data warehouse ▹ works seamlessly with google products ▸ data visualisation (DataStudio) ▸ machine learning (BQML*, Cloud auto ML) ▹ BIG PLUS: access to Google Analytics, Adwords, Youtube data raw data* (if you have a $150k/year Google 360 subscription) 10
  10. 10. 2. How does BigQuery works? Introduction of BigQuery 11
  11. 11. Big Query Structure 12
  12. 12. Loading Data 13
  13. 13. Loading Data 14
  14. 14. Function & Operators 15 Support both: 1. Legacy SQL 2. Standard SQL* *Not all functions are supported, one of the prominent example would be - SELECT distinct. (Instead user will need to use Group By & Partition By to replace these function) https://cloud.google.com/bigquery/docs/reference/standard-sql/
  15. 15. GA Data in BigQuery 16 ▹ A new table per day "ga_sessions_YYYYMMDD". ▹ Each row represent a new session ▹ Session data is nested and stored in a massive Big Table - works more like arrays/subsets Tips: - Query by day/ month/year - Unnest only “subsets” of data needed
  16. 16. GA Data in BigQuery 17
  17. 17. Integrate with CRM data to obtain Meaningful Data Marketing use case 1 18
  18. 18. Case Study Recruitment company, relies heavily on offline activity to qualify leads (applicants) that comes in problem: ▹ Misalignment in business metric (P&L) - Digital marketing and business metrics are not aligned ▹ Attribution - Marketing channels and campaigns cannot be accurately attributed to each placement 19
  19. 19. Solution ▹ A end-to-end bespoke dashboard to show the right data: ▸ how marketing channels contribute to each placements (sales) & quality candidates ▸ How offline and online activities works together 20
  20. 20. Prerequisite21 CRM ▹ You (or your client) is on Google 360 premium to access google analytics raw data ▹ Basic standard SQL ▹ Configured an integration between Google analytics and CRM (Customer Relationship Management)
  21. 21. Configure GA & CRM integration ▹ User/client ID https://support.google.com/analyti cs/answer/7584446?hl=en ▹ using Data layer to send back CRM user ID to match particular sessions on GA 22
  22. 22. Approach23 Integration GA & CRM Import CRM data into BigQuery Join GA Session Data & CRM using unique identifier Attribute GA sessions Connect with Data Studio
  23. 23. Create remarketing & look alike audience using BigQuery ML Marketing use case 2 24
  24. 24. Case Study Sales problem: ▹ Need a better way to prioritise leads: Offline sales oriented nature the lead volume they get per month is enormous (ten of thousands) Marketing problem ▹ Struggle to improve ROI of remarketing efforts ▹ Lack understanding of the traits of high value converters 25
  25. 25. Process overview 27 Prepare,extract, train, evaluate (repeat)
  26. 26. 1- Collect & Create data sets 28 ▹ Started with supervised learning ▹ Created a datasets that require a lot of data cleaning, labeling ▹ Trying to obtain as much data points as possible incl email, sales consultant performance, seasonality (eg school holiday)
  27. 27. - Desktop_flag - Tablet_flag - Mobile_flag - Total time on site - Total count of sessions - Sum of morning visits - Sum of daytime visits - Sum of time spent on Product Page * - Sum of time spent on resource page - Channel coming through first touch - Channel coming through last touch - Count of pages visited - Days difference since the first session - Sum hits number 29 What data is available in GA that we can use?
  28. 28. 2 - Clean and categorise data 30 Verify validity and clean data to make it ready to process. Look out for data quality problems like: ▹ Missing data ▹ Invalid or inconsistent data types ▹ Distribution bias or uniformity Tips: ▹ IF function to process missing value ▹ CASE statements to convert and transform data into dummy variables
  29. 29. 3 - Cluster & categorise different group of users 31 Based on our analysis and business context, we cluster the data into 4 main categories 1. Not interested, non-converters 2. Interested, non-converts 3. Not yet interested , converters 4. Interested, converters
  30. 30. 4 - Create & train model 32 Create a model for each group of users: ▹ BQ has a amazing integration with R studio & Python (using code lab) ▹ Alternatively, we can use BigQuery ML (using standard SQL function!) ▸ Linear Regression ▸ Logistic Regression(in this case) ▸ Multiclass logistic regression for classification https://cloud.google.com/bigquery/docs/bigqueryml-intro
  31. 31. ● Decrease complexity ● Increase speed ● Simple, easy to learn ● Export data from Big Query may be prevented by legal restrictions (such as HIPAA guidelines). 33 Advantage of using BQ ML
  32. 32. 5 - Evaluate your model 34 ▹ allow you to evaluate your model based on simple function “ML.Evaluate” ▹ Based on what model you have chosen you will have a different results Example of logistic regression:
  33. 33. 6 - Predict Outcome 35 Example of using the PREDICT function to predict ecommerce purchase:
  34. 34. ACTIVATE 36
  35. 35. 1. Create & test out lookalike audience based on your findings eg demographic features, location 2. Create remarketing list based on lead score - Import lead score & visitor ID in GA > create custom metric for lead score> create remarketing audience https://github.com/GoogleCloudPlatform/google-analytics-premium-bigquery- statistics/blob/master/README.md 37 Marketing activation
  36. 36. 1. BigQuery is a cheap, easy to set up data warehouse/ exploration/preparation/machine learning(!) tool 2. Data brings people together (sales & marketing) & goes a long way if use appropriately 3. Creating and using Machine learning model is easier than you think (with the right tool*) 38 Conclusion
  37. 37. 39 LET’S BQ THIS Any queries? You can find me at gabriella@inmarketingwetrust.com.au linkedin.com/in/gabriella-wong

×