Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

[LondonSEO 2020] BigQuery & SQL for SEOs

2,345 views

Published on

In this talk, Areej will share her learning process, how SEOs can get acquainted with the world of BigQuery and why SQL is the new and improved Excel. The audience will walk away with a handful of scripts and tips to get their BigQuery journey started!

Published in: Marketing

[LondonSEO 2020] BigQuery & SQL for SEOs

  1. 1. BigQuery & SQL for SEOs @areej_abuali linkedin.com/in/areejabuali/
  2. 2. @areej_abuali HELLO! I’m here to talk to you about how (5 months ago) I started using BigQuery & SQL in my day to day job. 2
  3. 3. @areej_abuali HELLO! I’m here to talk to you about how (5 months ago) I started using was forced to start using BigQuery & SQL in my day to day job. 3
  4. 4. @areej_abuali Agency-side 4 Client-side
  5. 5. @areej_abuali 35,000,000 Indexed Pages 55,000,000 Monthly Visits 5
  6. 6. @areej_abuali 6 Don’t get me started on Excel...
  7. 7. @areej_abuali 7 And using GA wasn’t going to cut it!
  8. 8. @areej_abuali 8 Note: This GA interface screenshot is not Zoopla data and I do not plan on showing any Zoopla data throughout the slides...
  9. 9. @areej_abuali 9 ...I just got my job 5 months ago and would like to keep it!
  10. 10. @areej_abuali So this talk is about how we can all adopt BigQuery and start doing really cool stuff with it... 10
  11. 11. “ It’s about learning to feel comfortable with the uncomfortable. 11
  12. 12. @areej_abuali 12 My 24/7 state of mind! And that it’s okay to feel this way!
  13. 13. @areej_abuali Let’s set the scene!
  14. 14. @areej_abuali 14 Google Cloud Platform
  15. 15. @areej_abuali What is BigQuery? 15 “BigQuery is an enterprise data warehouse that stores and queries massive datasets by enabling super-fast SQL queries using the processing power of Google’s infrastructure.”
  16. 16. @areej_abuali What is BigQuery? 16 “BigQuery is an enterprise data warehouse that stores and queries massive datasets by enabling super-fast SQL queries using the processing power of Google’s infrastructure.” TOO MUCH GIBBERISH!
  17. 17. @areej_abuali So what is it then? 17 It’s a thing that will help you analyse massive datasets quickly and easily via SQL!
  18. 18. @areej_abuali Why is it useful? ▸ It’s cloud-based (super scalable) 18
  19. 19. @areej_abuali Why is it useful? ▸ It’s cloud-based (super scalable) ▸ Unlimited access to historical data 19
  20. 20. @areej_abuali Why is it useful? ▸ It’s cloud-based (super scalable) ▸ Unlimited access to historical data ▸ It’s pay as you go (1TB = $5) 20
  21. 21. @areej_abuali Why is it useful? ▸ It’s cloud-based (super scalable) ▸ Unlimited access to historical data ▸ It’s pay as you go (1TB = $5) ▸ Simple interface and setup 21
  22. 22. @areej_abuali And as for SQL... ▸ It’s a language used for extracting and analysing data stored in databases 22
  23. 23. @areej_abuali And as for SQL... ▸ It’s a language used for extracting and analysing data stored in databases ▸ It’s way faster than Excel because the data you’re analysing is stored separately 23
  24. 24. @areej_abuali And as for SQL... ▸ It’s a language used for extracting and analysing data stored in databases ▸ It’s way faster than Excel because the data you’re analysing is stored separately ▸ Your code is reusable 24
  25. 25. @areej_abuali It’s pseudo-codish! SELECT * FROM example_table WHERE example_column = "value" 25
  26. 26. @areej_abuali What did I need? 26 Data query (repeatedly)
  27. 27. @areej_abuali What did I need? 27 Data query (repeatedly) Advanced filtering
  28. 28. @areej_abuali What did I need? 28 Data query (repeatedly) Advanced filtering Sort large datasets
  29. 29. @areej_abuali Let’s get started!
  30. 30. @areej_abuali 30 console.cloud.google.com/bigquery
  31. 31. @areej_abuali 31 Query Editor
  32. 32. @areej_abuali 32 Query Editor Your datasets are stored here
  33. 33. @areej_abuali 33 Query Editor Your datasets are stored here You can see your Job History & Query History here
  34. 34. @areej_abuali 34 https://cloud.google.com/bigquery/docs/loading-data
  35. 35. @areej_abuali 35 BigQuery Cookbook: support.google.com/analytics /answer/4419694
  36. 36. @areej_abuali GA Sample Dataset 36 https://bigquery.cloud.google.com/table /bigquery-public-data:google_analytics_ sample.ga_sessions_20170801 https://support.google.com/analytics/answer/7586738
  37. 37. @areej_abuali GA Sample Dataset 37 https://bigquery.cloud.google.com/table /bigquery-public-data:google_analytics_ sample.ga_sessions_20170801 https://support.google.com/analytics/answer/7586738 Because if I use Zoopla data...
  38. 38. @areej_abuali SQL Query 38 SELECT FROM WHERE ORDER BY LIMIT
  39. 39. @areej_abuali SQL Query - Select 39 What columns do you want to pull?
  40. 40. @areej_abuali SQL Query - Select 338 columns in total 40
  41. 41. @areej_abuali SQL Query - Select ▸ SELECT * ▸ SELECT date, visitNumber ▸ SELECT visitNumber as Number 41
  42. 42. @areej_abuali SELECT date as Date, channelGrouping as Channel, totals.visits as Visits, totals.transactionRevenue as Revenue 42 SQL Query - Select
  43. 43. @areej_abuali SQL Query - From 43 Which data source do you want to pull from?
  44. 44. @areej_abuali SQL Query - From 44 PROJECT ID DATASET TABLE
  45. 45. @areej_abuali SQL Query - From 45 PROJECT ID DATASET TABLE bigquery-public-data.google_analytics_sample.ga_sessions_20170801
  46. 46. @areej_abuali SELECT date as Date, channelGrouping as Channel, totals.visits as Visits, totals.transactionRevenue as Revenue FROM `bigquery-public-data.google_analytics_sample.ga_sessions_20170801` 46 SQL Query - From
  47. 47. @areej_abuali SQL Query - Where 47 What filters do you want to apply?
  48. 48. @areej_abuali SQL Query - Where ▸ WHERE channelGrouping = ‘Organic Search’ ▸ WHERE channelGrouping in (‘Organic Search’, ‘Direct’) ▸ WHERE channelGrouping = ‘Organic Search’ AND date = ‘20170701’ 48
  49. 49. @areej_abuali SELECT date as Date, channelGrouping as Channel, totals.visits as Visits, totals.transactionRevenue as Revenue FROM `bigquery-public-data.google_analytics_sample.ga_sessions_20170801` WHERE channelGrouping = 'Organic Search' 49 SQL Query - Where
  50. 50. @areej_abuali SQL Query - Order By 50 How do you want to sort your data?
  51. 51. @areej_abuali SELECT date as Date, channelGrouping as Channel, totals.visits as Visits, totals.transactionRevenue as Revenue FROM `bigquery-public-data.google_analytics_sample.ga_sessions_20170801` WHERE channelGrouping = 'Organic Search' ORDER BY totals.visits desc 51 SQL Query - Order By
  52. 52. @areej_abuali SQL Query - Limit 52 How many rows do you want to return?
  53. 53. @areej_abuali SELECT date as Date, channelGrouping as Channel, totals.visits as Visits, totals.transactionRevenue as Revenue FROM `bigquery-public-data.google_analytics_sample.ga_sessions_20170801` WHERE channelGrouping = 'Organic Search' ORDER BY Revenue desc LIMIT 100 53 SQL Query - Limit
  54. 54. @areej_abuali #standardSQL SELECT date as Date, channelGrouping as Channel, totals.visits as Visits, totals.transactionRevenue as Revenue FROM `bigquery-public-data.google_analytics_sample.ga_sessions_20170801` WHERE channelGrouping = 'Organic Search' ORDER BY Revenue desc LIMIT 100 54 Standard vs Legacy
  55. 55. @areej_abuali 55
  56. 56. @areej_abuali Your typical process... 56 ▸ Open GA ▸ Filter data in GA ▸ Export GA data
  57. 57. @areej_abuali Your typical process... 57 ▸ Open GA ▸ Filter data in GA ▸ Export GA data ▸ Open Excel ▸ Clean data ▸ Filter data ▸ Sort data
  58. 58. @areej_abuali Your typical process... 58 ▸ Open GA ▸ Filter data in GA ▸ Export GA data ▸ Open Excel ▸ Clean data ▸ Filter data ▸ Sort data Cry because everything breaks and you get the spinning wheel of death
  59. 59. @areej_abuali Your typical process... 59
  60. 60. @areej_abuali 60 2.1 seconds!
  61. 61. @areej_abuali 61 But what if I want to sum up some of my values?
  62. 62. @areej_abuali SQL Query - Select 62 SELECT date as Date, channelGrouping as Channel, sum(totals.visits) as Visits, sum(totals.transactionRevenue) as Revenue
  63. 63. @areej_abuali SQL Query - Group By 63 SELECT date as Date, channelGrouping as Channel, sum(totals.visits) as Visits, sum(totals.transactionRevenue) as Revenue GROUP BY Date, Channel Non-aggregated columns should be in Group By
  64. 64. @areej_abuali SELECT date as Date, channelGrouping as Channel, sum(totals.visits) as Visits, sum(totals.transactionRevenue) as Revenue FROM `bigquery-public-data.google_analytics_sample.ga_sessions_20170801` WHERE channelGrouping = 'Organic Search' GROUP BY Date, Channel ORDER BY Revenue desc LIMIT 100 64
  65. 65. @areej_abuali 65
  66. 66. @areej_abuali Real Life Challenge
  67. 67. @areej_abuali Challenge 67 What if all of our data wasn’t living in the same table?
  68. 68. @areej_abuali Sessions Data 68 Transaction Data
  69. 69. @areej_abuali Excel 69 ▸ VLookup ▸ MatchIndex (for VLookup haters)
  70. 70. @areej_abuali Join 70 Takes all the data from your first table and joins rows from a second table (using a common metric)
  71. 71. @areej_abuali 71
  72. 72. @areej_abuali Left Join 72 FROM `Table 1` a LEFT JOIN `Table 2` b ON (a.metric = b.metric)
  73. 73. @areej_abuali Left Join 73 FROM `project-1234.analytics.ga_sessions` a LEFT JOIN `project-1234.analytics.ga_transactions` b ON (a.ga_session_id = b.ga_session_id)
  74. 74. @areej_abuali SELECT a.channelGrouping as Channel, sum(a.totals.visits) as Visits, sum(b.totals.transactionRevenue) as Revenue FROM `project-1234.analytics.ga_sessions` a LEFT JOIN `project-1234.analytics.ga_transactions` b ON (a.ga_session_id = b.ga_session_id) WHERE a.channelGrouping = 'Organic Search' 74
  75. 75. @areej_abuali SQL Query 75 SELECT FROM WHERE GROUP BY JOINORDER BY
  76. 76. @areej_abuali Covered ▸ SELECT ▸ FROM ▸ WHERE ▸ ORDER BY ▸ GROUP BY ▸ JOIN ▸ LIMIT 76 Not Covered ▸ HAVING ▸ WINDOW ▸ UNION ▸ WITH
  77. 77. @areej_abuali 77 https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax
  78. 78. @areej_abuali Supported Functions ▸ Aggregate ▸ Arithmetic ▸ Comparison ▸ Date & Time ▸ Logical Operators ▸ Regular Expressions ▸ String 78
  79. 79. @areej_abuali So, What’s Next?
  80. 80. @areej_abuali 80 In this talk, we’ve simply scratched the surface by analysing Analytics data...
  81. 81. @areej_abuali There’s so much more to do! 81 Crawl Data
  82. 82. @areej_abuali There’s so much more to do! 82 Crawl Data Link Data
  83. 83. @areej_abuali There’s so much more to do! 83 Crawl Data Link Data Log Files
  84. 84. @areej_abuali 84 And there are so many smart(er) people and resources to help you do that!
  85. 85. @areej_abuali 85 BristolSEO (Jan 28th) ReadingSEO (Feb 13th) Hayden Roche - Technical SEO at Scale Attend more advanced talks! https://www.meetup.com/bristol-seo/ https://www.meetup.com/SEO-Meetup-Reading/ @HaydenRoche3
  86. 86. @areej_abuali 86 Read everything Dom writes! @dom_woodman ▸ How to Use BigQuery for Large-Scale SEO ▸ Guide to Log Analysis with Big Query https://moz.com/blog/how-to-bigquery-large-scale-seo https://www.distilled.net/log-file-analysis/
  87. 87. @areej_abuali Beautiful Dom Slide! ▸ How long does it take for a page to be discovered after being published? ▸ Which pages have requests from Googlebot? ▸ What are the top non-canonical pages being crawled? ▸ What are the most crawled parameters? ▸ Which directories have the most 404 error codes? ▸ Which pages are crawled with and without parameters? 87 https://www.slideshare.net/DominicWoodman/a-guide-to-log-analysis-with-big-query
  88. 88. @areej_abuali 88 Learn more SQL! https://www.codecademy.com/catalog/language/sql
  89. 89. @areej_abuali 89 More great resources/courses ▸ Coursera - From Data to Insights with Google Cloud ▸ QwikLabs - BigQuery for Marketing Analysts ▸ Coding is for Losers - Learning BigQuery SQL ▸ OnCrawl - Why SEOs Should Ditch Excel & Learn SQL ▸ Book - Google BigQuery: The Definitive Guide ▸ Google - BigQuery Documentation ▸ Google - BigQuery Cookbook
  90. 90. @areej_abuali A few final points... 90 Try out every random SQL query you come across (and create a library of saved queries)
  91. 91. @areej_abuali A few final points... 91 Mash up different datasets together (it helps answer tons of questions)
  92. 92. @areej_abuali A few final points... 92 Share cool things you learn with the rest of us (and don’t worry about that one idiot on Twitter who labels it as ‘old news’)
  93. 93. @areej_abuali A few final points... 93 It’s okay to feel overwhelmed learning something new (maybe in 5 months you’ll be giving a talk about it too!)
  94. 94. @areej_abuali 94 THANKS! Any Questions? ▸ @areej_abuali ▸ linkedin.com/in/areejabuali

×