Google Bigquery 101
Data Architect @ Globant
César Orozco Manotas
2The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
● Data architect and engineer.
● Python developer.
● Working with GCP stack since a year.
● Cloud Technology enthusiast.
About me
@manotasce
https://www.linkedin.com/in/manotasce/
3The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
Agenda
● Basic concepts
● Best Practices
● Demo
● Learning resources
Basic Concepts
5The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
Enterprise Data Warehouse (EDW)
EDW systems consist of huge databases, containing
historical data on volumes from multiple gigabytes to
terabytes of storage.
Mark Sweiger: Scalable Computer Architectures for Data Warehousing, p. 1.
6The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
BigQuery: A serverless, highly-scalable, and
cost-effective cloud data warehouse with an
in-memory BI Engine and AI Platform built in.
7The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
BigQuery offers...
1 Interactive analysis of petabyte scale databases
2 SQL 2011 query language and functions
3 Many ways to ingest, transform, load, export data to / from
BigQuery
4 Nested and repeated fields, user-defined functions in JavaScript
5 Inexpensive data storage; queries charged on amount of data
processed
Data Engineering Course 05 Bigquery Analysis
8The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
Why BigQuery is so popular?
9The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
BigQuery Tech Partners
Welcome to Cloud onBoard - slide 37 http://bit.ly/2XiJ5MB
10The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
BigQuery pricing
11The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
BigQuery pricing( cont.)
https://cloud.google.com/bigquery/pricing/
12The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
BigQuery architecture
Dremel is the execution Engine.
Jupiter is the network.
Colossus is the distributed storage.
Borg Compute.
https://cloud.google.com/blog/products/gcp/bigquery-under-the-hood
13The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
How BigQuery is organized?
A project contains users and datasets
A dataset contains tables and views
A table is a collection of columns
A job is a potentially long-running
action
Data Engineering Course 05 Bigquery Analysis
14The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
Interacting with BigQuery
Google SDK - bq (Python script for BigQuery)
BigQuery API
Cloud console - BigQuery UI
15The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
BigQuery: 4TB in 28 secs!
Taken from https://cloudonair.withgoogle.com/events/onboard-core-infrastructure?expand=talk:intermission-2
Best practices
17The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
Where to store data on GCP
18The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
BigQuery ETL reference architecture
Taken from Google BigQuery: The Definitive Guide by Valliappa Lakshmanan and Jordan Tigani (O’Reilly). Copyright
2020 Valliappa Lakshmanan and Jordan Tigani, 978-1-492-04446-8.
19The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
BigQuery EL reference architecture
Taken from Google BigQuery: The Definitive Guide by Valliappa Lakshmanan and Jordan Tigani (O’Reilly). Copyright
2020 Valliappa Lakshmanan and Jordan Tigani, 978-1-492-04446-8.
20The Products logos contained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams.
BigQuery ELT reference architecture
Taken from Google BigQuery: The Definitive Guide by Valliappa Lakshmanan and Jordan Tigani (O’Reilly). Copyright
2020 Valliappa Lakshmanan and Jordan Tigani, 978-1-492-04446-8.
Demo
Learning resources
https://cloud.google.com/bigquery/docs/
https://www.reddit.com/r/bigquery/
https://cloud.google.com/free/

Bigquery 101

  • 1.
    Google Bigquery 101 DataArchitect @ Globant César Orozco Manotas
  • 2.
    2The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. ● Data architect and engineer. ● Python developer. ● Working with GCP stack since a year. ● Cloud Technology enthusiast. About me @manotasce https://www.linkedin.com/in/manotasce/
  • 3.
    3The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. Agenda ● Basic concepts ● Best Practices ● Demo ● Learning resources
  • 4.
  • 5.
    5The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. Enterprise Data Warehouse (EDW) EDW systems consist of huge databases, containing historical data on volumes from multiple gigabytes to terabytes of storage. Mark Sweiger: Scalable Computer Architectures for Data Warehousing, p. 1.
  • 6.
    6The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. BigQuery: A serverless, highly-scalable, and cost-effective cloud data warehouse with an in-memory BI Engine and AI Platform built in.
  • 7.
    7The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. BigQuery offers... 1 Interactive analysis of petabyte scale databases 2 SQL 2011 query language and functions 3 Many ways to ingest, transform, load, export data to / from BigQuery 4 Nested and repeated fields, user-defined functions in JavaScript 5 Inexpensive data storage; queries charged on amount of data processed Data Engineering Course 05 Bigquery Analysis
  • 8.
    8The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. Why BigQuery is so popular?
  • 9.
    9The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. BigQuery Tech Partners Welcome to Cloud onBoard - slide 37 http://bit.ly/2XiJ5MB
  • 10.
    10The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. BigQuery pricing
  • 11.
    11The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. BigQuery pricing( cont.) https://cloud.google.com/bigquery/pricing/
  • 12.
    12The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. BigQuery architecture Dremel is the execution Engine. Jupiter is the network. Colossus is the distributed storage. Borg Compute. https://cloud.google.com/blog/products/gcp/bigquery-under-the-hood
  • 13.
    13The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. How BigQuery is organized? A project contains users and datasets A dataset contains tables and views A table is a collection of columns A job is a potentially long-running action Data Engineering Course 05 Bigquery Analysis
  • 14.
    14The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. Interacting with BigQuery Google SDK - bq (Python script for BigQuery) BigQuery API Cloud console - BigQuery UI
  • 15.
    15The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. BigQuery: 4TB in 28 secs! Taken from https://cloudonair.withgoogle.com/events/onboard-core-infrastructure?expand=talk:intermission-2
  • 16.
  • 17.
    17The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. Where to store data on GCP
  • 18.
    18The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. BigQuery ETL reference architecture Taken from Google BigQuery: The Definitive Guide by Valliappa Lakshmanan and Jordan Tigani (O’Reilly). Copyright 2020 Valliappa Lakshmanan and Jordan Tigani, 978-1-492-04446-8.
  • 19.
    19The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. BigQuery EL reference architecture Taken from Google BigQuery: The Definitive Guide by Valliappa Lakshmanan and Jordan Tigani (O’Reilly). Copyright 2020 Valliappa Lakshmanan and Jordan Tigani, 978-1-492-04446-8.
  • 20.
    20The Products logoscontained in this icon library may be used freely and without permission to accurately reference Google's technology and tools, for instance in books or architecture diagrams. BigQuery ELT reference architecture Taken from Google BigQuery: The Definitive Guide by Valliappa Lakshmanan and Jordan Tigani (O’Reilly). Copyright 2020 Valliappa Lakshmanan and Jordan Tigani, 978-1-492-04446-8.
  • 21.
  • 22.
  • 23.