Agenda
 Why BigQuery
 What is BigQuery
 Architecture
 BigQuery Organization
 Use case
 Pros and cons
Why BigQuery?
● Generate big data reports require expensive servers and
skilled database administrators.
● Interacting with big data has been expensive, slow and
inefficient.
● BigQuery changes all that
 Reducing time and expense to query data.
What’s BigQuery?
● Service for interactive analysis of massive datasets (TBs)
 Query billions of rows, seconds to write, seconds to return
 Uses a SQL-style query syntax
● Reliable and secure
 Replicated across multiple sites
 Secured through Access Control Lists
● Scalable
 Store hundreds of terabytes
 Pay only for what you use
● Fast
 Run ad hoc queries on multi-terabyte data sets in seconds
BigQuery is Google Cloud Platform’s interactive big data service. You can analyze terabytes of
data in a matter of seconds through SQL-like queries.
At a high level, a query from the client is received at the
root server, the workload is divided and distributed to a
number of intermediate servers, which is then further
reduced to the leaf servers. The leaf servers read from the
columnar-based storage.
Architecture
BigQuery Organization
BigQuery is structured as a hierarchy with 4 levels:
● Project. All data in BigQuery belongs inside a project
● Dataset. Holds one or more tables
● Table. Row-column structure that contains actual data
● Job. The tasks you are performing on the data
Datasets and Tables
Table name is represented as
follows:
● Current Project
<dataset>.<table
name>
● Different Project
<project>:<dataset>.<table>
e.g. publicdata:samples.wikipedia
Use cases
● Log Analysis. Making sense of computer generated records
● Retailer. Using data to forecast product sales
● Ads Targeting. Targeting proper customer sections
● Sensor Data. Collect and visualize ambient data
● Data Mashup. Query terabytes of heterogeneous data
Some customer case studies
Uses BigQuery to hone ad targeting and gain insights into
their business
Dashboards using BigQuery to analyze booking and
inventory data
Use BigQuery to provide their customers ways to expand
game engagement and find new channels for
monetization
Used BigQuery, App Engine and the Visulizaton API to
build a business intelligence solution
Pros
 Automatically optimises queries to fetch data quickly.
 Allows efficient management of data across multiple
databases.
 Relatively easy to use interface once you get used to it.
 Data volume management.
 BigQuery Omni allows for querying data across cloud
platforms, including Azure, AWS, and GCP
cons
 Queries that have not been performance-tuned and queries
returning a lot of redundant data can become costly very quickly
 Tooling support outside of the GCP ecosystem is often lacking
compared to other platforms
 Works best with flat tables, which can make managing an
enterprise data model difficult
Thank you!
Questions?

bigquery.pptx

  • 2.
    Agenda  Why BigQuery What is BigQuery  Architecture  BigQuery Organization  Use case  Pros and cons
  • 3.
    Why BigQuery? ● Generatebig data reports require expensive servers and skilled database administrators. ● Interacting with big data has been expensive, slow and inefficient. ● BigQuery changes all that  Reducing time and expense to query data.
  • 4.
    What’s BigQuery? ● Servicefor interactive analysis of massive datasets (TBs)  Query billions of rows, seconds to write, seconds to return  Uses a SQL-style query syntax ● Reliable and secure  Replicated across multiple sites  Secured through Access Control Lists ● Scalable  Store hundreds of terabytes  Pay only for what you use ● Fast  Run ad hoc queries on multi-terabyte data sets in seconds BigQuery is Google Cloud Platform’s interactive big data service. You can analyze terabytes of data in a matter of seconds through SQL-like queries.
  • 5.
    At a highlevel, a query from the client is received at the root server, the workload is divided and distributed to a number of intermediate servers, which is then further reduced to the leaf servers. The leaf servers read from the columnar-based storage.
  • 6.
  • 7.
    BigQuery Organization BigQuery isstructured as a hierarchy with 4 levels: ● Project. All data in BigQuery belongs inside a project ● Dataset. Holds one or more tables ● Table. Row-column structure that contains actual data ● Job. The tasks you are performing on the data
  • 8.
    Datasets and Tables Tablename is represented as follows: ● Current Project <dataset>.<table name> ● Different Project <project>:<dataset>.<table> e.g. publicdata:samples.wikipedia
  • 9.
    Use cases ● LogAnalysis. Making sense of computer generated records ● Retailer. Using data to forecast product sales ● Ads Targeting. Targeting proper customer sections ● Sensor Data. Collect and visualize ambient data ● Data Mashup. Query terabytes of heterogeneous data
  • 10.
    Some customer casestudies Uses BigQuery to hone ad targeting and gain insights into their business Dashboards using BigQuery to analyze booking and inventory data Use BigQuery to provide their customers ways to expand game engagement and find new channels for monetization Used BigQuery, App Engine and the Visulizaton API to build a business intelligence solution
  • 11.
    Pros  Automatically optimisesqueries to fetch data quickly.  Allows efficient management of data across multiple databases.  Relatively easy to use interface once you get used to it.  Data volume management.  BigQuery Omni allows for querying data across cloud platforms, including Azure, AWS, and GCP
  • 12.
    cons  Queries thathave not been performance-tuned and queries returning a lot of redundant data can become costly very quickly  Tooling support outside of the GCP ecosystem is often lacking compared to other platforms  Works best with flat tables, which can make managing an enterprise data model difficult
  • 13.