SlideShare a Scribd company logo
Mark Rittman, Independent Analyst + Product Manager
BIGQUERY, LOOKER AND BIG DATA ANALYTICS
AT PETABYTE-SCALE
BUDAPEST DATA FORUM
May 2017
•Mark Rittman, Independent Analyst for Big Data Analytics
•Currently working with Qubit as Analytic Product Manager
•20 years in the BI, DW, ETL and now Big Data industry
•Implementor, CTO, company founder and author
•On Twitter at @markrittman
•Linkedin at https://uk.linkedin.com/in/markrittman
•mark@rittman.co.uk and http://www.mjr-analytics.com
About the Presenter
2
•Responsible for building + managing an analytics
product on personalization platform for marketers
•Operates in same market as Adobe Marketing
Cloud, Google Analytic 360, Optimizely, Monetate
•Real-time ingest of 10TB+/day of web activity data,
used for personalization
•Built on Looker BI tool and Google Cloud Platform
Current Role - Analytics PM for Marketing Tech Startup
3
Big Data Analytics on Google Cloud Platform
4
Cloud Big Data Analytics 2.0
5
Also use as my personal dev platform
1TB of BigQuery query usage/month free
•Started back in 1996 on a bank Oracle DW project
•Our tools were Oracle 7.3.4, SQL*Plus, shell scripts
•Data warehouses provided a unified view of the business
•Single place to store key data and metrics
•Joined-up view of the business
•Aggregates and conformed dimensions
•ETL routines to load, cleanse and conform data
•BI tools for simple, guided access to information
Before This … 20 Years in Traditional DW Consulting
7
GOOGLE BIG QUERY
AND DISTRIBUTED, CLOUD COMPUTE + STORAGE
•New generation of big data platform services from Google, Amazon, Oracle
•Combines three key innovations from earlier technologies:
•Organising of data into tables and columns (from RDBMS DWs)
•Massively-scalable and distributed storage and query (from Big Data)
•Elastically-scalable Platform-as-a-Service (from Cloud)
Elastically-Scalable Data Warehouse-as-a-Service
9
•And things come full-circle … analytics
typically requires tabular data
•Google BigQuery based-on DremelX
massively-parallel query engine
•But stores data columnarprovides SQL
interface
•Solves the problem of providing DW-like
functionality at scale, as-a-service
BigQuery : Big Data Meets Data Warehousing
10
Google Cloud Platform
11
Cloud Dataflow - A fully managed, auto-scalable service for
pipeline data processing in batch or streaming mode
BigQuery - A fully managed, petabyte scale, low-cost enterprise
data warehouse for analytics
BigTable - A fully managed, petabyte scale, low-latency, high-
throughput wide column store for analytics
Cloud Pub/Sub - A fully managed, global and scalable publish
and subscribe service with guaranteed at-least-once message delivery
Google Cloud Platform Big Data Reference Architecture
12
All delivered as auto-scaling fully-managed services
Google BigQuery - Key Platform Technologies
13
Dremel - Massively-parallel real-time
query engine with SQL + REST API
Collosus - Distributed Storage layer
for Dremel, successor to GFS/HDFS
Capacitor - Columnar nested +
compressed file format for Dremel,
inspiration for Parquet etc
Borg - Large-scale cluster management,
runs Dremel jobs on 10,000+ servers
Jupiter - Google’s high-capacity
low-latency internal network
•BigQuery is a distributed, column-store query engine
•Denormalize your tables where possible as joins are relatively expensive
•Optimal query is one that filters, aggregates and selects subsets of columns
•Use table partitioning so that full scans of whole columns are minimized
Data Modeling with BigQuery
14
•BigQuery is a distributed, column-store query engine
•Denormalize your tables where possible as joins are relatively expensive
•Use nested repeated fields for dimension lookups to align with Capacitor storage
•Optimal query is one that filters, aggregates and selects subsets of columns
•Use table partitioning so that full scans of whole columns are minimized
BigQuery Storage Formats + Data Modeling
15
SELECT
zipcode, count(zipcode_trips.trip_count) as total_trips,
zipcode_incidents.call_type,
count(zipcode_incidents.call_type) as total_calls
FROM
`aerial-vehicle-148023.personal_metrics.sf_nested`
LEFT JOIN UNNEST (trips) as zipcode_trips
LEFT JOIN UNNEST (incidents) AS zipcode_incidents
WHERE
zipcode_incidents.call_type = 'Traffic Collision'
GROUP BY 1,3
Query Results in Seconds with No Indexes or
Summary Tables
BI + ANALYTICS TOOLING FOR GOOGLE BIGQUERY
AND DISTRIBUTED, CLOUD COMPUTE + STORAGE
BI and Analytics Tools for Google Cloud Platform
18
Looker for BI and Analytics -
BI for Data Engineers, Semantic Models and LookML
LookML : BI Modeling for Data Engineers
20
• Query building using business semantic model
• Self-Service data analytics with agile dev model
• Dashboards, reports (“looks”), action links,
scheduling
Further Reading at https://medium.com/mark-rittman
22
Mark Rittman, Independent Analyst + Product Manager
BIGQUERY, LOOKER AND BIG DATA ANALYTICS
AT PETABYTE-SCALE
BUDAPEST DATA FORUM
May 2017

More Related Content

What's hot

Scaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big DataScaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big Data
Treasure Data, Inc.
 
Snowflakes in the Cloud Real world experience on a new approach for Big Data
Snowflakes in the Cloud Real world experience on a new approach for Big DataSnowflakes in the Cloud Real world experience on a new approach for Big Data
Snowflakes in the Cloud Real world experience on a new approach for Big Data
DevFest DC
 
Big query
Big queryBig query
Big query
Tanvi Parikh
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and Kibana
ObjectRocket
 
Redshift VS BigQuery
Redshift VS BigQueryRedshift VS BigQuery
Redshift VS BigQuery
Kostas Pardalis
 
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQueryIntro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Chris Schalk
 
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Fwdays
 
Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020
Harald Erb
 
Accelerating Delivery of Data Products - The EBSCO Way
Accelerating Delivery of Data Products - The EBSCO WayAccelerating Delivery of Data Products - The EBSCO Way
Accelerating Delivery of Data Products - The EBSCO Way
MongoDB
 
Building Pinterest Real-Time Ads Platform Using Kafka Streams
Building Pinterest Real-Time Ads Platform Using Kafka Streams Building Pinterest Real-Time Ads Platform Using Kafka Streams
Building Pinterest Real-Time Ads Platform Using Kafka Streams
confluent
 
Big Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace ImagesBig Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace Images
Mark Kromer
 
Converging Database Transactions and Analytics
Converging Database Transactions and Analytics Converging Database Transactions and Analytics
Converging Database Transactions and Analytics
SingleStore
 
Connecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud Platform
ConnectaDigital
 
Дмитрий Попович "How to build a data warehouse?"
Дмитрий Попович "How to build a data warehouse?"Дмитрий Попович "How to build a data warehouse?"
Дмитрий Попович "How to build a data warehouse?"
Fwdays
 
Delivering rapid-fire Analytics with Snowflake and Tableau
Delivering rapid-fire Analytics with Snowflake and TableauDelivering rapid-fire Analytics with Snowflake and Tableau
Delivering rapid-fire Analytics with Snowflake and Tableau
Harald Erb
 
tecFinal 451 webinar deck
tecFinal 451 webinar decktecFinal 451 webinar deck
tecFinal 451 webinar deck
Basho Technologies
 
SLC Snowflake User Group - Mar 12, 2020
SLC Snowflake User Group - Mar 12, 2020SLC Snowflake User Group - Mar 12, 2020
SLC Snowflake User Group - Mar 12, 2020
Nathan Skousen
 
Bridging to a hybrid cloud data services architecture
Bridging to a hybrid cloud data services architectureBridging to a hybrid cloud data services architecture
Bridging to a hybrid cloud data services architecture
IBM Analytics
 
Building the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeBuilding the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free Life
SingleStore
 
Chicago Data Summit: Extending the Enterprise Data Warehouse with Hadoop
Chicago Data Summit: Extending the Enterprise Data Warehouse with HadoopChicago Data Summit: Extending the Enterprise Data Warehouse with Hadoop
Chicago Data Summit: Extending the Enterprise Data Warehouse with HadoopCloudera, Inc.
 

What's hot (20)

Scaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big DataScaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big Data
 
Snowflakes in the Cloud Real world experience on a new approach for Big Data
Snowflakes in the Cloud Real world experience on a new approach for Big DataSnowflakes in the Cloud Real world experience on a new approach for Big Data
Snowflakes in the Cloud Real world experience on a new approach for Big Data
 
Big query
Big queryBig query
Big query
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and Kibana
 
Redshift VS BigQuery
Redshift VS BigQueryRedshift VS BigQuery
Redshift VS BigQuery
 
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQueryIntro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
 
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
 
Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020
 
Accelerating Delivery of Data Products - The EBSCO Way
Accelerating Delivery of Data Products - The EBSCO WayAccelerating Delivery of Data Products - The EBSCO Way
Accelerating Delivery of Data Products - The EBSCO Way
 
Building Pinterest Real-Time Ads Platform Using Kafka Streams
Building Pinterest Real-Time Ads Platform Using Kafka Streams Building Pinterest Real-Time Ads Platform Using Kafka Streams
Building Pinterest Real-Time Ads Platform Using Kafka Streams
 
Big Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace ImagesBig Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace Images
 
Converging Database Transactions and Analytics
Converging Database Transactions and Analytics Converging Database Transactions and Analytics
Converging Database Transactions and Analytics
 
Connecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud Platform
 
Дмитрий Попович "How to build a data warehouse?"
Дмитрий Попович "How to build a data warehouse?"Дмитрий Попович "How to build a data warehouse?"
Дмитрий Попович "How to build a data warehouse?"
 
Delivering rapid-fire Analytics with Snowflake and Tableau
Delivering rapid-fire Analytics with Snowflake and TableauDelivering rapid-fire Analytics with Snowflake and Tableau
Delivering rapid-fire Analytics with Snowflake and Tableau
 
tecFinal 451 webinar deck
tecFinal 451 webinar decktecFinal 451 webinar deck
tecFinal 451 webinar deck
 
SLC Snowflake User Group - Mar 12, 2020
SLC Snowflake User Group - Mar 12, 2020SLC Snowflake User Group - Mar 12, 2020
SLC Snowflake User Group - Mar 12, 2020
 
Bridging to a hybrid cloud data services architecture
Bridging to a hybrid cloud data services architectureBridging to a hybrid cloud data services architecture
Bridging to a hybrid cloud data services architecture
 
Building the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeBuilding the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free Life
 
Chicago Data Summit: Extending the Enterprise Data Warehouse with Hadoop
Chicago Data Summit: Extending the Enterprise Data Warehouse with HadoopChicago Data Summit: Extending the Enterprise Data Warehouse with Hadoop
Chicago Data Summit: Extending the Enterprise Data Warehouse with Hadoop
 

Similar to Budapest Data Forum 2017 - BigQuery, Looker And Big Data Analytics At Petabyte-Scale

bigquery.pptx
bigquery.pptxbigquery.pptx
bigquery.pptx
Harissh16
 
Data exposure in Azure - production use-case
Data exposure in Azure - production use-caseData exposure in Azure - production use-case
Data exposure in Azure - production use-case
Alexander Laysha
 
mod 2.pdf
mod 2.pdfmod 2.pdf
Big Query - Women Techmarkers (Ukraine - March 2014)
Big Query - Women Techmarkers (Ukraine - March 2014)Big Query - Women Techmarkers (Ukraine - March 2014)
Big Query - Women Techmarkers (Ukraine - March 2014)
Ido Green
 
Amazon DynamoDB - Auto Scaling Webinar - v3.pptx
Amazon DynamoDB - Auto Scaling Webinar - v3.pptxAmazon DynamoDB - Auto Scaling Webinar - v3.pptx
Amazon DynamoDB - Auto Scaling Webinar - v3.pptx
Amazon Web Services
 
Google BigQuery is the future of Analytics! (Google Developer Conference)
Google BigQuery is the future of Analytics! (Google Developer Conference)Google BigQuery is the future of Analytics! (Google Developer Conference)
Google BigQuery is the future of Analytics! (Google Developer Conference)
Rasel Rana
 
Estimating the Total Costs of Your Cloud Analytics Platform 
Estimating the Total Costs of Your Cloud Analytics Platform Estimating the Total Costs of Your Cloud Analytics Platform 
Estimating the Total Costs of Your Cloud Analytics Platform 
DATAVERSITY
 
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBWebinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDB
MongoDB
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data Analytics
Mark Kromer
 
Supercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuerySupercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuery
Márton Kodok
 
Difference between data warehouse and data mining
Difference between data warehouse and data miningDifference between data warehouse and data mining
Difference between data warehouse and data mining
maxonlinetr
 
Executive Intro to BigQuery
Executive Intro to BigQueryExecutive Intro to BigQuery
Executive Intro to BigQuery
William M. Cohee
 
Levelling up your data infrastructure
Levelling up your data infrastructureLevelling up your data infrastructure
Levelling up your data infrastructure
Simon Belak
 
Boosting the Performance of your Rails Apps
Boosting the Performance of your Rails AppsBoosting the Performance of your Rails Apps
Boosting the Performance of your Rails Apps
Matt Kuklinski
 
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Introducing the MongoDB Connector for BI 2.0 with TableauWebinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
MongoDB
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
Rishikese MR
 
[Webinar] Getting Started with BigQuery: Basics, Its Appilcations & Use Cases
[Webinar] Getting Started with BigQuery: Basics, Its Appilcations & Use Cases[Webinar] Getting Started with BigQuery: Basics, Its Appilcations & Use Cases
[Webinar] Getting Started with BigQuery: Basics, Its Appilcations & Use Cases
Tatvic Analytics
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
How Auto Microcubes Work with Indexing & Caching to Deliver a Consistently Fa...
How Auto Microcubes Work with Indexing & Caching to Deliver a Consistently Fa...How Auto Microcubes Work with Indexing & Caching to Deliver a Consistently Fa...
How Auto Microcubes Work with Indexing & Caching to Deliver a Consistently Fa...
Remy Rosenbaum
 
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionEnterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Dmitry Anoshin
 

Similar to Budapest Data Forum 2017 - BigQuery, Looker And Big Data Analytics At Petabyte-Scale (20)

bigquery.pptx
bigquery.pptxbigquery.pptx
bigquery.pptx
 
Data exposure in Azure - production use-case
Data exposure in Azure - production use-caseData exposure in Azure - production use-case
Data exposure in Azure - production use-case
 
mod 2.pdf
mod 2.pdfmod 2.pdf
mod 2.pdf
 
Big Query - Women Techmarkers (Ukraine - March 2014)
Big Query - Women Techmarkers (Ukraine - March 2014)Big Query - Women Techmarkers (Ukraine - March 2014)
Big Query - Women Techmarkers (Ukraine - March 2014)
 
Amazon DynamoDB - Auto Scaling Webinar - v3.pptx
Amazon DynamoDB - Auto Scaling Webinar - v3.pptxAmazon DynamoDB - Auto Scaling Webinar - v3.pptx
Amazon DynamoDB - Auto Scaling Webinar - v3.pptx
 
Google BigQuery is the future of Analytics! (Google Developer Conference)
Google BigQuery is the future of Analytics! (Google Developer Conference)Google BigQuery is the future of Analytics! (Google Developer Conference)
Google BigQuery is the future of Analytics! (Google Developer Conference)
 
Estimating the Total Costs of Your Cloud Analytics Platform 
Estimating the Total Costs of Your Cloud Analytics Platform Estimating the Total Costs of Your Cloud Analytics Platform 
Estimating the Total Costs of Your Cloud Analytics Platform 
 
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBWebinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDB
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data Analytics
 
Supercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuerySupercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuery
 
Difference between data warehouse and data mining
Difference between data warehouse and data miningDifference between data warehouse and data mining
Difference between data warehouse and data mining
 
Executive Intro to BigQuery
Executive Intro to BigQueryExecutive Intro to BigQuery
Executive Intro to BigQuery
 
Levelling up your data infrastructure
Levelling up your data infrastructureLevelling up your data infrastructure
Levelling up your data infrastructure
 
Boosting the Performance of your Rails Apps
Boosting the Performance of your Rails AppsBoosting the Performance of your Rails Apps
Boosting the Performance of your Rails Apps
 
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Introducing the MongoDB Connector for BI 2.0 with TableauWebinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
[Webinar] Getting Started with BigQuery: Basics, Its Appilcations & Use Cases
[Webinar] Getting Started with BigQuery: Basics, Its Appilcations & Use Cases[Webinar] Getting Started with BigQuery: Basics, Its Appilcations & Use Cases
[Webinar] Getting Started with BigQuery: Basics, Its Appilcations & Use Cases
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
How Auto Microcubes Work with Indexing & Caching to Deliver a Consistently Fa...
How Auto Microcubes Work with Indexing & Caching to Deliver a Consistently Fa...How Auto Microcubes Work with Indexing & Caching to Deliver a Consistently Fa...
How Auto Microcubes Work with Indexing & Caching to Deliver a Consistently Fa...
 
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionEnterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
 

More from Rittman Analytics

From Zero to One with Rittman Analytics
From Zero to One with Rittman AnalyticsFrom Zero to One with Rittman Analytics
From Zero to One with Rittman Analytics
Rittman Analytics
 
Where Digital Analytics is taking BI and Big Data
Where Digital Analytics is taking BI and Big DataWhere Digital Analytics is taking BI and Big Data
Where Digital Analytics is taking BI and Big Data
Rittman Analytics
 
User Engagement Analysis using the new Looker System Activity Model
User Engagement Analysis using the new Looker System Activity ModelUser Engagement Analysis using the new Looker System Activity Model
User Engagement Analysis using the new Looker System Activity Model
Rittman Analytics
 
From BI Developer to Data Engineer with Oracle Analytics Cloud, Data Lake
From BI Developer to Data Engineer with Oracle Analytics Cloud, Data LakeFrom BI Developer to Data Engineer with Oracle Analytics Cloud, Data Lake
From BI Developer to Data Engineer with Oracle Analytics Cloud, Data Lake
Rittman Analytics
 
Planning a Strategy for Autonomous Analytics and Data Warehousing
Planning a Strategy for Autonomous Analytics and Data WarehousingPlanning a Strategy for Autonomous Analytics and Data Warehousing
Planning a Strategy for Autonomous Analytics and Data Warehousing
Rittman Analytics
 
Where Digital Analytics is taking BI and Big Data
Where Digital Analytics is taking BI and Big DataWhere Digital Analytics is taking BI and Big Data
Where Digital Analytics is taking BI and Big Data
Rittman Analytics
 
Data Warehouse Like a Tech Startup with Oracle Autonomous Data Warehouse
Data Warehouse Like a Tech Startup with Oracle Autonomous Data WarehouseData Warehouse Like a Tech Startup with Oracle Autonomous Data Warehouse
Data Warehouse Like a Tech Startup with Oracle Autonomous Data Warehouse
Rittman Analytics
 
From BI Developer to Data Engineer with Oracle Analytics Cloud, Data Lake
From BI Developer to Data Engineer with Oracle Analytics Cloud, Data LakeFrom BI Developer to Data Engineer with Oracle Analytics Cloud, Data Lake
From BI Developer to Data Engineer with Oracle Analytics Cloud, Data Lake
Rittman Analytics
 
Using Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations Data
Using Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations DataUsing Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations Data
Using Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations Data
Rittman Analytics
 
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake EditionFrom BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
Rittman Analytics
 
Using Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations Data
Using Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations DataUsing Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations Data
Using Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations Data
Rittman Analytics
 
Using Data & Analytics To Find Out How Much Daily Mail Readers Hate Me (and W...
Using Data & Analytics To Find Out How Much Daily Mail Readers Hate Me (and W...Using Data & Analytics To Find Out How Much Daily Mail Readers Hate Me (and W...
Using Data & Analytics To Find Out How Much Daily Mail Readers Hate Me (and W...
Rittman Analytics
 
Analytics, BigQuery, Looker and How I Became an Internet Meme for 48 Hours
Analytics, BigQuery, Looker and How I Became an Internet Meme for 48 HoursAnalytics, BigQuery, Looker and How I Became an Internet Meme for 48 Hours
Analytics, BigQuery, Looker and How I Became an Internet Meme for 48 Hours
Rittman Analytics
 
Analytics is Taking over the World (Again) - UKOUG Tech'17
Analytics is Taking over the World (Again) - UKOUG Tech'17Analytics is Taking over the World (Again) - UKOUG Tech'17
Analytics is Taking over the World (Again) - UKOUG Tech'17
Rittman Analytics
 
Petabytes to Personalization - Data Analytics with Qubit and Looker
Petabytes to Personalization - Data Analytics with Qubit and LookerPetabytes to Personalization - Data Analytics with Qubit and Looker
Petabytes to Personalization - Data Analytics with Qubit and Looker
Rittman Analytics
 
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Rittman Analytics
 
How a Tweet Went Viral - BIWA Summit 2017
How a Tweet Went Viral - BIWA Summit 2017How a Tweet Went Viral - BIWA Summit 2017
How a Tweet Went Viral - BIWA Summit 2017
Rittman Analytics
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Rittman Analytics
 

More from Rittman Analytics (18)

From Zero to One with Rittman Analytics
From Zero to One with Rittman AnalyticsFrom Zero to One with Rittman Analytics
From Zero to One with Rittman Analytics
 
Where Digital Analytics is taking BI and Big Data
Where Digital Analytics is taking BI and Big DataWhere Digital Analytics is taking BI and Big Data
Where Digital Analytics is taking BI and Big Data
 
User Engagement Analysis using the new Looker System Activity Model
User Engagement Analysis using the new Looker System Activity ModelUser Engagement Analysis using the new Looker System Activity Model
User Engagement Analysis using the new Looker System Activity Model
 
From BI Developer to Data Engineer with Oracle Analytics Cloud, Data Lake
From BI Developer to Data Engineer with Oracle Analytics Cloud, Data LakeFrom BI Developer to Data Engineer with Oracle Analytics Cloud, Data Lake
From BI Developer to Data Engineer with Oracle Analytics Cloud, Data Lake
 
Planning a Strategy for Autonomous Analytics and Data Warehousing
Planning a Strategy for Autonomous Analytics and Data WarehousingPlanning a Strategy for Autonomous Analytics and Data Warehousing
Planning a Strategy for Autonomous Analytics and Data Warehousing
 
Where Digital Analytics is taking BI and Big Data
Where Digital Analytics is taking BI and Big DataWhere Digital Analytics is taking BI and Big Data
Where Digital Analytics is taking BI and Big Data
 
Data Warehouse Like a Tech Startup with Oracle Autonomous Data Warehouse
Data Warehouse Like a Tech Startup with Oracle Autonomous Data WarehouseData Warehouse Like a Tech Startup with Oracle Autonomous Data Warehouse
Data Warehouse Like a Tech Startup with Oracle Autonomous Data Warehouse
 
From BI Developer to Data Engineer with Oracle Analytics Cloud, Data Lake
From BI Developer to Data Engineer with Oracle Analytics Cloud, Data LakeFrom BI Developer to Data Engineer with Oracle Analytics Cloud, Data Lake
From BI Developer to Data Engineer with Oracle Analytics Cloud, Data Lake
 
Using Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations Data
Using Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations DataUsing Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations Data
Using Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations Data
 
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake EditionFrom BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
 
Using Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations Data
Using Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations DataUsing Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations Data
Using Google Cloud Dataprep to Wrangle Strava, Fitbit and Google Locations Data
 
Using Data & Analytics To Find Out How Much Daily Mail Readers Hate Me (and W...
Using Data & Analytics To Find Out How Much Daily Mail Readers Hate Me (and W...Using Data & Analytics To Find Out How Much Daily Mail Readers Hate Me (and W...
Using Data & Analytics To Find Out How Much Daily Mail Readers Hate Me (and W...
 
Analytics, BigQuery, Looker and How I Became an Internet Meme for 48 Hours
Analytics, BigQuery, Looker and How I Became an Internet Meme for 48 HoursAnalytics, BigQuery, Looker and How I Became an Internet Meme for 48 Hours
Analytics, BigQuery, Looker and How I Became an Internet Meme for 48 Hours
 
Analytics is Taking over the World (Again) - UKOUG Tech'17
Analytics is Taking over the World (Again) - UKOUG Tech'17Analytics is Taking over the World (Again) - UKOUG Tech'17
Analytics is Taking over the World (Again) - UKOUG Tech'17
 
Petabytes to Personalization - Data Analytics with Qubit and Looker
Petabytes to Personalization - Data Analytics with Qubit and LookerPetabytes to Personalization - Data Analytics with Qubit and Looker
Petabytes to Personalization - Data Analytics with Qubit and Looker
 
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
 
How a Tweet Went Viral - BIWA Summit 2017
How a Tweet Went Viral - BIWA Summit 2017How a Tweet Went Viral - BIWA Summit 2017
How a Tweet Went Viral - BIWA Summit 2017
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
 

Recently uploaded

一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
Uber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportUber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis Report
SatyamNeelmani2
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Alireza Kamrani
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Introduction-to-Cybersecurit57hhfcbbcxxx
Introduction-to-Cybersecurit57hhfcbbcxxxIntroduction-to-Cybersecurit57hhfcbbcxxx
Introduction-to-Cybersecurit57hhfcbbcxxx
zahraomer517
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?
DOT TECH
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 

Recently uploaded (20)

一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Uber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportUber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis Report
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDB
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Introduction-to-Cybersecurit57hhfcbbcxxx
Introduction-to-Cybersecurit57hhfcbbcxxxIntroduction-to-Cybersecurit57hhfcbbcxxx
Introduction-to-Cybersecurit57hhfcbbcxxx
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 

Budapest Data Forum 2017 - BigQuery, Looker And Big Data Analytics At Petabyte-Scale

  • 1. Mark Rittman, Independent Analyst + Product Manager BIGQUERY, LOOKER AND BIG DATA ANALYTICS AT PETABYTE-SCALE BUDAPEST DATA FORUM May 2017
  • 2. •Mark Rittman, Independent Analyst for Big Data Analytics •Currently working with Qubit as Analytic Product Manager •20 years in the BI, DW, ETL and now Big Data industry •Implementor, CTO, company founder and author •On Twitter at @markrittman •Linkedin at https://uk.linkedin.com/in/markrittman •mark@rittman.co.uk and http://www.mjr-analytics.com About the Presenter 2
  • 3. •Responsible for building + managing an analytics product on personalization platform for marketers •Operates in same market as Adobe Marketing Cloud, Google Analytic 360, Optimizely, Monetate •Real-time ingest of 10TB+/day of web activity data, used for personalization •Built on Looker BI tool and Google Cloud Platform Current Role - Analytics PM for Marketing Tech Startup 3
  • 4. Big Data Analytics on Google Cloud Platform 4
  • 5. Cloud Big Data Analytics 2.0 5
  • 6. Also use as my personal dev platform 1TB of BigQuery query usage/month free
  • 7. •Started back in 1996 on a bank Oracle DW project •Our tools were Oracle 7.3.4, SQL*Plus, shell scripts •Data warehouses provided a unified view of the business •Single place to store key data and metrics •Joined-up view of the business •Aggregates and conformed dimensions •ETL routines to load, cleanse and conform data •BI tools for simple, guided access to information Before This … 20 Years in Traditional DW Consulting 7
  • 8. GOOGLE BIG QUERY AND DISTRIBUTED, CLOUD COMPUTE + STORAGE
  • 9. •New generation of big data platform services from Google, Amazon, Oracle •Combines three key innovations from earlier technologies: •Organising of data into tables and columns (from RDBMS DWs) •Massively-scalable and distributed storage and query (from Big Data) •Elastically-scalable Platform-as-a-Service (from Cloud) Elastically-Scalable Data Warehouse-as-a-Service 9
  • 10. •And things come full-circle … analytics typically requires tabular data •Google BigQuery based-on DremelX massively-parallel query engine •But stores data columnarprovides SQL interface •Solves the problem of providing DW-like functionality at scale, as-a-service BigQuery : Big Data Meets Data Warehousing 10
  • 11. Google Cloud Platform 11 Cloud Dataflow - A fully managed, auto-scalable service for pipeline data processing in batch or streaming mode BigQuery - A fully managed, petabyte scale, low-cost enterprise data warehouse for analytics BigTable - A fully managed, petabyte scale, low-latency, high- throughput wide column store for analytics Cloud Pub/Sub - A fully managed, global and scalable publish and subscribe service with guaranteed at-least-once message delivery
  • 12. Google Cloud Platform Big Data Reference Architecture 12 All delivered as auto-scaling fully-managed services
  • 13. Google BigQuery - Key Platform Technologies 13 Dremel - Massively-parallel real-time query engine with SQL + REST API Collosus - Distributed Storage layer for Dremel, successor to GFS/HDFS Capacitor - Columnar nested + compressed file format for Dremel, inspiration for Parquet etc Borg - Large-scale cluster management, runs Dremel jobs on 10,000+ servers Jupiter - Google’s high-capacity low-latency internal network
  • 14. •BigQuery is a distributed, column-store query engine •Denormalize your tables where possible as joins are relatively expensive •Optimal query is one that filters, aggregates and selects subsets of columns •Use table partitioning so that full scans of whole columns are minimized Data Modeling with BigQuery 14
  • 15. •BigQuery is a distributed, column-store query engine •Denormalize your tables where possible as joins are relatively expensive •Use nested repeated fields for dimension lookups to align with Capacitor storage •Optimal query is one that filters, aggregates and selects subsets of columns •Use table partitioning so that full scans of whole columns are minimized BigQuery Storage Formats + Data Modeling 15 SELECT zipcode, count(zipcode_trips.trip_count) as total_trips, zipcode_incidents.call_type, count(zipcode_incidents.call_type) as total_calls FROM `aerial-vehicle-148023.personal_metrics.sf_nested` LEFT JOIN UNNEST (trips) as zipcode_trips LEFT JOIN UNNEST (incidents) AS zipcode_incidents WHERE zipcode_incidents.call_type = 'Traffic Collision' GROUP BY 1,3
  • 16. Query Results in Seconds with No Indexes or Summary Tables
  • 17. BI + ANALYTICS TOOLING FOR GOOGLE BIGQUERY AND DISTRIBUTED, CLOUD COMPUTE + STORAGE
  • 18. BI and Analytics Tools for Google Cloud Platform 18
  • 19. Looker for BI and Analytics - BI for Data Engineers, Semantic Models and LookML
  • 20. LookML : BI Modeling for Data Engineers 20
  • 21. • Query building using business semantic model • Self-Service data analytics with agile dev model • Dashboards, reports (“looks”), action links, scheduling
  • 22. Further Reading at https://medium.com/mark-rittman 22
  • 23. Mark Rittman, Independent Analyst + Product Manager BIGQUERY, LOOKER AND BIG DATA ANALYTICS AT PETABYTE-SCALE BUDAPEST DATA FORUM May 2017

Editor's Notes

  1. The services we will highlight in this talk are Cloud Pub/Sub, Cloud Dataflow, BigQuery and BigTable. all fully managed scalable solutions Cloud Pub/Sub - global publish and subscribe service with guaranteed at-least-once message delivery. Cloud Dataflow - service for pipeline data processing in batch and streaming mode. It provides an innovative programming model, the result of our focus on developer productivity & experience, that equips you to address your processing needs at a high level of abstraction. BigQuery is our low-cost enterprise data warehouse for analytics. BigTable - low-latency petabyte scale analytics store, fast table scans, high read/write throughput