AI-Based Analytics in the Cloud
Karl Weinmeister {Developer Advocacy Manager}
@kweinmeister
2020 State of the CIO report
Data science is one of the most difficult roles to fill
2020 RELX Emerging Tech Executive Report
A leading reason for companies not using AI is
lack of technical expertise
AI skills shortage still persists
Spreadsheets are everywhere!
https://twitter.com/jsoltero/status/1238496597064863744
How does data makes its way into spreadsheets?
Report
Query
API
Extract
1. Data freshness
2. Data size
3. Anything else?
What are some issues with this approach?
Authorization to the data - what can happen?
Source Table Spreadsheet
A B C
Authorized users:
A D
Users:
Emailed to
Extracted to
Combining the best of Big Query and
the familiarity of Sheets to empower
workforces and assist with:
Introducing Connected Sheets
Sheets
Easy to use &
shareable
Familiar interface
Light-weight analysis
BigQuery
Analyze petabytes
of data
Complex queries
Increase time to insight
Connected
Sheets
Analyze billions of
rows of data in
Sheets, without any
need for specialized
knowledge.
Introducing
● Unlocking big data insights
● Accelerating data workflows
● Improving cost-efficiency
● Strengthening data security
Google BigQuery
Petabyte-scale storage
and queries
Encrypted, durable and
highly available
Real-time analytics on
streaming data
Google Cloud Platform’s
enterprise data warehouse
for analytics
Convenience of
standard SQL
Fully managed and serverless
Google BigQuery
Train and deploy ML models in
SQL
BigQuery ML
Execute ML workflows without
moving data from BigQuery
Automate common ML tasks
Built-in infrastructure
management, security &
compliance
Supported models in BigQuery ML
Classification
Logistic regression
XGBoost
DNN classifier (TensorFlow)
Regression
Other Models
k-means clustering
Time series forecasting
Model
Import/Export
Importing TensorFlow
models for batch prediction
NDA
AutoML Tables
Linear regression
XGBoost
DNN regressor (TensorFlow)
AutoML Tables
Recommendation:
Matrix factorization
Exporting models from
BigQuery ML for online
prediction
Demo!
Case Study: Demand Forecasting
https://unsplash.com/photos/9qQTUYm4ss4 https://unsplash.com/photos/M5tzZtFCOfs
Transportation Telecommunications
https://unsplash.com/photos/4P0zdOSstqI
Media/Gaming
➔ Predict ticket sales ➔ Predict network traffic
➔ Predict # active players/time
➔ Predict content viewership
https://unsplash.com/photos/nHRXNv2qeDE
Retail
Actual demand
Supply
Underallocation = revenue loss
Overallocation = revenue waste
Time
Iowa Liquor Sales data
Transactional data:
Iowa Liquor Sales data, BigQuery Public Datasets
https://console.cloud.google.com/marketplace/details/iowa-department-of-commerce/iowa-liquor-sales
`bigquery-public-data.iowa_liquor_sales.sales`
Training data SELECT
date,
item_description AS item_name,
SUM(bottles_sold) AS total_amount_sold
FROM
`bigquery-public-data.iowa_liquor_sales.sales`
GROUP BY
date, item_name
HAVING
date BETWEEN DATE('2016-01-01') AND
DATE('2017-06-01')
Developer Days
CREATE OR REPLACE MODEL
iowaliquor.forecast_by_product
OPTIONS(
MODEL_TYPE='ARIMA',
TIME_SERIES_TIMESTAMP_COL='date',
TIME_SERIES_DATA_COL='total_amount_sold',
TIME_SERIES_ID_COL='item_name',
HOLIDAY_REGION='US'
) AS
SELECT
date,
item_name,
total_amount_sold
FROM
iowaliquor.training_data
Build and train with
CREATE MODEL
https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-create-time-series
Behind-the-scenes
● Pre-processing
● Holiday effects
● Seasonal and trend
decomposition
● Trend modeling with
ARIMA and auto-ARIMA
Developer Days
SELECT
*
FROM
ML.FORECAST(MODEL iowaliquor.forecast_by_product,
STRUCT(30 AS horizon,
0.90 AS confidence_level)
)
Making forecasts with ML.FORECAST
More info on the demand forecasting use case
Google Cloud Blog YouTube
Building your own AI solution
Solution Requirements
Consider a typical Enterprise use case where custom Machine Learning
Models need to built and shared with people throughout your org
Requirements:
● Build ML custom models quickly
● Easily expose ML Models to people throughout the org
● Allow users with little or no ML experience to run model analysis
● Iterate and adapt quickly
Solution technologies:
● Sheets - Familiar easy data access for entire org
● BigQuery - Enterprise data storage and quick analysis
● BigQuery ML - Streamlined ML model creation on BigQuery data using SQL
● Connected Sheets - Access BigQuery data directly from Sheets
● Apps Script - Connect Sheets/Workspace to ML Models
Confidential + Proprietary
Apps Script
➔ Scripting language based on JavaScript
➔ Automates tasks across Google products and
services
➔ Serverless - code editor in your browser, and
scripts run on Google’s servers
script.google.com
Data Analyst Workflow
Sheets Apps Script BigQuery ML
Data Scientist Workflows
R-BigQuery Integration
(bigrquery package)
RStudio
R AI Platform Notebooks on GCP
pandas-BigQuery Integration
Colab, Other Jupyter
Notebook Tools
Python AI Platform Notebooks on GCP
Python R
BigQuery
Demo!
More info on building your own ML solution in Sheets
Google Cloud Blog Code Sample Demo Spreadsheet
bit.ly/ml-sheet
Access all assets from blog post at:
Questions?

Ai based analytics in the cloud

  • 1.
    AI-Based Analytics inthe Cloud Karl Weinmeister {Developer Advocacy Manager} @kweinmeister
  • 2.
    2020 State ofthe CIO report Data science is one of the most difficult roles to fill 2020 RELX Emerging Tech Executive Report A leading reason for companies not using AI is lack of technical expertise AI skills shortage still persists
  • 3.
  • 4.
    How does datamakes its way into spreadsheets? Report Query API Extract
  • 5.
    1. Data freshness 2.Data size 3. Anything else? What are some issues with this approach?
  • 6.
    Authorization to thedata - what can happen? Source Table Spreadsheet A B C Authorized users: A D Users: Emailed to Extracted to
  • 7.
    Combining the bestof Big Query and the familiarity of Sheets to empower workforces and assist with: Introducing Connected Sheets Sheets Easy to use & shareable Familiar interface Light-weight analysis BigQuery Analyze petabytes of data Complex queries Increase time to insight Connected Sheets Analyze billions of rows of data in Sheets, without any need for specialized knowledge. Introducing ● Unlocking big data insights ● Accelerating data workflows ● Improving cost-efficiency ● Strengthening data security
  • 8.
    Google BigQuery Petabyte-scale storage andqueries Encrypted, durable and highly available Real-time analytics on streaming data Google Cloud Platform’s enterprise data warehouse for analytics Convenience of standard SQL Fully managed and serverless
  • 9.
  • 10.
    Train and deployML models in SQL BigQuery ML Execute ML workflows without moving data from BigQuery Automate common ML tasks Built-in infrastructure management, security & compliance
  • 11.
    Supported models inBigQuery ML Classification Logistic regression XGBoost DNN classifier (TensorFlow) Regression Other Models k-means clustering Time series forecasting Model Import/Export Importing TensorFlow models for batch prediction NDA AutoML Tables Linear regression XGBoost DNN regressor (TensorFlow) AutoML Tables Recommendation: Matrix factorization Exporting models from BigQuery ML for online prediction
  • 12.
  • 13.
    Case Study: DemandForecasting
  • 14.
    https://unsplash.com/photos/9qQTUYm4ss4 https://unsplash.com/photos/M5tzZtFCOfs Transportation Telecommunications https://unsplash.com/photos/4P0zdOSstqI Media/Gaming ➔Predict ticket sales ➔ Predict network traffic ➔ Predict # active players/time ➔ Predict content viewership
  • 15.
  • 16.
    Actual demand Supply Underallocation =revenue loss Overallocation = revenue waste Time
  • 17.
    Iowa Liquor Salesdata Transactional data: Iowa Liquor Sales data, BigQuery Public Datasets https://console.cloud.google.com/marketplace/details/iowa-department-of-commerce/iowa-liquor-sales `bigquery-public-data.iowa_liquor_sales.sales`
  • 18.
    Training data SELECT date, item_descriptionAS item_name, SUM(bottles_sold) AS total_amount_sold FROM `bigquery-public-data.iowa_liquor_sales.sales` GROUP BY date, item_name HAVING date BETWEEN DATE('2016-01-01') AND DATE('2017-06-01')
  • 20.
    Developer Days CREATE ORREPLACE MODEL iowaliquor.forecast_by_product OPTIONS( MODEL_TYPE='ARIMA', TIME_SERIES_TIMESTAMP_COL='date', TIME_SERIES_DATA_COL='total_amount_sold', TIME_SERIES_ID_COL='item_name', HOLIDAY_REGION='US' ) AS SELECT date, item_name, total_amount_sold FROM iowaliquor.training_data Build and train with CREATE MODEL https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-create-time-series Behind-the-scenes ● Pre-processing ● Holiday effects ● Seasonal and trend decomposition ● Trend modeling with ARIMA and auto-ARIMA
  • 21.
    Developer Days SELECT * FROM ML.FORECAST(MODEL iowaliquor.forecast_by_product, STRUCT(30AS horizon, 0.90 AS confidence_level) ) Making forecasts with ML.FORECAST
  • 22.
    More info onthe demand forecasting use case Google Cloud Blog YouTube
  • 23.
    Building your ownAI solution
  • 24.
    Solution Requirements Consider atypical Enterprise use case where custom Machine Learning Models need to built and shared with people throughout your org Requirements: ● Build ML custom models quickly ● Easily expose ML Models to people throughout the org ● Allow users with little or no ML experience to run model analysis ● Iterate and adapt quickly Solution technologies: ● Sheets - Familiar easy data access for entire org ● BigQuery - Enterprise data storage and quick analysis ● BigQuery ML - Streamlined ML model creation on BigQuery data using SQL ● Connected Sheets - Access BigQuery data directly from Sheets ● Apps Script - Connect Sheets/Workspace to ML Models
  • 25.
    Confidential + Proprietary AppsScript ➔ Scripting language based on JavaScript ➔ Automates tasks across Google products and services ➔ Serverless - code editor in your browser, and scripts run on Google’s servers script.google.com
  • 26.
    Data Analyst Workflow SheetsApps Script BigQuery ML
  • 27.
    Data Scientist Workflows R-BigQueryIntegration (bigrquery package) RStudio R AI Platform Notebooks on GCP pandas-BigQuery Integration Colab, Other Jupyter Notebook Tools Python AI Platform Notebooks on GCP Python R BigQuery
  • 28.
  • 29.
    More info onbuilding your own ML solution in Sheets Google Cloud Blog Code Sample Demo Spreadsheet bit.ly/ml-sheet Access all assets from blog post at:
  • 30.