SlideShare a Scribd company logo
Serverless SQL
Torsten Steinbach
@torsstei
IBM
1
SQL on Object
Storage
DM Gartner
Hype Cycle
2018
Evolution of Form Factors
For Big Data Analytics
Enterprise Data
Warehouses
Tightly integrated and
optimized systems
Hadoop
Introduced open data formats & easy
scaling on commodity HW
Cloud-Native:
Serverless Analytics-aaS
• Seamless elasticity
• Pay-per-query consumption
• Analyze data as it sits in an object store
• Disaggregated architecture
• No more infrastructure head aches
The 90-ies 2000 Today
Ingredient 3: Serverless Data Transformation
Ingredient 4: Serverless Analytics
Ingredient 5: Serverless Automation
Ingredient 2: Serverless Data Ingest
Sharing Economy for Analytics
Ingredient 1: Serverless Storage
Object Storage
IBM Cloud Object Storage
Objects
Objects
Objects
At Rest
On the Wire
Buckets
Encrypted
Pennies per GB
REST
Elastic
Durable
Flexible
Resiliency Choices
Storage Classes
User Managed
Encryption Keys
S3 Compatible
High Speed Data
Transfer
Aspera
SQL Queries
Data Ingest Options
6
High Customizability
Degree of Serverless-ness
IBM Event Streams
(Kafka aaS)
IBM Cloud Functions
Out-of-the-Box
IBM Streaming Analytics
(IBM Streams aaS)
via Cloud Object Storage API
SQL Query ETL
Cloudant Replication
Blockchain Synch
Cloud Data
Data
Transformation
Serverless SQL
Analytics
IBM SQL Query
Object
Storage
Db2
+
Developers
Data
Engineers
Data Analysts
ü Perfect for Machine Generated Data
ü Ad-hoc Data Exploration
ü Operationalizing Data Pipelines
ü Big Data Lakes
ü Flexible Data Transformation
ü Extremely affordable. 5$/TB scanned
ü 100% API enabled
ü Analytics on Object Storage
ü Big Data Scale-Out. Running on Spark
ü 100% Self service – No Setup
2. Read data
4. Read
results
Application
3. Write results
IBM Cloud
Object Storage
Result SetData Set
Data Set
Data Set
1. Submit SQL
SQL
Archive / Export
IBM Cloud Streaming
IBM Streams
Event Streams
Land
Query
IBM Cloud Functions
IBM SQL Query
Architecture
IBM Cloud Databases
Db2 on Cloud
Geospatial SQLData Skipping
Timeseries SQL
Upload
Data Center 2
Analytics Engine Cluster
20 Kernels
Node 1
Node 3
Node 2
Node 3
…
20
Kernels
…
Data Center 3
Analytics Engine Cluster
20 Kernels
Node 1
Node 3
Node 2
Node 3
…
20
Kernels
…
SQL 1 SQL 1
Data Center 1
IBM Cloud SQL Query – Very High Level Architecture (MVP 1Q 2018)
Analytics Engine Cluster
20 Kernels
Cluster
Pool
Request Queue
Node 1
Node 3
Node 2
Node 3
…
Kernel
Pools
20
Kernels
…
SQL 1 SQL 2 SQL 3 SQL 4 SQL 5
Cloud Object Storage
SQL 6 …
JKG (Web Sockets)
IBM Cloud Query – Spark Cluster Architecture
SQL REST API
Create
Query
SQL Web Console
Watson
Studio
Notebooks
SQL Cloud Function
Integrate Explore
Deploy
IBM Cloud Query – Access Patterns
Node SDK
Python SDK
JDBCLooker
Best of breed Spark SQL Reference
• Complete, intuitive and interactive SQL Reference
• Each sample SQL can immediately be executed as is
https://cloud.ibm.com/docs/services/sql-query/sqlref/sql_reference.html#sql-reference
Analytics using full Power of Spark SQL
IBM SQL Query – Timeseries SQL 1/2
§ Intuitive first-of-a-kind SQL extensions for timeseries operations
§ Industry leading differentiators, including:
• Timeseries transformation functions:
• Correlation, Fourier transformation,
z-normalization, Granger, interpolation,
and distances
• Temporal Joins: SQL support for
Left/Right/Full Inner and Outer joins
of multiple timeseries
Alignment & Joining:
§ Further Industry leading differentiators
• Numerical and categorical timeseries types
• Timeseries data skipping for fast queries
• Forecasting:
• ARIMA, BATS, Anomaly detection, etc.
• Subsequence Mining:
• Train & match models for event sequences
• Segmentation:
• Time-based, Record-based, Anchor-based, Burst, and silence
Segmentation:
IBM SQL Query – Timeseries SQL 2/2
• IBM SQL Query – Spatial SQL
§ SQL/MM standard to store & analyze spatial data in RDBMS
§ Migration of PostGIS compliant SQL queries
§ Aggregation, computation and join via native SQL syntax
§ Industry leading differentiators
• Geodetic Full Earth support
• Increased developer productivity
• Avoid piece-wise planar projections
• High precision calculations anywhere on the earth
• Very large polygons (e.g. countries), polar caps, x-ing anti-meridian
• Spatial data skipping for fast queries
• Native and fine-granular geohash support
• Fast spatial aggregation
Example: Spatio-Temporal Processing of Sensor Data
IBM Cloud Object Storage
Sensor
Data
Query
Location
Analytics
Mobile
Cars
Devices
Land
Location
Filtering
Spatial
Aggregation
GPS
SQL/MM
Sensor
Metrics
t
t
t
Timeseries
Assembly
Timeseries
Join
Timeseries SQL
t
Serverless
Storage
Serverless
Runtimes
Serverless
Analytics
Object
Storage
Cloud
Functions
Query
A Completely Serverless Stack for Data & Analytics Solutions
Unstructured Data Prep
SQL Query
Cloud
Functions
Analyze
COSCOS
Extract Features
Automated/Scheduled SQL Execution
SQL Query
Cloud
Functions
Develop SQL Deploy as SQL Cloud Function
Set up Cloud
Function
Trigger/Schedule
Shield Data From Direct Access
SQL Query
Cloud
Functions
Deploy Cloud Function
with COS API Key
User Calls
Function to
Access Data
COS
Grant Execute on SQL
Cloud Function to User
Configure SQL Pipelines
SQL Query
Cloud
Functions
User creates function
sequence to automate flow
of consecutive SQLs
Sequence
SQL Query
Cloud
Functions
1.
2.
Use Cases of Cloud Functions Adding Value to SQL
Ingredient 3: Serverless Data Transformation ✓
Ingredient 4: Serverless Analytics ✓
Ingredient 5: Serverless Automation ✓
Ingredient 2: Serverless Data Ingest ✓
Ingredient 1: Serverless Storage ✓
Now, what is this all good for?
IBM Cloud Object Storage
Acquire
Query
Data Warehouses &
Databases
Db2 on Cloud
Process Analyze
ApplicationsApplications
Applications
IoT
Streaming
Devices
Devices
Devices
BI & AI
Land
Log Messages
Cleanse
Filter
Merge
Aggregate
Compress
Watson Studio
Looker
Cognos
WML
Explore
Analyze Analyze
Promote
Use for Data Pipelines to fuel BI & AI
Data –Driven Decisions
☛ Understanding system health, user behavior & workload status
Collecting & Analyzing Log Data
☛ Is NOT and afterthought but rather foundation for decisions on
system and feature design.
Data Volume Growing Rapidly
☛ Growth rates and data volume at rest can jump dramatically. Very
high elasticity is required.
Competitive Advantage
☛ Is based on short runways for turning data into actions
Turn your Logs into Business – Log Data Is The Cloud-Native Currency
Logs
Your Cloud
Application/Solution
IBM Cloud Object Storage
Query
Transform
Compress
Aggregate
Repartition
Analyze
Anomaly Detection
User Segmentation
Customer Support
Resource Planning
• Build & run data pipelines and analytics of your log message data
• Flexible log data analytics with full power of SQL
• Seamless scalability & elasticity according to your log message volume
Use for analyzing application logs
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Data Lake in IBM Cloud – How it works
IBM Cloud Data LakeData
Streaming
Upload
ETL
DB2
Feature
Extraction
Data
Prep
ICD
DB2
ICD
OLAP
Analytics WML
ETL
Federate
Asper
a
Cloudant
Replication
Secure
Sync
IBM
Blockchain
Application
s
Application
s Watson
Studio
Knowledge
Catalog
METASTORE
AI
ICP for DataAnalytics
Engine
IBM Cloud
Functions
Land Process Integrate
Key Protect
Index
Creation
Getting started: https://www.ibm.com/cloud/sql-query
SQL Query Intro Video: https://youtu.be/s-FznfHJpoU
SQL Query Starter Notebook in Watson Studio: https://ibm.biz/BdYNrN
SQL Reference: https://ibm.biz/Bd2jF7
SQL Query API doc: https://cloud.ibm.com/apidocs/sql-query
Big Data Layout Best Practices for COS: https://ibm.biz/Bd2jRg
Serverless Data & Analytics: https://ibm.biz/Bd2jF5
Further Resources
Backup
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
1. Identify friction points in users’ digital journey, e.g.:
• Clicks-2-purchase ratio
• Unexpected repeated page visits per user
• E.g. entering payment data should only happen once
• Last page visited per session
2. Identify click sequences for successful purchase
• Sequence matching using timeseries analysis
3. Identify customers/segments likely to churn or expand
• Look for typical page visits, actions or flows
• E.g. Terms & conditions, invite additional users etc.
4. Determining your most important content online
What Insights can I extract from a Clickstream?
1. Identify friction points in users’ digital journey, e.g.:
• Clicks-2-purchase ratio
• Unexpected repeated page visits per user
• E.g. entering payment data should only happen once
• Last page visited per session
2. Identify click sequences for successful purchase
• Sequence matching using timeseries analysis
3. Identify customers/segments likely to churn or expand
• Look for typical page visits, actions or flows
• E.g. Terms & conditions, invite additional users etc.
4. Determining your most important content online
What Insights can I extract from a Clickstream?
Building IBM Cloud-Native Data Lake
Serverless SQL
Serverless Storage
Serverless Pipeline
Automation ✓
✓
✓
Orchestration
Processing
Persistency Data Ingest
✓
Data Catalog ✓
Serverless
Unstructured Data
Processing ✓
• Traditional analytics systems
• Fixed capacities of appliances
• Specialized teams of data engineers & DBAs who manage data model, access and ETL
• BI analysts who have access only to the curated data sets in EDW
• Innovative enterprises today
• Wide range of teams that require direct access to same data set at all stages of the data
pipeline: BI analysts, data scientists, quantitative marketers, dev/ops, developers
• Data engineers that support these teams need a much, much more scalable and cost-
effective platform to ensure all teams have access they need and when needed
• Building analytics platforms in the cloud because of the scale and cost-efficiencies that
come with serverless analytics over object stores
Serverless – The key to IT Sharing Economy ... also for Analytics
Proper data organization è
better performance and lower cost
29
,
2
0
1
9
/
©
2
0
1
9
I
B
M
C
o
r
p
o
r
a
t
i
o
n
The key factors are:
• Number of bytes shipped
• Number of REST requests
Best practices for structured data:
• Choose the right object size (sweet spot: 128 MB)
• Choose the right format
• Choose the right data layout
• Avoid gzip compressed formats
Applies to SQL Query but also
applies to other Big Data engines
To learn more: https://www.ibm.com/blogs/bluemix/2018/06/big-data-layout/
Which Format is Query-Friendly?
2. Use Hive style partitioning
GPMeterStream/dt=2017-08-17/part-00085.csv
GPMeterStream/dt=2017-08-17/part-00086.csv
GPMeterStream/dt=2017-08-17/part-00087.csv
GPMeterStream/dt=2017-08-17/part-00088.csv
GPMeterStream/dt=2017-08-17/part-00089.csv
GPMeterStream/dt=2017-08-18/part-00001.csv
GPMeterStream/dt=2017-08-18/part-00002.csv
GPMeterStream/dt=2017-08-18/part-00003.csv
Avoid reading unnecessary objects altogether
Technique has limitations
Best Practice: minimize bytes scanned
1. Use Parquet
• Column based
• Only read the columns you need
• Column wise compression
• Min/max metadata
Table Locators
cos://<endpoint>/<bucket>/[<prefix>] <format definition>
Endpoint – of your object storage bucket or a short alias
E.g. s3.us-south.objectstorage.appdomain.cloud or alias us-south
Bucket – name in object storage
Prefix – one or multiple objects (i.e. table partitions) with same prefix
Used in FROM clauses for input data and in target field for result set data
Examples:
cos://us-south/myBucket/myFolder/mySubFolder/myData.parquet
cos://us/otherBucket/myData
cos://us/otherBucket/myData/part
cos://eu/newBucket/
<Table Locator> [JOBPREFIX JOBID | NONE]
[STORED AS CSV | PARQUET | JSON]
• Specifies the data format of the input data
• Table schema is automatically inferred at SQL execution time
• STORED AS Clause is optional, the default is CSV
• Additional parameters for CSV:
• E.g.: FIELDS TERMINATEY BY ‘t’ NOHEADER
• JOBPREFIX only for targets: defines unique prefix to append. Default is JOBID.
Table Format Definition
SELECT … INTO
<Table Locator> [STORED AS CSV | PARQUET | JSON]
[PARTITIONED [BY (<column list>)]
[INTO <num> BUCKETS]
[EVERY <num> ROWS]]
[SORT BY (<column list>)]
BY: Produces Hive Style Partitioning
INTO: Produced fix number of partitions (hash partitioned)
EVERY: Produces partitioned of even size (e.g. for pagination)
SORT BY: Exact result order & clustering when combined with PARTITIONED
Table Partitioning Definition
Submit a SQL query
POST https://api.sql-query.cloud.ibm.com/v2/sql_jobs
Runs the SQL in the background and returns a job_id
Detailed info for a SQL query (e.g. status, result location)
GET https://api.sql-query.cloud.ibm.com /v2/sql_jobs/{job_id}
Returns JSON with query execution details
List of recent SQL query executions
GET https://api.sql-query.cloud.ibm.com /v2/sql_jobs
Returns JSON array with last 30 SQL submissions and outcomes
IBM SQL Query REST API
IDUG Db2 Tech Conference
Charlotte, NC | June 2 – 6, 2019
Scaling Analytics: Data Skipping Saving you Time and $
Index All
Objects
IBM Cloud Object Storage
Data Set Objects
SQL
Query
Data Skipping
Indexing
Candidate
Objects
WHERE Clause
Saving Time
and $
SQL Query learns which objects are not relevant to a query
using a data skipping index
CREATE METAINDEX stores index summary metadata for
each object. Much smaller than the data.
SQLs skipping irrelevant objects to significantly reduce I/O
E.g.:
Independent of data formats
Index Types: Min/Max, Value List, Bounding Box
Get location and time of heat waves (>40 celcius)
SELECT lat, long, city, temp, date
FROM weather
WHERE temp > 40.0
Scaling Analytics: Data Skipping Saving you Time and $
Index All
Objects
IBM Cloud Object Storage
Data Set Objects
SQL
Query
Data Skipping
Indexing
Candidate
Objects
WHERE Clause
Saving Time
and $
SQL Query learns which objects are not relevant to a query
using a data skipping index
CREATE METAINDEX stores index summary metadata for
each object. Much smaller than the data.
SQLs skipping irrelevant objects to significantly reduce I/O
E.g.:
Independent of data formats
Index Types: Min/Max, Value List, Bounding Box
Get location and time of heat waves (>40 celcius)
SELECT lat, long, city, temp, date
FROM weather
WHERE temp > 40.0
• JDBC compliant driver library that wraps REST API
• Wrapping both, SQL Query and COS REST API
• Exposing regular session interface (JDBC Connection)
• Enabling custom JDBC application support
• Enabling BI application support
• Early adopter: Looker
• Support for stored table meta data (simple catalog)
• Stored as json in COS and referenced via JDBC
connection string
• I.e. DatabaseMetaData interface also supported
JDBC Driver for BI Applications
Apply for Beta Now
Query
JDBC Driver
REST
COS
JDBC
API
DataResult
Sets
Table
Catalog
E.g. Looker
Using SQL Query JDBC Driver
Define table catalog
• JSON file in COS containing:
• Table name
• Location of table objects on COS
• Object format
• Column names
• Column types
• INT, FLOAT, VARCHAR, TIMESTAMP
JDBC Connection String:
jdbc:SQLQuery:<sql-query instance crn>
?schemabucket=<COS bucket with json catalog>
?schemafile=<COS object with json catalog>
&apikey=<api key for your account>
&targetcosurl=<COS URL for result set>
Think 2019 / 2263 / February 2019 / © 2019 IBM Corporation
IBM Cloud Functions
Fair Never pay for idle
Polyglot
Elastic
Automation
Triggers
Open Source
CLOUD
FUNCTIONS
Schedules
Sequences

More Related Content

What's hot

Is there a way that we can build our Azure Synapse Pipelines all with paramet...
Is there a way that we can build our Azure Synapse Pipelines all with paramet...Is there a way that we can build our Azure Synapse Pipelines all with paramet...
Is there a way that we can build our Azure Synapse Pipelines all with paramet...
Erwin de Kreuk
 
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 Questions
Mike Broberg
 
Azure Databricks is Easier Than You Think
Azure Databricks is Easier Than You ThinkAzure Databricks is Easier Than You Think
Azure Databricks is Easier Than You Think
Ike Ellis
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
Raul Chong
 
Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2
Amazon Web Services
 
Moving to the cloud; PaaS, IaaS or Managed Instance
Moving to the cloud; PaaS, IaaS or Managed InstanceMoving to the cloud; PaaS, IaaS or Managed Instance
Moving to the cloud; PaaS, IaaS or Managed Instance
Thomas Sykes
 
Leveraging Azure Databricks to minimize time to insight by combining Batch an...
Leveraging Azure Databricks to minimize time to insight by combining Batch an...Leveraging Azure Databricks to minimize time to insight by combining Batch an...
Leveraging Azure Databricks to minimize time to insight by combining Batch an...
Microsoft Tech Community
 
Caching with DynamoDB and DAX - DevDay Austin 2017 Day 2
Caching with DynamoDB and DAX - DevDay Austin 2017 Day 2Caching with DynamoDB and DAX - DevDay Austin 2017 Day 2
Caching with DynamoDB and DAX - DevDay Austin 2017 Day 2
Amazon Web Services
 
Accessing Google Cloud APIs
Accessing Google Cloud APIsAccessing Google Cloud APIs
Accessing Google Cloud APIs
wesley chun
 
NoSQL, which way to go?
NoSQL, which way to go?NoSQL, which way to go?
NoSQL, which way to go?
Ahmed Elharouny
 
Data Design for Microservices - DevDay Austin 2017 Day 2
Data Design for Microservices - DevDay Austin 2017 Day 2Data Design for Microservices - DevDay Austin 2017 Day 2
Data Design for Microservices - DevDay Austin 2017 Day 2
Amazon Web Services
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Lucas Jellema
 
Bi and AI updates in the Microsoft Data Platform stack
Bi and AI updates in the Microsoft Data Platform stackBi and AI updates in the Microsoft Data Platform stack
Bi and AI updates in the Microsoft Data Platform stack
Ivan Donev
 
Discovery Day 2019 Sofia - Big data clusters
Discovery Day 2019 Sofia - Big data clustersDiscovery Day 2019 Sofia - Big data clusters
Discovery Day 2019 Sofia - Big data clusters
Ivan Donev
 
Microsoft SQL server 2017 Level 300 technical deck
Microsoft SQL server 2017 Level 300 technical deckMicrosoft SQL server 2017 Level 300 technical deck
Microsoft SQL server 2017 Level 300 technical deck
George Walters
 
Discovery Day 2019 Sofia - What is new in SQL Server 2019
Discovery Day 2019 Sofia - What is new in SQL Server 2019Discovery Day 2019 Sofia - What is new in SQL Server 2019
Discovery Day 2019 Sofia - What is new in SQL Server 2019
Ivan Donev
 
Introduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsIntroduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis Analytics
Amazon Web Services
 
Modern ETL: Azure Data Factory, Data Lake, and SQL Database
Modern ETL: Azure Data Factory, Data Lake, and SQL DatabaseModern ETL: Azure Data Factory, Data Lake, and SQL Database
Modern ETL: Azure Data Factory, Data Lake, and SQL Database
Eric Bragas
 
PaaSport to Paradise: Lifting & Shifting with Azure SQL Database/Managed Inst...
PaaSport to Paradise: Lifting & Shifting with Azure SQL Database/Managed Inst...PaaSport to Paradise: Lifting & Shifting with Azure SQL Database/Managed Inst...
PaaSport to Paradise: Lifting & Shifting with Azure SQL Database/Managed Inst...
Sandy Winarko
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
James Serra
 

What's hot (20)

Is there a way that we can build our Azure Synapse Pipelines all with paramet...
Is there a way that we can build our Azure Synapse Pipelines all with paramet...Is there a way that we can build our Azure Synapse Pipelines all with paramet...
Is there a way that we can build our Azure Synapse Pipelines all with paramet...
 
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 Questions
 
Azure Databricks is Easier Than You Think
Azure Databricks is Easier Than You ThinkAzure Databricks is Easier Than You Think
Azure Databricks is Easier Than You Think
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
 
Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2
 
Moving to the cloud; PaaS, IaaS or Managed Instance
Moving to the cloud; PaaS, IaaS or Managed InstanceMoving to the cloud; PaaS, IaaS or Managed Instance
Moving to the cloud; PaaS, IaaS or Managed Instance
 
Leveraging Azure Databricks to minimize time to insight by combining Batch an...
Leveraging Azure Databricks to minimize time to insight by combining Batch an...Leveraging Azure Databricks to minimize time to insight by combining Batch an...
Leveraging Azure Databricks to minimize time to insight by combining Batch an...
 
Caching with DynamoDB and DAX - DevDay Austin 2017 Day 2
Caching with DynamoDB and DAX - DevDay Austin 2017 Day 2Caching with DynamoDB and DAX - DevDay Austin 2017 Day 2
Caching with DynamoDB and DAX - DevDay Austin 2017 Day 2
 
Accessing Google Cloud APIs
Accessing Google Cloud APIsAccessing Google Cloud APIs
Accessing Google Cloud APIs
 
NoSQL, which way to go?
NoSQL, which way to go?NoSQL, which way to go?
NoSQL, which way to go?
 
Data Design for Microservices - DevDay Austin 2017 Day 2
Data Design for Microservices - DevDay Austin 2017 Day 2Data Design for Microservices - DevDay Austin 2017 Day 2
Data Design for Microservices - DevDay Austin 2017 Day 2
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
 
Bi and AI updates in the Microsoft Data Platform stack
Bi and AI updates in the Microsoft Data Platform stackBi and AI updates in the Microsoft Data Platform stack
Bi and AI updates in the Microsoft Data Platform stack
 
Discovery Day 2019 Sofia - Big data clusters
Discovery Day 2019 Sofia - Big data clustersDiscovery Day 2019 Sofia - Big data clusters
Discovery Day 2019 Sofia - Big data clusters
 
Microsoft SQL server 2017 Level 300 technical deck
Microsoft SQL server 2017 Level 300 technical deckMicrosoft SQL server 2017 Level 300 technical deck
Microsoft SQL server 2017 Level 300 technical deck
 
Discovery Day 2019 Sofia - What is new in SQL Server 2019
Discovery Day 2019 Sofia - What is new in SQL Server 2019Discovery Day 2019 Sofia - What is new in SQL Server 2019
Discovery Day 2019 Sofia - What is new in SQL Server 2019
 
Introduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsIntroduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis Analytics
 
Modern ETL: Azure Data Factory, Data Lake, and SQL Database
Modern ETL: Azure Data Factory, Data Lake, and SQL DatabaseModern ETL: Azure Data Factory, Data Lake, and SQL Database
Modern ETL: Azure Data Factory, Data Lake, and SQL Database
 
PaaSport to Paradise: Lifting & Shifting with Azure SQL Database/Managed Inst...
PaaSport to Paradise: Lifting & Shifting with Azure SQL Database/Managed Inst...PaaSport to Paradise: Lifting & Shifting with Azure SQL Database/Managed Inst...
PaaSport to Paradise: Lifting & Shifting with Azure SQL Database/Managed Inst...
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 

Similar to Serverless SQL

IBM Cloud Day January 2021 Data Lake Deep Dive
IBM Cloud Day January 2021 Data Lake Deep DiveIBM Cloud Day January 2021 Data Lake Deep Dive
IBM Cloud Day January 2021 Data Lake Deep Dive
Torsten Steinbach
 
Coud-based Data Lake for Analytics and AI
Coud-based Data Lake for Analytics and AICoud-based Data Lake for Analytics and AI
Coud-based Data Lake for Analytics and AI
Torsten Steinbach
 
IBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lakeIBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lake
Torsten Steinbach
 
IBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM Cloud
IBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM CloudIBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM Cloud
IBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM Cloud
Torsten Steinbach
 
IBM Cloud Native Day April 2021: Serverless Data Lake
IBM Cloud Native Day April 2021: Serverless Data LakeIBM Cloud Native Day April 2021: Serverless Data Lake
IBM Cloud Native Day April 2021: Serverless Data Lake
Torsten Steinbach
 
Cloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AICloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AI
Torsten Steinbach
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
Amazon Web Services
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Amazon Web Services
 
Azure Data Factory for Azure Data Week
Azure Data Factory for Azure Data WeekAzure Data Factory for Azure Data Week
Azure Data Factory for Azure Data Week
Mark Kromer
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
Amazon Web Services
 
Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?
Crate.io
 
The Best of re:invent 2016
The Best of re:invent 2016The Best of re:invent 2016
The Best of re:invent 2016
Amazon Web Services
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics Platform
DATAVERSITY
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
Amazon Web Services
 
Building a Big Data & Analytics Platform using AWS
Building a Big Data & Analytics Platform using AWS Building a Big Data & Analytics Platform using AWS
Building a Big Data & Analytics Platform using AWS
Amazon Web Services
 
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
Cisco DevNet
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWSAWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
Amazon Web Services
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
Amazon Web Services
 

Similar to Serverless SQL (20)

IBM Cloud Day January 2021 Data Lake Deep Dive
IBM Cloud Day January 2021 Data Lake Deep DiveIBM Cloud Day January 2021 Data Lake Deep Dive
IBM Cloud Day January 2021 Data Lake Deep Dive
 
Coud-based Data Lake for Analytics and AI
Coud-based Data Lake for Analytics and AICoud-based Data Lake for Analytics and AI
Coud-based Data Lake for Analytics and AI
 
IBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lakeIBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lake
 
IBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM Cloud
IBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM CloudIBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM Cloud
IBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM Cloud
 
IBM Cloud Native Day April 2021: Serverless Data Lake
IBM Cloud Native Day April 2021: Serverless Data LakeIBM Cloud Native Day April 2021: Serverless Data Lake
IBM Cloud Native Day April 2021: Serverless Data Lake
 
Cloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AICloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AI
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
 
Azure Data Factory for Azure Data Week
Azure Data Factory for Azure Data WeekAzure Data Factory for Azure Data Week
Azure Data Factory for Azure Data Week
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 
Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?
 
The Best of re:invent 2016
The Best of re:invent 2016The Best of re:invent 2016
The Best of re:invent 2016
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics Platform
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Building a Big Data & Analytics Platform using AWS
Building a Big Data & Analytics Platform using AWS Building a Big Data & Analytics Platform using AWS
Building a Big Data & Analytics Platform using AWS
 
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWSAWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 

More from Torsten Steinbach

Suburface 2021 IBM Cloud Data Lake
Suburface 2021 IBM Cloud Data LakeSuburface 2021 IBM Cloud Data Lake
Suburface 2021 IBM Cloud Data Lake
Torsten Steinbach
 
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
Torsten Steinbach
 
IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?
IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?
IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?
Torsten Steinbach
 
IBM THINK 2019 - Cloud-Native Clickstream Analysis in IBM Cloud
IBM THINK 2019 - Cloud-Native Clickstream Analysis in IBM CloudIBM THINK 2019 - Cloud-Native Clickstream Analysis in IBM Cloud
IBM THINK 2019 - Cloud-Native Clickstream Analysis in IBM Cloud
Torsten Steinbach
 
IBM THINK 2019 - Self-Service Cloud Data Management with SQL
IBM THINK 2019 - Self-Service Cloud Data Management with SQL IBM THINK 2019 - Self-Service Cloud Data Management with SQL
IBM THINK 2019 - Self-Service Cloud Data Management with SQL
Torsten Steinbach
 
IBM THINK 2018 - IBM Cloud SQL Query Introduction
IBM THINK 2018 - IBM Cloud SQL Query IntroductionIBM THINK 2018 - IBM Cloud SQL Query Introduction
IBM THINK 2018 - IBM Cloud SQL Query Introduction
Torsten Steinbach
 
IBM Insight 2014 - Advanced Warehouse Analytics in the Cloud
IBM Insight 2014 - Advanced Warehouse Analytics in the CloudIBM Insight 2014 - Advanced Warehouse Analytics in the Cloud
IBM Insight 2014 - Advanced Warehouse Analytics in the Cloud
Torsten Steinbach
 
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloudIBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
Torsten Steinbach
 
IBM Insight 2015 - 1824 - Using Bluemix and dashDB for Twitter Analysis
IBM Insight 2015 - 1824 - Using Bluemix and dashDB for Twitter AnalysisIBM Insight 2015 - 1824 - Using Bluemix and dashDB for Twitter Analysis
IBM Insight 2015 - 1824 - Using Bluemix and dashDB for Twitter Analysis
Torsten Steinbach
 
IBM InterConnect 2016 - 3505 - Cloud-Based Analytics of The Weather Company i...
IBM InterConnect 2016 - 3505 - Cloud-Based Analytics of The Weather Company i...IBM InterConnect 2016 - 3505 - Cloud-Based Analytics of The Weather Company i...
IBM InterConnect 2016 - 3505 - Cloud-Based Analytics of The Weather Company i...
Torsten Steinbach
 
IBM Information on Demand 2013 - Session 2839 - Using IBM PureData System fo...
IBM Information on Demand 2013  - Session 2839 - Using IBM PureData System fo...IBM Information on Demand 2013  - Session 2839 - Using IBM PureData System fo...
IBM Information on Demand 2013 - Session 2839 - Using IBM PureData System fo...
Torsten Steinbach
 
esri2015cloudantdashdbpresentation-150731203041-lva1-app6892
esri2015cloudantdashdbpresentation-150731203041-lva1-app6892esri2015cloudantdashdbpresentation-150731203041-lva1-app6892
esri2015cloudantdashdbpresentation-150731203041-lva1-app6892
Torsten Steinbach
 

More from Torsten Steinbach (12)

Suburface 2021 IBM Cloud Data Lake
Suburface 2021 IBM Cloud Data LakeSuburface 2021 IBM Cloud Data Lake
Suburface 2021 IBM Cloud Data Lake
 
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
 
IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?
IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?
IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?
 
IBM THINK 2019 - Cloud-Native Clickstream Analysis in IBM Cloud
IBM THINK 2019 - Cloud-Native Clickstream Analysis in IBM CloudIBM THINK 2019 - Cloud-Native Clickstream Analysis in IBM Cloud
IBM THINK 2019 - Cloud-Native Clickstream Analysis in IBM Cloud
 
IBM THINK 2019 - Self-Service Cloud Data Management with SQL
IBM THINK 2019 - Self-Service Cloud Data Management with SQL IBM THINK 2019 - Self-Service Cloud Data Management with SQL
IBM THINK 2019 - Self-Service Cloud Data Management with SQL
 
IBM THINK 2018 - IBM Cloud SQL Query Introduction
IBM THINK 2018 - IBM Cloud SQL Query IntroductionIBM THINK 2018 - IBM Cloud SQL Query Introduction
IBM THINK 2018 - IBM Cloud SQL Query Introduction
 
IBM Insight 2014 - Advanced Warehouse Analytics in the Cloud
IBM Insight 2014 - Advanced Warehouse Analytics in the CloudIBM Insight 2014 - Advanced Warehouse Analytics in the Cloud
IBM Insight 2014 - Advanced Warehouse Analytics in the Cloud
 
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloudIBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
 
IBM Insight 2015 - 1824 - Using Bluemix and dashDB for Twitter Analysis
IBM Insight 2015 - 1824 - Using Bluemix and dashDB for Twitter AnalysisIBM Insight 2015 - 1824 - Using Bluemix and dashDB for Twitter Analysis
IBM Insight 2015 - 1824 - Using Bluemix and dashDB for Twitter Analysis
 
IBM InterConnect 2016 - 3505 - Cloud-Based Analytics of The Weather Company i...
IBM InterConnect 2016 - 3505 - Cloud-Based Analytics of The Weather Company i...IBM InterConnect 2016 - 3505 - Cloud-Based Analytics of The Weather Company i...
IBM InterConnect 2016 - 3505 - Cloud-Based Analytics of The Weather Company i...
 
IBM Information on Demand 2013 - Session 2839 - Using IBM PureData System fo...
IBM Information on Demand 2013  - Session 2839 - Using IBM PureData System fo...IBM Information on Demand 2013  - Session 2839 - Using IBM PureData System fo...
IBM Information on Demand 2013 - Session 2839 - Using IBM PureData System fo...
 
esri2015cloudantdashdbpresentation-150731203041-lva1-app6892
esri2015cloudantdashdbpresentation-150731203041-lva1-app6892esri2015cloudantdashdbpresentation-150731203041-lva1-app6892
esri2015cloudantdashdbpresentation-150731203041-lva1-app6892
 

Recently uploaded

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
eoxhsaa
 
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
1tyxnjpia
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
keesa2
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
slg6lamcq
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
tzu5xla
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
eudsoh
 
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptxREUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
KiriakiENikolaidou
 
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
agdhot
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
Vietnam Cotton & Spinning Association
 

Recently uploaded (20)

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
 
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
 
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptxREUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
 
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
 

Serverless SQL

  • 2. SQL on Object Storage DM Gartner Hype Cycle 2018
  • 3. Evolution of Form Factors For Big Data Analytics Enterprise Data Warehouses Tightly integrated and optimized systems Hadoop Introduced open data formats & easy scaling on commodity HW Cloud-Native: Serverless Analytics-aaS • Seamless elasticity • Pay-per-query consumption • Analyze data as it sits in an object store • Disaggregated architecture • No more infrastructure head aches The 90-ies 2000 Today
  • 4. Ingredient 3: Serverless Data Transformation Ingredient 4: Serverless Analytics Ingredient 5: Serverless Automation Ingredient 2: Serverless Data Ingest Sharing Economy for Analytics Ingredient 1: Serverless Storage
  • 5. Object Storage IBM Cloud Object Storage Objects Objects Objects At Rest On the Wire Buckets Encrypted Pennies per GB REST Elastic Durable Flexible Resiliency Choices Storage Classes User Managed Encryption Keys S3 Compatible High Speed Data Transfer Aspera SQL Queries
  • 6. Data Ingest Options 6 High Customizability Degree of Serverless-ness IBM Event Streams (Kafka aaS) IBM Cloud Functions Out-of-the-Box IBM Streaming Analytics (IBM Streams aaS) via Cloud Object Storage API SQL Query ETL Cloudant Replication Blockchain Synch
  • 7. Cloud Data Data Transformation Serverless SQL Analytics IBM SQL Query Object Storage Db2 + Developers Data Engineers Data Analysts ü Perfect for Machine Generated Data ü Ad-hoc Data Exploration ü Operationalizing Data Pipelines ü Big Data Lakes ü Flexible Data Transformation ü Extremely affordable. 5$/TB scanned ü 100% API enabled ü Analytics on Object Storage ü Big Data Scale-Out. Running on Spark ü 100% Self service – No Setup
  • 8. 2. Read data 4. Read results Application 3. Write results IBM Cloud Object Storage Result SetData Set Data Set Data Set 1. Submit SQL SQL Archive / Export IBM Cloud Streaming IBM Streams Event Streams Land Query IBM Cloud Functions IBM SQL Query Architecture IBM Cloud Databases Db2 on Cloud Geospatial SQLData Skipping Timeseries SQL Upload
  • 9. Data Center 2 Analytics Engine Cluster 20 Kernels Node 1 Node 3 Node 2 Node 3 … 20 Kernels … Data Center 3 Analytics Engine Cluster 20 Kernels Node 1 Node 3 Node 2 Node 3 … 20 Kernels … SQL 1 SQL 1 Data Center 1 IBM Cloud SQL Query – Very High Level Architecture (MVP 1Q 2018) Analytics Engine Cluster 20 Kernels Cluster Pool Request Queue Node 1 Node 3 Node 2 Node 3 … Kernel Pools 20 Kernels … SQL 1 SQL 2 SQL 3 SQL 4 SQL 5 Cloud Object Storage SQL 6 … JKG (Web Sockets) IBM Cloud Query – Spark Cluster Architecture
  • 10. SQL REST API Create Query SQL Web Console Watson Studio Notebooks SQL Cloud Function Integrate Explore Deploy IBM Cloud Query – Access Patterns Node SDK Python SDK JDBCLooker
  • 11. Best of breed Spark SQL Reference • Complete, intuitive and interactive SQL Reference • Each sample SQL can immediately be executed as is https://cloud.ibm.com/docs/services/sql-query/sqlref/sql_reference.html#sql-reference Analytics using full Power of Spark SQL
  • 12. IBM SQL Query – Timeseries SQL 1/2 § Intuitive first-of-a-kind SQL extensions for timeseries operations § Industry leading differentiators, including: • Timeseries transformation functions: • Correlation, Fourier transformation, z-normalization, Granger, interpolation, and distances • Temporal Joins: SQL support for Left/Right/Full Inner and Outer joins of multiple timeseries Alignment & Joining:
  • 13. § Further Industry leading differentiators • Numerical and categorical timeseries types • Timeseries data skipping for fast queries • Forecasting: • ARIMA, BATS, Anomaly detection, etc. • Subsequence Mining: • Train & match models for event sequences • Segmentation: • Time-based, Record-based, Anchor-based, Burst, and silence Segmentation: IBM SQL Query – Timeseries SQL 2/2
  • 14. • IBM SQL Query – Spatial SQL § SQL/MM standard to store & analyze spatial data in RDBMS § Migration of PostGIS compliant SQL queries § Aggregation, computation and join via native SQL syntax § Industry leading differentiators • Geodetic Full Earth support • Increased developer productivity • Avoid piece-wise planar projections • High precision calculations anywhere on the earth • Very large polygons (e.g. countries), polar caps, x-ing anti-meridian • Spatial data skipping for fast queries • Native and fine-granular geohash support • Fast spatial aggregation
  • 15. Example: Spatio-Temporal Processing of Sensor Data IBM Cloud Object Storage Sensor Data Query Location Analytics Mobile Cars Devices Land Location Filtering Spatial Aggregation GPS SQL/MM Sensor Metrics t t t Timeseries Assembly Timeseries Join Timeseries SQL t
  • 17. Unstructured Data Prep SQL Query Cloud Functions Analyze COSCOS Extract Features Automated/Scheduled SQL Execution SQL Query Cloud Functions Develop SQL Deploy as SQL Cloud Function Set up Cloud Function Trigger/Schedule Shield Data From Direct Access SQL Query Cloud Functions Deploy Cloud Function with COS API Key User Calls Function to Access Data COS Grant Execute on SQL Cloud Function to User Configure SQL Pipelines SQL Query Cloud Functions User creates function sequence to automate flow of consecutive SQLs Sequence SQL Query Cloud Functions 1. 2. Use Cases of Cloud Functions Adding Value to SQL
  • 18. Ingredient 3: Serverless Data Transformation ✓ Ingredient 4: Serverless Analytics ✓ Ingredient 5: Serverless Automation ✓ Ingredient 2: Serverless Data Ingest ✓ Ingredient 1: Serverless Storage ✓ Now, what is this all good for?
  • 19. IBM Cloud Object Storage Acquire Query Data Warehouses & Databases Db2 on Cloud Process Analyze ApplicationsApplications Applications IoT Streaming Devices Devices Devices BI & AI Land Log Messages Cleanse Filter Merge Aggregate Compress Watson Studio Looker Cognos WML Explore Analyze Analyze Promote Use for Data Pipelines to fuel BI & AI
  • 20. Data –Driven Decisions ☛ Understanding system health, user behavior & workload status Collecting & Analyzing Log Data ☛ Is NOT and afterthought but rather foundation for decisions on system and feature design. Data Volume Growing Rapidly ☛ Growth rates and data volume at rest can jump dramatically. Very high elasticity is required. Competitive Advantage ☛ Is based on short runways for turning data into actions Turn your Logs into Business – Log Data Is The Cloud-Native Currency
  • 21. Logs Your Cloud Application/Solution IBM Cloud Object Storage Query Transform Compress Aggregate Repartition Analyze Anomaly Detection User Segmentation Customer Support Resource Planning • Build & run data pipelines and analytics of your log message data • Flexible log data analytics with full power of SQL • Seamless scalability & elasticity according to your log message volume Use for analyzing application logs
  • 22. IDUG Db2 Tech Conference Charlotte, NC | June 2 – 6, 2019 Data Lake in IBM Cloud – How it works IBM Cloud Data LakeData Streaming Upload ETL DB2 Feature Extraction Data Prep ICD DB2 ICD OLAP Analytics WML ETL Federate Asper a Cloudant Replication Secure Sync IBM Blockchain Application s Application s Watson Studio Knowledge Catalog METASTORE AI ICP for DataAnalytics Engine IBM Cloud Functions Land Process Integrate Key Protect Index Creation
  • 23. Getting started: https://www.ibm.com/cloud/sql-query SQL Query Intro Video: https://youtu.be/s-FznfHJpoU SQL Query Starter Notebook in Watson Studio: https://ibm.biz/BdYNrN SQL Reference: https://ibm.biz/Bd2jF7 SQL Query API doc: https://cloud.ibm.com/apidocs/sql-query Big Data Layout Best Practices for COS: https://ibm.biz/Bd2jRg Serverless Data & Analytics: https://ibm.biz/Bd2jF5 Further Resources
  • 25. IDUG Db2 Tech Conference Charlotte, NC | June 2 – 6, 2019 1. Identify friction points in users’ digital journey, e.g.: • Clicks-2-purchase ratio • Unexpected repeated page visits per user • E.g. entering payment data should only happen once • Last page visited per session 2. Identify click sequences for successful purchase • Sequence matching using timeseries analysis 3. Identify customers/segments likely to churn or expand • Look for typical page visits, actions or flows • E.g. Terms & conditions, invite additional users etc. 4. Determining your most important content online What Insights can I extract from a Clickstream?
  • 26. 1. Identify friction points in users’ digital journey, e.g.: • Clicks-2-purchase ratio • Unexpected repeated page visits per user • E.g. entering payment data should only happen once • Last page visited per session 2. Identify click sequences for successful purchase • Sequence matching using timeseries analysis 3. Identify customers/segments likely to churn or expand • Look for typical page visits, actions or flows • E.g. Terms & conditions, invite additional users etc. 4. Determining your most important content online What Insights can I extract from a Clickstream?
  • 27. Building IBM Cloud-Native Data Lake Serverless SQL Serverless Storage Serverless Pipeline Automation ✓ ✓ ✓ Orchestration Processing Persistency Data Ingest ✓ Data Catalog ✓ Serverless Unstructured Data Processing ✓
  • 28. • Traditional analytics systems • Fixed capacities of appliances • Specialized teams of data engineers & DBAs who manage data model, access and ETL • BI analysts who have access only to the curated data sets in EDW • Innovative enterprises today • Wide range of teams that require direct access to same data set at all stages of the data pipeline: BI analysts, data scientists, quantitative marketers, dev/ops, developers • Data engineers that support these teams need a much, much more scalable and cost- effective platform to ensure all teams have access they need and when needed • Building analytics platforms in the cloud because of the scale and cost-efficiencies that come with serverless analytics over object stores Serverless – The key to IT Sharing Economy ... also for Analytics
  • 29. Proper data organization è better performance and lower cost 29 , 2 0 1 9 / © 2 0 1 9 I B M C o r p o r a t i o n The key factors are: • Number of bytes shipped • Number of REST requests Best practices for structured data: • Choose the right object size (sweet spot: 128 MB) • Choose the right format • Choose the right data layout • Avoid gzip compressed formats Applies to SQL Query but also applies to other Big Data engines To learn more: https://www.ibm.com/blogs/bluemix/2018/06/big-data-layout/
  • 30. Which Format is Query-Friendly?
  • 31. 2. Use Hive style partitioning GPMeterStream/dt=2017-08-17/part-00085.csv GPMeterStream/dt=2017-08-17/part-00086.csv GPMeterStream/dt=2017-08-17/part-00087.csv GPMeterStream/dt=2017-08-17/part-00088.csv GPMeterStream/dt=2017-08-17/part-00089.csv GPMeterStream/dt=2017-08-18/part-00001.csv GPMeterStream/dt=2017-08-18/part-00002.csv GPMeterStream/dt=2017-08-18/part-00003.csv Avoid reading unnecessary objects altogether Technique has limitations Best Practice: minimize bytes scanned 1. Use Parquet • Column based • Only read the columns you need • Column wise compression • Min/max metadata
  • 32. Table Locators cos://<endpoint>/<bucket>/[<prefix>] <format definition> Endpoint – of your object storage bucket or a short alias E.g. s3.us-south.objectstorage.appdomain.cloud or alias us-south Bucket – name in object storage Prefix – one or multiple objects (i.e. table partitions) with same prefix Used in FROM clauses for input data and in target field for result set data Examples: cos://us-south/myBucket/myFolder/mySubFolder/myData.parquet cos://us/otherBucket/myData cos://us/otherBucket/myData/part cos://eu/newBucket/
  • 33. <Table Locator> [JOBPREFIX JOBID | NONE] [STORED AS CSV | PARQUET | JSON] • Specifies the data format of the input data • Table schema is automatically inferred at SQL execution time • STORED AS Clause is optional, the default is CSV • Additional parameters for CSV: • E.g.: FIELDS TERMINATEY BY ‘t’ NOHEADER • JOBPREFIX only for targets: defines unique prefix to append. Default is JOBID. Table Format Definition
  • 34. SELECT … INTO <Table Locator> [STORED AS CSV | PARQUET | JSON] [PARTITIONED [BY (<column list>)] [INTO <num> BUCKETS] [EVERY <num> ROWS]] [SORT BY (<column list>)] BY: Produces Hive Style Partitioning INTO: Produced fix number of partitions (hash partitioned) EVERY: Produces partitioned of even size (e.g. for pagination) SORT BY: Exact result order & clustering when combined with PARTITIONED Table Partitioning Definition
  • 35. Submit a SQL query POST https://api.sql-query.cloud.ibm.com/v2/sql_jobs Runs the SQL in the background and returns a job_id Detailed info for a SQL query (e.g. status, result location) GET https://api.sql-query.cloud.ibm.com /v2/sql_jobs/{job_id} Returns JSON with query execution details List of recent SQL query executions GET https://api.sql-query.cloud.ibm.com /v2/sql_jobs Returns JSON array with last 30 SQL submissions and outcomes IBM SQL Query REST API
  • 36. IDUG Db2 Tech Conference Charlotte, NC | June 2 – 6, 2019 Scaling Analytics: Data Skipping Saving you Time and $ Index All Objects IBM Cloud Object Storage Data Set Objects SQL Query Data Skipping Indexing Candidate Objects WHERE Clause Saving Time and $ SQL Query learns which objects are not relevant to a query using a data skipping index CREATE METAINDEX stores index summary metadata for each object. Much smaller than the data. SQLs skipping irrelevant objects to significantly reduce I/O E.g.: Independent of data formats Index Types: Min/Max, Value List, Bounding Box Get location and time of heat waves (>40 celcius) SELECT lat, long, city, temp, date FROM weather WHERE temp > 40.0
  • 37. Scaling Analytics: Data Skipping Saving you Time and $ Index All Objects IBM Cloud Object Storage Data Set Objects SQL Query Data Skipping Indexing Candidate Objects WHERE Clause Saving Time and $ SQL Query learns which objects are not relevant to a query using a data skipping index CREATE METAINDEX stores index summary metadata for each object. Much smaller than the data. SQLs skipping irrelevant objects to significantly reduce I/O E.g.: Independent of data formats Index Types: Min/Max, Value List, Bounding Box Get location and time of heat waves (>40 celcius) SELECT lat, long, city, temp, date FROM weather WHERE temp > 40.0
  • 38. • JDBC compliant driver library that wraps REST API • Wrapping both, SQL Query and COS REST API • Exposing regular session interface (JDBC Connection) • Enabling custom JDBC application support • Enabling BI application support • Early adopter: Looker • Support for stored table meta data (simple catalog) • Stored as json in COS and referenced via JDBC connection string • I.e. DatabaseMetaData interface also supported JDBC Driver for BI Applications Apply for Beta Now Query JDBC Driver REST COS JDBC API DataResult Sets Table Catalog E.g. Looker
  • 39. Using SQL Query JDBC Driver Define table catalog • JSON file in COS containing: • Table name • Location of table objects on COS • Object format • Column names • Column types • INT, FLOAT, VARCHAR, TIMESTAMP JDBC Connection String: jdbc:SQLQuery:<sql-query instance crn> ?schemabucket=<COS bucket with json catalog> ?schemafile=<COS object with json catalog> &apikey=<api key for your account> &targetcosurl=<COS URL for result set>
  • 40. Think 2019 / 2263 / February 2019 / © 2019 IBM Corporation
  • 41. IBM Cloud Functions Fair Never pay for idle Polyglot Elastic Automation Triggers Open Source CLOUD FUNCTIONS Schedules Sequences