SlideShare a Scribd company logo
1 of 30
Download to read offline
BIGGER DATA ON GPUS:
SUCCESSES
APPROACHES, CHALLENGES,
Jake Wheat
Arnon Shimoni
INTRODUCING SQREAM DB
GPU-ACCELERATED DATA WAREHOUSE
100xfaster
Queries
10%of resources
Cost
20xmore data
Analyze
ECOSYSTEM
FAST TO GET LOTS OF DATA IN
• Use GPU for loading
• 900 GB/s Memory Bandwidth
• Compress all the data
• Collect metadata
FAST TO GET LOTS OF DATA OUT
• Access with easy-to-use SQL
• Support standards like ODBC and JDBC
• 900 GB/s Memory Bandwidth for SQL operations
• Access raw data directly, without cubes, indexes
• SQream DB reads less data from disk, with compression
GPUS FOR DATA
PROCESSING
ARE GPUS INTERESTING FOR RUNNING SQL?
• Can they run SQL
• Can they run SQL faster
– If qualified yes, in what situations?
• Are there other issues to consider?
CAN GPUS RUN SQL?
Example SQL Physical Operator Implementation
select a+b, c * 5 from t select
(a.k.a
project/extend/rename)
thrust::transform
select a, count(*), sum(b),
avg(b) from t group by a
stream aggregate thrust::reduce_by_key
select a, b from t where a > 0.5 filter thrust::remove_if
select distinct a from t stream distinct thrust::unique
select a, b, c, d from t order
by a,b
sort thrust::sort
select * from t union all
select * from u
union all -
select * from t
inner join u using (a)
sort merge join (smj) simple implementation:
thrust::upper_bounds,
lower_bounds, unnest, gather
MARKETING HURDLES
• PCI-bottleneck means it will never work
• Columnar databases can't do joins
• GPUs can't accelerate SQL operations
• No-one will put a GPU in a server
• GPUs are not actually faster than CPUs
• A startup cannot make a production ready SQL DBMS
OTHER ISSUES
• Can you make a convincing demo?
• Can you turn it into a real product?
• Can you put GPUs in a data centre?
• Are GPUs a safe bet in the medium/long
term?
INSPIRATION
EARLY RESEARCH
• MonetDB/X100 talk
youtu.be/yrLd-3lnZ58
• Relational Joins on Graphics
Processors
www.cse.ust.hk/catalac/papers/
gpujoin_sigmod08.pdf
• Relational Query Co-Processing on
Graphics Processors
dl.acm.org/citation.cfm?id=162058
8
• Several Daniel Abadi papers
www.cs.umd.edu/~abadi/
THE EARLY SQREAM DB PROTOTYPES
• Original brief: OpenCL + Erlang + Haskell streaming IoT = World Domination!
• Generate thrust at query time
• SQL server plugin
• A real (but simple) DBMS with storage
OUR FIRST DBMS
• Run on data on disk
• Create and drop table
• Insert, insert select (and truncate)
• A wide range of queries:
e.g. select lists, joins, where, aggregates, order by, distinct
• Lots of external algorithms
WHY NOT POSTGRES?
Some downsides to Postgres
• No columnar - engine and storage
• No threads, Not distributed
• A big complex system
Some non-benefits:
• Parsing, syntax, and similar - Haskell makes this easy
• The storage and execution engine – very row based
Some things we miss:
• Wide range of features, data types, operations
• Extensibility
• Cost based optimiser
• Protocol/client compatibility
STEPS TOWARDS TODAY'S PRODUCT
Haskell Compiler
Parse SQL
Desugar to
Relational
Algebra
Optimize
Desugar to
Statement Plan
Network
Server
Runtime
Metadata
Database
Columnar
Storage
Tree Interpreter Building Blocks I/O Task Runner
SQREAM DB ARCHITECTURE
Statement Compiler
SQL Parser
Desugar & Optimize
Relational Algebra
Desugar & Optimize
Low-level stages
Execution Engine
Statement Tree Interpreter
Task Runners
I/O CPU GPU
Storage Layer
Metadata Database
+ Low-level transactions
server or in-process
Bulk Data Layer
Extent Extent Extent …
Storage Reorganizer
Tasks
Queue & Thread
Manager
Profiling Support
Memory Managers
Building
blocks
Building
blocks
Building
blocks
Connection &
Session
Manager
Concurrency
& Admission
Control
Desugar & Optimize
Small
Memory
Managers
Chunk
Memory
Managers
Spool
Memory
Managers
Linux FS
Cache
Prodder
SOME ARCHITECTURE DETAILS
• Haskell has the intelligence
• C++/CUDA does the heavy lifting
• Message passing, worker pools
• Bulk data memory centric
• Storage is append-only with background reorganization
STORAGE AND TRANSACTIONS
• Metadata database with relatively conventional transactions
• Append only storage layer with background reorganization
Transactions
• Serializable, with any kind of statement
• Run multiple queries concurrently with anything
• Run multiple inserts to the same table at the same time
• Cannot run multiple statements in a single transaction
• Other operations such as delete, truncate, and DDL use course grained exclusive
locking
USING GPUS EFFECTIVELY
• Good kernels
• Optimise around GPU memory
• Use large chunks, rechunk where necessary
• Avoid PCI transfers where possible
• Profiling
• Partitioning
VECTORED BINARY SEARCH
0
3
4
2
4
5
0
0
3
3
1
1
1
1
1
2
2
Table A Table B
HASH JOINS
• Can hashing run fast on the GPU?
• Answer from NVIDIA experts:
– in principle probably yes
– in practice, difficult to compete with sort-based algorithms
COMPRESSION
• GPU compression for typical columnar data
– e.g. Dictionary, RLE, Delta, Pfor + Combos
– Helps speed up IO and PCI transfer times
– in house code
• CPU compression for general data
– Helps speed up IO, but not PCI transfer times
– We use things like Snappy and LZ4
SOME FINAL THOUGHTS
• SQL analytics and GPUs are a natural fit
• GPUs can be very effective for big data/external
algorithms
• Lots of exciting work being done in non-SQL
analytics (not just on GPUs)
• Haskell is a big positive
• Building a commercial SQL DBMS is very difficult
• Building a SQL DBMS is a really satisfying thing to do
SQL GPU
HIGH THROUGHPUT, CONVERGED
• SQream DB is designed for high-throughput devices
• IBM Power Systems is the only NVLink CPU-to-GPU enabled architecture,
unlocking the potential of high-throughput accelerated computing
• The IBM AC922, with POWER9 and NVLINK can transfer data at up to 300GB/s,
almost 9.5x faster than PCIe 3.0 found in x86-based architectures, reducing
classic I/O bottlenecks
2x
NVIDIA
Tesla V100
2x
NVIDIA
Tesla V100
IBM
Power 9
IBM
Power 9
HIGH THROUGHPUT ARCHITECTURE
IT’S NOT JUST CORES
RAM
Power9
CPU
Tesla V100
GPU
VRAM
Tesla V100
GPU
VRAM
170GB/s per CPU
NVLink – 300GB/s BiDi
900GB/s
RAM
Power9
CPU
Tesla V100
GPU
VRAM
Tesla V100
GPU
VRAM
IBM SMP bus
UP TO 3.7X FASTER QUERIES
52.83
10.35
84.5
78.57
14.06
2.8
30.29 29.01
0
10
20
30
40
50
60
70
80
90
TPC-H Query 8 TPC-H Query 6 TPC-H Query 19 TPC-H Query 17
Querytime(seconds)
Lowerisbetter
Query
SQream DB performance
IBM Power9 vs Intel Xeon (Skylake)
Dell PowerEdge R740 IBM Power9 AC922
IBM Power9 AC922:
2x POWER9 16C @ 3.8GHz | 256 GB DDR4 2666 MHz | SSD storage | 4x NVIDIA Tesla V100 (SXM2 NVLINK - 16GB)
Dell PowerEdge R740:
2x Intel Xeon Silver 4112 CPU @ 2.60GHz | 256GB DDR4 2666MHz | SSD storage | 4x NVIDIA Tesla V100 (PCIe - 16GB)
• In our testing, SQream DB on Power9
is between 150% to 370% faster than
comparable x86 architectures,
especially on large data sets. For
example, in the TPC-H (SF 10,000)
dataset, Query 8 ran in a quarter of
the time on the IBM Power 9,
compared to the x86 competitor.
UNDERSTAND 40 MILLION CUSTOMERS
TELECOM
HP DL380g9
with NVIDIA Tesla GPU
96 GB RAM + 6 TB storage
$200K
80 NODES
5 full racks
7600 CPU cores
$10,000,000
20M
10M
300M
120M
Ingest time
Reporting time
Ownership Cost
Green
plum
3G
4G
CDRs
Others
ETL
1-2 hours
GP
Daily
aggr.
…
Profiles
GP
Daily report
3 hours
(max) #1 #2 #3 #4 #31•••
•••
Daily reports
Monthly
#1
Monthly
#2
Monthly
#NMonthly reports
(7 days)
5hr 3hr 0.5hr
Billing
Pre-aggregations
ARCHITECTURE BEFORE SQREAM DB
SIMPLIFIED WITH SQREAM DB
3G
4G
CDRs
Others
#1 #2 #3 #4 #31•••
•••
Daily reports
Monthly
#1
Monthly
#2
Monthly
#NMonthly reports
1 day
10m 4m 2m

More Related Content

What's hot

3D: DBT using Databricks and Delta
3D: DBT using Databricks and Delta3D: DBT using Databricks and Delta
3D: DBT using Databricks and DeltaDatabricks
 
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScyllaDB
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
 
Scaling Machine Learning with Apache Spark
Scaling Machine Learning with Apache SparkScaling Machine Learning with Apache Spark
Scaling Machine Learning with Apache SparkDatabricks
 
Delta from a Data Engineer's Perspective
Delta from a Data Engineer's PerspectiveDelta from a Data Engineer's Perspective
Delta from a Data Engineer's PerspectiveDatabricks
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangDatabricks
 
Delta: Building Merge on Read
Delta: Building Merge on ReadDelta: Building Merge on Read
Delta: Building Merge on ReadDatabricks
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Databricks
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introductionchrislusf
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesAmazon Web Services
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesNishith Agarwal
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Cloudera, Inc.
 
Dynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache SparkDynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache SparkDatabricks
 
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsApache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsDataWorks Summit
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversScyllaDB
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j InternalsTobias Lindaaker
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseDatabricks
 

What's hot (20)

3D: DBT using Databricks and Delta
3D: DBT using Databricks and Delta3D: DBT using Databricks and Delta
3D: DBT using Databricks and Delta
 
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Scaling Machine Learning with Apache Spark
Scaling Machine Learning with Apache SparkScaling Machine Learning with Apache Spark
Scaling Machine Learning with Apache Spark
 
Delta from a Data Engineer's Perspective
Delta from a Data Engineer's PerspectiveDelta from a Data Engineer's Perspective
Delta from a Data Engineer's Perspective
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
 
Delta: Building Merge on Read
Delta: Building Merge on ReadDelta: Building Merge on Read
Delta: Building Merge on Read
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introduction
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0
 
Dynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache SparkDynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache Spark
 
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsApache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j Internals
 
The delta architecture
The delta architectureThe delta architecture
The delta architecture
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta Lakehouse
 

Similar to SQream DB - Bigger Data On GPUs: Approaches, Challenges, Successes

Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics PlatformSantanu Dey
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Community
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community
 
SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드
SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드
SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드문기 박
 
Healthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache SparkHealthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache SparkDatabricks
 
Sqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performanceSqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performanceGanesan Narayanasamy
 
The state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the CloudThe state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the CloudNicolas Poggi
 
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsChoose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsDatabricks
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AITyrone Systems
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureCeph Community
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitecturePatrick McGarry
 
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...Data Con LA
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Community
 
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseTackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseDatabricks
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_SummaryHiram Fleitas León
 

Similar to SQream DB - Bigger Data On GPUs: Approaches, Challenges, Successes (20)

Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics Platform
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
Galaxy Big Data with MariaDB
Galaxy Big Data with MariaDBGalaxy Big Data with MariaDB
Galaxy Big Data with MariaDB
 
SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드
SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드
SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드
 
IaaS for DBAs in Azure
IaaS for DBAs in AzureIaaS for DBAs in Azure
IaaS for DBAs in Azure
 
Healthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache SparkHealthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache Spark
 
Sqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performanceSqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performance
 
The state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the CloudThe state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the Cloud
 
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsChoose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AI
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
 
Drupal performance
Drupal performanceDrupal performance
Drupal performance
 
SQREAM DB on IBM Power9
SQREAM DB on IBM Power9SQREAM DB on IBM Power9
SQREAM DB on IBM Power9
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
 
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseTackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
 

Recently uploaded

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 

Recently uploaded (20)

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 

SQream DB - Bigger Data On GPUs: Approaches, Challenges, Successes

  • 1. BIGGER DATA ON GPUS: SUCCESSES APPROACHES, CHALLENGES, Jake Wheat Arnon Shimoni
  • 2. INTRODUCING SQREAM DB GPU-ACCELERATED DATA WAREHOUSE 100xfaster Queries 10%of resources Cost 20xmore data Analyze
  • 4. FAST TO GET LOTS OF DATA IN • Use GPU for loading • 900 GB/s Memory Bandwidth • Compress all the data • Collect metadata
  • 5. FAST TO GET LOTS OF DATA OUT • Access with easy-to-use SQL • Support standards like ODBC and JDBC • 900 GB/s Memory Bandwidth for SQL operations • Access raw data directly, without cubes, indexes • SQream DB reads less data from disk, with compression
  • 7. ARE GPUS INTERESTING FOR RUNNING SQL? • Can they run SQL • Can they run SQL faster – If qualified yes, in what situations? • Are there other issues to consider?
  • 8. CAN GPUS RUN SQL? Example SQL Physical Operator Implementation select a+b, c * 5 from t select (a.k.a project/extend/rename) thrust::transform select a, count(*), sum(b), avg(b) from t group by a stream aggregate thrust::reduce_by_key select a, b from t where a > 0.5 filter thrust::remove_if select distinct a from t stream distinct thrust::unique select a, b, c, d from t order by a,b sort thrust::sort select * from t union all select * from u union all - select * from t inner join u using (a) sort merge join (smj) simple implementation: thrust::upper_bounds, lower_bounds, unnest, gather
  • 9. MARKETING HURDLES • PCI-bottleneck means it will never work • Columnar databases can't do joins • GPUs can't accelerate SQL operations • No-one will put a GPU in a server • GPUs are not actually faster than CPUs • A startup cannot make a production ready SQL DBMS
  • 10. OTHER ISSUES • Can you make a convincing demo? • Can you turn it into a real product? • Can you put GPUs in a data centre? • Are GPUs a safe bet in the medium/long term?
  • 12. EARLY RESEARCH • MonetDB/X100 talk youtu.be/yrLd-3lnZ58 • Relational Joins on Graphics Processors www.cse.ust.hk/catalac/papers/ gpujoin_sigmod08.pdf • Relational Query Co-Processing on Graphics Processors dl.acm.org/citation.cfm?id=162058 8 • Several Daniel Abadi papers www.cs.umd.edu/~abadi/
  • 13. THE EARLY SQREAM DB PROTOTYPES • Original brief: OpenCL + Erlang + Haskell streaming IoT = World Domination! • Generate thrust at query time • SQL server plugin • A real (but simple) DBMS with storage
  • 14. OUR FIRST DBMS • Run on data on disk • Create and drop table • Insert, insert select (and truncate) • A wide range of queries: e.g. select lists, joins, where, aggregates, order by, distinct • Lots of external algorithms
  • 15. WHY NOT POSTGRES? Some downsides to Postgres • No columnar - engine and storage • No threads, Not distributed • A big complex system Some non-benefits: • Parsing, syntax, and similar - Haskell makes this easy • The storage and execution engine – very row based Some things we miss: • Wide range of features, data types, operations • Extensibility • Cost based optimiser • Protocol/client compatibility
  • 16. STEPS TOWARDS TODAY'S PRODUCT Haskell Compiler Parse SQL Desugar to Relational Algebra Optimize Desugar to Statement Plan Network Server Runtime Metadata Database Columnar Storage Tree Interpreter Building Blocks I/O Task Runner
  • 17. SQREAM DB ARCHITECTURE Statement Compiler SQL Parser Desugar & Optimize Relational Algebra Desugar & Optimize Low-level stages Execution Engine Statement Tree Interpreter Task Runners I/O CPU GPU Storage Layer Metadata Database + Low-level transactions server or in-process Bulk Data Layer Extent Extent Extent … Storage Reorganizer Tasks Queue & Thread Manager Profiling Support Memory Managers Building blocks Building blocks Building blocks Connection & Session Manager Concurrency & Admission Control Desugar & Optimize Small Memory Managers Chunk Memory Managers Spool Memory Managers Linux FS Cache Prodder
  • 18. SOME ARCHITECTURE DETAILS • Haskell has the intelligence • C++/CUDA does the heavy lifting • Message passing, worker pools • Bulk data memory centric • Storage is append-only with background reorganization
  • 19. STORAGE AND TRANSACTIONS • Metadata database with relatively conventional transactions • Append only storage layer with background reorganization Transactions • Serializable, with any kind of statement • Run multiple queries concurrently with anything • Run multiple inserts to the same table at the same time • Cannot run multiple statements in a single transaction • Other operations such as delete, truncate, and DDL use course grained exclusive locking
  • 20. USING GPUS EFFECTIVELY • Good kernels • Optimise around GPU memory • Use large chunks, rechunk where necessary • Avoid PCI transfers where possible • Profiling • Partitioning
  • 22. HASH JOINS • Can hashing run fast on the GPU? • Answer from NVIDIA experts: – in principle probably yes – in practice, difficult to compete with sort-based algorithms
  • 23. COMPRESSION • GPU compression for typical columnar data – e.g. Dictionary, RLE, Delta, Pfor + Combos – Helps speed up IO and PCI transfer times – in house code • CPU compression for general data – Helps speed up IO, but not PCI transfer times – We use things like Snappy and LZ4
  • 24. SOME FINAL THOUGHTS • SQL analytics and GPUs are a natural fit • GPUs can be very effective for big data/external algorithms • Lots of exciting work being done in non-SQL analytics (not just on GPUs) • Haskell is a big positive • Building a commercial SQL DBMS is very difficult • Building a SQL DBMS is a really satisfying thing to do SQL GPU
  • 25. HIGH THROUGHPUT, CONVERGED • SQream DB is designed for high-throughput devices • IBM Power Systems is the only NVLink CPU-to-GPU enabled architecture, unlocking the potential of high-throughput accelerated computing • The IBM AC922, with POWER9 and NVLINK can transfer data at up to 300GB/s, almost 9.5x faster than PCIe 3.0 found in x86-based architectures, reducing classic I/O bottlenecks 2x NVIDIA Tesla V100 2x NVIDIA Tesla V100 IBM Power 9 IBM Power 9
  • 26. HIGH THROUGHPUT ARCHITECTURE IT’S NOT JUST CORES RAM Power9 CPU Tesla V100 GPU VRAM Tesla V100 GPU VRAM 170GB/s per CPU NVLink – 300GB/s BiDi 900GB/s RAM Power9 CPU Tesla V100 GPU VRAM Tesla V100 GPU VRAM IBM SMP bus
  • 27. UP TO 3.7X FASTER QUERIES 52.83 10.35 84.5 78.57 14.06 2.8 30.29 29.01 0 10 20 30 40 50 60 70 80 90 TPC-H Query 8 TPC-H Query 6 TPC-H Query 19 TPC-H Query 17 Querytime(seconds) Lowerisbetter Query SQream DB performance IBM Power9 vs Intel Xeon (Skylake) Dell PowerEdge R740 IBM Power9 AC922 IBM Power9 AC922: 2x POWER9 16C @ 3.8GHz | 256 GB DDR4 2666 MHz | SSD storage | 4x NVIDIA Tesla V100 (SXM2 NVLINK - 16GB) Dell PowerEdge R740: 2x Intel Xeon Silver 4112 CPU @ 2.60GHz | 256GB DDR4 2666MHz | SSD storage | 4x NVIDIA Tesla V100 (PCIe - 16GB) • In our testing, SQream DB on Power9 is between 150% to 370% faster than comparable x86 architectures, especially on large data sets. For example, in the TPC-H (SF 10,000) dataset, Query 8 ran in a quarter of the time on the IBM Power 9, compared to the x86 competitor.
  • 28. UNDERSTAND 40 MILLION CUSTOMERS TELECOM HP DL380g9 with NVIDIA Tesla GPU 96 GB RAM + 6 TB storage $200K 80 NODES 5 full racks 7600 CPU cores $10,000,000 20M 10M 300M 120M Ingest time Reporting time Ownership Cost
  • 29. Green plum 3G 4G CDRs Others ETL 1-2 hours GP Daily aggr. … Profiles GP Daily report 3 hours (max) #1 #2 #3 #4 #31••• ••• Daily reports Monthly #1 Monthly #2 Monthly #NMonthly reports (7 days) 5hr 3hr 0.5hr Billing Pre-aggregations ARCHITECTURE BEFORE SQREAM DB
  • 30. SIMPLIFIED WITH SQREAM DB 3G 4G CDRs Others #1 #2 #3 #4 #31••• ••• Daily reports Monthly #1 Monthly #2 Monthly #NMonthly reports 1 day 10m 4m 2m