SlideShare a Scribd company logo
1© Copyright 2016 Pivotal. All rights reserved. 1© Copyright 2016 Pivotal. All rights reserved.
GPORCA OSS 101
Xin Zhang (xzhang@pivotal.io)
GPDB Meetup
April 2017
2© Copyright 2016 Pivotal. All rights reserved.
Data Team at Palo Alto
On Jan 2016, Open Source GPORCA
At http://github.com/greenplum-db/gporca
BLOG: http://engineering.pivotal.io/post/gporca-open-source/
1 WHO, WHERE, WHEN
3© Copyright 2016 Pivotal. All rights reserved.
WHAT is GPORCA architecture?
WHY using GPORCA (pros and cons)?
HOW to contribute to GPORCA?
2 WHAT, WHY, HOW
4© Copyright 2016 Pivotal. All rights reserved.
Add a feature
Unit test it
Contribute back to community
3 At the End of the Talk, You CAN
5© Copyright 2016 Pivotal. All rights reserved.
Jul 2015, PQO: Pivotal Query Optimizer (Released in GPDB 4.3.5.0)
Jun 2014, ORCA: SIGMOD 2014 Paper
In source code
GPOPT: src/backend/gpopt, select gp_opt_version();
GPORCA = Greenplum ORCA = ...
6© Copyright 2016 Pivotal. All rights reserved.
Introduction
GPORCA Architecture
Develop
Build/Test/Debug
How to contribute
Outline
7© Copyright 2016 Pivotal. All rights reserved.
Query Plan Example 1/2
SELECT s.beer, s.price
FROM Bars b, Sells s
WHERE b.name = s.bar
AND b.city = 'San Francisco'
• Bars is distributed randomly
• Sells is distributed by bar
8© Copyright 2016 Pivotal. All rights reserved.
Query Plan Example 2/2
SELECT s.beer, s.price
FROM Bars b, Sells s
WHERE b.name = s.bar
AND b.city = 'San Francisco'
• Bars is distributed randomly
• Sells is distributed by bar
Slice 2
Slice 1
Slice 3
9© Copyright 2016 Pivotal. All rights reserved.
Segment N
Query Execution Example
Slice 3
Slice 2
Master Slice 1
Slice 3
Slice 2
…
Segment 1
10© Copyright 2016 Pivotal. All rights reserved.
GPORCA can optimize all
111 TPC-DS Queries
According to SIGMOD 2014 Paper
11© Copyright 2016 Pivotal. All rights reserved.
~60 Logical Operators
~50 Physical Operators
~40 Scalar Operators
~110 Transformation Rules
Great SQL Surface of GPORCA
12© Copyright 2016 Pivotal. All rights reserved.
Key Features vs. Planner
• Smarter partition elimination
• Multi-level partitioning support (4.3.6)
• Subquery unnesting
• Common table expressions (CTE)
• Join on castable data types
• Improved join ordering (better join order)
• Join-Aggregate reordering (push/pull AGG over join)
• Sort order optimization (avoid sort multiple times)
• Skew awareness (redistribution vs. broadcast)
13© Copyright 2016 Pivotal. All rights reserved.
14© Copyright 2016 Pivotal. All rights reserved.
DXL API and Multi-Host Support
GPORCA
Parser
Host System
< />
SQL
Q2DXL
DXL
Query
MD Provider
MDreq.
Catalog Executor
DXL
MD
DXL
Plan
DXL2Plan
Results
15© Copyright 2014 Pivotal. All rights reserved.
GPORCA Architecture
KERNEL
16© Copyright 2016 Pivotal. All rights reserved.
• [Step 0] Pre-Process: e.g., predicates pushdown
• [Step 1] Exploration: All equivalent logical plans
• [Step 2] Statistics Derivation: histograms
• [Step 3] Implementation: Logical to Physical
• [Step 4] MPP Optimization: Enforcing and Costing
Optimization Steps in Orca
17© Copyright 2014 Pivotal. All rights reserved.
Step 1 Exploration 1/2: Memoization
• Compact in-memory data
structure capturing plan space
– Group: Container of equivalent
expressions
– Group Expression: Operator
that has other groups as its
children
Inner Join
(T1.a = T2.b)
GROUP 0
Get(T2)Get(T1)
0: Inner Join [1,2]
GROUP 1
0: Get (T1) [ ]
GROUP 2
0: Get (T2) [ ]Logical Expression
Memo
18© Copyright 2014 Pivotal. All rights reserved.
Step 1 Exploration 2/2: Transformation
Inner Join
(T1.a = T2.b)
GROUP 0 Get(T2)Get(T1)
0: Inner Join [1,2]
GROUP 1
0: Get (T1) [ ]
GROUP 2
0: Get (T2) [ ]
Logical Expression
Memo
GROUP 0
0: Inner Join [1,2]
GROUP 1
0: Get (T1) [ ]
GROUP 2
0: Get (T2) [ ]
Memo
1: Inner Join [2,1]
Inner Join
(T1.a = T2.b)
Get(T1)Get(T2)
Logical Expression
19© Copyright 2014 Pivotal. All rights reserved.
Step 2 Statistics Derivation 1/2
1: Inner Join(a=b)
[2,1]
0: Inner Join(a=b)
[1,2]
GROUP 0
0: Get(T1) [ ]
GROUP 1
0: Get(T2) [ ]
GROUP 2
Reqd Stats = { }
Reqd Stats = {a}
Reqd Stats = {b}
Back-end SystemOrca
20© Copyright 2014 Pivotal. All rights reserved.
Step 2 Statistics Derivation 2/2
1: Inner Join(a=b)
[2,1]
0: Inner Join(a=b)
[1,2]
GROUP 0
0: Get(T1) [ ]
GROUP 1
0: Get(T2) [ ]
GROUP 2
Reqd Stats = { }
Reqd Stats = {a} MD Accessor
Reqd Stats = {b} MD Accessor
MD Provider
Catalog
Back-end SystemOrca
MDCache
21© Copyright 2016 Pivotal. All rights reserved.
Step 3 Implementation
Inner Hash Join (T1.a=T2.b) [1,2]
GROUP 1 GROUP 2
Inner Join (T1.a=T2.b) [1,2]
GROUP 1 GROUP 2
Logical Expression
Physical Expression
22© Copyright 2016 Pivotal. All rights reserved.
Drvd Props: {Hashed(T1.a), < >}
Step 4 MPP Optimization 1/3 Requirement
Inner Hash Join (T1.a=T2.b) [1,2]
Reqd Props: {Singleton, <T1.a>}
GROUP 1 GROUP 2
Distribution, Order
Optimization Request
23© Copyright 2016 Pivotal. All rights reserved.
Redistribute(T2.b)Scan(T1)
Drvd Props: {Hashed(T1.a), < >}
Step 4 MPP Optimization 2/3 Costing
Inner Hash Join (T1.a=T2.b) [1,2]
Reqd Props: {Singleton, <T1.a>}
Distribution, Order
{Hashed(T1.a), Any} {Hashed(T2.b), Any}
Scan(T2)
Optimization Request
24© Copyright 2016 Pivotal. All rights reserved.
Redistribute(T2.b)Scan(T1)
Drvd Props: {Hashed(T1.a), < >}
Step 4 MPP Optimization 3/3 Enforcement
Inner Hash Join (T1.a=T2.b) [1,2]
Reqd Props: {Singleton, <T1.a>}
Distribution, Order
{Hashed(T1.a), Any} {Hashed(T2.b), Any}
Scan(T2)
Inner Hash Join
Scan(T1) Redistribute(T2.b)
Scan(T2)
Sort(T1.a)
GatherMerge(T1.a)
Inner Hash Join
Scan(T1) Redistribute(T2.b)
Scan(T2)
Gather
Sort(T1.a)
Optimization Request
Property
Enforcing
25© Copyright 2016 Pivotal. All rights reserved.
26© Copyright 2016 Pivotal. All rights reserved.
Split an aggregate into a pair of local and global aggregate.
Schema: CREATE TABLE foo (a int, b int, c int) distributed by (a);
Query: SELECT sum(c) FROM foo GROUP BY b
Do local aggregation on segments
The global aggregation on master
Walkthrough
27© Copyright 2014 Pivotal. All rights reserved.
Split Groupby Aggregate
GpAgg (b)
Get(foo)
Sum(c)
GpAgg (b)
Get(foo)
Sum(c)
GpAgg (b)
Sum(c)
28© Copyright 2016 Pivotal. All rights reserved.
// HEADER FILES
~/orca/libgpopt/include/gpopt/xforms
// SOURCE FILES
~/orca/libgpopt/src/xforms
CXformSplitGbAgg
29© Copyright 2016 Pivotal. All rights reserved.
• Pattern
• Pre-Condition Check
Transformation Trigger
30© Copyright 2016 Pivotal. All rights reserved.
Pattern
GPOS_NEW(pmp)
CExpression
(
pmp,
// logical aggregate operator
GPOS_NEW(pmp) CLogicalGbAgg(pmp),
// relational child
GPOS_NEW(pmp) CExpression(pmp, GPOS_NEW(pmp) CPatternLeaf(pmp)),
// scalar project list
GPOS_NEW(pmp) CExpression(pmp, GPOS_NEW(pmp) CPatternTree(pmp))
));
GpAgg (b)
Get(foo)
Sum(c)
31© Copyright 2016 Pivotal. All rights reserved.
What's wrong of this pattern?
GPOS_NEW(pmp)
CExpression
(
pmp,
// logical aggregate operator
GPOS_NEW(pmp) CLogicalGbAgg(pmp),
// relational child
GPOS_NEW(pmp) CExpression(pmp, GPOS_NEW(pmp) CPatternLeaf(pmp)),
// scalar project list
GPOS_NEW(pmp) CExpression(pmp, GPOS_NEW(pmp) CPatternTree(pmp))
));
GpAgg (b)
Get(foo)
Sum(c)
GpAgg (b)
Sum(c)
32© Copyright 2016 Pivotal. All rights reserved.
Pre-Condition Check
Do not fire this rule on a logical operator produced by the same rule.
(Avoid Infinite Recursion)
// Compatibility function for splitting aggregates
virtual
BOOL FCompatible(CXform::EXformId exfid)
{
return (CXform::ExfSplitGbAgg != exfid);
}
GpAgg (b)
Get(foo)
Sum(c)
GpAgg (b)
Sum(c)
33© Copyright 2016 Pivotal. All rights reserved.
void Transform
(
CXformContext *pxfctxt, // update
CXformResult *pxfres, // output
CExpression *pexpr // input
)
const;
The Actual Transformation
34© Copyright 2016 Pivotal. All rights reserved.
Register Transformation
void CXformFactory::Instantiate()
{
…
Add(GPOS_NEW(m_pmp) CXformSplitGbAgg(m_pmp));
…
}
35© Copyright 2016 Pivotal. All rights reserved.
36© Copyright 2016 Pivotal. All rights reserved.
Top Level
.
├── cmake
├── concourse
├── data
├── libgpdbcost
├── libgpopt
├── libgpos
├── libnaucrates
├── patches
├── scripts
└── server
37© Copyright 2016 Pivotal. All rights reserved.
libgpos: memory, task scheduler, exception handling, unit-test framework
libgpos
├── include
└── src
├── common
├── error
├── io
├── memory
├── net
├── string
├── sync
├── task
└── test
├── server
│ ├── include
│ └── src
│ ├── startup
│ └── unittest
│ └── gpos
│ ├── common
│ ├── error
│ ├── io
│ ├── memory
│ ├── string
│ ├── sync
│ ├── task
│ └── test
38© Copyright 2016 Pivotal. All rights reserved.
libnaucrates: DXL, metadata, statistics, traceflags
libnaucrates
├── include
│ └── naucrates
│ ├── base
│ ├── dxl
│ │ ├── operators
│ │ ├── parser
│ │ └── xml
│ ├── md
│ ├── statistics
│ └── traceflags
└── src
├── base
├── md
├── operators
├── parser
├── statistics
└── xml
39© Copyright 2016 Pivotal. All rights reserved.
libgpopt: engine, metadata cache, minidump, operators, memo, xform rules
libgpopt
├── include
│ └── gpopt
└── src
├── base
├── engine
├── eval
├── mdcache
├── metadata
├── minidump
├── operators
├── optimizer
├── search
├── translate
└── xforms
40© Copyright 2016 Pivotal. All rights reserved.
libgpdbcost: cost model
libgpdbcost
├── CMakeLists.txt
├── include
│ └── gpdbcost
└── src
├── CCostModelGPDB.cpp
├── CCostModelGPDBLegacy.cpp
├── CCostModelParamsGPDB.cpp
├── CCostModelParamsGPDBLegacy.cpp
└── ICostModel.cpp
41© Copyright 2016 Pivotal. All rights reserved.
server: unit tests server
├── include
└── src
├── startup
└── unittest
├── dxl
│ ├── base
│ └── statistics
└── gpopt
├── base
├── cost
├── csq
├── engine
├── eval
├── mdcache
├── metadata
├── minidump
├── operators
├── search
├── translate
└── xforms
42© Copyright 2016 Pivotal. All rights reserved.
Data: all the test data data
├── dxl
│ ├── cost
│ ├── csq_tests
│ ├── expressiontests
│ ├── indexjoin
│ ├── metadata
│ ├── minidump
│ │ ├── CArrayExpansionTest
│ │ ├── CJoinOrderDPTest
│ │ ├── CPhysicalParallelUnionAllTest
│ │ ├── CPruneColumnsTest
│ │ └── sql
│ ├── multilevel-partitioning
│ ├── parse_tests
│ ├── plstmt
│ ├── query
│ ├── search
│ ├── statistics
│ ├── tpcds
│ ├── tpcds-partitioned
│ ├── tpch
│ └── tpch-partitioned
43© Copyright 2016 Pivotal. All rights reserved.
44© Copyright 2016 Pivotal. All rights reserved.
Dependencies
GP-XERCES: https://github.com/greenplum-db/gp-xerces
CMake 3.0+
45© Copyright 2016 Pivotal. All rights reserved.
CMake Build
mkdir build
cd build
cmake ../
make && make install
46© Copyright 2016 Pivotal. All rights reserved.
Concourse Pipeline
https://ci.orca.pivotalci.info/teams/main/pipelines/gporca
47© Copyright 2016 Pivotal. All rights reserved.
48© Copyright 2016 Pivotal. All rights reserved.
# run all unit tests
ctest
# run all unit tests in parallel with 7 threads
ctest -j7
# run only one unit test called CAggTest
./server/gporca_test -U CAggTest
Test GPORCA (<2min)
49© Copyright 2016 Pivotal. All rights reserved.
Follow instructions from:
https://github.com/d/bug-free-fortnight
It's very useful to verify installcheck-good locally with latest
GPORCA changes.
Test with GPDB OSS in Docker
50© Copyright 2016 Pivotal. All rights reserved.
51© Copyright 2016 Pivotal. All rights reserved.
GPORCA Traceflags and GPDB GUC
GPORCA relies on Traceflags to change runtime behavior: Traceflag.h
Exposed in GPDB as GUC (Grand Unified Configuration): guc_gp.c
-- turn on GPORCA
set optimizer=on;
-- print input query (GPORCA TF 101000)
set optimizer_print_query=on;
52© Copyright 2016 Pivotal. All rights reserved.
Turn on the minidump
set client_min_messages='log';
set optimizer=on;
set optimizer_enable_constant_expression_evaluation=off;
set optimizer_enumerate_plans=on;
set optimizer_minidump=always;
Run a query
GPORCA creates a *.mdp file in the $MASTER_DATA_DIRECTORY/minidump
# run only one minidump directly
./server/gporca_test -d ../data/dxl/minidump/TVFRandom.mdp
Minidump: DXL document
53© Copyright 2016 Pivotal. All rights reserved.
Input Query, Output Plan
Debug the plans
set client_min_messages='log';
set optimizer=on;
set optimizer_print_query=on; -- input query, and preprocessed query
set optimizer_print_plan=on; -- output final physical plan
54© Copyright 2016 Pivotal. All rights reserved.
Plan Enumeration
Turn on the plan enumerations
set client_min_messages='log';
set optimizer=on;
set optimizer_enumerate_plans=on;
Pick a plan out of search space
set optimizer=on;
set client_min_messages='log';
set optimizer_enumerate_plans=on;
set optimizer_plan_id=1;
55© Copyright 2016 Pivotal. All rights reserved.
Optimization Stats and Xform Rules
Debug optimizer stages
set client_min_messages='log';
set optimizer=on;
set optimizer_print_optimization_stats=on;
Debug the transformation rules details
set client_min_messages='log';
set optimizer=on;
set optimizer_print_xform=on;
56© Copyright 2016 Pivotal. All rights reserved.
MEMO Groups
set optimizer_print_memo_after_exploration=on;
set optimizer_print_memo_after_implementation=on;
set optimizer_print_memo_after_optimization=on;
ROOT group is indicated as `ROOT`
57© Copyright 2016 Pivotal. All rights reserved.
Way to Disable Xform Rules
select disable_xform('CXformJoinAssociativity');
select enable_xform('CXformJoinAssociativity');
All the xform rules can be found from the class names under libgpopt/include/gpopt/xforms
CXformFactory::Instantiate lists all the activated xform rules (~130 rules)
58© Copyright 2016 Pivotal. All rights reserved.
Useful Breakpoints
# Entry point of optimizer
COptimizer::PdxlnOptimize
# DXL: Translate DXL into Query
CTranslatorDXLToExpr::PexprTranslateQuery
# Step 1: Pre-processor
CExpressionPreprocessor::PexprPreprocess
# Step 2-3-4: Optimization
COptimizer::PexprOptimize
# Individual rule transformation, all CXform* classes
CXformSplitGbAgg::Transform
# Enforceable Property
CEngine::FCheckEnfdProps
CPartitionPropagationSpec::AppendEnforcers
# DXL: Translate Plan back in DXL
CTranslatorExprToDXL::PdxlnTranslate
59© Copyright 2016 Pivotal. All rights reserved.
60© Copyright 2016 Pivotal. All rights reserved.
1. Fork GPORCA at https://github.com/greenplum-db/gporca
2. Pick an issue https://github.com/greenplum-db/gporca/issues
3. Send a Pull Request (PR)
THANKS!
Join GPORCA mailing list gpdb-users+subscribe@greenplum.org
xzhang@pivotal.io
1-2-3
61© Copyright 2016 Pivotal. All rights reserved.
Q: What happen to SORT generated in Enforcement?
A: There is only ONE implementation of SORT, so, that's a physical operator and won't be pushed anywhere after enforcement.
Q: Why CXformResult has more than one alternatives?
A: Some rule (e.g. CXformExpandNAryJoinDP) can produce multiple choices after transformation
Q: How hard to make GPORCA adapt to a new host?
A: The hard part is on the MD translation. So far, GPORCA is very Postgres friendly.
Q: Why there is no separate SQL parser included in GPORCA?
A: GPORCA focused on relational algebra and let host handle the binding, view expansions, and permissions.
Q: How to add a new property like 'reliability' or 'dollar cost' to this optimizer?
A: That can be added to Property Enforcement as the Order/Distribution/Partition/Rewindability. For example, if people want to favor a more 'reliable' data source, they can add a
CEnfdReliability class to cost that choice. It's an interesting combination of 'reliability' and 'dollar cost', usually, when it's more reliable is more expensive. It's an interesting balance to achieve.
Q: How long does the `ctest` run?
A: Around 5min on 2.8Ghz Intel i7. Running with `ctest -j7` finished in < 2min.
Q: Is GPORCA multi-threaded?
A: It's multi-thread READY, but currently, we run with single thread. There are still few caveats (thread safe issues) to iron out before we can fully turn it on.
FAQ
62© Copyright 2016 Pivotal. All rights reserved.
Publications
Orca: A Modular Query Optimizer Architecture for Big Data, SIGMOD 2014
Mohamed A. Soliman, Lyublena Antova, Venkatesh Raghavan, Amr El-Helw, Zhongxian Gu, Entong Shen, George C. Caragea, Carlos
Garcia-Alvarado, Foyzur Rahman, Michalis Petropoulos, Florian Waas, Sivaramakrishnan Narayanan, Konstantinos Krikellas, Rhonda
Baldwin
Optimization of Common Table Expressions in MPP Database Systems, VLDB 2015
Amr El-Helw, Venkatesh Raghavan, Mohamed A. Soliman, George C. Caragea, Zhongxian Gu, Michalis Petropoulos.
Optimizing Queries over Partitioned Tables in MPP Systems, SIGMOD 2014
Lyublena Antova, Amr El-Helw, Mohamed Soliman, Zhongxian Gu, Michalis Petropoulos, Florian Waas
Reversing Statistics for Scalable Test Databases Generation, DBTest 2013
Entong Shen, Lyublena Antova
Total Operator State Recall - Cost-Effective Reuse of Results in Greenplum Database, ICDE Workshops 2013
George C. Caragea, Carlos Garcia-Alvarado, Michalis Petropoulos, Florian M. Waas
Testing the Accuracy of Query Optimizers, DBTest 2012
Zhongxian Gu, Mohamed A. Soliman, Florian M. Waas
Automatic Capture of Minimal, Portable, and Executable Bug Repros using AMPERe, DBTest 2012
Lyublena Antova, Konstantinos Krikellas, Florian M. Waas
Automatic Data Placement in MPP Databases, ICDE Workshops 2012
Carlos Garcia-Alvarado, Venkatesh Raghavan, Sivaramakrishnan Narayanan, Florian M. Waas
A NEW PLATFORM FOR A NEW ERAA NEW PLATFORM FOR A NEW ERA

More Related Content

What's hot

CTF超入門 (for 第12回セキュリティさくら)
CTF超入門 (for 第12回セキュリティさくら)CTF超入門 (for 第12回セキュリティさくら)
CTF超入門 (for 第12回セキュリティさくら)
kikuchan98
 
HPCユーザが知っておきたいTCP/IPの話 ~クラスタ・グリッド環境の落とし穴~
HPCユーザが知っておきたいTCP/IPの話 ~クラスタ・グリッド環境の落とし穴~HPCユーザが知っておきたいTCP/IPの話 ~クラスタ・グリッド環境の落とし穴~
HPCユーザが知っておきたいTCP/IPの話 ~クラスタ・グリッド環境の落とし穴~
Ryousei Takano
 
RENAT - ネットワーク検証自動化
RENAT - ネットワーク検証自動化RENAT - ネットワーク検証自動化
RENAT - ネットワーク検証自動化
HuuBachNguyen
 
Introduction to parallel computing using CUDA
Introduction to parallel computing using CUDAIntroduction to parallel computing using CUDA
Introduction to parallel computing using CUDA
Martin Peniak
 
Comparing Next-Generation Container Image Building Tools
 Comparing Next-Generation Container Image Building Tools Comparing Next-Generation Container Image Building Tools
Comparing Next-Generation Container Image Building Tools
Akihiro Suda
 
PythonとVeriloggenを用いたRTL設計メタプログラミング
PythonとVeriloggenを用いたRTL設計メタプログラミングPythonとVeriloggenを用いたRTL設計メタプログラミング
PythonとVeriloggenを用いたRTL設計メタプログラミング
Shinya Takamaeda-Y
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
libfetion
 
広告配信のための高速疎ベクトル検索エンジンの開発@WebDBフォーラム2015 #webdbf2015
広告配信のための高速疎ベクトル検索エンジンの開発@WebDBフォーラム2015 #webdbf2015広告配信のための高速疎ベクトル検索エンジンの開発@WebDBフォーラム2015 #webdbf2015
広告配信のための高速疎ベクトル検索エンジンの開発@WebDBフォーラム2015 #webdbf2015
Yahoo!デベロッパーネットワーク
 
Tensor flow usergroup 2016 (公開版)
Tensor flow usergroup 2016 (公開版)Tensor flow usergroup 2016 (公開版)
Tensor flow usergroup 2016 (公開版)
Hiroki Nakahara
 
How shit works: the CPU
How shit works: the CPUHow shit works: the CPU
How shit works: the CPU
Tomer Gabel
 
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIKernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Anne Nicolas
 
The Yocto Project
The Yocto ProjectThe Yocto Project
The Yocto Project
rossburton
 
OpenCL Programming 101
OpenCL Programming 101OpenCL Programming 101
OpenCL Programming 101
Yoss Cohen
 
文法圧縮入門:超高速テキスト処理のためのデータ圧縮(NLP2014チュートリアル)
文法圧縮入門:超高速テキスト処理のためのデータ圧縮(NLP2014チュートリアル)文法圧縮入門:超高速テキスト処理のためのデータ圧縮(NLP2014チュートリアル)
文法圧縮入門:超高速テキスト処理のためのデータ圧縮(NLP2014チュートリアル)
Shirou Maruyama
 
Git flow for daily use
Git flow for daily useGit flow for daily use
Git flow for daily use
Mediacurrent
 
goroutineはどうやって動いているのか
goroutineはどうやって動いているのかgoroutineはどうやって動いているのか
goroutineはどうやって動いているのか
ota42y
 
Preparing OpenSHMEM for Exascale
Preparing OpenSHMEM for ExascalePreparing OpenSHMEM for Exascale
Preparing OpenSHMEM for Exascale
inside-BigData.com
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
Kernel TLV
 
レシピの作り方入門
レシピの作り方入門レシピの作り方入門
レシピの作り方入門
Nobuhiro Iwamatsu
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
lcplcp1
 

What's hot (20)

CTF超入門 (for 第12回セキュリティさくら)
CTF超入門 (for 第12回セキュリティさくら)CTF超入門 (for 第12回セキュリティさくら)
CTF超入門 (for 第12回セキュリティさくら)
 
HPCユーザが知っておきたいTCP/IPの話 ~クラスタ・グリッド環境の落とし穴~
HPCユーザが知っておきたいTCP/IPの話 ~クラスタ・グリッド環境の落とし穴~HPCユーザが知っておきたいTCP/IPの話 ~クラスタ・グリッド環境の落とし穴~
HPCユーザが知っておきたいTCP/IPの話 ~クラスタ・グリッド環境の落とし穴~
 
RENAT - ネットワーク検証自動化
RENAT - ネットワーク検証自動化RENAT - ネットワーク検証自動化
RENAT - ネットワーク検証自動化
 
Introduction to parallel computing using CUDA
Introduction to parallel computing using CUDAIntroduction to parallel computing using CUDA
Introduction to parallel computing using CUDA
 
Comparing Next-Generation Container Image Building Tools
 Comparing Next-Generation Container Image Building Tools Comparing Next-Generation Container Image Building Tools
Comparing Next-Generation Container Image Building Tools
 
PythonとVeriloggenを用いたRTL設計メタプログラミング
PythonとVeriloggenを用いたRTL設計メタプログラミングPythonとVeriloggenを用いたRTL設計メタプログラミング
PythonとVeriloggenを用いたRTL設計メタプログラミング
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
 
広告配信のための高速疎ベクトル検索エンジンの開発@WebDBフォーラム2015 #webdbf2015
広告配信のための高速疎ベクトル検索エンジンの開発@WebDBフォーラム2015 #webdbf2015広告配信のための高速疎ベクトル検索エンジンの開発@WebDBフォーラム2015 #webdbf2015
広告配信のための高速疎ベクトル検索エンジンの開発@WebDBフォーラム2015 #webdbf2015
 
Tensor flow usergroup 2016 (公開版)
Tensor flow usergroup 2016 (公開版)Tensor flow usergroup 2016 (公開版)
Tensor flow usergroup 2016 (公開版)
 
How shit works: the CPU
How shit works: the CPUHow shit works: the CPU
How shit works: the CPU
 
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIKernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
 
The Yocto Project
The Yocto ProjectThe Yocto Project
The Yocto Project
 
OpenCL Programming 101
OpenCL Programming 101OpenCL Programming 101
OpenCL Programming 101
 
文法圧縮入門:超高速テキスト処理のためのデータ圧縮(NLP2014チュートリアル)
文法圧縮入門:超高速テキスト処理のためのデータ圧縮(NLP2014チュートリアル)文法圧縮入門:超高速テキスト処理のためのデータ圧縮(NLP2014チュートリアル)
文法圧縮入門:超高速テキスト処理のためのデータ圧縮(NLP2014チュートリアル)
 
Git flow for daily use
Git flow for daily useGit flow for daily use
Git flow for daily use
 
goroutineはどうやって動いているのか
goroutineはどうやって動いているのかgoroutineはどうやって動いているのか
goroutineはどうやって動いているのか
 
Preparing OpenSHMEM for Exascale
Preparing OpenSHMEM for ExascalePreparing OpenSHMEM for Exascale
Preparing OpenSHMEM for Exascale
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
レシピの作り方入門
レシピの作り方入門レシピの作り方入門
レシピの作り方入門
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
 

Similar to GPDB Meetup GPORCA OSS 101

GPORCA: Query Optimization as a Service
GPORCA: Query Optimization as a ServiceGPORCA: Query Optimization as a Service
GPORCA: Query Optimization as a Service
PivotalOpenSourceHub
 
Crude-Oil Scheduling Technology: moving from simulation to optimization
Crude-Oil Scheduling Technology: moving from simulation to optimizationCrude-Oil Scheduling Technology: moving from simulation to optimization
Crude-Oil Scheduling Technology: moving from simulation to optimization
Brenno Menezes
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120
Hyoungjun Kim
 
How YugaByte DB Implements Distributed PostgreSQL
How YugaByte DB Implements Distributed PostgreSQLHow YugaByte DB Implements Distributed PostgreSQL
How YugaByte DB Implements Distributed PostgreSQL
Yugabyte
 
コミュニティ開発に参加しよう!
コミュニティ開発に参加しよう!コミュニティ開発に参加しよう!
コミュニティ開発に参加しよう!
Takahiro Itagaki
 
What’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorWhat’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributor
Masahiko Sawada
 
“Quantum” Performance Effects: beyond the Core
“Quantum” Performance Effects: beyond the Core“Quantum” Performance Effects: beyond the Core
“Quantum” Performance Effects: beyond the Core
C4Media
 
Examining Oracle GoldenGate Trail Files
Examining Oracle GoldenGate Trail FilesExamining Oracle GoldenGate Trail Files
Examining Oracle GoldenGate Trail Files
Bobby Curtis
 
One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018
One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018
One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018
Gleb Otochkin
 
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter
 
MySQL 8.0: What Is New in Optimizer and Executor?
MySQL 8.0: What Is New in Optimizer and Executor?MySQL 8.0: What Is New in Optimizer and Executor?
MySQL 8.0: What Is New in Optimizer and Executor?
Norvald Ryeng
 
Performance Co-Pilot
Performance Co-PilotPerformance Co-Pilot
Performance Co-Pilot
YOSHIKAWA Ryota
 
Z Garbage Collector
Z Garbage CollectorZ Garbage Collector
Z Garbage Collector
David Buck
 
NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...
NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...
NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...
VirtualTech Japan Inc.
 
Sprint 111
Sprint 111Sprint 111
Sprint 111
ManageIQ
 
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdfDataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
Miguel Angel Fajardo
 
Developing R Graphical User Interfaces
Developing R Graphical User InterfacesDeveloping R Graphical User Interfaces
Developing R Graphical User Interfaces
Setia Pramana
 
Presto query optimizer: pursuit of performance
Presto query optimizer: pursuit of performancePresto query optimizer: pursuit of performance
Presto query optimizer: pursuit of performance
DataWorks Summit
 
Managing large (and small) R based solutions with R Suite
Managing large (and small) R based solutions with R SuiteManaging large (and small) R based solutions with R Suite
Managing large (and small) R based solutions with R Suite
Wit Jakuczun
 
Jboss World 2011 Infinispan
Jboss World 2011 InfinispanJboss World 2011 Infinispan
Jboss World 2011 Infinispan
cbo_
 

Similar to GPDB Meetup GPORCA OSS 101 (20)

GPORCA: Query Optimization as a Service
GPORCA: Query Optimization as a ServiceGPORCA: Query Optimization as a Service
GPORCA: Query Optimization as a Service
 
Crude-Oil Scheduling Technology: moving from simulation to optimization
Crude-Oil Scheduling Technology: moving from simulation to optimizationCrude-Oil Scheduling Technology: moving from simulation to optimization
Crude-Oil Scheduling Technology: moving from simulation to optimization
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120
 
How YugaByte DB Implements Distributed PostgreSQL
How YugaByte DB Implements Distributed PostgreSQLHow YugaByte DB Implements Distributed PostgreSQL
How YugaByte DB Implements Distributed PostgreSQL
 
コミュニティ開発に参加しよう!
コミュニティ開発に参加しよう!コミュニティ開発に参加しよう!
コミュニティ開発に参加しよう!
 
What’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorWhat’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributor
 
“Quantum” Performance Effects: beyond the Core
“Quantum” Performance Effects: beyond the Core“Quantum” Performance Effects: beyond the Core
“Quantum” Performance Effects: beyond the Core
 
Examining Oracle GoldenGate Trail Files
Examining Oracle GoldenGate Trail FilesExamining Oracle GoldenGate Trail Files
Examining Oracle GoldenGate Trail Files
 
One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018
One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018
One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018
 
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
 
MySQL 8.0: What Is New in Optimizer and Executor?
MySQL 8.0: What Is New in Optimizer and Executor?MySQL 8.0: What Is New in Optimizer and Executor?
MySQL 8.0: What Is New in Optimizer and Executor?
 
Performance Co-Pilot
Performance Co-PilotPerformance Co-Pilot
Performance Co-Pilot
 
Z Garbage Collector
Z Garbage CollectorZ Garbage Collector
Z Garbage Collector
 
NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...
NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...
NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...
 
Sprint 111
Sprint 111Sprint 111
Sprint 111
 
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdfDataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
 
Developing R Graphical User Interfaces
Developing R Graphical User InterfacesDeveloping R Graphical User Interfaces
Developing R Graphical User Interfaces
 
Presto query optimizer: pursuit of performance
Presto query optimizer: pursuit of performancePresto query optimizer: pursuit of performance
Presto query optimizer: pursuit of performance
 
Managing large (and small) R based solutions with R Suite
Managing large (and small) R based solutions with R SuiteManaging large (and small) R based solutions with R Suite
Managing large (and small) R based solutions with R Suite
 
Jboss World 2011 Infinispan
Jboss World 2011 InfinispanJboss World 2011 Infinispan
Jboss World 2011 Infinispan
 

Recently uploaded

GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
christinelarrosa
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
DanBrown980551
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
christinelarrosa
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
Fwdays
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Ukraine
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
Mydbops
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
AWS Certified Solutions Architect Associate (SAA-C03)
AWS Certified Solutions Architect Associate (SAA-C03)AWS Certified Solutions Architect Associate (SAA-C03)
AWS Certified Solutions Architect Associate (SAA-C03)
HarpalGohil4
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
Fwdays
 
What is an RPA CoE? Session 2 – CoE Roles
What is an RPA CoE?  Session 2 – CoE RolesWhat is an RPA CoE?  Session 2 – CoE Roles
What is an RPA CoE? Session 2 – CoE Roles
DianaGray10
 

Recently uploaded (20)

GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
AWS Certified Solutions Architect Associate (SAA-C03)
AWS Certified Solutions Architect Associate (SAA-C03)AWS Certified Solutions Architect Associate (SAA-C03)
AWS Certified Solutions Architect Associate (SAA-C03)
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
 
What is an RPA CoE? Session 2 – CoE Roles
What is an RPA CoE?  Session 2 – CoE RolesWhat is an RPA CoE?  Session 2 – CoE Roles
What is an RPA CoE? Session 2 – CoE Roles
 

GPDB Meetup GPORCA OSS 101

  • 1. 1© Copyright 2016 Pivotal. All rights reserved. 1© Copyright 2016 Pivotal. All rights reserved. GPORCA OSS 101 Xin Zhang (xzhang@pivotal.io) GPDB Meetup April 2017
  • 2. 2© Copyright 2016 Pivotal. All rights reserved. Data Team at Palo Alto On Jan 2016, Open Source GPORCA At http://github.com/greenplum-db/gporca BLOG: http://engineering.pivotal.io/post/gporca-open-source/ 1 WHO, WHERE, WHEN
  • 3. 3© Copyright 2016 Pivotal. All rights reserved. WHAT is GPORCA architecture? WHY using GPORCA (pros and cons)? HOW to contribute to GPORCA? 2 WHAT, WHY, HOW
  • 4. 4© Copyright 2016 Pivotal. All rights reserved. Add a feature Unit test it Contribute back to community 3 At the End of the Talk, You CAN
  • 5. 5© Copyright 2016 Pivotal. All rights reserved. Jul 2015, PQO: Pivotal Query Optimizer (Released in GPDB 4.3.5.0) Jun 2014, ORCA: SIGMOD 2014 Paper In source code GPOPT: src/backend/gpopt, select gp_opt_version(); GPORCA = Greenplum ORCA = ...
  • 6. 6© Copyright 2016 Pivotal. All rights reserved. Introduction GPORCA Architecture Develop Build/Test/Debug How to contribute Outline
  • 7. 7© Copyright 2016 Pivotal. All rights reserved. Query Plan Example 1/2 SELECT s.beer, s.price FROM Bars b, Sells s WHERE b.name = s.bar AND b.city = 'San Francisco' • Bars is distributed randomly • Sells is distributed by bar
  • 8. 8© Copyright 2016 Pivotal. All rights reserved. Query Plan Example 2/2 SELECT s.beer, s.price FROM Bars b, Sells s WHERE b.name = s.bar AND b.city = 'San Francisco' • Bars is distributed randomly • Sells is distributed by bar Slice 2 Slice 1 Slice 3
  • 9. 9© Copyright 2016 Pivotal. All rights reserved. Segment N Query Execution Example Slice 3 Slice 2 Master Slice 1 Slice 3 Slice 2 … Segment 1
  • 10. 10© Copyright 2016 Pivotal. All rights reserved. GPORCA can optimize all 111 TPC-DS Queries According to SIGMOD 2014 Paper
  • 11. 11© Copyright 2016 Pivotal. All rights reserved. ~60 Logical Operators ~50 Physical Operators ~40 Scalar Operators ~110 Transformation Rules Great SQL Surface of GPORCA
  • 12. 12© Copyright 2016 Pivotal. All rights reserved. Key Features vs. Planner • Smarter partition elimination • Multi-level partitioning support (4.3.6) • Subquery unnesting • Common table expressions (CTE) • Join on castable data types • Improved join ordering (better join order) • Join-Aggregate reordering (push/pull AGG over join) • Sort order optimization (avoid sort multiple times) • Skew awareness (redistribution vs. broadcast)
  • 13. 13© Copyright 2016 Pivotal. All rights reserved.
  • 14. 14© Copyright 2016 Pivotal. All rights reserved. DXL API and Multi-Host Support GPORCA Parser Host System < /> SQL Q2DXL DXL Query MD Provider MDreq. Catalog Executor DXL MD DXL Plan DXL2Plan Results
  • 15. 15© Copyright 2014 Pivotal. All rights reserved. GPORCA Architecture KERNEL
  • 16. 16© Copyright 2016 Pivotal. All rights reserved. • [Step 0] Pre-Process: e.g., predicates pushdown • [Step 1] Exploration: All equivalent logical plans • [Step 2] Statistics Derivation: histograms • [Step 3] Implementation: Logical to Physical • [Step 4] MPP Optimization: Enforcing and Costing Optimization Steps in Orca
  • 17. 17© Copyright 2014 Pivotal. All rights reserved. Step 1 Exploration 1/2: Memoization • Compact in-memory data structure capturing plan space – Group: Container of equivalent expressions – Group Expression: Operator that has other groups as its children Inner Join (T1.a = T2.b) GROUP 0 Get(T2)Get(T1) 0: Inner Join [1,2] GROUP 1 0: Get (T1) [ ] GROUP 2 0: Get (T2) [ ]Logical Expression Memo
  • 18. 18© Copyright 2014 Pivotal. All rights reserved. Step 1 Exploration 2/2: Transformation Inner Join (T1.a = T2.b) GROUP 0 Get(T2)Get(T1) 0: Inner Join [1,2] GROUP 1 0: Get (T1) [ ] GROUP 2 0: Get (T2) [ ] Logical Expression Memo GROUP 0 0: Inner Join [1,2] GROUP 1 0: Get (T1) [ ] GROUP 2 0: Get (T2) [ ] Memo 1: Inner Join [2,1] Inner Join (T1.a = T2.b) Get(T1)Get(T2) Logical Expression
  • 19. 19© Copyright 2014 Pivotal. All rights reserved. Step 2 Statistics Derivation 1/2 1: Inner Join(a=b) [2,1] 0: Inner Join(a=b) [1,2] GROUP 0 0: Get(T1) [ ] GROUP 1 0: Get(T2) [ ] GROUP 2 Reqd Stats = { } Reqd Stats = {a} Reqd Stats = {b} Back-end SystemOrca
  • 20. 20© Copyright 2014 Pivotal. All rights reserved. Step 2 Statistics Derivation 2/2 1: Inner Join(a=b) [2,1] 0: Inner Join(a=b) [1,2] GROUP 0 0: Get(T1) [ ] GROUP 1 0: Get(T2) [ ] GROUP 2 Reqd Stats = { } Reqd Stats = {a} MD Accessor Reqd Stats = {b} MD Accessor MD Provider Catalog Back-end SystemOrca MDCache
  • 21. 21© Copyright 2016 Pivotal. All rights reserved. Step 3 Implementation Inner Hash Join (T1.a=T2.b) [1,2] GROUP 1 GROUP 2 Inner Join (T1.a=T2.b) [1,2] GROUP 1 GROUP 2 Logical Expression Physical Expression
  • 22. 22© Copyright 2016 Pivotal. All rights reserved. Drvd Props: {Hashed(T1.a), < >} Step 4 MPP Optimization 1/3 Requirement Inner Hash Join (T1.a=T2.b) [1,2] Reqd Props: {Singleton, <T1.a>} GROUP 1 GROUP 2 Distribution, Order Optimization Request
  • 23. 23© Copyright 2016 Pivotal. All rights reserved. Redistribute(T2.b)Scan(T1) Drvd Props: {Hashed(T1.a), < >} Step 4 MPP Optimization 2/3 Costing Inner Hash Join (T1.a=T2.b) [1,2] Reqd Props: {Singleton, <T1.a>} Distribution, Order {Hashed(T1.a), Any} {Hashed(T2.b), Any} Scan(T2) Optimization Request
  • 24. 24© Copyright 2016 Pivotal. All rights reserved. Redistribute(T2.b)Scan(T1) Drvd Props: {Hashed(T1.a), < >} Step 4 MPP Optimization 3/3 Enforcement Inner Hash Join (T1.a=T2.b) [1,2] Reqd Props: {Singleton, <T1.a>} Distribution, Order {Hashed(T1.a), Any} {Hashed(T2.b), Any} Scan(T2) Inner Hash Join Scan(T1) Redistribute(T2.b) Scan(T2) Sort(T1.a) GatherMerge(T1.a) Inner Hash Join Scan(T1) Redistribute(T2.b) Scan(T2) Gather Sort(T1.a) Optimization Request Property Enforcing
  • 25. 25© Copyright 2016 Pivotal. All rights reserved.
  • 26. 26© Copyright 2016 Pivotal. All rights reserved. Split an aggregate into a pair of local and global aggregate. Schema: CREATE TABLE foo (a int, b int, c int) distributed by (a); Query: SELECT sum(c) FROM foo GROUP BY b Do local aggregation on segments The global aggregation on master Walkthrough
  • 27. 27© Copyright 2014 Pivotal. All rights reserved. Split Groupby Aggregate GpAgg (b) Get(foo) Sum(c) GpAgg (b) Get(foo) Sum(c) GpAgg (b) Sum(c)
  • 28. 28© Copyright 2016 Pivotal. All rights reserved. // HEADER FILES ~/orca/libgpopt/include/gpopt/xforms // SOURCE FILES ~/orca/libgpopt/src/xforms CXformSplitGbAgg
  • 29. 29© Copyright 2016 Pivotal. All rights reserved. • Pattern • Pre-Condition Check Transformation Trigger
  • 30. 30© Copyright 2016 Pivotal. All rights reserved. Pattern GPOS_NEW(pmp) CExpression ( pmp, // logical aggregate operator GPOS_NEW(pmp) CLogicalGbAgg(pmp), // relational child GPOS_NEW(pmp) CExpression(pmp, GPOS_NEW(pmp) CPatternLeaf(pmp)), // scalar project list GPOS_NEW(pmp) CExpression(pmp, GPOS_NEW(pmp) CPatternTree(pmp)) )); GpAgg (b) Get(foo) Sum(c)
  • 31. 31© Copyright 2016 Pivotal. All rights reserved. What's wrong of this pattern? GPOS_NEW(pmp) CExpression ( pmp, // logical aggregate operator GPOS_NEW(pmp) CLogicalGbAgg(pmp), // relational child GPOS_NEW(pmp) CExpression(pmp, GPOS_NEW(pmp) CPatternLeaf(pmp)), // scalar project list GPOS_NEW(pmp) CExpression(pmp, GPOS_NEW(pmp) CPatternTree(pmp)) )); GpAgg (b) Get(foo) Sum(c) GpAgg (b) Sum(c)
  • 32. 32© Copyright 2016 Pivotal. All rights reserved. Pre-Condition Check Do not fire this rule on a logical operator produced by the same rule. (Avoid Infinite Recursion) // Compatibility function for splitting aggregates virtual BOOL FCompatible(CXform::EXformId exfid) { return (CXform::ExfSplitGbAgg != exfid); } GpAgg (b) Get(foo) Sum(c) GpAgg (b) Sum(c)
  • 33. 33© Copyright 2016 Pivotal. All rights reserved. void Transform ( CXformContext *pxfctxt, // update CXformResult *pxfres, // output CExpression *pexpr // input ) const; The Actual Transformation
  • 34. 34© Copyright 2016 Pivotal. All rights reserved. Register Transformation void CXformFactory::Instantiate() { … Add(GPOS_NEW(m_pmp) CXformSplitGbAgg(m_pmp)); … }
  • 35. 35© Copyright 2016 Pivotal. All rights reserved.
  • 36. 36© Copyright 2016 Pivotal. All rights reserved. Top Level . ├── cmake ├── concourse ├── data ├── libgpdbcost ├── libgpopt ├── libgpos ├── libnaucrates ├── patches ├── scripts └── server
  • 37. 37© Copyright 2016 Pivotal. All rights reserved. libgpos: memory, task scheduler, exception handling, unit-test framework libgpos ├── include └── src ├── common ├── error ├── io ├── memory ├── net ├── string ├── sync ├── task └── test ├── server │ ├── include │ └── src │ ├── startup │ └── unittest │ └── gpos │ ├── common │ ├── error │ ├── io │ ├── memory │ ├── string │ ├── sync │ ├── task │ └── test
  • 38. 38© Copyright 2016 Pivotal. All rights reserved. libnaucrates: DXL, metadata, statistics, traceflags libnaucrates ├── include │ └── naucrates │ ├── base │ ├── dxl │ │ ├── operators │ │ ├── parser │ │ └── xml │ ├── md │ ├── statistics │ └── traceflags └── src ├── base ├── md ├── operators ├── parser ├── statistics └── xml
  • 39. 39© Copyright 2016 Pivotal. All rights reserved. libgpopt: engine, metadata cache, minidump, operators, memo, xform rules libgpopt ├── include │ └── gpopt └── src ├── base ├── engine ├── eval ├── mdcache ├── metadata ├── minidump ├── operators ├── optimizer ├── search ├── translate └── xforms
  • 40. 40© Copyright 2016 Pivotal. All rights reserved. libgpdbcost: cost model libgpdbcost ├── CMakeLists.txt ├── include │ └── gpdbcost └── src ├── CCostModelGPDB.cpp ├── CCostModelGPDBLegacy.cpp ├── CCostModelParamsGPDB.cpp ├── CCostModelParamsGPDBLegacy.cpp └── ICostModel.cpp
  • 41. 41© Copyright 2016 Pivotal. All rights reserved. server: unit tests server ├── include └── src ├── startup └── unittest ├── dxl │ ├── base │ └── statistics └── gpopt ├── base ├── cost ├── csq ├── engine ├── eval ├── mdcache ├── metadata ├── minidump ├── operators ├── search ├── translate └── xforms
  • 42. 42© Copyright 2016 Pivotal. All rights reserved. Data: all the test data data ├── dxl │ ├── cost │ ├── csq_tests │ ├── expressiontests │ ├── indexjoin │ ├── metadata │ ├── minidump │ │ ├── CArrayExpansionTest │ │ ├── CJoinOrderDPTest │ │ ├── CPhysicalParallelUnionAllTest │ │ ├── CPruneColumnsTest │ │ └── sql │ ├── multilevel-partitioning │ ├── parse_tests │ ├── plstmt │ ├── query │ ├── search │ ├── statistics │ ├── tpcds │ ├── tpcds-partitioned │ ├── tpch │ └── tpch-partitioned
  • 43. 43© Copyright 2016 Pivotal. All rights reserved.
  • 44. 44© Copyright 2016 Pivotal. All rights reserved. Dependencies GP-XERCES: https://github.com/greenplum-db/gp-xerces CMake 3.0+
  • 45. 45© Copyright 2016 Pivotal. All rights reserved. CMake Build mkdir build cd build cmake ../ make && make install
  • 46. 46© Copyright 2016 Pivotal. All rights reserved. Concourse Pipeline https://ci.orca.pivotalci.info/teams/main/pipelines/gporca
  • 47. 47© Copyright 2016 Pivotal. All rights reserved.
  • 48. 48© Copyright 2016 Pivotal. All rights reserved. # run all unit tests ctest # run all unit tests in parallel with 7 threads ctest -j7 # run only one unit test called CAggTest ./server/gporca_test -U CAggTest Test GPORCA (<2min)
  • 49. 49© Copyright 2016 Pivotal. All rights reserved. Follow instructions from: https://github.com/d/bug-free-fortnight It's very useful to verify installcheck-good locally with latest GPORCA changes. Test with GPDB OSS in Docker
  • 50. 50© Copyright 2016 Pivotal. All rights reserved.
  • 51. 51© Copyright 2016 Pivotal. All rights reserved. GPORCA Traceflags and GPDB GUC GPORCA relies on Traceflags to change runtime behavior: Traceflag.h Exposed in GPDB as GUC (Grand Unified Configuration): guc_gp.c -- turn on GPORCA set optimizer=on; -- print input query (GPORCA TF 101000) set optimizer_print_query=on;
  • 52. 52© Copyright 2016 Pivotal. All rights reserved. Turn on the minidump set client_min_messages='log'; set optimizer=on; set optimizer_enable_constant_expression_evaluation=off; set optimizer_enumerate_plans=on; set optimizer_minidump=always; Run a query GPORCA creates a *.mdp file in the $MASTER_DATA_DIRECTORY/minidump # run only one minidump directly ./server/gporca_test -d ../data/dxl/minidump/TVFRandom.mdp Minidump: DXL document
  • 53. 53© Copyright 2016 Pivotal. All rights reserved. Input Query, Output Plan Debug the plans set client_min_messages='log'; set optimizer=on; set optimizer_print_query=on; -- input query, and preprocessed query set optimizer_print_plan=on; -- output final physical plan
  • 54. 54© Copyright 2016 Pivotal. All rights reserved. Plan Enumeration Turn on the plan enumerations set client_min_messages='log'; set optimizer=on; set optimizer_enumerate_plans=on; Pick a plan out of search space set optimizer=on; set client_min_messages='log'; set optimizer_enumerate_plans=on; set optimizer_plan_id=1;
  • 55. 55© Copyright 2016 Pivotal. All rights reserved. Optimization Stats and Xform Rules Debug optimizer stages set client_min_messages='log'; set optimizer=on; set optimizer_print_optimization_stats=on; Debug the transformation rules details set client_min_messages='log'; set optimizer=on; set optimizer_print_xform=on;
  • 56. 56© Copyright 2016 Pivotal. All rights reserved. MEMO Groups set optimizer_print_memo_after_exploration=on; set optimizer_print_memo_after_implementation=on; set optimizer_print_memo_after_optimization=on; ROOT group is indicated as `ROOT`
  • 57. 57© Copyright 2016 Pivotal. All rights reserved. Way to Disable Xform Rules select disable_xform('CXformJoinAssociativity'); select enable_xform('CXformJoinAssociativity'); All the xform rules can be found from the class names under libgpopt/include/gpopt/xforms CXformFactory::Instantiate lists all the activated xform rules (~130 rules)
  • 58. 58© Copyright 2016 Pivotal. All rights reserved. Useful Breakpoints # Entry point of optimizer COptimizer::PdxlnOptimize # DXL: Translate DXL into Query CTranslatorDXLToExpr::PexprTranslateQuery # Step 1: Pre-processor CExpressionPreprocessor::PexprPreprocess # Step 2-3-4: Optimization COptimizer::PexprOptimize # Individual rule transformation, all CXform* classes CXformSplitGbAgg::Transform # Enforceable Property CEngine::FCheckEnfdProps CPartitionPropagationSpec::AppendEnforcers # DXL: Translate Plan back in DXL CTranslatorExprToDXL::PdxlnTranslate
  • 59. 59© Copyright 2016 Pivotal. All rights reserved.
  • 60. 60© Copyright 2016 Pivotal. All rights reserved. 1. Fork GPORCA at https://github.com/greenplum-db/gporca 2. Pick an issue https://github.com/greenplum-db/gporca/issues 3. Send a Pull Request (PR) THANKS! Join GPORCA mailing list gpdb-users+subscribe@greenplum.org xzhang@pivotal.io 1-2-3
  • 61. 61© Copyright 2016 Pivotal. All rights reserved. Q: What happen to SORT generated in Enforcement? A: There is only ONE implementation of SORT, so, that's a physical operator and won't be pushed anywhere after enforcement. Q: Why CXformResult has more than one alternatives? A: Some rule (e.g. CXformExpandNAryJoinDP) can produce multiple choices after transformation Q: How hard to make GPORCA adapt to a new host? A: The hard part is on the MD translation. So far, GPORCA is very Postgres friendly. Q: Why there is no separate SQL parser included in GPORCA? A: GPORCA focused on relational algebra and let host handle the binding, view expansions, and permissions. Q: How to add a new property like 'reliability' or 'dollar cost' to this optimizer? A: That can be added to Property Enforcement as the Order/Distribution/Partition/Rewindability. For example, if people want to favor a more 'reliable' data source, they can add a CEnfdReliability class to cost that choice. It's an interesting combination of 'reliability' and 'dollar cost', usually, when it's more reliable is more expensive. It's an interesting balance to achieve. Q: How long does the `ctest` run? A: Around 5min on 2.8Ghz Intel i7. Running with `ctest -j7` finished in < 2min. Q: Is GPORCA multi-threaded? A: It's multi-thread READY, but currently, we run with single thread. There are still few caveats (thread safe issues) to iron out before we can fully turn it on. FAQ
  • 62. 62© Copyright 2016 Pivotal. All rights reserved. Publications Orca: A Modular Query Optimizer Architecture for Big Data, SIGMOD 2014 Mohamed A. Soliman, Lyublena Antova, Venkatesh Raghavan, Amr El-Helw, Zhongxian Gu, Entong Shen, George C. Caragea, Carlos Garcia-Alvarado, Foyzur Rahman, Michalis Petropoulos, Florian Waas, Sivaramakrishnan Narayanan, Konstantinos Krikellas, Rhonda Baldwin Optimization of Common Table Expressions in MPP Database Systems, VLDB 2015 Amr El-Helw, Venkatesh Raghavan, Mohamed A. Soliman, George C. Caragea, Zhongxian Gu, Michalis Petropoulos. Optimizing Queries over Partitioned Tables in MPP Systems, SIGMOD 2014 Lyublena Antova, Amr El-Helw, Mohamed Soliman, Zhongxian Gu, Michalis Petropoulos, Florian Waas Reversing Statistics for Scalable Test Databases Generation, DBTest 2013 Entong Shen, Lyublena Antova Total Operator State Recall - Cost-Effective Reuse of Results in Greenplum Database, ICDE Workshops 2013 George C. Caragea, Carlos Garcia-Alvarado, Michalis Petropoulos, Florian M. Waas Testing the Accuracy of Query Optimizers, DBTest 2012 Zhongxian Gu, Mohamed A. Soliman, Florian M. Waas Automatic Capture of Minimal, Portable, and Executable Bug Repros using AMPERe, DBTest 2012 Lyublena Antova, Konstantinos Krikellas, Florian M. Waas Automatic Data Placement in MPP Databases, ICDE Workshops 2012 Carlos Garcia-Alvarado, Venkatesh Raghavan, Sivaramakrishnan Narayanan, Florian M. Waas
  • 63. A NEW PLATFORM FOR A NEW ERAA NEW PLATFORM FOR A NEW ERA