SlideShare a Scribd company logo
© Copyright 2019 Pivotal Software, Inc. All rights Reserved.
2019 9 25
Pivotal
DWH
Pivotal Greenplum
Agenda
●
Ø Pivotal Greenplum
Ø Greenplum - Pivotal Greenplum 6
● DWH Pivotal Greenplum
Ø
Ø DWH
Ø
●
3© Copyright 2019 Pivotal. All rights reserved.
Pivotal Greenplum
• Pivotal Data Suite (CPU )
•
•
• ( ) K8s
• MPP DB
•
• ( etc..)
•
•
•
4© Copyright 2019 Pivotal. All rights reserved.
Pivotal Greenplum
MPP (Massively Parallel Processing)
... ...
x 2
x 2
SQL
SQL gNet
5© Copyright 2019 Pivotal. All rights reserved.
CPU
I/O
CPU CPU CPU CPU
CPU CPU CPU CPU CPU
I/O I/O
HW
RDB DB
6© Copyright 2019 Pivotal. All rights reserved.
( 1/2)
256GB RAM
1.8TB 10Krpm SAS HDD 6 RAID5
(4 + 1 + 1 )
256GB RAM256GB RAM 256GB RAM 256GB RAM
256GB RAM
1.8TB 10Krpm SAS HDD 6 RAID5
(4 + 1 + 1 )
#1
(10Gbps x 52 )
#2
(10Gbps x 52 )
#1 #2 #3 #4
Intel E5-2680v3
2CPU,24Core
Intel E5-2680v3
2CPU,24Core
Intel E5-2680v3
2CPU,24Core
Intel E5-2680v3
2CPU,24Core
Intel E5-2680v3
2CPU,24Core
Intel E5-2680v3
2CPU,24Core
1.8TB 10Krpm SAS HDD 12 RAID5
(10 + 1 + 1 )
1.8TB 10Krpm SAS HDD 12 RAID5
(10 + 1 + 1 )
1.8TB 10Krpm SAS HDD 12 RAID5
(10 + 1 + 1 )
1.8TB 10Krpm SAS HDD 12 RAID5
(10 + 1 + 1 )
1.8TB 10Krpm SAS HDD 12 RAID5
(10 + 1 + 1 )
1.8TB 10Krpm SAS HDD 12 RAID5
(10 + 1 + 1 )
1.8TB 10Krpm SAS HDD 12 RAID5
(10 + 1 + 1 )
1.8TB 10Krpm SAS HDD 12 RAID5
(10 + 1 + 1 )
mLAG
7© Copyright 2019 Pivotal. All rights reserved.
( 2/2)
64GB RAM
300GB 10Krpm SAS HDD 6 RAID5
(4 + 1 + 1 )
64GB RAM64GB RAM 64GB RAM 64GB RAM
64GB RAM
300GB 10Krpm SAS HDD 6 RAID5
(4 + 1 + 1 )
#1
(10Gbps x 52 )
#2
(10Gbps x 52 )
#1 #2 #3 #4
Intel E5-2660
2CPU,16Core
Intel E5-2660
2CPU,16Core
Intel E5-2660
2CPU,16Core
Intel E5-2660
2CPU,16Core
Intel E5-2660
2CPU,16Core
Intel E5-2660
2CPU,16Core
300GB 10Krpm SAS HDD 12 RAID5
(10 + 1 + 1 )
300GB 10Krpm SAS HDD 12 RAID5
(10 + 1 + 1 )
300GB 10Krpm SAS HDD 12 RAID5
(10 + 1 + 1 )
300GB 10Krpm SAS HDD 12 RAID5
(10 + 1 + 1 )
300GB 10Krpm SAS HDD 12 RAID5
(10 + 1 + 1 )
300GB 10Krpm SAS HDD 12 RAID5
(10 + 1 + 1 )
300GB 10Krpm SAS HDD 12 RAID5
(10 + 1 + 1 )
300GB 10Krpm SAS HDD 12 RAID5
(10 + 1 + 1 )
HA
Bonding
#1 #1
#9 #9
#17 #17
5 #25 #25
mLAG
OLTP OLAP
Greenplum
v1 v4 v5 v6
2003 - 2009 2010 2015-20182013 2019
BI
Cover w/ Image
DWH
• PB
• :
• SLA
•
•
• OSS
#1 DWH
September 4, 2019: Now Generally Available
Greenplum 6 Postgres
v8.4 – 2314 commits
v9.0 – 1859 commits
v9.1 – 2035 commits
v9.2 – 1945 commits
v9.3 – 1603 commits
v9.4 – 1964 commits
TOTAL: 11,720 Commits Merged
Code Quality via Open Source
Optimized for Big Data in Greenplum
“Customers
frequently called out
the open-source
alignment with
PostgreSQL as a
strong and cost-
effective positive”
-- Gartner MQ 2019
Greenplum 6 OLTP
70
● OLTP
● OLTP
● 24,448 TPS for Update transactions in GP6
● 46,570 TPS for Single Row Insert in GP6
● 140,000 TPS for Select Only Query in GP6
●
Real world analytical
database and data
warehouse use cases
require a mixed
workload of long and
short queries as well
as updates and
deletes
“Replicated”
“DISTRIBUTED REPLICATED”
● Greenplum
● /
● Replicated
REPLICATED
DIMENSION TABLES
FOR FAST LOCAL JOIN
Greenplum 6 Replicated Tables
create table table_replicated (a int , b text)
distributed replicated;
insert into table_replicated
select id, 'val ' || id
from generate_series (1,10000) id;
select pg_relation_size('table_replicated');
pg_relation_size
------------------
917504
create table table_non_replicated (a int , b text)
distributed randomly;
insert into table_non_replicated
select id, 'val ' || id
from generate_series (1,10000) id;
select pg_relation_size('table_non_replicated');
pg_relation_size
------------------
458752
With Non Replicated table With Replicated Tables
Size is multiplied by the
number of primaries
select gp_segment_id, count(*) from table_replicated
group by 1;
ERROR: column "gp_segment_id" does not exist
LINE 1: select gp_segment_id, count(*) from ...
^
select gp_segment_id, count(*) from
table_non_replicated group by 1;
gp_segment_id | count
---------------+-------
0 | 5011
1 | 4989 The field gp_segment_id doesn't
exist in replicated tables
Greenplum 6 Replicated Tables Query Plan
explain select count(*) from table_fact f inner join table_replicated d on f.a = d.a;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Aggregate (cost=0.00..874.73 rows=1 width=8)
-> Gather Motion 2:1 (slice1; segments: 2) (cost=0.00..874.73 rows=1 width=8)
-> Aggregate (cost=0.00..874.73 rows=1 width=8)
-> Hash Join (cost=0.00..874.73 rows=50000 width=1)
Hash Cond: (table_fact.a = table_replicated.a)
-> Seq Scan on table_fact (cost=0.00..432.15 rows=50000 width=4)
-> Hash (cost=431.23..431.23 rows=10000 width=4)
-> Seq Scan on table_replicated (cost=0.00..431.23 rows=10000 width=4)
Optimizer: PQO version 3.29.0
explain select count(*) from table_fact f inner join table_non_replicated d on f.a = d.a;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Aggregate (cost=0.00..874.31 rows=1 width=8)
-> Gather Motion 2:1 (slice3; segments: 2) (cost=0.00..874.31 rows=1 width=8)
-> Aggregate (cost=0.00..874.31 rows=1 width=8)
-> Hash Join (cost=0.00..874.31 rows=50000 width=1)
Hash Cond: (table_fact.a = table_non_replicated.a)
-> Redistribute Motion 2:2 (slice1; segments: 2) (cost=0.00..433.15 rows=50000 width=4)
Hash Key: table_fact.a
-> Seq Scan on table_fact (cost=0.00..432.15 rows=50000 width=4)
-> Hash (cost=431.22..431.22 rows=5000 width=4)
-> Redistribute Motion 2:2 (slice2; segments: 2) (cost=0.00..431.22 rows=5000 width=4)
Hash Key: table_non_replicated.a
-> Seq Scan on table_non_replicated (cost=0.00..431.12 rows=5000 width=4)
Optimizer: PQO version 3.29.0
WithNonReplicatedtable
1 slice vs 3 slices
No redistribution
WithReplicatedtable
H/W
● zStandard
● Facebook OSS
●
●
● CREATE TABLE WITH
WITH (compresstype=zstd)
zStandard
Zstd
1-9 (1)
(9)
SQL CTE
Using RECURSIVE,
a WITH query can refer
to its own output
ETL Writable CTE
Data modifying
CTE allows
several different
operations in the
same query
Unlogged :
● WAL Unlogged
● :
● DB
create unlogged table
table_unlogged
(a int , b text)
distributed randomly;
Private CloudBare-Metal Public Cloud
Greenplum Building
Blocks
• The most performant way to
run Greenplum on premise
• Pivotal Blueprint for Dell
reference hardware configs
• Superior price/performance; no
expensive proprietary hardware
• Certified and supported by
Pivotal
Run Greenplum in Any Environment
Greenplum for Kubernetes
Other Kubernetes
(on VMs or not)
Google
Container Engine
Enterprise & Essentials(OSS K8s)
•
• : 100%
•
Public Cloud
Run Greenplum in Any Environment
●
○ AI
Pivotal Greenplum
○ ( )
●
○
○
○
○
●
○
○ DR AZ
○ HA
● 1
●
●
( )
●
● 5
○
●
● pgBouncer
DB
● gpsnap/gpcronsnap -
●
IaaS
●
●
●
Azure Resource
Group
Deployment
AWS
CloudFormation
GCP
Deployment
Manager
V
M
V
M
V
M
V
M
V
M
X
Data
Volume
Snapshot Restore
Run Greenplum in Any Environment
Greenplum for Kubernetes
Other Kubernetes
(on VMs or not)
Google
Container Engine
Enterprise & Essentials(OSS K8s)
●
● K8s
● K8s
●
day-2
PKS
Container
Operator
Bringing Cloud Databases On-Premises
● Greenplum
(Postgres) / Pod /
VM(vMotion)
● Greenplum
●
●
●
●
K8s worker 1 K8s worker n
PKS/K8s cluster
pod pod
K8s worker VMs: 8 to 32 GB
● Greenplum
○
○
● Greenplum
○
Pod
● VM K8s
Greenplum Pod
○ Pod
Persistent Volume 1 . . n
K8s worker 1 K8s worker n
PKS/K8s cluster
pod pod
Pivotal
2km
ATM 24 200
Peter
Pavan
Pivotal 2km ATM 24 200
Peter Pavan
drop function if exists get_people(text,text,integer,integer,float,float);
CREATE FUNCTION get_people(text,text,integer,integer,float,float) RETURNS integer
AS $$
declare
linkchk integer; v1 record; v2 record;
begin
execute 'truncate table results;';
for v1 in select distinct a.id,a.firstname,a.lastname,amount,tran_date,c.lat,c.lng,address,a.description,d.score from people a,transactions b,location c,
(SELECT w.id, q.score FROM people w, gptext.search(TABLE(SELECT 1 SCATTER BY 1), 'gpadmin.public.people' , 'Pivotal', null) q
WHERE (q.id::integer) = w.id order by 2 desc) d
where soundex(firstname)=soundex($1) and a.id=b.id and amount > $3 and (extract(epoch from tran_date) - extract(epoch from now()))/3600 < $4
and st_distance_sphere(st_makepoint($5, $6),st_makepoint(c.lng, c.lat))/1000.0 <= 2.0 and b.locid=c.locid and a.id=d.id
loop
for v2 in select distinct a.id,a.firstname,a.lastname,amount,tran_date,c.lat,c.lng,address,a.description,d.score from people a,transactions b,location c,
(SELECT w.id, q.score FROM people w, gptext.search(TABLE(SELECT 1 SCATTER BY 1), 'gpadmin.public.people' , 'Pivotal', null) q
WHERE (q.id::integer) = w.id order by 2 desc) d
where soundex(firstname)=soundex($2) and a.id=b.id and amount > $3 and (extract(epoch from tran_date) - extract(epoch from now()))/3600 < $4
and st_distance_sphere(st_makepoint($5, $6),st_makepoint(c.lng, c.lat))/1000.0 <= 2.0 and b.locid=c.locid and a.id=d.id
loop
execute 'DROP TABLE IF EXISTS out, out_summary;';
execute 'SELECT madlib.graph_bfs(''people'',''id'',''links'',NULL,'||v1.id||',''out'');' ;
select 1 into linkchk from out where dist=1 and id=v2.id;
if linkchk is not null then
insert into results values (v1.id,v1.firstname,v1.lastname,v1.amount,v1.tran_date,v1.lat,v1.lng,v1.address,v1.description,v1.score);
insert into results values (v2.id,v2.firstname,v2.lastname,v2.amount,v2.tran_date,v2.lat,v2.lng,v2.address,v2.description,v2.score);
end if;
end loop;
end loop;
return 0;
end
$$ LANGUAGE plpgsql;
-- person1 , person 2, amount, duration in hours, longtitude, latitude (in question)
select get_people('Pavan','Peter',200,24,103.912680, 1.309432) ;
Greenplum POSTGIS functions
st_distance_sphere() and
st_makepoint() calculate distance
between ATM location and
reference lat ,long < 2 KM
GPText.search() function is
used to know if both
people work at ‘Pivotal’
Greenplum and Apache MADlib BFS
search to know if there are direct or
indirect links between people
Greenplum Fuzzy String
Match function Soundex()
to know if people name
sounds like ‘Pavan’ or
‘Peter’
Greenplum Time functions to
calculate difference in amount
withdrawn time < 24 hours
Amount
> $200
“Pivotal
- GPText
Peter
Pavan
- Fuzzy
String Match
- Apache MADlib 2km ATM”
- PostGIS
24 ”
/
200
”
: 3,000+ vs 34
LOAD
customer
data from
HDFS and
put to HIVE
DESCRIPTION
Column needs to
be indexed
SEARCH
IN Column
& WRITE
Result to
HDFS
WRITE
CODE :
Pulling Data
Into Spark
Data Frame
WRITE
CODE :
CHECK
Soundex
WRITE
CODE :
MATCH
SOLR
Result
WRITE
CODE :
GRAPH
LINK
Analysis
WRITE
CODE :
POSTGI
S
Distance
Calculation
WRITE
CODE :
GRAPH
LINK
Analysi
s
WRITE
CODE :
WRITE
RESULTS
TO HIVE
TABLE
“Investigate a crime suspect whose name sounds like ‘Pavan’, who knows Peter directly, who withdraw Peter’s $500 at an ATM
located 2km from Changi yesterday.”
Using a Hadoop Ecosystem: 10 steps, 3000+ Lines of code across 4 different systems
1 2 3 4 5 6 7 8 9 10
Using Greenplum: 1 step, 1 query – 34 Lines of Code
One query – using built-in functions: Soundex (sounds like), NLP (work at same company),
Machine Learning MADlib (know directly), Time (yesterday), PostGIS (within 2km)
Greenplum
Greenplum
BI
Pivotal Greenplum OSS
•
• 50
• Pivotal Greenplum
• Apache
2017 7 :
http://madlib.apach
e.org
Apache MADlib
•
•
• http://lucene.apache
.org/solr/
Apache Solr
•
PostgreSQL
OSS
•
•
• http://postgis.net/
PostGIS
•
•
•
R
• https://www.r-
project.org/
R
•
•
•
•
• https://www.python.
org/
Python
In-DB
• Open source https://github.com/apache/madlib
• Downloads and docs http://madlib.apache.org/
• Wiki
https://cwiki.apache.org/confluence/display/MADLIB/
Apache MADlib: SQL
Apache
PostgreSQL &
Greenplum
Functions
Data Types and Transformations
Array and Matrix Operations
Matrix Factorization
• Low Rank
• Singular Value Decomposition (SVD)
Norms and Distance Functions
Sparse Vectors
Encoding Categorical Variables
Path Functions
Pivot
Sessionize
Stemming
May 2018
Graph
All Pairs Shortest Path (APSP)
Breadth-First Search
Hyperlink-Induced Topic Search (HITS)
Average Path Length
Closeness Centrality
Graph Diameter
In-Out Degree
PageRank and Personalized PageRank
Single Source Shortest Path (SSSP)
Weakly Connected Components
Model Selection
Cross Validation
Prediction Metrics
Train-Test Split
Statistics
Descriptive Statistics
• Cardinality Estimators
• Correlation and Covariance
• Summary
Inferential Statistics
• Hypothesis Tests
Probability Functions
Supervised Learning
Neural Networks
Support Vector Machines (SVM)
Conditional Random Field (CRF)
Regression Models
• Clustered Variance
• Cox-Proportional Hazards Regression
• Elastic Net Regularization
• Generalized Linear Models
• Linear Regression
• Logistic Regression
• Marginal Effects
• Multinomial Regression
• Naïve Bayes
• Ordinal Regression
• Robust Variance
Tree Methods
• Decision Tree
• Random Forest
Time Series Analysis
• ARIMA
Unsupervised Learning
Association Rules (Apriori)
Clustering (k-Means)
Principal Component Analysis (PCA)
Topic Modelling (Latent Dirichlet Allocation)
Utility Functions
Columns to Vector
Conjugate Gradient
Linear Solvers
• Dense Linear Systems
• Sparse Linear Systems
Mini-Batching
PMML Export
Term Frequency for Text
Vector to Columns
Nearest Neighbors
• k-Nearest Neighbors
Sampling
Balanced
Random
Stratified
Greenplum
Standby
Master
…
Master
Host
SQL
Interconnect
Segment
Host
Node1
Segment
Host
Node2
Segment
Host
Node3
Segment
Host
Node
N
GPU N
…
GPU 1 GPU N
…
GPU 1 GPU N
…
GPU 1
…
GPU N
…
GPU 1
In-Database
Functions
Machine learning
&
statistics
&
math
&
graph
&
utilities
MassivelyParallelProcessing
Best of both worlds: GPU-
focused and CPU-focused
data science workloads
● Unified platform for full
range of data science
workloads
● Higher productivity due
to no data movement
● Persistent data storage
and management
integrated with core
machine learning & API
compute engine
Supporting the full spectrum of data science workloads:
Data preparation, feature generation, machine learning, geospatial, deep learning, etc
Data
Types
Structured
Data
Unstructured
Data
Geographic
Data
Real Time
Data
Natural
Language
Data
Time Series
Data
Event Data
Network
Data
Linked Data
?
40
●
●
●
●
In-Memory
Database
RDBMS
dataData Lake
HOT
DATA
WARM
DATA
COLD
DATA
41
Platform Extension Framework (PXF)
PXF
44© Copyright 2019 Pivotal. All rights reserved.
(GIS)
: (NICT)
• ( )
• Pivotal Greenplum
• PostGIS: OSS
• Apache MADlib:
OSS
•
•
•
•
•
Hadoop
GPText
Pivotal Greenplum
ColdHotWarm
DataTemperature
PIVOTAL
GEMFIRE
PIVOTAL
GREENPLUM
(Data Warehouse)
PIVOTAL
GREENPLUM
Structured Data
JDBC, OBBC
SQL
ANSI SQL
RDBMS
SparkGemFireHDFS
JSON, Apache AVRO, Apache Parquet and XML
Teradata SQL
DB SQL
Apache MADlib
/ /
Python. R,
Java, Perl, C
Apache SOLR PostGIS
Custom Apps BI / Reporting Machine Learning AI
Pivotal
Greenplum
KafkaETL
Spring
Cloud
Data Flow
(MPP)
PostgreSQL
(GPORCA)
Command
Center
SQL
(Hyper-Q)
IT
● Pivotal Greenplum 3
Ø
●
Ø &
Ø PostgreSQL DWH
● Greenplum 6
● DWH Pivotal Greenplum
●
● TW: @greenplummy
● connpass: https://pivotal-japan.connpass.com/
© Copyright 2019 Pivotal Software, Inc. All rights Reserved.
Thank You

More Related Content

What's hot

インフラエンジニアのためのcassandra入門
インフラエンジニアのためのcassandra入門インフラエンジニアのためのcassandra入門
インフラエンジニアのためのcassandra入門Akihiro Kuwano
 
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイントPostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
NTT DATA OSS Professional Services
 
PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...
PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...
PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...
NTT DATA Technology & Innovation
 
【Zabbix2.0】snmpttによるトラップメッセージの編集 #Zabbix #自宅ラック勉強会
【Zabbix2.0】snmpttによるトラップメッセージの編集 #Zabbix #自宅ラック勉強会【Zabbix2.0】snmpttによるトラップメッセージの編集 #Zabbix #自宅ラック勉強会
【Zabbix2.0】snmpttによるトラップメッセージの編集 #Zabbix #自宅ラック勉強会真乙 九龍
 
PostgreSQL失敗談
PostgreSQL失敗談PostgreSQL失敗談
PostgreSQL失敗談
Takashi Meguro
 
祝!PostgreSQLレプリケーション10周年!徹底紹介!!
祝!PostgreSQLレプリケーション10周年!徹底紹介!!祝!PostgreSQLレプリケーション10周年!徹底紹介!!
祝!PostgreSQLレプリケーション10周年!徹底紹介!!
NTT DATA Technology & Innovation
 
ソーシャルゲーム案件におけるDB分割のPHP実装
ソーシャルゲーム案件におけるDB分割のPHP実装ソーシャルゲーム案件におけるDB分割のPHP実装
ソーシャルゲーム案件におけるDB分割のPHP実装
infinite_loop
 
Azure Database for PostgreSQL 入門 (PostgreSQL Conference Japan 2021)
Azure Database for PostgreSQL 入門 (PostgreSQL Conference Japan 2021)Azure Database for PostgreSQL 入門 (PostgreSQL Conference Japan 2021)
Azure Database for PostgreSQL 入門 (PostgreSQL Conference Japan 2021)
Keisuke Takahashi
 
ヤフー社内でやってるMySQLチューニングセミナー大公開
ヤフー社内でやってるMySQLチューニングセミナー大公開ヤフー社内でやってるMySQLチューニングセミナー大公開
ヤフー社内でやってるMySQLチューニングセミナー大公開
Yahoo!デベロッパーネットワーク
 
Sql server エンジニアに知ってもらいたい!! sql server チューニングアプローチ
Sql server エンジニアに知ってもらいたい!! sql server チューニングアプローチSql server エンジニアに知ってもらいたい!! sql server チューニングアプローチ
Sql server エンジニアに知ってもらいたい!! sql server チューニングアプローチMasayuki Ozawa
 
大量のデータ処理や分析に使えるOSS Apache Sparkのご紹介(Open Source Conference 2020 Online/Kyoto ...
大量のデータ処理や分析に使えるOSS Apache Sparkのご紹介(Open Source Conference 2020 Online/Kyoto ...大量のデータ処理や分析に使えるOSS Apache Sparkのご紹介(Open Source Conference 2020 Online/Kyoto ...
大量のデータ処理や分析に使えるOSS Apache Sparkのご紹介(Open Source Conference 2020 Online/Kyoto ...
NTT DATA Technology & Innovation
 
Metaspace
MetaspaceMetaspace
Metaspace
Yasumasa Suenaga
 
Coherenceを利用するときに気をつけること #OracleCoherence
Coherenceを利用するときに気をつけること #OracleCoherenceCoherenceを利用するときに気をつけること #OracleCoherence
Coherenceを利用するときに気をつけること #OracleCoherenceToshiaki Maki
 
Vacuum徹底解説
Vacuum徹底解説Vacuum徹底解説
Vacuum徹底解説
Masahiko Sawada
 
PostgreSQLの行レベルセキュリティと SpringAOPでマルチテナントの ユーザー間情報漏洩を防止する (JJUG CCC 2021 Spring)
PostgreSQLの行レベルセキュリティと SpringAOPでマルチテナントの ユーザー間情報漏洩を防止する (JJUG CCC 2021 Spring)PostgreSQLの行レベルセキュリティと SpringAOPでマルチテナントの ユーザー間情報漏洩を防止する (JJUG CCC 2021 Spring)
PostgreSQLの行レベルセキュリティと SpringAOPでマルチテナントの ユーザー間情報漏洩を防止する (JJUG CCC 2021 Spring)
Koichiro Matsuoka
 
SQLアンチパターン - 開発者を待ち受ける25の落とし穴 (拡大版)
SQLアンチパターン - 開発者を待ち受ける25の落とし穴 (拡大版)SQLアンチパターン - 開発者を待ち受ける25の落とし穴 (拡大版)
SQLアンチパターン - 開発者を待ち受ける25の落とし穴 (拡大版)
Takuto Wada
 
binary log と 2PC と Group Commit
binary log と 2PC と Group Commitbinary log と 2PC と Group Commit
binary log と 2PC と Group Commit
Takanori Sejima
 
平成最後の1月ですし、Databricksでもやってみましょうか
平成最後の1月ですし、Databricksでもやってみましょうか平成最後の1月ですし、Databricksでもやってみましょうか
平成最後の1月ですし、Databricksでもやってみましょうか
Ryuichi Tokugami
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
Alexander Korotkov
 
雑なMySQLパフォーマンスチューニング
雑なMySQLパフォーマンスチューニング雑なMySQLパフォーマンスチューニング
雑なMySQLパフォーマンスチューニング
yoku0825
 

What's hot (20)

インフラエンジニアのためのcassandra入門
インフラエンジニアのためのcassandra入門インフラエンジニアのためのcassandra入門
インフラエンジニアのためのcassandra入門
 
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイントPostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
 
PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...
PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...
PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...
 
【Zabbix2.0】snmpttによるトラップメッセージの編集 #Zabbix #自宅ラック勉強会
【Zabbix2.0】snmpttによるトラップメッセージの編集 #Zabbix #自宅ラック勉強会【Zabbix2.0】snmpttによるトラップメッセージの編集 #Zabbix #自宅ラック勉強会
【Zabbix2.0】snmpttによるトラップメッセージの編集 #Zabbix #自宅ラック勉強会
 
PostgreSQL失敗談
PostgreSQL失敗談PostgreSQL失敗談
PostgreSQL失敗談
 
祝!PostgreSQLレプリケーション10周年!徹底紹介!!
祝!PostgreSQLレプリケーション10周年!徹底紹介!!祝!PostgreSQLレプリケーション10周年!徹底紹介!!
祝!PostgreSQLレプリケーション10周年!徹底紹介!!
 
ソーシャルゲーム案件におけるDB分割のPHP実装
ソーシャルゲーム案件におけるDB分割のPHP実装ソーシャルゲーム案件におけるDB分割のPHP実装
ソーシャルゲーム案件におけるDB分割のPHP実装
 
Azure Database for PostgreSQL 入門 (PostgreSQL Conference Japan 2021)
Azure Database for PostgreSQL 入門 (PostgreSQL Conference Japan 2021)Azure Database for PostgreSQL 入門 (PostgreSQL Conference Japan 2021)
Azure Database for PostgreSQL 入門 (PostgreSQL Conference Japan 2021)
 
ヤフー社内でやってるMySQLチューニングセミナー大公開
ヤフー社内でやってるMySQLチューニングセミナー大公開ヤフー社内でやってるMySQLチューニングセミナー大公開
ヤフー社内でやってるMySQLチューニングセミナー大公開
 
Sql server エンジニアに知ってもらいたい!! sql server チューニングアプローチ
Sql server エンジニアに知ってもらいたい!! sql server チューニングアプローチSql server エンジニアに知ってもらいたい!! sql server チューニングアプローチ
Sql server エンジニアに知ってもらいたい!! sql server チューニングアプローチ
 
大量のデータ処理や分析に使えるOSS Apache Sparkのご紹介(Open Source Conference 2020 Online/Kyoto ...
大量のデータ処理や分析に使えるOSS Apache Sparkのご紹介(Open Source Conference 2020 Online/Kyoto ...大量のデータ処理や分析に使えるOSS Apache Sparkのご紹介(Open Source Conference 2020 Online/Kyoto ...
大量のデータ処理や分析に使えるOSS Apache Sparkのご紹介(Open Source Conference 2020 Online/Kyoto ...
 
Metaspace
MetaspaceMetaspace
Metaspace
 
Coherenceを利用するときに気をつけること #OracleCoherence
Coherenceを利用するときに気をつけること #OracleCoherenceCoherenceを利用するときに気をつけること #OracleCoherence
Coherenceを利用するときに気をつけること #OracleCoherence
 
Vacuum徹底解説
Vacuum徹底解説Vacuum徹底解説
Vacuum徹底解説
 
PostgreSQLの行レベルセキュリティと SpringAOPでマルチテナントの ユーザー間情報漏洩を防止する (JJUG CCC 2021 Spring)
PostgreSQLの行レベルセキュリティと SpringAOPでマルチテナントの ユーザー間情報漏洩を防止する (JJUG CCC 2021 Spring)PostgreSQLの行レベルセキュリティと SpringAOPでマルチテナントの ユーザー間情報漏洩を防止する (JJUG CCC 2021 Spring)
PostgreSQLの行レベルセキュリティと SpringAOPでマルチテナントの ユーザー間情報漏洩を防止する (JJUG CCC 2021 Spring)
 
SQLアンチパターン - 開発者を待ち受ける25の落とし穴 (拡大版)
SQLアンチパターン - 開発者を待ち受ける25の落とし穴 (拡大版)SQLアンチパターン - 開発者を待ち受ける25の落とし穴 (拡大版)
SQLアンチパターン - 開発者を待ち受ける25の落とし穴 (拡大版)
 
binary log と 2PC と Group Commit
binary log と 2PC と Group Commitbinary log と 2PC と Group Commit
binary log と 2PC と Group Commit
 
平成最後の1月ですし、Databricksでもやってみましょうか
平成最後の1月ですし、Databricksでもやってみましょうか平成最後の1月ですし、Databricksでもやってみましょうか
平成最後の1月ですし、Databricksでもやってみましょうか
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
 
雑なMySQLパフォーマンスチューニング
雑なMySQLパフォーマンスチューニング雑なMySQLパフォーマンスチューニング
雑なMySQLパフォーマンスチューニング
 

Similar to クラウドDWHとしても進化を続けるPivotal Greenplumご紹介

Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Databricks
 
Apache Cassandra at Macys
Apache Cassandra at MacysApache Cassandra at Macys
Apache Cassandra at Macys
DataStax Academy
 
PostgreSQL 9.6 새 기능 소개
PostgreSQL 9.6 새 기능 소개PostgreSQL 9.6 새 기능 소개
PostgreSQL 9.6 새 기능 소개
PgDay.Seoul
 
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
PgDay.Seoul
 
Scaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQLScaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQL
Jim Mlodgenski
 
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
Kohei KaiGai
 
What’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorWhat’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributor
Masahiko Sawada
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
Kohei KaiGai
 
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenJohn Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenPostgresOpen
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda en
Kohei KaiGai
 
20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw
Kohei KaiGai
 
Rob Sullivan at Heroku's Waza 2013: Your Database -- A Story of Indifference
Rob Sullivan at Heroku's Waza 2013: Your Database -- A Story of IndifferenceRob Sullivan at Heroku's Waza 2013: Your Database -- A Story of Indifference
Rob Sullivan at Heroku's Waza 2013: Your Database -- A Story of Indifference
Heroku
 
RAPIDS: ускоряем Pandas и scikit-learn на GPU Павел Клеменков, NVidia
RAPIDS: ускоряем Pandas и scikit-learn на GPU  Павел Клеменков, NVidiaRAPIDS: ускоряем Pandas и scikit-learn на GPU  Павел Клеменков, NVidia
RAPIDS: ускоряем Pandas и scikit-learn на GPU Павел Клеменков, NVidia
Mail.ru Group
 
Explain this!
Explain this!Explain this!
Explain this!
Fabio Telles Rodriguez
 
String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?
Jeremy Schneider
 
SequoiaDB Distributed Relational Database
SequoiaDB Distributed Relational DatabaseSequoiaDB Distributed Relational Database
SequoiaDB Distributed Relational Database
wangzhonnew
 
Spark Streaming Tips for Devs and Ops
Spark Streaming Tips for Devs and OpsSpark Streaming Tips for Devs and Ops
Spark Streaming Tips for Devs and Ops
Francisco Pérez Paradas
 
Spark Streaming Tips for Devs and Ops by Fran perez y federico fernández
Spark Streaming Tips for Devs and Ops by Fran perez y federico fernándezSpark Streaming Tips for Devs and Ops by Fran perez y federico fernández
Spark Streaming Tips for Devs and Ops by Fran perez y federico fernández
J On The Beach
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
Kohei KaiGai
 
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Serban Tanasa
 

Similar to クラウドDWHとしても進化を続けるPivotal Greenplumご紹介 (20)

Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
 
Apache Cassandra at Macys
Apache Cassandra at MacysApache Cassandra at Macys
Apache Cassandra at Macys
 
PostgreSQL 9.6 새 기능 소개
PostgreSQL 9.6 새 기능 소개PostgreSQL 9.6 새 기능 소개
PostgreSQL 9.6 새 기능 소개
 
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
 
Scaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQLScaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQL
 
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
 
What’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorWhat’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributor
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
 
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenJohn Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda en
 
20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw
 
Rob Sullivan at Heroku's Waza 2013: Your Database -- A Story of Indifference
Rob Sullivan at Heroku's Waza 2013: Your Database -- A Story of IndifferenceRob Sullivan at Heroku's Waza 2013: Your Database -- A Story of Indifference
Rob Sullivan at Heroku's Waza 2013: Your Database -- A Story of Indifference
 
RAPIDS: ускоряем Pandas и scikit-learn на GPU Павел Клеменков, NVidia
RAPIDS: ускоряем Pandas и scikit-learn на GPU  Павел Клеменков, NVidiaRAPIDS: ускоряем Pandas и scikit-learn на GPU  Павел Клеменков, NVidia
RAPIDS: ускоряем Pandas и scikit-learn на GPU Павел Клеменков, NVidia
 
Explain this!
Explain this!Explain this!
Explain this!
 
String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?
 
SequoiaDB Distributed Relational Database
SequoiaDB Distributed Relational DatabaseSequoiaDB Distributed Relational Database
SequoiaDB Distributed Relational Database
 
Spark Streaming Tips for Devs and Ops
Spark Streaming Tips for Devs and OpsSpark Streaming Tips for Devs and Ops
Spark Streaming Tips for Devs and Ops
 
Spark Streaming Tips for Devs and Ops by Fran perez y federico fernández
Spark Streaming Tips for Devs and Ops by Fran perez y federico fernándezSpark Streaming Tips for Devs and Ops by Fran perez y federico fernández
Spark Streaming Tips for Devs and Ops by Fran perez y federico fernández
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
 
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
 

Recently uploaded

Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 

Recently uploaded (20)

Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 

クラウドDWHとしても進化を続けるPivotal Greenplumご紹介

  • 1. © Copyright 2019 Pivotal Software, Inc. All rights Reserved. 2019 9 25 Pivotal DWH Pivotal Greenplum
  • 2. Agenda ● Ø Pivotal Greenplum Ø Greenplum - Pivotal Greenplum 6 ● DWH Pivotal Greenplum Ø Ø DWH Ø ●
  • 3. 3© Copyright 2019 Pivotal. All rights reserved. Pivotal Greenplum • Pivotal Data Suite (CPU ) • • • ( ) K8s • MPP DB • • ( etc..) • • •
  • 4. 4© Copyright 2019 Pivotal. All rights reserved. Pivotal Greenplum MPP (Massively Parallel Processing) ... ... x 2 x 2 SQL SQL gNet
  • 5. 5© Copyright 2019 Pivotal. All rights reserved. CPU I/O CPU CPU CPU CPU CPU CPU CPU CPU CPU I/O I/O HW RDB DB
  • 6. 6© Copyright 2019 Pivotal. All rights reserved. ( 1/2) 256GB RAM 1.8TB 10Krpm SAS HDD 6 RAID5 (4 + 1 + 1 ) 256GB RAM256GB RAM 256GB RAM 256GB RAM 256GB RAM 1.8TB 10Krpm SAS HDD 6 RAID5 (4 + 1 + 1 ) #1 (10Gbps x 52 ) #2 (10Gbps x 52 ) #1 #2 #3 #4 Intel E5-2680v3 2CPU,24Core Intel E5-2680v3 2CPU,24Core Intel E5-2680v3 2CPU,24Core Intel E5-2680v3 2CPU,24Core Intel E5-2680v3 2CPU,24Core Intel E5-2680v3 2CPU,24Core 1.8TB 10Krpm SAS HDD 12 RAID5 (10 + 1 + 1 ) 1.8TB 10Krpm SAS HDD 12 RAID5 (10 + 1 + 1 ) 1.8TB 10Krpm SAS HDD 12 RAID5 (10 + 1 + 1 ) 1.8TB 10Krpm SAS HDD 12 RAID5 (10 + 1 + 1 ) 1.8TB 10Krpm SAS HDD 12 RAID5 (10 + 1 + 1 ) 1.8TB 10Krpm SAS HDD 12 RAID5 (10 + 1 + 1 ) 1.8TB 10Krpm SAS HDD 12 RAID5 (10 + 1 + 1 ) 1.8TB 10Krpm SAS HDD 12 RAID5 (10 + 1 + 1 ) mLAG
  • 7. 7© Copyright 2019 Pivotal. All rights reserved. ( 2/2) 64GB RAM 300GB 10Krpm SAS HDD 6 RAID5 (4 + 1 + 1 ) 64GB RAM64GB RAM 64GB RAM 64GB RAM 64GB RAM 300GB 10Krpm SAS HDD 6 RAID5 (4 + 1 + 1 ) #1 (10Gbps x 52 ) #2 (10Gbps x 52 ) #1 #2 #3 #4 Intel E5-2660 2CPU,16Core Intel E5-2660 2CPU,16Core Intel E5-2660 2CPU,16Core Intel E5-2660 2CPU,16Core Intel E5-2660 2CPU,16Core Intel E5-2660 2CPU,16Core 300GB 10Krpm SAS HDD 12 RAID5 (10 + 1 + 1 ) 300GB 10Krpm SAS HDD 12 RAID5 (10 + 1 + 1 ) 300GB 10Krpm SAS HDD 12 RAID5 (10 + 1 + 1 ) 300GB 10Krpm SAS HDD 12 RAID5 (10 + 1 + 1 ) 300GB 10Krpm SAS HDD 12 RAID5 (10 + 1 + 1 ) 300GB 10Krpm SAS HDD 12 RAID5 (10 + 1 + 1 ) 300GB 10Krpm SAS HDD 12 RAID5 (10 + 1 + 1 ) 300GB 10Krpm SAS HDD 12 RAID5 (10 + 1 + 1 ) HA Bonding #1 #1 #9 #9 #17 #17 5 #25 #25 mLAG
  • 9. Greenplum v1 v4 v5 v6 2003 - 2009 2010 2015-20182013 2019 BI
  • 10. Cover w/ Image DWH • PB • : • SLA • • • OSS #1 DWH
  • 11. September 4, 2019: Now Generally Available
  • 12. Greenplum 6 Postgres v8.4 – 2314 commits v9.0 – 1859 commits v9.1 – 2035 commits v9.2 – 1945 commits v9.3 – 1603 commits v9.4 – 1964 commits TOTAL: 11,720 Commits Merged Code Quality via Open Source Optimized for Big Data in Greenplum “Customers frequently called out the open-source alignment with PostgreSQL as a strong and cost- effective positive” -- Gartner MQ 2019
  • 13. Greenplum 6 OLTP 70 ● OLTP ● OLTP ● 24,448 TPS for Update transactions in GP6 ● 46,570 TPS for Single Row Insert in GP6 ● 140,000 TPS for Select Only Query in GP6 ● Real world analytical database and data warehouse use cases require a mixed workload of long and short queries as well as updates and deletes
  • 14. “Replicated” “DISTRIBUTED REPLICATED” ● Greenplum ● / ● Replicated REPLICATED DIMENSION TABLES FOR FAST LOCAL JOIN
  • 15. Greenplum 6 Replicated Tables create table table_replicated (a int , b text) distributed replicated; insert into table_replicated select id, 'val ' || id from generate_series (1,10000) id; select pg_relation_size('table_replicated'); pg_relation_size ------------------ 917504 create table table_non_replicated (a int , b text) distributed randomly; insert into table_non_replicated select id, 'val ' || id from generate_series (1,10000) id; select pg_relation_size('table_non_replicated'); pg_relation_size ------------------ 458752 With Non Replicated table With Replicated Tables Size is multiplied by the number of primaries select gp_segment_id, count(*) from table_replicated group by 1; ERROR: column "gp_segment_id" does not exist LINE 1: select gp_segment_id, count(*) from ... ^ select gp_segment_id, count(*) from table_non_replicated group by 1; gp_segment_id | count ---------------+------- 0 | 5011 1 | 4989 The field gp_segment_id doesn't exist in replicated tables
  • 16. Greenplum 6 Replicated Tables Query Plan explain select count(*) from table_fact f inner join table_replicated d on f.a = d.a; QUERY PLAN ---------------------------------------------------------------------------------------------------- Aggregate (cost=0.00..874.73 rows=1 width=8) -> Gather Motion 2:1 (slice1; segments: 2) (cost=0.00..874.73 rows=1 width=8) -> Aggregate (cost=0.00..874.73 rows=1 width=8) -> Hash Join (cost=0.00..874.73 rows=50000 width=1) Hash Cond: (table_fact.a = table_replicated.a) -> Seq Scan on table_fact (cost=0.00..432.15 rows=50000 width=4) -> Hash (cost=431.23..431.23 rows=10000 width=4) -> Seq Scan on table_replicated (cost=0.00..431.23 rows=10000 width=4) Optimizer: PQO version 3.29.0 explain select count(*) from table_fact f inner join table_non_replicated d on f.a = d.a; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Aggregate (cost=0.00..874.31 rows=1 width=8) -> Gather Motion 2:1 (slice3; segments: 2) (cost=0.00..874.31 rows=1 width=8) -> Aggregate (cost=0.00..874.31 rows=1 width=8) -> Hash Join (cost=0.00..874.31 rows=50000 width=1) Hash Cond: (table_fact.a = table_non_replicated.a) -> Redistribute Motion 2:2 (slice1; segments: 2) (cost=0.00..433.15 rows=50000 width=4) Hash Key: table_fact.a -> Seq Scan on table_fact (cost=0.00..432.15 rows=50000 width=4) -> Hash (cost=431.22..431.22 rows=5000 width=4) -> Redistribute Motion 2:2 (slice2; segments: 2) (cost=0.00..431.22 rows=5000 width=4) Hash Key: table_non_replicated.a -> Seq Scan on table_non_replicated (cost=0.00..431.12 rows=5000 width=4) Optimizer: PQO version 3.29.0 WithNonReplicatedtable 1 slice vs 3 slices No redistribution WithReplicatedtable
  • 17. H/W ● zStandard ● Facebook OSS ● ● ● CREATE TABLE WITH WITH (compresstype=zstd)
  • 19. SQL CTE Using RECURSIVE, a WITH query can refer to its own output
  • 20. ETL Writable CTE Data modifying CTE allows several different operations in the same query
  • 21. Unlogged : ● WAL Unlogged ● : ● DB create unlogged table table_unlogged (a int , b text) distributed randomly;
  • 22.
  • 23. Private CloudBare-Metal Public Cloud Greenplum Building Blocks • The most performant way to run Greenplum on premise • Pivotal Blueprint for Dell reference hardware configs • Superior price/performance; no expensive proprietary hardware • Certified and supported by Pivotal Run Greenplum in Any Environment Greenplum for Kubernetes Other Kubernetes (on VMs or not) Google Container Engine Enterprise & Essentials(OSS K8s) • • : 100% •
  • 24. Public Cloud Run Greenplum in Any Environment
  • 25. ● ○ AI Pivotal Greenplum ○ ( ) ● ○ ○ ○ ○ ● ○ ○ DR AZ ○ HA
  • 26. ● 1 ● ● ( ) ● ● 5 ○ ● ● pgBouncer DB ● gpsnap/gpcronsnap - ● IaaS ● ● ● Azure Resource Group Deployment AWS CloudFormation GCP Deployment Manager V M V M V M V M V M X Data Volume Snapshot Restore
  • 27. Run Greenplum in Any Environment Greenplum for Kubernetes Other Kubernetes (on VMs or not) Google Container Engine Enterprise & Essentials(OSS K8s)
  • 29. ● Greenplum (Postgres) / Pod / VM(vMotion) ● Greenplum ● ● ● ● K8s worker 1 K8s worker n PKS/K8s cluster pod pod K8s worker VMs: 8 to 32 GB
  • 30. ● Greenplum ○ ○ ● Greenplum ○ Pod ● VM K8s Greenplum Pod ○ Pod Persistent Volume 1 . . n K8s worker 1 K8s worker n PKS/K8s cluster pod pod
  • 32. Pivotal 2km ATM 24 200 Peter Pavan drop function if exists get_people(text,text,integer,integer,float,float); CREATE FUNCTION get_people(text,text,integer,integer,float,float) RETURNS integer AS $$ declare linkchk integer; v1 record; v2 record; begin execute 'truncate table results;'; for v1 in select distinct a.id,a.firstname,a.lastname,amount,tran_date,c.lat,c.lng,address,a.description,d.score from people a,transactions b,location c, (SELECT w.id, q.score FROM people w, gptext.search(TABLE(SELECT 1 SCATTER BY 1), 'gpadmin.public.people' , 'Pivotal', null) q WHERE (q.id::integer) = w.id order by 2 desc) d where soundex(firstname)=soundex($1) and a.id=b.id and amount > $3 and (extract(epoch from tran_date) - extract(epoch from now()))/3600 < $4 and st_distance_sphere(st_makepoint($5, $6),st_makepoint(c.lng, c.lat))/1000.0 <= 2.0 and b.locid=c.locid and a.id=d.id loop for v2 in select distinct a.id,a.firstname,a.lastname,amount,tran_date,c.lat,c.lng,address,a.description,d.score from people a,transactions b,location c, (SELECT w.id, q.score FROM people w, gptext.search(TABLE(SELECT 1 SCATTER BY 1), 'gpadmin.public.people' , 'Pivotal', null) q WHERE (q.id::integer) = w.id order by 2 desc) d where soundex(firstname)=soundex($2) and a.id=b.id and amount > $3 and (extract(epoch from tran_date) - extract(epoch from now()))/3600 < $4 and st_distance_sphere(st_makepoint($5, $6),st_makepoint(c.lng, c.lat))/1000.0 <= 2.0 and b.locid=c.locid and a.id=d.id loop execute 'DROP TABLE IF EXISTS out, out_summary;'; execute 'SELECT madlib.graph_bfs(''people'',''id'',''links'',NULL,'||v1.id||',''out'');' ; select 1 into linkchk from out where dist=1 and id=v2.id; if linkchk is not null then insert into results values (v1.id,v1.firstname,v1.lastname,v1.amount,v1.tran_date,v1.lat,v1.lng,v1.address,v1.description,v1.score); insert into results values (v2.id,v2.firstname,v2.lastname,v2.amount,v2.tran_date,v2.lat,v2.lng,v2.address,v2.description,v2.score); end if; end loop; end loop; return 0; end $$ LANGUAGE plpgsql; -- person1 , person 2, amount, duration in hours, longtitude, latitude (in question) select get_people('Pavan','Peter',200,24,103.912680, 1.309432) ; Greenplum POSTGIS functions st_distance_sphere() and st_makepoint() calculate distance between ATM location and reference lat ,long < 2 KM GPText.search() function is used to know if both people work at ‘Pivotal’ Greenplum and Apache MADlib BFS search to know if there are direct or indirect links between people Greenplum Fuzzy String Match function Soundex() to know if people name sounds like ‘Pavan’ or ‘Peter’ Greenplum Time functions to calculate difference in amount withdrawn time < 24 hours Amount > $200 “Pivotal - GPText Peter Pavan - Fuzzy String Match - Apache MADlib 2km ATM” - PostGIS 24 ” / 200 ”
  • 33. : 3,000+ vs 34 LOAD customer data from HDFS and put to HIVE DESCRIPTION Column needs to be indexed SEARCH IN Column & WRITE Result to HDFS WRITE CODE : Pulling Data Into Spark Data Frame WRITE CODE : CHECK Soundex WRITE CODE : MATCH SOLR Result WRITE CODE : GRAPH LINK Analysis WRITE CODE : POSTGI S Distance Calculation WRITE CODE : GRAPH LINK Analysi s WRITE CODE : WRITE RESULTS TO HIVE TABLE “Investigate a crime suspect whose name sounds like ‘Pavan’, who knows Peter directly, who withdraw Peter’s $500 at an ATM located 2km from Changi yesterday.” Using a Hadoop Ecosystem: 10 steps, 3000+ Lines of code across 4 different systems 1 2 3 4 5 6 7 8 9 10 Using Greenplum: 1 step, 1 query – 34 Lines of Code One query – using built-in functions: Soundex (sounds like), NLP (work at same company), Machine Learning MADlib (know directly), Time (yesterday), PostGIS (within 2km)
  • 35. Pivotal Greenplum OSS • • 50 • Pivotal Greenplum • Apache 2017 7 : http://madlib.apach e.org Apache MADlib • • • http://lucene.apache .org/solr/ Apache Solr • PostgreSQL OSS • • • http://postgis.net/ PostGIS • • • R • https://www.r- project.org/ R • • • • • https://www.python. org/ Python
  • 36. In-DB • Open source https://github.com/apache/madlib • Downloads and docs http://madlib.apache.org/ • Wiki https://cwiki.apache.org/confluence/display/MADLIB/ Apache MADlib: SQL Apache PostgreSQL & Greenplum
  • 37. Functions Data Types and Transformations Array and Matrix Operations Matrix Factorization • Low Rank • Singular Value Decomposition (SVD) Norms and Distance Functions Sparse Vectors Encoding Categorical Variables Path Functions Pivot Sessionize Stemming May 2018 Graph All Pairs Shortest Path (APSP) Breadth-First Search Hyperlink-Induced Topic Search (HITS) Average Path Length Closeness Centrality Graph Diameter In-Out Degree PageRank and Personalized PageRank Single Source Shortest Path (SSSP) Weakly Connected Components Model Selection Cross Validation Prediction Metrics Train-Test Split Statistics Descriptive Statistics • Cardinality Estimators • Correlation and Covariance • Summary Inferential Statistics • Hypothesis Tests Probability Functions Supervised Learning Neural Networks Support Vector Machines (SVM) Conditional Random Field (CRF) Regression Models • Clustered Variance • Cox-Proportional Hazards Regression • Elastic Net Regularization • Generalized Linear Models • Linear Regression • Logistic Regression • Marginal Effects • Multinomial Regression • Naïve Bayes • Ordinal Regression • Robust Variance Tree Methods • Decision Tree • Random Forest Time Series Analysis • ARIMA Unsupervised Learning Association Rules (Apriori) Clustering (k-Means) Principal Component Analysis (PCA) Topic Modelling (Latent Dirichlet Allocation) Utility Functions Columns to Vector Conjugate Gradient Linear Solvers • Dense Linear Systems • Sparse Linear Systems Mini-Batching PMML Export Term Frequency for Text Vector to Columns Nearest Neighbors • k-Nearest Neighbors Sampling Balanced Random Stratified
  • 38. Greenplum Standby Master … Master Host SQL Interconnect Segment Host Node1 Segment Host Node2 Segment Host Node3 Segment Host Node N GPU N … GPU 1 GPU N … GPU 1 GPU N … GPU 1 … GPU N … GPU 1 In-Database Functions Machine learning & statistics & math & graph & utilities MassivelyParallelProcessing Best of both worlds: GPU- focused and CPU-focused data science workloads ● Unified platform for full range of data science workloads ● Higher productivity due to no data movement ● Persistent data storage and management integrated with core machine learning & API compute engine Supporting the full spectrum of data science workloads: Data preparation, feature generation, machine learning, geospatial, deep learning, etc
  • 40. ? 40
  • 43. PXF
  • 44. 44© Copyright 2019 Pivotal. All rights reserved. (GIS) : (NICT) • ( ) • Pivotal Greenplum • PostGIS: OSS • Apache MADlib: OSS • • • • •
  • 46. PIVOTAL GREENPLUM Structured Data JDBC, OBBC SQL ANSI SQL RDBMS SparkGemFireHDFS JSON, Apache AVRO, Apache Parquet and XML Teradata SQL DB SQL Apache MADlib / / Python. R, Java, Perl, C Apache SOLR PostGIS Custom Apps BI / Reporting Machine Learning AI Pivotal Greenplum KafkaETL Spring Cloud Data Flow (MPP) PostgreSQL (GPORCA) Command Center SQL (Hyper-Q) IT
  • 47. ● Pivotal Greenplum 3 Ø ● Ø & Ø PostgreSQL DWH ● Greenplum 6 ● DWH Pivotal Greenplum ● ● TW: @greenplummy ● connpass: https://pivotal-japan.connpass.com/
  • 48. © Copyright 2019 Pivotal Software, Inc. All rights Reserved. Thank You