SlideShare a Scribd company logo
1 of 80
A NEW PLATFORM FOR A NEW ERA
SK Krishnamurthy
2© Copyright 2013 Pivotal. All rights reserved.
Agenda
 HAWQ failover and HA now
 HAWQ HA upcoming release
 What’s new in PHD 1.1
 Pivotal Command Center new features
 Discuss roadmap in conjunction with AMEX requirements
 Open discussion: SAW, PHD 1.1 upgrade, …
3© Copyright 2013 Pivotal. All rights reserved. 3© Copyright 2013 Pivotal. All rights reserved.
HAWQ - Availability
Nov 25, 2013
4© Copyright 2013 Pivotal. All rights reserved.
Deployment Model – Sample HAWQ Cluster
HAWQ
PM
HAWQ
SM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
5© Copyright 2013 Pivotal. All rights reserved.
HAWQ Master Fails
HAWQ
PM
HAWQ
SM
PNN SNN
Action Availability Notes
HAWQ Cluster Yes (with downtime) HAWQ Cluster available. How
does clients connect to
SM?Manual process to
connect to standby master.
Similar to GPDB.
Current “SELECT” queries Aborted Users need to restart the
query.
Current Transaction Aborted Dirty data & temp files will be
removed.
New “SELECT” & transaction Yes SM will continue to process
queries.
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
6© Copyright 2013 Pivotal. All rights reserved.
HAWQ Master Fails
HAWQ
PM
HAWQ
SM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
• Execution coordinator resides on
master
• Distributed transaction master resides
on master
• Log copied up to last committed
transaction
• Run gpactivatestandby on secondary
master
• Either VIP or DNS hostname change
to re-route client connections
7© Copyright 2013 Pivotal. All rights reserved.
HAWQ
PM
HAWQ
SM
PNN SNN
Action Availability Notes
HAWQ Cluster Un-Available Cluster is considered to be
down.
Current “SELECT” queries Aborted Can’t restart the query.
Current Transaction Aborted Dirty data & temp files will be
removed.
New “SELECT” &
Transaction query
Not possible
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
HAWQ Master & Standby Master Fail
8© Copyright 2013 Pivotal. All rights reserved.
HAWQ
PM
HAWQ
SM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
HAWQ Master & Standby Master Fail
• Configure RAID 10 for HAWQ master
so primary segment data directory is
never lost
9© Copyright 2013 Pivotal. All rights reserved.
PNN Fails
HAWQ
PM
HAWQ
SM
PNN SNN
Action Availability Notes
HAWQ Cluster Yes (with downtime) Meta data query can be
carried out, but no other
queries. No DDL or DML.
Current “SELECT” queries Aborted Users need to restart the
query.
Current Transaction Aborted After the PNN Is up, dirty data
& temp files will be removed.
New “SELECT” &
Transaction query
Not possible
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
• PHD 1.1:
• (option 1)Manually bring up PNN. HAWQ cannot switch to secondary name node.
• (option 2)HDFS admin should change the FQDN or IP address of secondary NN to the PNN.
• HAWQ master keeps on trying to connect PNN and when it finds one, the cluster becomes operational.
• PHD 1.1.1 (Dec,13)
• QA verified testing of above 2 options.
10© Copyright 2013 Pivotal. All rights reserved.
PNN Fails
HAWQ
PM
HAWQ
SM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
• PHD 1.1:
• (option 1)Manually bring up PNN. HAWQ cannot switch to secondary name node.
• (option 2)HDFS admin should change the FQDN or IP address of secondary NN to the PNN.
• HAWQ master keeps on trying to connect PNN and when it finds one, the cluster becomes operational.
• PHD 1.1.1 (Dec,13)
• QA verified testing of above 2 options.
• Normal HDFS failover process
• Change DNS name of secondary NN
to the current NN
• Namenode service will be supported in
PHD 1.2 (February)
11© Copyright 2013 Pivotal. All rights reserved.
PNN & Secondary NN Fail
HAWQ
PM
HAWQ
SM
PNN SNN
Action Availability Notes
HAWQ Cluster No Meta data query can be
carried out, but no other
queries. No DDL or DML.
Current “SELECT” queries Aborted Users need to restart the
query.
Current Transaction Aborted After the PNN Is up, dirty data
& temp files will be removed.
New “SELECT” &
Transaction query
Not possible
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
• PHD 1.1:
• (option 1)Manually bring up PNN. HAWQ cannot switch to secondary name node.
• (option 2)HDFS admin should change the FQDN or IP address of secondary NN to the PNN.
• HAWQ master keeps on trying to connect PNN and when it finds one, the cluster becomes operational.
• PHD 1.1.1 (Dec,13)
• QA verified testing of above 2 options.
12© Copyright 2013 Pivotal. All rights reserved.
PNN & Secondary NN Fail
HAWQ
PM
HAWQ
SM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
• No split information
• No transactions
13© Copyright 2013 Pivotal. All rights reserved.
Secondary NN Fail
HAWQ
PM
HAWQ
SM
PNN SNN
Action Availability Notes
HAWQ Cluster Yes Fully available
Current “SELECT” queries Yes
Current Transaction Yes
New “SELECT” &
Transaction query
Yes
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
14© Copyright 2013 Pivotal. All rights reserved.
A Segment Fails
HAWQ
PM
HAWQ
SM
PNN SNN
Action Availability Notes
HAWQ Cluster Yes HAWQ Cluster available.
Current “SELECT” queries Aborted Users need to restart the
query.
Current Transaction Aborted Dirty data & temp files will be
removed.
New “SELECT” &
Transaction query
Yes Remaining segments will
handle the query.
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
15© Copyright 2013 Pivotal. All rights reserved.
A Segment Fails
HAWQ
PM
HAWQ
SM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
• Segments QE (Query Executers) are
killed
• HAWQ does not materialize
intermediate results
• Local actions by QE is not committed
• Segment QEs are started by other
segments in consequent queries
• QE substitution is random
• Future release for option to materialize
work files
16© Copyright 2013 Pivotal. All rights reserved.
Multiple Segment Fail
HAWQ
PM
HAWQ
SM
PNN SNN
Action Availability Notes
HAWQ Cluster Yes HAWQ Cluster available.
Current “SELECT” queries Aborted Users need to restart the
query.
Current Transaction Aborted Dirty data & temp files will be
removed.
New “SELECT” &
Transaction query
Yes Remaining segments will
handle the query.
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
17© Copyright 2013 Pivotal. All rights reserved.
DN Fails
HAWQ
PM
HAWQ
SM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
Action Availability Notes
HAWQ Cluster Yes HAWQ Cluster available.
Current “SELECT” queries Yes SS will automatically connect
to remote DN in the middle of
currently executing query.
Current Transaction Yes Transaction will finish
successfully.
New “SELECT” &
Transaction query
Yes
• PHD 1.1:
• No Impact. SS will continue to work with remote DN
• Loss of data locality might introduce slight performance impact. In 10G network performance
impact is measured to be around 10% for large queries. Simple queries might experience 50%
performance impact.
18© Copyright 2013 Pivotal. All rights reserved.
DN Fails
HAWQ
PM
HAWQ
SM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
• PHD 1.1:
• No Impact. SS will continue to work with remote DN
• Loss of data locality might introduce slight performance impact. In 10G network performance
impact is measured to be around 10% for large queries. Simple queries might experience 50%
performance impact.
• libhdfs faults to read from HDFS
replica
• Short-term performance loss until NN
marks DN as dead
19© Copyright 2013 Pivotal. All rights reserved.
Segment Host Dies
HAWQ
PM
HAWQ
SM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
Action Availability Notes
HAWQ Cluster Yes HAWQ Cluster available.
Current “SELECT” queries Aborted Users need to restart the
query.
Current Transaction Aborted Dirty data & temp files will be
removed.
New “SELECT” &
Transaction query
Yes Remaining segments will
handle the query.
20© Copyright 2013 Pivotal. All rights reserved.
Single Disk Failure in DN
 JBOD
– If Tempdata is not in the failed disk then no impact on the cluster or query.
– If Tempdata is configured to be on the failed disk.
▪ Small queries will run, but large queries with too much temporary data will be impacted.
▪ Transactions will be aborted and new transaction will continue if multiple disk are configured
to contain tempdata.
 RAID 5
– No impact.
– Possible performance loss.
 RAID 10
– No Impact & no performance loss.
21© Copyright 2013 Pivotal. All rights reserved.
HAWQ HA on roadmap
 Automatic Namenode HA supported on PHD now
 Automatic Namenode HA (name service) supported by HAWQ in
February release
 PXF to also support NN service
 No interruption in query execution during NN failure
 HAWQ HA unchanged
22© Copyright 2013 Pivotal. All rights reserved. 22© Copyright 2013 Pivotal. All rights reserved.
What’s New in
Pivotal HD 1.1
November 7th, 2013
23© Copyright 2013 Pivotal. All rights reserved.
Key Themes of PivotalHD 1.1 Release
 Leverage more data, in real time, more easily to gain
competitive advantage
 Richer services and tools to create broader set of
applications
 Deeper, streamlined administrative capabilities for enterprise
deployments
24© Copyright 2013 Pivotal. All rights reserved.
Pivotal HD Architecture
HDFS
HBas
e
Pig, Hive,
Mahout
Map
Reduce
Sqoop Flume
Resource
Management
& Workflow
Yarn
Zookeeper
Apache Pivotal
Command
Center
Configure,
Deploy,
Monitor,
Manage
Data Loader
Pivotal HD
Enterprise
Spring
Unified Storage
Service
Xtension
Framework
Catalog
Services
Query
Optimizer
Dynamic Pipelining
ANSI SQL + Analytics
HAWQ – Advanced
Database Services
Hadoop Virtualization
Extension
Distrubuted
In-memory
Store
Query
Transactions
Ingestion
Processing
Hadoop Driver –
Parallel with Compaction
ANSI SQL + In-Memory
GemFire XD – Real-Time
Database Services
MADlib Algorithms
Oozie
Vaidya
25© Copyright 2013 Pivotal. All rights reserved.
GemFire XD : Delivers
Enterprise real-time data processing platform for SLA critical applications; enables users to rapidly and reliably
analyze & react to high volumes of events while leveraging10s of TBs of in-memory reference data.
Cloud Scale
Real-Time Platform
Seamless Pivotal
HD Integration
Optimized for
Real-Time Analytics
• Very low & predictable
latencies at high &
variable loads
• 10s of TBs in-memory
(Memscale)
• Multi-tiered caching
• Efficient in-memory M-R
• Real-time event
processing
• Continuous querying
• SQL based queries
• Support structured and
semi-structured* data
• Java stored procedures
• Deep Spring Data
integration
• Native support for
JSON and Objects
(Java, C++, C#)*
• Scale to HDFS with
policy driven in-memory
data retention
• Online and offline
querying of HDFS data
• ETL-less bi-directional
integration with other
Pivotal HD services
Enterprise-Class
Reliability
• JTA distributed
transactions
• HA through in-memory
redundancy
• Reliable event
propagation
• Active-active
deployments across
WAN
* EA / Not in 1.0
26© Copyright 2013 Pivotal. All rights reserved.
Feature Benefit
Command Center:
Install Wizard Faster, easier set up and configuration of HD cluster
Start/Stop Services Point/click control of multiple services through a central interface
HAWQ
UDF
(Partial)
- C, PL/pgsql
- pgcrypto, orafce
Enable richer data processing and analytics functionality leveraging existing SQL
skill sets
Kerberos Support Tightly integrated security with HDFS
PXF: Writable HDFS Table
Support
Easily export HAWQ data to HDFS for external consumption
HAWQ Input Format Reader Directly leverage HAWQ data in MapReduce, Pig and Hive
Diagnostic Tools Lower administration costs
Improved Query Planner “Orca” Enabled to provide more efficient query plans
What’s New in Pivotal HD 1.1
27© Copyright 2013 Pivotal. All rights reserved.
Feature Benefit
Install/Config (ICM) CLI
Add/Remove Services Faster, easier set up and administration of services (e.g. Hbase, GemfireXD etc)
Upgrade Streamlined, low risk upgrade from 1.0.1 to 1.1
Apache Hadoop Components
Hadoop to 2.0.5 and select
2.0.6 patches
Greater stability and lower risk based on critical defect fixes incorporated
Oozie 3.3.2 Orchestrate data processing (e.g. MR, Pig) job pipelines with dependencies
Hive 11 (incl. HCatalog and
Hiveserver2)
Significant improvements in functionality, scalability and security.
Hbase 0.94.8 Enables snapshots of tables without overhead to the Region Servers
RHEL 6.4 Certification Enhanced performance optimizations and security improvements
What’s New in Pivotal HD 1.1
28© Copyright 2013 Pivotal. All rights reserved.
Feature Benefit
Platform and Security
Kerberos Support
- HDFS
- HAWQ
- Unified Storage Service
- PXF to be supported in Dec 2013
Tighter governance, risk and compliance
JRE 1.7.0.15 support Supported platform. JRE 1.6 is end of life.
RHEL 6.4 (FIPS) certification Federal standard for cryptography modules
Pgcrypto for HAWQ Flexible and robust encryption of sensitive data
Tools
Unified Storage Service: CDH4 as a
data source
Stream data from CDH4
Data Loader
- Push Stream API
- Spring XD front end for Twitter
Integration support for wider variety of data sources
What’s New in Pivotal HD 1.1
29© Copyright 2013 Pivotal. All rights reserved.
Command Center Cluster Deployment Wizard
• Performs “Host
Verification” to determine
host eligibility to be added
to cluster
30© Copyright 2013 Pivotal. All rights reserved.
Command Center Cluster Deployment Wizard
• Easily Add Eligible Nodes to
Roles
• Basic Validation of Layout
• Checkbox Add/Remove
Services
• Ability to Download
Configuration Locally
Recorder Demo can be found -> Here
31© Copyright 2013 Pivotal. All rights reserved.
Orca - Improved Optimizer
 Pluggable architecture, allowing faster innovation and quicker iteration on
quality improvements
 Subset of improved functionality:
• Parity with Planner
• Improved join-ordering
• Join-Aggregate re-ordering
• Sub-query de-correlation
• Optimal sort-orders
• Full integration of data (re-
)distribution
• Contradiction detection
• Elimination of redundant joins
• Smarter Partition scan
• Star-join optimization
• Skew aware
32© Copyright 2013 Pivotal. All rights reserved.
What’s new in PXF
 Profiles
 Writable external tables
 Hive partition pruning, HBase filtration
 Additional connectors & CSV support
 Complete extensibility
 Roadmap
– Security & authentication
– Multi-FS support & other distributions via OS
– Stand-alone service
33© Copyright 2013 Pivotal. All rights reserved.
Why Pivotal HD?
 Big Data + Fast Data
 The first enterprise grade platform that provides OLAP
and OLTP with HDFS as the common data substrate
 Enables closed loop analytics, real-time event
processing and high speed data ingest
34© Copyright 2013 Pivotal. All rights reserved.
Hawq Format Reader
Java Program
(i.e. MapReduce
Job)
HDFS
Hawq
Hawq
Reader
(Jar file)
1. Request is made to
where Files for specific
“Table” exist
2. Location is returned on
where are files
2. HDFS Files with
Hawq Format are
streamed to Reader
Recorded Demo can be found -> Here
35© Copyright 2013 Pivotal. All rights reserved.
Oozie now Included and Supported with PHD
 Oozie is a workflow scheduler system to manage Apache Hadoop jobs.
 Oozie Workflow jobs are Directed Acyclical Graphs (DAGs) of actions.
 Oozie Coordinator jobs are recurrent Oozie Workflow jobs triggered by time
(frequency) and data availabilty.
 Oozie is integrated with the rest of the Hadoop stack supporting several types of
Hadoop jobs out of the box (such as Java map-reduce, Streaming map-reduce,
Pig, Hive, Sqoop and Distcp) as well as system specific jobs (such as Java
programs and shell scripts).
 Oozie is a scalable, reliable and extensible system.
36© Copyright 2013 Pivotal. All rights reserved.
Matrix of what is supported via Install method
37© Copyright 2013 Pivotal. All rights reserved.
Security Dashboard (items in bold tested; rest are scheduled))
Support secure
cluster
Supports Kerberos
for Authentication
Support LDAP for
Authentication
HDFS Yes Yes Linux OS supports
MapReduce/Pig Yes N/A
Hive Yes (standalone mode) N/A
Hiveserver No No
Hiveserver2 Yes Yes Yes
Hbase Yes Yes Yes
HAWQ* Yes Yes Yes
GemfireXD Yes Yes Yes
* Except PXF; Scheduled for Dec (PHD 1.1.1 release
38© Copyright 2013 Pivotal. All rights reserved.
Vaidya
39© Copyright 2013 Pivotal. All rights reserved. 39© Copyright 2013 Pivotal. All rights reserved.
Roadmap
Open Discussion
Nov 25, 2013
40© Copyright 2013 Pivotal. All rights reserved.
Roadmap – Action Items
 Error tables released in PHD 1.2 (February)
– Current workaround
 PCC new features?!
 SAW integration
 PHD 1.1 upgrade planning
41© Copyright 2013 Pivotal. All rights reserved. 41© Copyright 2013 Pivotal. All rights reserved.
Appendix
Nov 25, 2013
42© Copyright 2013 Pivotal. All rights reserved. 42© Copyright 2013 Pivotal. All rights reserved.
HAWQ
Nov 25, 2013
43© Copyright 2013 Pivotal. All rights reserved.
History
 HAWQ 1.0 (March release)
– True SQL Engine in Hadoop
▪ SQL 92, 99 & 2003 OLAP extensions
▪ JDBC/ODBC
– Basic SQL functionalities
▪ DDL and DML
– High availability feature
– Transaction support
 HAWQ 1.1 (June release)
– JBOD support feature
 HAWQ 1.1.1 (August release)
– HDFS access layer read fault tolerance support
– HAWQ diagnosis tool
– ORCA enabled
 HAWQ 1.1.2 (September release)
– HAWQ MR Inputformat for AO tables
– HDFS access layer write fault tolerance support
– HDFS 2.0.5 support
 HAWQ 1.1.3 (Oct release)
– HAWQ Kerberos support
– HAWQ on secure HDFS
– UDF
 HAWQ 1.1.4 (Dec release)
– Gptoolkit
– UDF enhancement
– Manual failover for HDFS HA
 HAWQ 1.2 (Feb release)
– Parquet storage support
– HAWQ MR Inputformat
– Automatic failover for HDFS HA
– …
44© Copyright 2013 Pivotal. All rights reserved.
Network
Interconnect
...
......
HAWQ & HDFS Master
Severs
Planning & dispatch
Segment
Severs
Query execution
...
Storage
HDFS, HBase …
45© Copyright 2013 Pivotal. All rights reserved.
Namenode
B
replication
Rack1 Rack2
DatanodeDatanode Datanode
Read/Write
Segment
Segment host
Segment
Segment
Segment host
Segment
Segment host
Master host
Meta Ops
GPDB Interconnect
Segment
Segment
Segment
Segment host
Segment
Datanode
Segment
SegmentSegment Segment
46© Copyright 2013 Pivotal. All rights reserved.
Query execution flow
47© Copyright 2013 Pivotal. All rights reserved.
Parallel Query Optimizer
• Converts SQL into a physical execution plan
– Cost-based optimization looks for the most efficient plan
– Physical plan contains scans, joins, sorts, aggregations, etc.
– Global planning avoids sub-optimal ‘SQL pushing’ to segments
– Directly inserts ‘motion’ nodes for inter-segment communication
• ‘Motion’ nodes for efficient non-local join processing
(Assume table A is distributed across all segments – i.e. each has AK)
– Broadcast Motion (N:N)
• Every segment sends AK to all other segments
– Redistribute Motion (N:N)
• Every segment rehashes AK (by join column) and redistributes each row
– Gather Motion (N:1)
• Every segment sends its AK to a single node (usually the master)
48© Copyright 2013 Pivotal. All rights reserved.
Example of Parallel Query Optimization
48
select
c_custkey, c_name,
sum(l_extendedprice * (1 - l_discount)) as revenue,
c_acctbal, n_name, c_address, c_phone, c_comment
from
customer, orders, lineitem, nation
where
c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate >= date '1994-08-01'
and o_orderdate < date '1994-08-01'
+ interval '3 month'
and l_returnflag = 'R'
and c_nationkey = n_nationkey
group by
c_custkey, c_name, c_acctbal,
c_phone, n_name, c_address, c_comment
order by
revenue desc
Gather Motion
4:1
(slice 3)
Sort
HashAggregate
HashJoin
Redistribute Motion
4:4
(slice 1)
HashJoin
Seq Scan on
lineitem
Hash
Seq Scan on
orders
Hash
HashJoin
Seq Scan on
customer
Hash
Broadcast Motion
4:4
(slice 2)
Seq Scan on
nation
49© Copyright 2013 Pivotal. All rights reserved.
Interconnect
• UDP based
• Flow control
50© Copyright 2013 Pivotal. All rights reserved.
Metadata dispatch
• Metadata dispatch
• Stateless segments
– Read only metadata on segment
51© Copyright 2013 Pivotal. All rights reserved.
Transaction
 Full transaction support tables on HDFS
– When a load transaction is aborted, there will be some garbage data left at the end of
file. For HDFS like systems, data cannot be truncated or overwritten.
 Methods to process the partial data to support transaction.
– Option 1: Load data into a separate HDFS file. Unlimited number of files.
– Option 2: Use metadata to records the boundary of garbage data, and
implements a kind of vacuum mechanism.
– Option 3: Implement HDFS truncation.
 HDFS truncate is added to support transaction
52© Copyright 2013 Pivotal. All rights reserved.
Transaction
 Snapshot isolation
 Simplified Transaction Model Support
– Simplified two phase commit
53© Copyright 2013 Pivotal. All rights reserved.
Transaction support
• Methods to process the partial data to support
transaction.
– Option 1: Load data into a separate HDFS file.
Unlimited number of files.
– Option 2: Use metadata to records the boundary of
garbage data, and implements a kind of vacuum
mechanism.
– Option 3: Implement HDFS truncation.
54© Copyright 2013 Pivotal. All rights reserved.
Pluggable storage
• Read Optimized/Append only storage
• Column store
– Compressions: quicklz, zlib, RLE
– Partitioned tables hit HDFS limitation
• Parquet
– Open source format
– PAX like column store
– Snappy, gzip
• MR Input/Output format
55© Copyright 2013 Pivotal. All rights reserved.
HDFS C client: why
• libhdfs (Current HDFS c client) is based on JNI. It is difficult to make
HAWQ support a large number of concurrent queries.
• Example:
– 4 segments on each segment hosts
– 50 concurrent queries
– each query has 16 QE processes that do scan
– there will be about 800 processes that start 800 JVMs to access HDFS.
– If each JVM uses 500MB memory, the JVMs will consume 800 * 500M =
400G memory.
– Thus naïve usage of libhdfs is not suitable for HAWQ. Currently we
have three options to solve this problem
56© Copyright 2013 Pivotal. All rights reserved.
HDFS client: three options
• Option 1: use HDFS FUSE. HDFS FUSE introduces some
performance overhead. And the scalability is not verified yet.
• Option 2 (libhdfs2): implement a webhdfs based C client. webhdfs is
based on HTTP. It also introduces some costs. Performance should
be benchmarked. Webhdfs based method has several benefits, such
as ease to implementation and low maintenance cost.
• Option 3 (libhdfs3): implement a C RPC interface that directly
communicates with NameNode and DataNode. Many changes when
the RPC protocol is changed.
57© Copyright 2013 Pivotal. All rights reserved. 57© Copyright 2013 Pivotal. All rights reserved.
PXF
Nov 25, 2013
58© Copyright 2013 Pivotal. All rights reserved.
PXF is...
A fast extensible framework
connecting Hawq to a data
store of choice that exposes a
parallel API
59© Copyright 2013 Pivotal. All rights reserved.
Hawq External Tables
• gpfdist
– remote delimited text (or csv) files.
• file
– text files on segment filesystem.
• execute
– script execution and produced data
• pxf
– text and binary data from available pxf connectors (mostly HD based).
60© Copyright 2013 Pivotal. All rights reserved.
Steps
• Step 1: GRANT ON PROTOCOL pxf
• Step 2: Define a PXF table
– Pick built-in plugins right for the job
– Specify data source of choice
– Map remote data fields to Hawq db attributes (plugin
dependent)
• Step 3: Query the PXF table.
– Directly
– Or copy to a Hawq table first
CREATE EXTERNAL TABLE foo(<col list>)
LOCATION (‘pxf://<host:port>/<data source>?<plugin
options>’)
FORMAT ‘<type>’(<params>)
61© Copyright 2013 Pivotal. All rights reserved.
62© Copyright 2013 Pivotal. All rights reserved.
63© Copyright 2013 Pivotal. All rights reserved.
64© Copyright 2013 Pivotal. All rights reserved. 64© Copyright 2013 Pivotal. All rights reserved.
New Features
Main additions since PHD1.0
65© Copyright 2013 Pivotal. All rights reserved. 65© Copyright 2013 Pivotal. All rights reserved.
User Experience
66© Copyright 2013 Pivotal. All rights reserved.
User Experience
• Improved/Informative error messages.
• Profiles
LOCATION(‘pxf://<host:port>/sales?fragmenter=H
iveFragmenter&accessor=HiveAccessor&resolver=H
iveResolver’)
LOCATION(‘pxf://<host:port>/sales?profile=Hive
’)
67© Copyright 2013 Pivotal. All rights reserved.
profiles.xml
<profile>
<name>HBase</name>
<description>Used for connecting to an HBase data store engine</description>
<plugins>
<fragmenter>HBaseDataFragmenter</fragmenter>
<accessor>HBaseAccessor</accessor>
<resolver>HBaseResolver</resolver>
<myidentifier>MyValue</myidentifier>
</plugins>
</profile>
68© Copyright 2013 Pivotal. All rights reserved.
profiles.xml
<profile>
<name>HdfsTextSimple</name>
<description>Used when reading delimited single line records from plain text files on HDFS
</description>
<plugins>
<fragmenter>HdfsDataFragmenter</fragmenter>
<accessor>LineBreakAccessor</accessor>
<resolver>StringPassResolver</resolver>
<analyzer>HdfsAnalyzer</analyzer> <-- (soon to be added)
</plugins>
</profile>
69© Copyright 2013 Pivotal. All rights reserved.
profiles.xml
<profile>
<name>MyCustomProfile</name>
<description>Used with a new set of plugins I wrote</description>
<plugins>
<fragmenter>MyFragmenter</fragmenter>
<accessor>MyAccessor</accessor>
<resolver>MyResolver</resolver>
<analyzer>MyAnalyzer</analyzer>
</plugins>
</profile>
Add your own profiles
70© Copyright 2013 Pivotal. All rights reserved. 70© Copyright 2013 Pivotal. All rights reserved.
Export to HDFS
71© Copyright 2013 Pivotal. All rights reserved.
Writable PXF
• gphdfs-like functionality
– but extensible…
– currently supports text, csv, SequenceFile
– supports various hadoop compression CodecsCREATE WRITABLE EXTERNAL TABLE ...
LOCATION(‘pxf://<host:port>/sales?profile=HdfsTextSimple&COMPRESSION_CODEC=org.apache.ha
doop.io.compress.GzipCodec')
FORMAT ‘text’(delimiter ‘,’);
can create a new profile “HdfsTextSimpleGZipped” that includes compression_codec
LOCATION(‘pxf://<host:port>/sales?profile=HdfsTextSimpleGZipped')
72© Copyright 2013 Pivotal. All rights reserved. 72© Copyright 2013 Pivotal. All rights reserved.
New Connectors
73© Copyright 2013 Pivotal. All rights reserved.
New Connectors
• GemFire XD (Released. GA February)
• JSON (On github. GA February (r+w))
• Accumulo (On github. GA version being coded by Clearedge. GA February)
• Cassandra (On github. Alpha)
Non of them was written by the PXF Dev team… a
testament for extensibility.
74© Copyright 2013 Pivotal. All rights reserved.
Feature Summary
★ HBase (w/filter pushdown)
★ Hive (w/partition exclusion. various storage file types)
★ HDFS Files: read (delimited text, csv, Sequence, Avro)
★ HDFS Files: write (delimited text, csv, Sequence, various compression
codecs and options)
★ GemFireXD, JSON format, Cassandra, Accumulo (currently Beta)
★ Stats collection
★ Automatic data locality optimizations
★ Extensibility!
75© Copyright 2013 Pivotal. All rights reserved.
Coming Up Very Soon...
★ Isilon Integration
★ Kerberized HDFS Support
★ Namenode High Availability
76© Copyright 2013 Pivotal. All rights reserved.
Limitations
• Local metadata of external data
– Will be made more transparent when UCS exists.
• Authentication and Authorization of external systems
– Will be made simpler when centralized user mgmt exists.
• Currently supporting local PHD only
• Error tables not yet supported
• Sharing space with Name/DataNode
77© Copyright 2013 Pivotal. All rights reserved. 77© Copyright 2013 Pivotal. All rights reserved.
Writing a plugin
steps and guidelines
78© Copyright 2013 Pivotal. All rights reserved.
Main Steps
1. Verify P-HD running and PXF installed
a. SingleCluster, AllInAll, SingleNode VM
2. Implement the PXF plugin API for your connector
(Java)
a. Use the PXF API doc as a reference
3. Compile your connector classes and add them to the
hadoop classpath on all nodes
4. Restart PHD (won’t be necessary in the future)
5. Add a profile (optional)
79© Copyright 2013 Pivotal. All rights reserved.
Plugins
• Fragmenter – returns a list of source data fragments
and their location
• Accessor – access a given list of fragments read them
and return records
• Resolver – deserialize each record according to a given
schema or technique
• Analyzer – returns statistics about the source data
80© Copyright 2013 Pivotal. All rights reserved. 80© Copyright 2013 Pivotal. All rights reserved.
Thanks!
Nov 25, 2013

More Related Content

What's hot

(SPOT305) Event-Driven Computing on Change Logs in AWS | AWS re:Invent 2014
(SPOT305) Event-Driven Computing on Change Logs in AWS | AWS re:Invent 2014(SPOT305) Event-Driven Computing on Change Logs in AWS | AWS re:Invent 2014
(SPOT305) Event-Driven Computing on Change Logs in AWS | AWS re:Invent 2014Amazon Web Services
 
Scaling paypal workloads with oracle rac ss
Scaling paypal workloads with oracle rac ssScaling paypal workloads with oracle rac ss
Scaling paypal workloads with oracle rac ssAnil Nair
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload managementBiju Nair
 
Oracle RAC 12c Practical Performance Management and Tuning OOW13 [CON8825]
Oracle RAC 12c Practical Performance Management and Tuning OOW13 [CON8825]Oracle RAC 12c Practical Performance Management and Tuning OOW13 [CON8825]
Oracle RAC 12c Practical Performance Management and Tuning OOW13 [CON8825]Markus Michalewicz
 
Christo Kutrovsky - Maximize Data Warehouse Performance with Parallel Queries
Christo Kutrovsky - Maximize Data Warehouse Performance with Parallel QueriesChristo Kutrovsky - Maximize Data Warehouse Performance with Parallel Queries
Christo Kutrovsky - Maximize Data Warehouse Performance with Parallel QueriesChristo Kutrovsky
 
A5 oracle exadata-the game changer for online transaction processing data w...
A5   oracle exadata-the game changer for online transaction processing data w...A5   oracle exadata-the game changer for online transaction processing data w...
A5 oracle exadata-the game changer for online transaction processing data w...Dr. Wilfred Lin (Ph.D.)
 
Real-Time Query for Data Guard
Real-Time Query for Data Guard Real-Time Query for Data Guard
Real-Time Query for Data Guard Uwe Hesse
 
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceUsing Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceBiju Nair
 
Fllow con 2014
Fllow con 2014 Fllow con 2014
Fllow con 2014 gbgruver
 
RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)
RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)
RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)Kristofferson A
 
Oracle Replication with DBvisit
Oracle Replication with DBvisitOracle Replication with DBvisit
Oracle Replication with DBvisitAnton An
 
NoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_DatabaseNoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_DatabaseParesh Patel
 

What's hot (13)

(SPOT305) Event-Driven Computing on Change Logs in AWS | AWS re:Invent 2014
(SPOT305) Event-Driven Computing on Change Logs in AWS | AWS re:Invent 2014(SPOT305) Event-Driven Computing on Change Logs in AWS | AWS re:Invent 2014
(SPOT305) Event-Driven Computing on Change Logs in AWS | AWS re:Invent 2014
 
Scaling paypal workloads with oracle rac ss
Scaling paypal workloads with oracle rac ssScaling paypal workloads with oracle rac ss
Scaling paypal workloads with oracle rac ss
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload management
 
Oracle RAC 12c Practical Performance Management and Tuning OOW13 [CON8825]
Oracle RAC 12c Practical Performance Management and Tuning OOW13 [CON8825]Oracle RAC 12c Practical Performance Management and Tuning OOW13 [CON8825]
Oracle RAC 12c Practical Performance Management and Tuning OOW13 [CON8825]
 
Cloud DWH deep dive
Cloud DWH deep diveCloud DWH deep dive
Cloud DWH deep dive
 
Christo Kutrovsky - Maximize Data Warehouse Performance with Parallel Queries
Christo Kutrovsky - Maximize Data Warehouse Performance with Parallel QueriesChristo Kutrovsky - Maximize Data Warehouse Performance with Parallel Queries
Christo Kutrovsky - Maximize Data Warehouse Performance with Parallel Queries
 
A5 oracle exadata-the game changer for online transaction processing data w...
A5   oracle exadata-the game changer for online transaction processing data w...A5   oracle exadata-the game changer for online transaction processing data w...
A5 oracle exadata-the game changer for online transaction processing data w...
 
Real-Time Query for Data Guard
Real-Time Query for Data Guard Real-Time Query for Data Guard
Real-Time Query for Data Guard
 
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceUsing Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve Performace
 
Fllow con 2014
Fllow con 2014 Fllow con 2014
Fllow con 2014
 
RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)
RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)
RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)
 
Oracle Replication with DBvisit
Oracle Replication with DBvisitOracle Replication with DBvisit
Oracle Replication with DBvisit
 
NoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_DatabaseNoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
 

Viewers also liked

Big Data - In-Memory Index / Sub Second Query engine - Roxie - HPCC Systems
Big Data - In-Memory Index / Sub Second Query engine - Roxie - HPCC SystemsBig Data - In-Memory Index / Sub Second Query engine - Roxie - HPCC Systems
Big Data - In-Memory Index / Sub Second Query engine - Roxie - HPCC SystemsFujio Turner
 
基于Spring batch的大数据量并行处理
基于Spring batch的大数据量并行处理基于Spring batch的大数据量并行处理
基于Spring batch的大数据量并行处理Jacky Chi
 
Spring Batch Workshop (advanced)
Spring Batch Workshop (advanced)Spring Batch Workshop (advanced)
Spring Batch Workshop (advanced)lyonjug
 
Hadoop vs Java Batch Processing JSR 352
Hadoop vs Java Batch Processing JSR 352Hadoop vs Java Batch Processing JSR 352
Hadoop vs Java Batch Processing JSR 352Armel Nene
 
gsoc_mentor for Shivram Mani
gsoc_mentor for Shivram Manigsoc_mentor for Shivram Mani
gsoc_mentor for Shivram ManiShivram Mani
 
PXF HAWQ Unmanaged Data
PXF HAWQ Unmanaged DataPXF HAWQ Unmanaged Data
PXF HAWQ Unmanaged DataShivram Mani
 
Hawq Hcatalog Integration
Hawq Hcatalog IntegrationHawq Hcatalog Integration
Hawq Hcatalog IntegrationShivram Mani
 
Managing Apache HAWQ with Apache AMBARI
Managing Apache HAWQ with Apache AMBARIManaging Apache HAWQ with Apache AMBARI
Managing Apache HAWQ with Apache AMBARIMithun (Matt) Mathew
 
Apache Zeppelin Meetup Christian Tzolov 1/21/16
Apache Zeppelin Meetup Christian Tzolov 1/21/16 Apache Zeppelin Meetup Christian Tzolov 1/21/16
Apache Zeppelin Meetup Christian Tzolov 1/21/16 PivotalOpenSourceHub
 
Apache HAWQ : An Introduction
Apache HAWQ : An IntroductionApache HAWQ : An Introduction
Apache HAWQ : An IntroductionSandeep Kunkunuru
 
HAWQ: a massively parallel processing SQL engine in hadoop
HAWQ: a massively parallel processing SQL engine in hadoopHAWQ: a massively parallel processing SQL engine in hadoop
HAWQ: a massively parallel processing SQL engine in hadoopBigData Research
 
Pivotal Strata NYC 2015 Apache HAWQ Launch
Pivotal Strata NYC 2015 Apache HAWQ LaunchPivotal Strata NYC 2015 Apache HAWQ Launch
Pivotal Strata NYC 2015 Apache HAWQ LaunchVMware Tanzu
 
S2GX 2012 - Introduction to Spring Integration and Spring Batch
S2GX 2012 - Introduction to Spring Integration and Spring BatchS2GX 2012 - Introduction to Spring Integration and Spring Batch
S2GX 2012 - Introduction to Spring Integration and Spring BatchGunnar Hillert
 
Massively Parallel Processing with Procedural Python - Pivotal HAWQ
Massively Parallel Processing with Procedural Python - Pivotal HAWQMassively Parallel Processing with Procedural Python - Pivotal HAWQ
Massively Parallel Processing with Procedural Python - Pivotal HAWQInMobi Technology
 
Apache HAWQ and Apache MADlib: Journey to Apache
Apache HAWQ and Apache MADlib: Journey to ApacheApache HAWQ and Apache MADlib: Journey to Apache
Apache HAWQ and Apache MADlib: Journey to ApachePivotalOpenSourceHub
 
Parallel batch processing with spring batch slideshare
Parallel batch processing with spring batch   slideshareParallel batch processing with spring batch   slideshare
Parallel batch processing with spring batch slideshareMorten Andersen-Gott
 
Phd tutorial hawq_v0.1
Phd tutorial hawq_v0.1Phd tutorial hawq_v0.1
Phd tutorial hawq_v0.1seungdon Choi
 
Ahea Team Spring batch
Ahea Team Spring batchAhea Team Spring batch
Ahea Team Spring batchSunghyun Roh
 

Viewers also liked (20)

Big Data - In-Memory Index / Sub Second Query engine - Roxie - HPCC Systems
Big Data - In-Memory Index / Sub Second Query engine - Roxie - HPCC SystemsBig Data - In-Memory Index / Sub Second Query engine - Roxie - HPCC Systems
Big Data - In-Memory Index / Sub Second Query engine - Roxie - HPCC Systems
 
基于Spring batch的大数据量并行处理
基于Spring batch的大数据量并行处理基于Spring batch的大数据量并行处理
基于Spring batch的大数据量并行处理
 
Spring Batch Workshop (advanced)
Spring Batch Workshop (advanced)Spring Batch Workshop (advanced)
Spring Batch Workshop (advanced)
 
Hadoop vs Java Batch Processing JSR 352
Hadoop vs Java Batch Processing JSR 352Hadoop vs Java Batch Processing JSR 352
Hadoop vs Java Batch Processing JSR 352
 
gsoc_mentor for Shivram Mani
gsoc_mentor for Shivram Manigsoc_mentor for Shivram Mani
gsoc_mentor for Shivram Mani
 
PXF BDAM 2016
PXF BDAM 2016PXF BDAM 2016
PXF BDAM 2016
 
PXF HAWQ Unmanaged Data
PXF HAWQ Unmanaged DataPXF HAWQ Unmanaged Data
PXF HAWQ Unmanaged Data
 
Hawq Hcatalog Integration
Hawq Hcatalog IntegrationHawq Hcatalog Integration
Hawq Hcatalog Integration
 
Managing Apache HAWQ with Apache AMBARI
Managing Apache HAWQ with Apache AMBARIManaging Apache HAWQ with Apache AMBARI
Managing Apache HAWQ with Apache AMBARI
 
Apache Zeppelin Meetup Christian Tzolov 1/21/16
Apache Zeppelin Meetup Christian Tzolov 1/21/16 Apache Zeppelin Meetup Christian Tzolov 1/21/16
Apache Zeppelin Meetup Christian Tzolov 1/21/16
 
Apache HAWQ : An Introduction
Apache HAWQ : An IntroductionApache HAWQ : An Introduction
Apache HAWQ : An Introduction
 
HAWQ: a massively parallel processing SQL engine in hadoop
HAWQ: a massively parallel processing SQL engine in hadoopHAWQ: a massively parallel processing SQL engine in hadoop
HAWQ: a massively parallel processing SQL engine in hadoop
 
Pivotal Strata NYC 2015 Apache HAWQ Launch
Pivotal Strata NYC 2015 Apache HAWQ LaunchPivotal Strata NYC 2015 Apache HAWQ Launch
Pivotal Strata NYC 2015 Apache HAWQ Launch
 
S2GX 2012 - Introduction to Spring Integration and Spring Batch
S2GX 2012 - Introduction to Spring Integration and Spring BatchS2GX 2012 - Introduction to Spring Integration and Spring Batch
S2GX 2012 - Introduction to Spring Integration and Spring Batch
 
Build & test Apache Hawq
Build & test Apache Hawq Build & test Apache Hawq
Build & test Apache Hawq
 
Massively Parallel Processing with Procedural Python - Pivotal HAWQ
Massively Parallel Processing with Procedural Python - Pivotal HAWQMassively Parallel Processing with Procedural Python - Pivotal HAWQ
Massively Parallel Processing with Procedural Python - Pivotal HAWQ
 
Apache HAWQ and Apache MADlib: Journey to Apache
Apache HAWQ and Apache MADlib: Journey to ApacheApache HAWQ and Apache MADlib: Journey to Apache
Apache HAWQ and Apache MADlib: Journey to Apache
 
Parallel batch processing with spring batch slideshare
Parallel batch processing with spring batch   slideshareParallel batch processing with spring batch   slideshare
Parallel batch processing with spring batch slideshare
 
Phd tutorial hawq_v0.1
Phd tutorial hawq_v0.1Phd tutorial hawq_v0.1
Phd tutorial hawq_v0.1
 
Ahea Team Spring batch
Ahea Team Spring batchAhea Team Spring batch
Ahea Team Spring batch
 

Similar to Pivotal HAWQ - High Availability (2014)

Live traffic capture and replay in cassandra 4.0
Live traffic capture and replay in cassandra 4.0Live traffic capture and replay in cassandra 4.0
Live traffic capture and replay in cassandra 4.0Vinay Kumar Chella
 
The Data Center and Hadoop
The Data Center and HadoopThe Data Center and Hadoop
The Data Center and HadoopDataWorks Summit
 
Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009) Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009) PostgreSQL Experts, Inc.
 
M|18 Battle of the Online Schema Change Methods
M|18 Battle of the Online Schema Change MethodsM|18 Battle of the Online Schema Change Methods
M|18 Battle of the Online Schema Change MethodsMariaDB plc
 
Operating and supporting HBase Clusters
Operating and supporting HBase ClustersOperating and supporting HBase Clusters
Operating and supporting HBase Clustersenissoz
 
Operating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and ImprovementsOperating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and ImprovementsDataWorks Summit/Hadoop Summit
 
Extreme Availability using Oracle 12c Features: Your very last system shutdown?
Extreme Availability using Oracle 12c Features: Your very last system shutdown?Extreme Availability using Oracle 12c Features: Your very last system shutdown?
Extreme Availability using Oracle 12c Features: Your very last system shutdown?Toronto-Oracle-Users-Group
 
A DBA’s guide to using TSA
A DBA’s guide to using TSAA DBA’s guide to using TSA
A DBA’s guide to using TSAFrederik Engelen
 
Oracle Drivers configuration for High Availability, is it a developer's job?
Oracle Drivers configuration for High Availability, is it a developer's job?Oracle Drivers configuration for High Availability, is it a developer's job?
Oracle Drivers configuration for High Availability, is it a developer's job?Ludovico Caldara
 
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACKristofferson A
 
Inside Microsoft's FPGA-Based Configurable Cloud
Inside Microsoft's FPGA-Based Configurable CloudInside Microsoft's FPGA-Based Configurable Cloud
Inside Microsoft's FPGA-Based Configurable Cloudinside-BigData.com
 
Redundancy for Big Hadoop Clusters is hard - Stuart Pook
Redundancy for Big Hadoop Clusters is hard  - Stuart PookRedundancy for Big Hadoop Clusters is hard  - Stuart Pook
Redundancy for Big Hadoop Clusters is hard - Stuart PookEvention
 
The Data Center and Hadoop
The Data Center and HadoopThe Data Center and Hadoop
The Data Center and HadoopMichael Zhang
 
20141011 my sql clusterv01pptx
20141011 my sql clusterv01pptx20141011 my sql clusterv01pptx
20141011 my sql clusterv01pptxIvan Ma
 
Jboss World 2011 Infinispan
Jboss World 2011 InfinispanJboss World 2011 Infinispan
Jboss World 2011 Infinispancbo_
 
Critical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseCritical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseScyllaDB
 
Five Lessons in Distributed Databases
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databasesjbellis
 

Similar to Pivotal HAWQ - High Availability (2014) (20)

Live traffic capture and replay in cassandra 4.0
Live traffic capture and replay in cassandra 4.0Live traffic capture and replay in cassandra 4.0
Live traffic capture and replay in cassandra 4.0
 
The Data Center and Hadoop
The Data Center and HadoopThe Data Center and Hadoop
The Data Center and Hadoop
 
Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009) Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009)
 
M|18 Battle of the Online Schema Change Methods
M|18 Battle of the Online Schema Change MethodsM|18 Battle of the Online Schema Change Methods
M|18 Battle of the Online Schema Change Methods
 
Operating and supporting HBase Clusters
Operating and supporting HBase ClustersOperating and supporting HBase Clusters
Operating and supporting HBase Clusters
 
Operating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and ImprovementsOperating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and Improvements
 
Extreme Availability using Oracle 12c Features: Your very last system shutdown?
Extreme Availability using Oracle 12c Features: Your very last system shutdown?Extreme Availability using Oracle 12c Features: Your very last system shutdown?
Extreme Availability using Oracle 12c Features: Your very last system shutdown?
 
A DBA’s guide to using TSA
A DBA’s guide to using TSAA DBA’s guide to using TSA
A DBA’s guide to using TSA
 
Oracle Drivers configuration for High Availability, is it a developer's job?
Oracle Drivers configuration for High Availability, is it a developer's job?Oracle Drivers configuration for High Availability, is it a developer's job?
Oracle Drivers configuration for High Availability, is it a developer's job?
 
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
 
Kudu austin oct 2015.pptx
Kudu austin oct 2015.pptxKudu austin oct 2015.pptx
Kudu austin oct 2015.pptx
 
nZDM.ppt
nZDM.pptnZDM.ppt
nZDM.ppt
 
Inside Microsoft's FPGA-Based Configurable Cloud
Inside Microsoft's FPGA-Based Configurable CloudInside Microsoft's FPGA-Based Configurable Cloud
Inside Microsoft's FPGA-Based Configurable Cloud
 
Redundancy for Big Hadoop Clusters is hard - Stuart Pook
Redundancy for Big Hadoop Clusters is hard  - Stuart PookRedundancy for Big Hadoop Clusters is hard  - Stuart Pook
Redundancy for Big Hadoop Clusters is hard - Stuart Pook
 
The Data Center and Hadoop
The Data Center and HadoopThe Data Center and Hadoop
The Data Center and Hadoop
 
20141011 my sql clusterv01pptx
20141011 my sql clusterv01pptx20141011 my sql clusterv01pptx
20141011 my sql clusterv01pptx
 
Jboss World 2011 Infinispan
Jboss World 2011 InfinispanJboss World 2011 Infinispan
Jboss World 2011 Infinispan
 
Critical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseCritical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency Database
 
Five Lessons in Distributed Databases
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databases
 

Recently uploaded

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 

Recently uploaded (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 

Pivotal HAWQ - High Availability (2014)

  • 1. A NEW PLATFORM FOR A NEW ERA SK Krishnamurthy
  • 2. 2© Copyright 2013 Pivotal. All rights reserved. Agenda  HAWQ failover and HA now  HAWQ HA upcoming release  What’s new in PHD 1.1  Pivotal Command Center new features  Discuss roadmap in conjunction with AMEX requirements  Open discussion: SAW, PHD 1.1 upgrade, …
  • 3. 3© Copyright 2013 Pivotal. All rights reserved. 3© Copyright 2013 Pivotal. All rights reserved. HAWQ - Availability Nov 25, 2013
  • 4. 4© Copyright 2013 Pivotal. All rights reserved. Deployment Model – Sample HAWQ Cluster HAWQ PM HAWQ SM PNN SNN DN SS SS SS SS SS SS DN SS SS SS SS SS SS DN SS SS SS SS SS SS
  • 5. 5© Copyright 2013 Pivotal. All rights reserved. HAWQ Master Fails HAWQ PM HAWQ SM PNN SNN Action Availability Notes HAWQ Cluster Yes (with downtime) HAWQ Cluster available. How does clients connect to SM?Manual process to connect to standby master. Similar to GPDB. Current “SELECT” queries Aborted Users need to restart the query. Current Transaction Aborted Dirty data & temp files will be removed. New “SELECT” & transaction Yes SM will continue to process queries. DN SS SS SS SS SS SS DN SS SS SS SS SS SS DN SS SS SS SS SS SS
  • 6. 6© Copyright 2013 Pivotal. All rights reserved. HAWQ Master Fails HAWQ PM HAWQ SM PNN SNN DN SS SS SS SS SS SS DN SS SS SS SS SS SS DN SS SS SS SS SS SS • Execution coordinator resides on master • Distributed transaction master resides on master • Log copied up to last committed transaction • Run gpactivatestandby on secondary master • Either VIP or DNS hostname change to re-route client connections
  • 7. 7© Copyright 2013 Pivotal. All rights reserved. HAWQ PM HAWQ SM PNN SNN Action Availability Notes HAWQ Cluster Un-Available Cluster is considered to be down. Current “SELECT” queries Aborted Can’t restart the query. Current Transaction Aborted Dirty data & temp files will be removed. New “SELECT” & Transaction query Not possible DN SS SS SS SS SS SS DN SS SS SS SS SS SS DN SS SS SS SS SS SS HAWQ Master & Standby Master Fail
  • 8. 8© Copyright 2013 Pivotal. All rights reserved. HAWQ PM HAWQ SM PNN SNN DN SS SS SS SS SS SS DN SS SS SS SS SS SS DN SS SS SS SS SS SS HAWQ Master & Standby Master Fail • Configure RAID 10 for HAWQ master so primary segment data directory is never lost
  • 9. 9© Copyright 2013 Pivotal. All rights reserved. PNN Fails HAWQ PM HAWQ SM PNN SNN Action Availability Notes HAWQ Cluster Yes (with downtime) Meta data query can be carried out, but no other queries. No DDL or DML. Current “SELECT” queries Aborted Users need to restart the query. Current Transaction Aborted After the PNN Is up, dirty data & temp files will be removed. New “SELECT” & Transaction query Not possible DN SS SS SS SS SS SS DN SS SS SS SS SS SS DN SS SS SS SS SS SS • PHD 1.1: • (option 1)Manually bring up PNN. HAWQ cannot switch to secondary name node. • (option 2)HDFS admin should change the FQDN or IP address of secondary NN to the PNN. • HAWQ master keeps on trying to connect PNN and when it finds one, the cluster becomes operational. • PHD 1.1.1 (Dec,13) • QA verified testing of above 2 options.
  • 10. 10© Copyright 2013 Pivotal. All rights reserved. PNN Fails HAWQ PM HAWQ SM PNN SNN DN SS SS SS SS SS SS DN SS SS SS SS SS SS DN SS SS SS SS SS SS • PHD 1.1: • (option 1)Manually bring up PNN. HAWQ cannot switch to secondary name node. • (option 2)HDFS admin should change the FQDN or IP address of secondary NN to the PNN. • HAWQ master keeps on trying to connect PNN and when it finds one, the cluster becomes operational. • PHD 1.1.1 (Dec,13) • QA verified testing of above 2 options. • Normal HDFS failover process • Change DNS name of secondary NN to the current NN • Namenode service will be supported in PHD 1.2 (February)
  • 11. 11© Copyright 2013 Pivotal. All rights reserved. PNN & Secondary NN Fail HAWQ PM HAWQ SM PNN SNN Action Availability Notes HAWQ Cluster No Meta data query can be carried out, but no other queries. No DDL or DML. Current “SELECT” queries Aborted Users need to restart the query. Current Transaction Aborted After the PNN Is up, dirty data & temp files will be removed. New “SELECT” & Transaction query Not possible DN SS SS SS SS SS SS DN SS SS SS SS SS SS DN SS SS SS SS SS SS • PHD 1.1: • (option 1)Manually bring up PNN. HAWQ cannot switch to secondary name node. • (option 2)HDFS admin should change the FQDN or IP address of secondary NN to the PNN. • HAWQ master keeps on trying to connect PNN and when it finds one, the cluster becomes operational. • PHD 1.1.1 (Dec,13) • QA verified testing of above 2 options.
  • 12. 12© Copyright 2013 Pivotal. All rights reserved. PNN & Secondary NN Fail HAWQ PM HAWQ SM PNN SNN DN SS SS SS SS SS SS DN SS SS SS SS SS SS DN SS SS SS SS SS SS • No split information • No transactions
  • 13. 13© Copyright 2013 Pivotal. All rights reserved. Secondary NN Fail HAWQ PM HAWQ SM PNN SNN Action Availability Notes HAWQ Cluster Yes Fully available Current “SELECT” queries Yes Current Transaction Yes New “SELECT” & Transaction query Yes DN SS SS SS SS SS SS DN SS SS SS SS SS SS DN SS SS SS SS SS SS
  • 14. 14© Copyright 2013 Pivotal. All rights reserved. A Segment Fails HAWQ PM HAWQ SM PNN SNN Action Availability Notes HAWQ Cluster Yes HAWQ Cluster available. Current “SELECT” queries Aborted Users need to restart the query. Current Transaction Aborted Dirty data & temp files will be removed. New “SELECT” & Transaction query Yes Remaining segments will handle the query. DN SS SS SS SS SS SS DN SS SS SS SS SS SS DN SS SS SS SS SS SS
  • 15. 15© Copyright 2013 Pivotal. All rights reserved. A Segment Fails HAWQ PM HAWQ SM PNN SNN DN SS SS SS SS SS SS DN SS SS SS SS SS SS DN SS SS SS SS SS SS • Segments QE (Query Executers) are killed • HAWQ does not materialize intermediate results • Local actions by QE is not committed • Segment QEs are started by other segments in consequent queries • QE substitution is random • Future release for option to materialize work files
  • 16. 16© Copyright 2013 Pivotal. All rights reserved. Multiple Segment Fail HAWQ PM HAWQ SM PNN SNN Action Availability Notes HAWQ Cluster Yes HAWQ Cluster available. Current “SELECT” queries Aborted Users need to restart the query. Current Transaction Aborted Dirty data & temp files will be removed. New “SELECT” & Transaction query Yes Remaining segments will handle the query. DN SS SS SS SS SS SS DN SS SS SS SS SS SS DN SS SS SS SS SS SS
  • 17. 17© Copyright 2013 Pivotal. All rights reserved. DN Fails HAWQ PM HAWQ SM PNN SNN DN SS SS SS SS SS SS DN SS SS SS SS SS SS DN SS SS SS SS SS SS Action Availability Notes HAWQ Cluster Yes HAWQ Cluster available. Current “SELECT” queries Yes SS will automatically connect to remote DN in the middle of currently executing query. Current Transaction Yes Transaction will finish successfully. New “SELECT” & Transaction query Yes • PHD 1.1: • No Impact. SS will continue to work with remote DN • Loss of data locality might introduce slight performance impact. In 10G network performance impact is measured to be around 10% for large queries. Simple queries might experience 50% performance impact.
  • 18. 18© Copyright 2013 Pivotal. All rights reserved. DN Fails HAWQ PM HAWQ SM PNN SNN DN SS SS SS SS SS SS DN SS SS SS SS SS SS DN SS SS SS SS SS SS • PHD 1.1: • No Impact. SS will continue to work with remote DN • Loss of data locality might introduce slight performance impact. In 10G network performance impact is measured to be around 10% for large queries. Simple queries might experience 50% performance impact. • libhdfs faults to read from HDFS replica • Short-term performance loss until NN marks DN as dead
  • 19. 19© Copyright 2013 Pivotal. All rights reserved. Segment Host Dies HAWQ PM HAWQ SM PNN SNN DN SS SS SS SS SS SS DN SS SS SS SS SS SS DN SS SS SS SS SS SS Action Availability Notes HAWQ Cluster Yes HAWQ Cluster available. Current “SELECT” queries Aborted Users need to restart the query. Current Transaction Aborted Dirty data & temp files will be removed. New “SELECT” & Transaction query Yes Remaining segments will handle the query.
  • 20. 20© Copyright 2013 Pivotal. All rights reserved. Single Disk Failure in DN  JBOD – If Tempdata is not in the failed disk then no impact on the cluster or query. – If Tempdata is configured to be on the failed disk. ▪ Small queries will run, but large queries with too much temporary data will be impacted. ▪ Transactions will be aborted and new transaction will continue if multiple disk are configured to contain tempdata.  RAID 5 – No impact. – Possible performance loss.  RAID 10 – No Impact & no performance loss.
  • 21. 21© Copyright 2013 Pivotal. All rights reserved. HAWQ HA on roadmap  Automatic Namenode HA supported on PHD now  Automatic Namenode HA (name service) supported by HAWQ in February release  PXF to also support NN service  No interruption in query execution during NN failure  HAWQ HA unchanged
  • 22. 22© Copyright 2013 Pivotal. All rights reserved. 22© Copyright 2013 Pivotal. All rights reserved. What’s New in Pivotal HD 1.1 November 7th, 2013
  • 23. 23© Copyright 2013 Pivotal. All rights reserved. Key Themes of PivotalHD 1.1 Release  Leverage more data, in real time, more easily to gain competitive advantage  Richer services and tools to create broader set of applications  Deeper, streamlined administrative capabilities for enterprise deployments
  • 24. 24© Copyright 2013 Pivotal. All rights reserved. Pivotal HD Architecture HDFS HBas e Pig, Hive, Mahout Map Reduce Sqoop Flume Resource Management & Workflow Yarn Zookeeper Apache Pivotal Command Center Configure, Deploy, Monitor, Manage Data Loader Pivotal HD Enterprise Spring Unified Storage Service Xtension Framework Catalog Services Query Optimizer Dynamic Pipelining ANSI SQL + Analytics HAWQ – Advanced Database Services Hadoop Virtualization Extension Distrubuted In-memory Store Query Transactions Ingestion Processing Hadoop Driver – Parallel with Compaction ANSI SQL + In-Memory GemFire XD – Real-Time Database Services MADlib Algorithms Oozie Vaidya
  • 25. 25© Copyright 2013 Pivotal. All rights reserved. GemFire XD : Delivers Enterprise real-time data processing platform for SLA critical applications; enables users to rapidly and reliably analyze & react to high volumes of events while leveraging10s of TBs of in-memory reference data. Cloud Scale Real-Time Platform Seamless Pivotal HD Integration Optimized for Real-Time Analytics • Very low & predictable latencies at high & variable loads • 10s of TBs in-memory (Memscale) • Multi-tiered caching • Efficient in-memory M-R • Real-time event processing • Continuous querying • SQL based queries • Support structured and semi-structured* data • Java stored procedures • Deep Spring Data integration • Native support for JSON and Objects (Java, C++, C#)* • Scale to HDFS with policy driven in-memory data retention • Online and offline querying of HDFS data • ETL-less bi-directional integration with other Pivotal HD services Enterprise-Class Reliability • JTA distributed transactions • HA through in-memory redundancy • Reliable event propagation • Active-active deployments across WAN * EA / Not in 1.0
  • 26. 26© Copyright 2013 Pivotal. All rights reserved. Feature Benefit Command Center: Install Wizard Faster, easier set up and configuration of HD cluster Start/Stop Services Point/click control of multiple services through a central interface HAWQ UDF (Partial) - C, PL/pgsql - pgcrypto, orafce Enable richer data processing and analytics functionality leveraging existing SQL skill sets Kerberos Support Tightly integrated security with HDFS PXF: Writable HDFS Table Support Easily export HAWQ data to HDFS for external consumption HAWQ Input Format Reader Directly leverage HAWQ data in MapReduce, Pig and Hive Diagnostic Tools Lower administration costs Improved Query Planner “Orca” Enabled to provide more efficient query plans What’s New in Pivotal HD 1.1
  • 27. 27© Copyright 2013 Pivotal. All rights reserved. Feature Benefit Install/Config (ICM) CLI Add/Remove Services Faster, easier set up and administration of services (e.g. Hbase, GemfireXD etc) Upgrade Streamlined, low risk upgrade from 1.0.1 to 1.1 Apache Hadoop Components Hadoop to 2.0.5 and select 2.0.6 patches Greater stability and lower risk based on critical defect fixes incorporated Oozie 3.3.2 Orchestrate data processing (e.g. MR, Pig) job pipelines with dependencies Hive 11 (incl. HCatalog and Hiveserver2) Significant improvements in functionality, scalability and security. Hbase 0.94.8 Enables snapshots of tables without overhead to the Region Servers RHEL 6.4 Certification Enhanced performance optimizations and security improvements What’s New in Pivotal HD 1.1
  • 28. 28© Copyright 2013 Pivotal. All rights reserved. Feature Benefit Platform and Security Kerberos Support - HDFS - HAWQ - Unified Storage Service - PXF to be supported in Dec 2013 Tighter governance, risk and compliance JRE 1.7.0.15 support Supported platform. JRE 1.6 is end of life. RHEL 6.4 (FIPS) certification Federal standard for cryptography modules Pgcrypto for HAWQ Flexible and robust encryption of sensitive data Tools Unified Storage Service: CDH4 as a data source Stream data from CDH4 Data Loader - Push Stream API - Spring XD front end for Twitter Integration support for wider variety of data sources What’s New in Pivotal HD 1.1
  • 29. 29© Copyright 2013 Pivotal. All rights reserved. Command Center Cluster Deployment Wizard • Performs “Host Verification” to determine host eligibility to be added to cluster
  • 30. 30© Copyright 2013 Pivotal. All rights reserved. Command Center Cluster Deployment Wizard • Easily Add Eligible Nodes to Roles • Basic Validation of Layout • Checkbox Add/Remove Services • Ability to Download Configuration Locally Recorder Demo can be found -> Here
  • 31. 31© Copyright 2013 Pivotal. All rights reserved. Orca - Improved Optimizer  Pluggable architecture, allowing faster innovation and quicker iteration on quality improvements  Subset of improved functionality: • Parity with Planner • Improved join-ordering • Join-Aggregate re-ordering • Sub-query de-correlation • Optimal sort-orders • Full integration of data (re- )distribution • Contradiction detection • Elimination of redundant joins • Smarter Partition scan • Star-join optimization • Skew aware
  • 32. 32© Copyright 2013 Pivotal. All rights reserved. What’s new in PXF  Profiles  Writable external tables  Hive partition pruning, HBase filtration  Additional connectors & CSV support  Complete extensibility  Roadmap – Security & authentication – Multi-FS support & other distributions via OS – Stand-alone service
  • 33. 33© Copyright 2013 Pivotal. All rights reserved. Why Pivotal HD?  Big Data + Fast Data  The first enterprise grade platform that provides OLAP and OLTP with HDFS as the common data substrate  Enables closed loop analytics, real-time event processing and high speed data ingest
  • 34. 34© Copyright 2013 Pivotal. All rights reserved. Hawq Format Reader Java Program (i.e. MapReduce Job) HDFS Hawq Hawq Reader (Jar file) 1. Request is made to where Files for specific “Table” exist 2. Location is returned on where are files 2. HDFS Files with Hawq Format are streamed to Reader Recorded Demo can be found -> Here
  • 35. 35© Copyright 2013 Pivotal. All rights reserved. Oozie now Included and Supported with PHD  Oozie is a workflow scheduler system to manage Apache Hadoop jobs.  Oozie Workflow jobs are Directed Acyclical Graphs (DAGs) of actions.  Oozie Coordinator jobs are recurrent Oozie Workflow jobs triggered by time (frequency) and data availabilty.  Oozie is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Java map-reduce, Streaming map-reduce, Pig, Hive, Sqoop and Distcp) as well as system specific jobs (such as Java programs and shell scripts).  Oozie is a scalable, reliable and extensible system.
  • 36. 36© Copyright 2013 Pivotal. All rights reserved. Matrix of what is supported via Install method
  • 37. 37© Copyright 2013 Pivotal. All rights reserved. Security Dashboard (items in bold tested; rest are scheduled)) Support secure cluster Supports Kerberos for Authentication Support LDAP for Authentication HDFS Yes Yes Linux OS supports MapReduce/Pig Yes N/A Hive Yes (standalone mode) N/A Hiveserver No No Hiveserver2 Yes Yes Yes Hbase Yes Yes Yes HAWQ* Yes Yes Yes GemfireXD Yes Yes Yes * Except PXF; Scheduled for Dec (PHD 1.1.1 release
  • 38. 38© Copyright 2013 Pivotal. All rights reserved. Vaidya
  • 39. 39© Copyright 2013 Pivotal. All rights reserved. 39© Copyright 2013 Pivotal. All rights reserved. Roadmap Open Discussion Nov 25, 2013
  • 40. 40© Copyright 2013 Pivotal. All rights reserved. Roadmap – Action Items  Error tables released in PHD 1.2 (February) – Current workaround  PCC new features?!  SAW integration  PHD 1.1 upgrade planning
  • 41. 41© Copyright 2013 Pivotal. All rights reserved. 41© Copyright 2013 Pivotal. All rights reserved. Appendix Nov 25, 2013
  • 42. 42© Copyright 2013 Pivotal. All rights reserved. 42© Copyright 2013 Pivotal. All rights reserved. HAWQ Nov 25, 2013
  • 43. 43© Copyright 2013 Pivotal. All rights reserved. History  HAWQ 1.0 (March release) – True SQL Engine in Hadoop ▪ SQL 92, 99 & 2003 OLAP extensions ▪ JDBC/ODBC – Basic SQL functionalities ▪ DDL and DML – High availability feature – Transaction support  HAWQ 1.1 (June release) – JBOD support feature  HAWQ 1.1.1 (August release) – HDFS access layer read fault tolerance support – HAWQ diagnosis tool – ORCA enabled  HAWQ 1.1.2 (September release) – HAWQ MR Inputformat for AO tables – HDFS access layer write fault tolerance support – HDFS 2.0.5 support  HAWQ 1.1.3 (Oct release) – HAWQ Kerberos support – HAWQ on secure HDFS – UDF  HAWQ 1.1.4 (Dec release) – Gptoolkit – UDF enhancement – Manual failover for HDFS HA  HAWQ 1.2 (Feb release) – Parquet storage support – HAWQ MR Inputformat – Automatic failover for HDFS HA – …
  • 44. 44© Copyright 2013 Pivotal. All rights reserved. Network Interconnect ... ...... HAWQ & HDFS Master Severs Planning & dispatch Segment Severs Query execution ... Storage HDFS, HBase …
  • 45. 45© Copyright 2013 Pivotal. All rights reserved. Namenode B replication Rack1 Rack2 DatanodeDatanode Datanode Read/Write Segment Segment host Segment Segment Segment host Segment Segment host Master host Meta Ops GPDB Interconnect Segment Segment Segment Segment host Segment Datanode Segment SegmentSegment Segment
  • 46. 46© Copyright 2013 Pivotal. All rights reserved. Query execution flow
  • 47. 47© Copyright 2013 Pivotal. All rights reserved. Parallel Query Optimizer • Converts SQL into a physical execution plan – Cost-based optimization looks for the most efficient plan – Physical plan contains scans, joins, sorts, aggregations, etc. – Global planning avoids sub-optimal ‘SQL pushing’ to segments – Directly inserts ‘motion’ nodes for inter-segment communication • ‘Motion’ nodes for efficient non-local join processing (Assume table A is distributed across all segments – i.e. each has AK) – Broadcast Motion (N:N) • Every segment sends AK to all other segments – Redistribute Motion (N:N) • Every segment rehashes AK (by join column) and redistributes each row – Gather Motion (N:1) • Every segment sends its AK to a single node (usually the master)
  • 48. 48© Copyright 2013 Pivotal. All rights reserved. Example of Parallel Query Optimization 48 select c_custkey, c_name, sum(l_extendedprice * (1 - l_discount)) as revenue, c_acctbal, n_name, c_address, c_phone, c_comment from customer, orders, lineitem, nation where c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate >= date '1994-08-01' and o_orderdate < date '1994-08-01' + interval '3 month' and l_returnflag = 'R' and c_nationkey = n_nationkey group by c_custkey, c_name, c_acctbal, c_phone, n_name, c_address, c_comment order by revenue desc Gather Motion 4:1 (slice 3) Sort HashAggregate HashJoin Redistribute Motion 4:4 (slice 1) HashJoin Seq Scan on lineitem Hash Seq Scan on orders Hash HashJoin Seq Scan on customer Hash Broadcast Motion 4:4 (slice 2) Seq Scan on nation
  • 49. 49© Copyright 2013 Pivotal. All rights reserved. Interconnect • UDP based • Flow control
  • 50. 50© Copyright 2013 Pivotal. All rights reserved. Metadata dispatch • Metadata dispatch • Stateless segments – Read only metadata on segment
  • 51. 51© Copyright 2013 Pivotal. All rights reserved. Transaction  Full transaction support tables on HDFS – When a load transaction is aborted, there will be some garbage data left at the end of file. For HDFS like systems, data cannot be truncated or overwritten.  Methods to process the partial data to support transaction. – Option 1: Load data into a separate HDFS file. Unlimited number of files. – Option 2: Use metadata to records the boundary of garbage data, and implements a kind of vacuum mechanism. – Option 3: Implement HDFS truncation.  HDFS truncate is added to support transaction
  • 52. 52© Copyright 2013 Pivotal. All rights reserved. Transaction  Snapshot isolation  Simplified Transaction Model Support – Simplified two phase commit
  • 53. 53© Copyright 2013 Pivotal. All rights reserved. Transaction support • Methods to process the partial data to support transaction. – Option 1: Load data into a separate HDFS file. Unlimited number of files. – Option 2: Use metadata to records the boundary of garbage data, and implements a kind of vacuum mechanism. – Option 3: Implement HDFS truncation.
  • 54. 54© Copyright 2013 Pivotal. All rights reserved. Pluggable storage • Read Optimized/Append only storage • Column store – Compressions: quicklz, zlib, RLE – Partitioned tables hit HDFS limitation • Parquet – Open source format – PAX like column store – Snappy, gzip • MR Input/Output format
  • 55. 55© Copyright 2013 Pivotal. All rights reserved. HDFS C client: why • libhdfs (Current HDFS c client) is based on JNI. It is difficult to make HAWQ support a large number of concurrent queries. • Example: – 4 segments on each segment hosts – 50 concurrent queries – each query has 16 QE processes that do scan – there will be about 800 processes that start 800 JVMs to access HDFS. – If each JVM uses 500MB memory, the JVMs will consume 800 * 500M = 400G memory. – Thus naïve usage of libhdfs is not suitable for HAWQ. Currently we have three options to solve this problem
  • 56. 56© Copyright 2013 Pivotal. All rights reserved. HDFS client: three options • Option 1: use HDFS FUSE. HDFS FUSE introduces some performance overhead. And the scalability is not verified yet. • Option 2 (libhdfs2): implement a webhdfs based C client. webhdfs is based on HTTP. It also introduces some costs. Performance should be benchmarked. Webhdfs based method has several benefits, such as ease to implementation and low maintenance cost. • Option 3 (libhdfs3): implement a C RPC interface that directly communicates with NameNode and DataNode. Many changes when the RPC protocol is changed.
  • 57. 57© Copyright 2013 Pivotal. All rights reserved. 57© Copyright 2013 Pivotal. All rights reserved. PXF Nov 25, 2013
  • 58. 58© Copyright 2013 Pivotal. All rights reserved. PXF is... A fast extensible framework connecting Hawq to a data store of choice that exposes a parallel API
  • 59. 59© Copyright 2013 Pivotal. All rights reserved. Hawq External Tables • gpfdist – remote delimited text (or csv) files. • file – text files on segment filesystem. • execute – script execution and produced data • pxf – text and binary data from available pxf connectors (mostly HD based).
  • 60. 60© Copyright 2013 Pivotal. All rights reserved. Steps • Step 1: GRANT ON PROTOCOL pxf • Step 2: Define a PXF table – Pick built-in plugins right for the job – Specify data source of choice – Map remote data fields to Hawq db attributes (plugin dependent) • Step 3: Query the PXF table. – Directly – Or copy to a Hawq table first CREATE EXTERNAL TABLE foo(<col list>) LOCATION (‘pxf://<host:port>/<data source>?<plugin options>’) FORMAT ‘<type>’(<params>)
  • 61. 61© Copyright 2013 Pivotal. All rights reserved.
  • 62. 62© Copyright 2013 Pivotal. All rights reserved.
  • 63. 63© Copyright 2013 Pivotal. All rights reserved.
  • 64. 64© Copyright 2013 Pivotal. All rights reserved. 64© Copyright 2013 Pivotal. All rights reserved. New Features Main additions since PHD1.0
  • 65. 65© Copyright 2013 Pivotal. All rights reserved. 65© Copyright 2013 Pivotal. All rights reserved. User Experience
  • 66. 66© Copyright 2013 Pivotal. All rights reserved. User Experience • Improved/Informative error messages. • Profiles LOCATION(‘pxf://<host:port>/sales?fragmenter=H iveFragmenter&accessor=HiveAccessor&resolver=H iveResolver’) LOCATION(‘pxf://<host:port>/sales?profile=Hive ’)
  • 67. 67© Copyright 2013 Pivotal. All rights reserved. profiles.xml <profile> <name>HBase</name> <description>Used for connecting to an HBase data store engine</description> <plugins> <fragmenter>HBaseDataFragmenter</fragmenter> <accessor>HBaseAccessor</accessor> <resolver>HBaseResolver</resolver> <myidentifier>MyValue</myidentifier> </plugins> </profile>
  • 68. 68© Copyright 2013 Pivotal. All rights reserved. profiles.xml <profile> <name>HdfsTextSimple</name> <description>Used when reading delimited single line records from plain text files on HDFS </description> <plugins> <fragmenter>HdfsDataFragmenter</fragmenter> <accessor>LineBreakAccessor</accessor> <resolver>StringPassResolver</resolver> <analyzer>HdfsAnalyzer</analyzer> <-- (soon to be added) </plugins> </profile>
  • 69. 69© Copyright 2013 Pivotal. All rights reserved. profiles.xml <profile> <name>MyCustomProfile</name> <description>Used with a new set of plugins I wrote</description> <plugins> <fragmenter>MyFragmenter</fragmenter> <accessor>MyAccessor</accessor> <resolver>MyResolver</resolver> <analyzer>MyAnalyzer</analyzer> </plugins> </profile> Add your own profiles
  • 70. 70© Copyright 2013 Pivotal. All rights reserved. 70© Copyright 2013 Pivotal. All rights reserved. Export to HDFS
  • 71. 71© Copyright 2013 Pivotal. All rights reserved. Writable PXF • gphdfs-like functionality – but extensible… – currently supports text, csv, SequenceFile – supports various hadoop compression CodecsCREATE WRITABLE EXTERNAL TABLE ... LOCATION(‘pxf://<host:port>/sales?profile=HdfsTextSimple&COMPRESSION_CODEC=org.apache.ha doop.io.compress.GzipCodec') FORMAT ‘text’(delimiter ‘,’); can create a new profile “HdfsTextSimpleGZipped” that includes compression_codec LOCATION(‘pxf://<host:port>/sales?profile=HdfsTextSimpleGZipped')
  • 72. 72© Copyright 2013 Pivotal. All rights reserved. 72© Copyright 2013 Pivotal. All rights reserved. New Connectors
  • 73. 73© Copyright 2013 Pivotal. All rights reserved. New Connectors • GemFire XD (Released. GA February) • JSON (On github. GA February (r+w)) • Accumulo (On github. GA version being coded by Clearedge. GA February) • Cassandra (On github. Alpha) Non of them was written by the PXF Dev team… a testament for extensibility.
  • 74. 74© Copyright 2013 Pivotal. All rights reserved. Feature Summary ★ HBase (w/filter pushdown) ★ Hive (w/partition exclusion. various storage file types) ★ HDFS Files: read (delimited text, csv, Sequence, Avro) ★ HDFS Files: write (delimited text, csv, Sequence, various compression codecs and options) ★ GemFireXD, JSON format, Cassandra, Accumulo (currently Beta) ★ Stats collection ★ Automatic data locality optimizations ★ Extensibility!
  • 75. 75© Copyright 2013 Pivotal. All rights reserved. Coming Up Very Soon... ★ Isilon Integration ★ Kerberized HDFS Support ★ Namenode High Availability
  • 76. 76© Copyright 2013 Pivotal. All rights reserved. Limitations • Local metadata of external data – Will be made more transparent when UCS exists. • Authentication and Authorization of external systems – Will be made simpler when centralized user mgmt exists. • Currently supporting local PHD only • Error tables not yet supported • Sharing space with Name/DataNode
  • 77. 77© Copyright 2013 Pivotal. All rights reserved. 77© Copyright 2013 Pivotal. All rights reserved. Writing a plugin steps and guidelines
  • 78. 78© Copyright 2013 Pivotal. All rights reserved. Main Steps 1. Verify P-HD running and PXF installed a. SingleCluster, AllInAll, SingleNode VM 2. Implement the PXF plugin API for your connector (Java) a. Use the PXF API doc as a reference 3. Compile your connector classes and add them to the hadoop classpath on all nodes 4. Restart PHD (won’t be necessary in the future) 5. Add a profile (optional)
  • 79. 79© Copyright 2013 Pivotal. All rights reserved. Plugins • Fragmenter – returns a list of source data fragments and their location • Accessor – access a given list of fragments read them and return records • Resolver – deserialize each record according to a given schema or technique • Analyzer – returns statistics about the source data
  • 80. 80© Copyright 2013 Pivotal. All rights reserved. 80© Copyright 2013 Pivotal. All rights reserved. Thanks! Nov 25, 2013