Leveraging Hadoop in your PostgreSQL Environment

Who am I?
Jim Mlodgenski
CTO, OpenSCG
Co-organizer, NYCPUG
Co-organizer, Philly PUG
Co-chair, PGConf US
jim@openscg.com
@jim_mlodgenski

Agenda
Strengths of PostgreSQL
Strengths of Hadoop
Hadoop Community
Use Cases

Best of Both World
Postgres
World’s most advanced open
source database solution
Enterprise class including MVCC,
streaming replication & rich data
type support (to name a few!)
Robust transaction support with
strong ANSI-SQL compliance
Hadoop
Big data distributed framework
Reliable, massively scalable &
proven
Failures handled at the application
layer allowing commodity
hardware

Strengths of PostgreSQL
Strong Data Types
Concurrency
Transactions
Security
Indexes
Connectors

Components of PostgreSQL
Database
Connectors
– JDBC
– ODBC
– Libpq
Foreign Data Wrappers
And more...

Strengths of Hadoop
Parallelism
Flexibility
Redundancy
Scalability

Components of Hadoop
HDFS
Hive
Flume
Sqoop
ZooKeeper
Hbase
And many more...

HDFS
Hadoop Distributed File System

Hbase
Modeled after Google BigTable
Column-oriented database on
top of HDFS

ZooKeeper
Distributed Configuration Service
Supports synchronization and
distributed locking
Automatic leader election

Hive
Adds SQL on Hadoop
Converts SQL (HQL) to
MapReduce Jobs

Flume
Streams data into HDFS
Distributed and Highly Available

Sqoop
Allows for bulk
transfers of data
between Hadoop
and a RDBMS

Hadoop Community
Much more like the Linux community than the
PostgreSQL community
Some competing commercial interests makes
the direction unclear to some

Hive Metastore
All of the meta data of the Hive tables reside
in a RDBMS
The default is to use Derby
– Limits to a single connection

Hive Metastore (cont.)
Use PostgreSQL for scalability and reliability
Many concurrent users

PostgreSQL Backups
PostgreSQL's WAL archiving and Point In Time
Recovery is powerful
– But it requires a lot of storage
Typically used with some sort of NFS

PostgreSQL Backups (cont.)
Use HDFS
– Redundancy & Scalability

PostgreSQL Backups (cont.)
Archive Command
archive_command =
'hadoop dfs copyFromLocal
%p
/user/postgres/wal/%f'

Log Files
Maintain log files for
months or years
May use Syslog to
consolidate multiple
database logs
Turning on query logging
makes the log file huge

Log Files (cont.)
Use Flume
Consolidates
logs across
databases
MapReduce
allows for parallel
analysis

Log Files (cont.)
Setup Syslog to forward messages to Flume
rsyslog.conf:
*.* @127.0.0.1:5140
Configure Flume to act as a Syslog server
pglogs.sources.sl.type = syslogudp
pglogs.sources.sl.port = 5140
pglogs.sources.sl.host = 0.0.0.0

Log Files (cont.)
MapReduce jobs can quickly analyze the logs
public static class MapClass extends MapReduceBase implements
Mapper<StatementOffset, Text, Text, LongWritable> {
private final static String STATEMENT_DELIM = "statement: ";
private final static String SYSLOG_IDENT = "postgres";
private final static LongWritable one = new LongWritable(1);
public void map(StatementOffset key, Text value,
OutputCollector<Text, LongWritable> output,
Reporter reporter)
throws IOException {
String line = value.toString();
if (line.startsWith(SYSLOG_IDENT) &&
line.contains(STATEMENT_DELIM)) {
output.collect(getStatementType(line), one);
}
}
...

Transaction History
History Tables grow very
rapidly
Maintaining the tables
over time is a huge
undertaking
Partitioning frequently
used

Transaction History (cont.)
Use Sqoop
– Add a sequence to the table for fast
incremental loads

OLAP Cubes
Can take a very long
time to build
PostgreSQL will use
only a single CPU
Drilling down to the
details can be a very
long query

OLAP Cubes
Use a Foreign Data Wrapper
Looks like a native table to reporting tools
Drill down takes place on Hadoop

OLAP Cubes (cont.)
Create a Foreign Server
CREATE EXTENSION hadoop_fdw;
CREATE SERVER hadoop_server
FOREIGN DATA WRAPPER hadoop_fdw
OPTIONS (address '127.0.0.1', port '10000');
CREATE USER MAPPING
FOR PUBLIC SERVER hadoop_server;

OLAP Cubes (cont.)
Create a Foreign Table
CREATE FOREIGN TABLE order_line (
ol_w_id integer,
ol_d_id integer,
ol_o_id integer,
ol_number integer,
ol_i_id integer,
ol_delivery_d timestamp,
ol_amount decimal(6,2),
ol_supply_w_id integer,
ol_quantity decimal(2,0),
ol_dist_info varchar(24)
) SERVER hadoop_server
OPTIONS (table 'order_line');

OLAP Cubes (cont.)
Loading PostgreSQL aggregate tables is a
simple SQL statement
Use Hive views for more complex aggregations
INSERT INTO item_sale_month
SELECT ol_i_id as i_id,
EXTRACT(YEAR FROM ol_delivery_d) as year,
EXTRACT(MONTH FROM ol_delivery_d) as month,
sum(ol_amount) as amount
FROM order_line
GROUP BY 1, 2, 3;

OLAP Cubes (cont.)
Drill downs pass the processing down to Hive
postgres=# explain verbose select sum(ol_amount) from order_line where
ol_i_id = 34928;
QUERY PLAN

Aggregate
(cost=11002.50..11002.51 rows=1 width=14)
Output: sum(ol_amount)
>
Foreign Scan on public.order_line (cost=10000.00..11000.00 rows=1000
width=14)
Output: ol_w_id, ol_d_id, ol_o_id, ol_number, ol_i_id,
ol_delivery_d, ol_amount, ol_supply_w_id, ol_quantity, ol_dist_info
Remote SQL: SELECT * FROM order_line WHERE ((ol_i_id = 34928))
(5 rows)

Audit History
All database access
should be audited and
autonomously logged
Must be maintained for
years

Audit History (cont.)
Use the Hadoop Foreign Data Wrapper
to Flume

Audit History (cont.)
Create a writable foreign table
CREATE FORIEGN TABLE audit (
audit_id bigint,
event_d timestamp,
table varchar,
action varchar,
user varchar,
OPTIONS (table 'audit',
flume_port '44444');

Message Queue
Tables have a lot of
churn with many
updates and deletes
Causes a lot of table
and index bloat in
PostgreSQL
AKA a vacuuming
nightmare

Message Queue (cont.)
Use an FDW to Hbase
Hbase is not an “Eventually Consistent”
architecture so it is ideal for message queues

Message Queue (cont.)
Create a writable foreign table
CREATE FOREIGN TABLE hbase_table (
key varchar,
value varchar
OPTIONS (table 'hbase_table', hbase_address
'localhost',
hbase_port '9090',
hbase_mapping ':key,cf:val');
INSERT INTO hbase_table VALUES ('key1',
'value1');
INSERT INTO hbase_table VALUES ('key2',
'value2');
UPDATE hbase_table SET value = 'update'
WHERE key = 'key2';
DELETE FROM hbase_table WHERE key='key1';
SELECT * from hbase_table;

High Availability
When setting up
replication for high
availability many
necessary components
are not provided by
PostgreSQL
Failure detection
Split brain prevention
Replica promotion
Notification to clients of fail
over

High Availability (cont.)
ZooKeeper with
a custom
background worker
can handle all of
the missing
components

Failure Detection – Replicas watch an
ephemeral lock created by the master
void watch_master() {
...
sprintf(root_path, "%s/lock",
zookeeper_path);
while (!found_master && !
got_sigterm) {
elog(DEBUG1, "Looking for
the master lock...");
rc = zoo_get_children(zh,

Split brain prevention – master grabs an
exclusive zooKeeper lock on startup. Shut
down immediately if unsuccessful
char *create_lock() {
char path[PATH_LEN];
char *buffer;
int rc;
buffer = (char *)
palloc(PATH_LEN);
ensure_connected();

Replica promotion – use zooKeeper for
ballots of a election. Highest LSN wins
void elect_master() {
...
recptr = GetWalRcvWriteRecPtr(NULL, NULL);
sprintf(lsn, "%X/%08X", (uint32) (recptr >> 32), (uint32) recptr);
elog(DEBUG1, "Entering a ballot with an LSN of: %s", lsn);
sprintf(path, "%s/lock/%s", zookeeper_path, replica_id);
rc = zoo_create(zh, path, lsn, strlen(lsn), &ZOO_OPEN_ACL_UNSAFE,
ZOO_EPHEMERAL, buffer, sizeof(buffer)-1);
if (rc) {
elog(FATAL, "Failure creating zooKeeper path: %s", path);
}
elog(DEBUG1, "Created a zooKeeper ephemeral path at: %s", buffer);
all_votes_in = false;
while (!all_votes_in && !got_sigterm) {
sprintf(path, "%s/replica", zookeeper_path);
rc = zoo_get_children(zh, path, 0, &replicas);
if (rc == ZOK) {
sprintf(path, "%s/lock", zookeeper_path);
rc = zoo_get_children(zh, path, 0, &ballots);
if (rc == ZOK) {
all_votes_in = true;
for(i=0; i < replicas.count; i++) {
found = false;
for(j=0; j < ballots.count; j++) {
if (strcmp(replicas.data[i], ballots.data[j]) == 0) {
found = true;
break;
}
}
if (!found) {
all_votes_in = false;
break;
}
}
}
}
…
}
if (strcmp(ballots.data[j], replica_id) != 0) {
sprintf(path, "%s/lock/%s", zookeeper_path, ballots.data[j]);
memset(buffer, 0, sizeof(buffer));
bufferlen= sizeof(buffer);
rc = zoo_get(zh, path, 0, buffer, &bufferlen, NULL);
if (rc != ZOK) {
elog(LOG, "Unable to get %s. New master probably already found...",
path);
}
elog(DEBUG1, "Comparing the LSN: %s", buffer);
if (strcmp(lsn, buffer) < 0) {
elog(DEBUG1, "Found an LSN greater than mine. I am not the winner.");
return;
} else if (strcmp(lsn, buffer) == 0) {
elog(DEBUG1, "Found an LSN equal to mine. See if I was the first to the
start.");
if (strcmp(replica_id, ballots.data[j]) > 0 ) {
elog(DEBUG1, "Found an LSN equal to mine and a sequence earlier
than mine. I am not the winner.");
return;
}
}
}
}
elog(LOG, "Becoming the new master. Acquiring the proper locks.");
lock = create_lock();
elog(DEBUG1, "Removing ballot at %s", path);
rc = zoo_delete(zh, path, -1);
if (rc != ZOK) {
elog(LOG, "Unable to delete %s", path);
}
}
if (!has_lock(lock)) {
elog(LOG, "Unable to acquire a zooKeeper lock. Shutting down to prevent a split
brain scenario");
do_stop();
} else {
elog(LOG, "Promoting to become the new master.");
do_promote();
}
publish_master_info();
}

Client notification – Python (or others) can
watch the master and act appropriately
def
__init__(self,zkHosts,pathName):
zkHosts = zkHosts
pathName = pathName
watchPath = pathName +
"/master"
zk =
KazooClient(hosts=zkHosts)
zk.start()

Getting the Components
http://hadoop.apache.org/
http://hive.apache.org/
http://flume.apache.org/
http://sqoop.apache.org/
http://zookeeper.apache.org/
http://hbase.apache.org/
http://www.postgresql.org/
http://jdbc.postgresql.org/
http://openjdk.java.net/
http://openscg.com/se/hadoop-fdw/

Leveraging Hadoop in your PostgreSQL Environment

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Leveraging Hadoop in your PostgreSQL Environment

Similar to Leveraging Hadoop in your PostgreSQL Environment (20)

More from Jim Mlodgenski

More from Jim Mlodgenski (11)

Recently uploaded

Recently uploaded (20)

Leveraging Hadoop in your PostgreSQL Environment