1. HAWQ is an open source MPP database for Hadoop that provides SQL querying capabilities and integration with data in HDFS and other sources.
2. It uses a master-segment architecture with dynamic resource management through YARN to enable high performance SQL queries across large datasets.
3. The document discusses HAWQ's architecture, performance advantages, extensions for querying external data through PXF, and integration with Hive through different connectors and a unified catalog.
15. HAWQ eXtension Framework (aka PXF)
Uniform tabular view to
heterogeneous data sources
Exploits parallelism for data
access
Pluggable framework for
Custom connectors(profiles)
Built-in connectors for various data
sources/formats
16. Tomcat
(Webapp)
REST API
Java API
External Tables
Java API
Java/Thrift
● JDBC
● Solr
● Redis
● Cassandra
● GemfireXD
PXF Architecture
➔ Independent JVM
➔ Runs alongside namenode and datanodes
PXF