Managing Apache HAWQ
with Apache AMBARI
Apache Ambari Meetup - June 27, 2016
Alexander Denissov
Bhuvnesh Chaudhary
Mithun Mathew
Apache HAWQ
(incubating)
Apache Ambari
Hadoop-native SQL query engine and advanced analytics
MPP database that offers:
1
2
3
4
5
interactive query execution
high performance
machine learning algorithms
tools for Data Analysts and Data Scientists
processing for large and complex data sets
APACHE HAWQ (incubating)
APACHE HAWQ (incubating) ARCHITECTURE
HAWQ - AMBARI INTEGRATION SCOPE
Installation and configuration
Topology and configuration recommendations and validations
Kerberos and High Availability support
HAWQ Master - HAWQ Standby failover
Service and Component Alerts
Visual Widgets
HAWQ - AMBARI INTEGRATION EFFORT
Praises
Ambari’s pluggable architecture makes integrations like this possible and easy
Kerberos setup is fully metadata driven — major kudos!
Challenges
HAWQ is not part of the HDP stack and is not available in Ambari out-of-the box
Advanced features and wizards require JavaScript code modifications
Driven by the team of engineers at Pivotal
Developed integrations from basic to more advanced
Invaluable support from Ambari Community
THANK YOU!
RECOMMEND SERVICE TOPOLOGY
VALIDATE SERVICE TOPOLOGY
RECOMMEND AND VALIDATE CONFIGS
HAWQ SERVICE SUMMARY PAGE
HAWQ SERVICE ACTIONS
ACTIVATE HAWQ STANDBY WIZARD
Activate HAWQ Standby
Wizard
(Manual Operation)
ACTIVATE HAWQ STANDBY WIZARD
HAWQ Standby Master
promoted to HAWQ Master
Add HAWQ Standby Master
action becomes visible
HAWQ ALERTS
Status of HAWQ Components
Communication issues
between HAWQ Components
HAWQ AMBARI FUTURE INTEGRATION
Support automated upgrade independent of stack
Ongoing related work: AMBARI-14854, AMBARI-12885
Ambari requires service restart for pushing configuration changes.
What if, the service can reload configurations without restart?
Ongoing related work: AMBARI-17241
HAWQ Upgrade
Dynamic Configuration Reload
Display query history
Manage resource queues
HAWQ View
Currently Ambari does not support configuration
changes without restarting service
Some parameters do NOT require restart!
HDFS dfs.heartbeat.interval, dfs.namenode.heartbeat.recheck-interval
HAWQ default_hash_table_bucket_number, hawq_rm_memory_limit_perseg
DYNAMIC CONFIGURATION RELOAD
Currently Ambari does not support configuration
changes without restarting service
Some parameters do NOT require restart!
HDFS dfs.heartbeat.interval, dfs.namenode.heartbeat.recheck-interval
HAWQ default_hash_table_bucket_number, hawq_rm_memory_limit_perseg
DOWNTIME!!!
Consequence of Restarting the Service:
DYNAMIC CONFIGURATION RELOAD
No more DOWNTIME!!!
DYNAMIC CONFIGURATION RELOAD
resources/common-services/HAWQ/2.0.0/configurations/hawq-site.xml
<property>	
				<name>default_hash_table_bucket_number</name>	
				<value>6</value>	
				<supports-reload>true</supports-reload>	
</property>
resources/common-services/HAWQ/2.0.0/package/scripts/hawqmaster.py
class	HawqMaster(Script):	
		def	start(self,	env):	
				…	
		def	stop(self,	env):	
				…	
		def	reload(self,	env):	
				self.configure(env)	
				Execute(‘hawq	master	reload’,	…)
HOW TO USE
Ambari UI
Show Reload
Button
Desired Configs
Updater
HeartBeat
Processor
ServiceComponentHost
(updates requires_reload)
Request Handling
and Execution
Ambari
Agent
Reload
Method
Ambari Server
Ambari Web
Ambari
Agent
Reload
Method
Ambari
Agent
Reload
Method
Ambari
Agent
Reload
Method
REST
API
POST
PUT
GET
Heartbeats
invalidates
requires_reload
user specifies
COLLABORATION DIAGRAM
Feedback?
AMBARI-17241
Reload vs Restart - Are they mutually exclusive?
THINGS TO DECIDE
Seriously, purple?

Managing Apache HAWQ with Apache AMBARI