The document summarizes recommendations for efficiently and effectively managing Apache Hadoop based on observations from analyzing over 1,000 customer bundles. It covers common operational mistakes like inconsistent operating system configurations involving locale, transparent huge pages, NTP, and legacy kernel issues. It also provides recommendations for optimizing configurations involving HDFS name node and data node settings, YARN resource manager and node manager memory settings, and YARN ATS timeline storage. The presentation encourages adopting recommendations built into the SmartSense analytics product to improve cluster operations and prevent issues.
/sys/kernel/mm/redhat_transparent_hugepage/enabled [always] never
Usernames, User ID’s, Group Names, Group ID’s, as well as DNS
positive-time-to-live
If the NFS file system is hard mounted, the NFS daemons try repeatedly to contact the server. The NFS daemon retries will not time out unless nfstimeout value is set and reached
SmartSense bundles include configuration, and metrics, and bundles used for Support Case troubleshooting included configuration, metrics, and log files. This data is captured for the Operating System of cluster nodes, as well as for all of the installed HDP services.
The capture process can be configured to exclude specific files from capture, or specific Hadoop properties within HDP configuration files. In order to provide protection to organization-specific data, such as customer ID’s, patient ID’s, Credit Card #’s, etc. We provide the capability to specify a regular expression that can be removed or replaced in any file that is captured by SmartSense. This allows protection of sensitive data in the event that data is unintentionally leaked into log files.
By default we remove all properties associated with clear text passwords. Ambari, Hive, and Oozie by default store DB credentials as cleartext, unless they’ve been configured to encrypt them. Just in case Hadoop Operators have not taken the time to do so, we exclude those properties by default.