Troubleshooting Puppet
Enterprise
Celia Cottle
Support Engineer | Puppet Labs
celia@puppetlabs.com
@celiaPDX
The Stack
Console
The console is Puppet Enterprise’s web GUI.
Mcollective/Live Management
LM is an interface to PE’s orche...
The Console
Console
Logs
/var/log/pe-httpd/puppetdashboard.error.log
/var/log/pe-httpd/puppetdashboard.access.log
/var/log/pe-httpd/pu...
No nodes are reporting
Console
Common Problems
•  Stop the pe-puppet-dashboard-workers
•  Check opt/puppet/share/puppet-da...
Console
Common Problems
There’s No Facts Listed For Nodes
/Node Manager Won’t Display
/var/log/pe-httpd/puppetmaster.error...
Console Authentication
Logs
/var/log/pe-httpd/access.log
/var/log/pe-httpd/error.log /var/log/pe-console-auth/
cas.log
Con...
Console Auth
Common Problems
Can’t Log In
/var/log/pe-console-auth/cas.log:
Invalid credentials given for user 'console@pu...
PuppetDB
PuppetDB
Log Files:
/var/log/messages
/var/log/pe-puppetdb/puppetdb.log
Config Files:
/etc/puppetlabs/puppet/puppetdb.conf
PuppetDB
Common Problems
SSL Errors
* /var/log/messages
Error:	
  Could	
  not	
  retrieve	
  catalog	
  from	
  remote	
 ...
Puppetdb
Common Problems
PuppetDB Won’t Start, Fails Silently
/var/log/pe-puppetdb/puppetdb.log
***/var/log/pe-puppetdb/pu...
Live Management
Live Management
/Mcollective
Logs:
/var/log/pe-activemq/activemq.log
/var/log/pe-mcollective/mcollective.log
/var/log/pe-h...
Mcollective
Common Problems
* None of the Nodes Show Up In Live Management
/var/log/pe-httpd/error.log
	
  No	
  MCollecti...
Live Management
Common Problems And What They Look Like
* None of the Nodes Show Up In Live Management
/var/log/pe-activem...
Mcollective
Common Problems
* The Number of Nodes reporting from
MCollective commands, or Live Management,
varies
/var/log...
Live Management
Common Problems And What They Look Like
* Nothing displays but a 500 error
Master/Agent
Logs:
* /var/log/messages
* /var/log/pe-httpd/error.log
Configuration:
/etc/puppetlabs/puppet/puppet.conf
Master/Agent
Common Problems And What They Look Like
* Nodes are failing runs
/var/log/messages
err: /File[/var/opt/lib/pe...
Master/Agent
Common Problems And What They Look Like
* Nodes are failing runs
var/log/messages
Error:	
  Could	
  not	
  r...
Master/Agent
Common Problems And What They Look Like
* Nodes can’t reach the master
Error:	
  Could	
  not	
  request	
  c...
Red Herrings
/var/log/pe-httpd/error.log
config.ru:9:	
  warning:	
  already	
  initialized	
  
constant	
  argv	
  
var/l...
SSL Errors
Where your certs (mostly) live:
/etc/puppetlabs/puppet/ssl
/opt/puppet/share/puppet-dashboard/certs
/etc/puppet...
Regenerating The CA And The Master
1. Delete the contents of /etc/puppetlabs/puppet/ssl directory on the
master.
2. Run `p...
Regenerating the PuppetDB Certs
1. Stop the PuppetDB service
2. Remove agent certs from/etc/puppetlabs/puppet/ssl/ if on a...
Regenerating The Console’s
Certificate
1. cd /opt/puppet/share/puppet-dashboard/certs, and
remove any existing contents.
2...
Regenerating The Agent’s
Certificate
On the master:
1. puppet cert clean agenthostname
2. Restart pe-httpd
On the agent:
1...
Regenerating Your Master’s
Certificate
1. Edit your puppet.conf to update any changes
to the hostname or alt names.
2. `pu...
Certs that Puppet can Regenerate
pe-internal-broker
pe-internal-mcollective-servers
pe-internal-peadmin-mcollective-client...
Regenerating All The Certificates
http://showterm.io/f41a4b7bb5b0b006d8a80
Q&A
Resources
Ask.Puppetlabs.com
Irc.freenode.net
#puppet
PE-Users Mailing List:
https://groups.google.com/a/puppetlabs.com/
g...
Troubleshooting the Puppet Enterprise Stack
Upcoming SlideShare
Loading in …5
×

Troubleshooting the Puppet Enterprise Stack

10,469 views

Published on

A guide through where to look for errors when they happen in the various parts of Puppet Enterprise ( the console, Live Management, puppet master, Activemq, MCollective, agent), what some of those errors mean, and what warnings and errors are red herrings/normally occurring.

Celia Cottle
Support Engineer, Puppet Labs
Celia Cottle is a Support Engineer at Puppet Labs, where she troubleshoots and resolves issues for Puppet Enterprise customers. She comes from Portland State University, where she worked for the College of Engineering and Computer Science doing technical support, while getting her degree in Communication. She’s been working in IT for over five years and enjoys problem solving, working with a wide range of OSes and software, and the variety of challenges that supporting Puppet Enterprise brings. She currently resides in Portland, Oregon.

Published in: Technology, Business

Troubleshooting the Puppet Enterprise Stack

  1. 1. Troubleshooting Puppet Enterprise Celia Cottle Support Engineer | Puppet Labs celia@puppetlabs.com @celiaPDX
  2. 2. The Stack Console The console is Puppet Enterprise’s web GUI. Mcollective/Live Management LM is an interface to PE’s orchestration engine (Mcollective). PuppetDB PuppetDB collects data generated by Puppet. Master/Agent The central puppet server/ Retrieves the client configuration from the puppet master and applies it to the local host
  3. 3. The Console
  4. 4. Console Logs /var/log/pe-httpd/puppetdashboard.error.log /var/log/pe-httpd/puppetdashboard.access.log /var/log/pe-httpd/puppetmaster.error.log Configuration /etc/puppetlabs/puppet/puppet.conf
  5. 5. No nodes are reporting Console Common Problems •  Stop the pe-puppet-dashboard-workers •  Check opt/puppet/share/puppet-dashboard/tmp/pids for files ending in .pid. •  Restart the pe-puppet-dashboard-workers. •  Run ps aux | grep delayed_job and see if entries like dashboard/delayed_job.1 and delayed_job.1_monitor appear. If they are, that means the dashboard has started up properly again.
  6. 6. Console Common Problems There’s No Facts Listed For Nodes /Node Manager Won’t Display /var/log/pe-httpd/puppetmaster.error.log [Fri  Aug  16  22:49:20  2013]  [error]  [client  172.16.0.2]   Certificate  Verification:  Error  (23):  certificate  revoked  
  7. 7. Console Authentication Logs /var/log/pe-httpd/access.log /var/log/pe-httpd/error.log /var/log/pe-console-auth/ cas.log Configuration Files /etc/puppetlabs/console-auth/cas_client_config.yml /etc/puppetlabs/rubycas-server/config.yml
  8. 8. Console Auth Common Problems Can’t Log In /var/log/pe-console-auth/cas.log: Invalid credentials given for user 'console@puppetlabs.test' Possible Cause: Bad Credentials/Lost Credentials $ cd /opt/puppet/share/console-auth $ sudo /opt/puppet/bin/rake db:create_user USERNAME="adminuser@example.com" PASSWORD="<password>" ROLE="Admin” Alternatively, if using 3rd Party Auth: /var/log/pe-httpd/access.log
  9. 9. PuppetDB
  10. 10. PuppetDB Log Files: /var/log/messages /var/log/pe-puppetdb/puppetdb.log Config Files: /etc/puppetlabs/puppet/puppetdb.conf
  11. 11. PuppetDB Common Problems SSL Errors * /var/log/messages Error:  Could  not  retrieve  catalog  from  remote   server:  Error  400  on  SERVER:  Failed  to  submit   'replace  facts'  command  for  agent1.vm  to   PuppetDB  at  master0.vm:8081:  Server  hostname   'master0.vm'  did  not  match  server  certificate;   expected  one  of  master1.vm  
  12. 12. Puppetdb Common Problems PuppetDB Won’t Start, Fails Silently /var/log/pe-puppetdb/puppetdb.log ***/var/log/pe-puppetdb/puppetdb-oom.hprof java.lang.OutOfMemoryError:  Java  heap  space   Fix: Edit the defaults in /etc/default/pe-puppetdb or /etc/ sysconfig/pe-puppetdb, and change the 256m to 1024m JAVA_ARGS="-Xmx256m -XX: +HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/ var/log/pe-puppetdb/puppetdb-oom.hprof -Xms256m"
  13. 13. Live Management
  14. 14. Live Management /Mcollective Logs: /var/log/pe-activemq/activemq.log /var/log/pe-mcollective/mcollective.log /var/log/pe-httpd/error.log Configuration: /etc/puppetlabs/mcollective/server.cfg
  15. 15. Mcollective Common Problems * None of the Nodes Show Up In Live Management /var/log/pe-httpd/error.log  No  MCollective  servers  responded.  Either   MCollective  is  not  yet  configured  and   operational  or  all  MCollective  servers  are   off-­‐line.  Check  that  you  can  reach  your   servers  with  `mco  ping`.  It  may  also  help  to   increase  the  LM_DISCOVERY_TIMEOUT  or   LM_INVENTORY_RETRIES  variables  in  your  Apache   configuration.  
  16. 16. Live Management Common Problems And What They Look Like * None of the Nodes Show Up In Live Management /var/log/pe-activemq/activemq.log |  WARN  |  Transport  Connection  to:  tcp://000.00.000.00:0000   failed:  java.lang.SecurityException:  User  name  [mcollective]   or  password  is  invalid.  
  17. 17. Mcollective Common Problems * The Number of Nodes reporting from MCollective commands, or Live Management, varies /var/log/pe-activemq/activemq.log javax.net.ssl.SSLHandshakeException:  Remote  host   closed  connection  during  handshake     Solution: On the master, edit: /opt/puppet/share/puppet/modules/pe_mcollective/server.cfg.erb   and  edit  the  line  registerinterval  =  
  18. 18. Live Management Common Problems And What They Look Like * Nothing displays but a 500 error
  19. 19. Master/Agent Logs: * /var/log/messages * /var/log/pe-httpd/error.log Configuration: /etc/puppetlabs/puppet/puppet.conf
  20. 20. Master/Agent Common Problems And What They Look Like * Nodes are failing runs /var/log/messages err: /File[/var/opt/lib/pe-puppet/lib]: Failed to generate additional resources using 'eval_generate: Connection timed out - connect(2) err: Could not retrieve plugin: execution expired Solution: Splay: http://docs.puppetlabs.com/references/latest/ configuration.html#splay
  21. 21. Master/Agent Common Problems And What They Look Like * Nodes are failing runs var/log/messages Error:  Could  not  request  certificate:  The  certificate  retrieved   from  the  master  does  not  match  the  agent's  private  key.   To  fix  this,  remove  the  certificate  from  both  the  master  and  the   agent  and  then  start  a  puppet  run,  which  will  automatically   regenerate  a  certficate.   On  the  master:      puppet  cert  clean  agentname   Restart  pe-­‐httpd   On  the  agent:      rm  -­‐f  /etc/puppetlabs/puppet/ssl/certs/agentname      puppet  agent  -­‐t    
  22. 22. Master/Agent Common Problems And What They Look Like * Nodes can’t reach the master Error:  Could  not  request  certificate:  getaddrinfo:   Name  or  service  not  known   Troubleshooting 1. telnet master 8140 2. Check /etc/hosts or DNS 3. ping master
  23. 23. Red Herrings /var/log/pe-httpd/error.log config.ru:9:  warning:  already  initialized   constant  argv   var/log/pe-httpd/puppetdashboard.error.log [warn]  RSA  server  certificate  CommonName  (CN)   `pe-­‐internal-­‐dashboard'  does  NOT  match  server   name!?   /var/log/pe-console-auth/auth.log INFO  2013-­‐08-­‐20  01:07  UTC:  User    (anonymous)   accessed  read-­‐write  url  /reports/upload  
  24. 24. SSL Errors Where your certs (mostly) live: /etc/puppetlabs/puppet/ssl /opt/puppet/share/puppet-dashboard/certs /etc/puppetlabs/puppetdb/ssl
  25. 25. Regenerating The CA And The Master 1. Delete the contents of /etc/puppetlabs/puppet/ssl directory on the master. 2. Run `puppet cert list` to regenerate the CA. 3. Stop pe-httpd. 4. Run `puppet master --no-daemonize --verbose` to regenerate the master cert and create a cert request. 5. Check that ‘puppet cert list -a’ returned the master cert. 6. Restart pe-httpd.
  26. 26. Regenerating the PuppetDB Certs 1. Stop the PuppetDB service 2. Remove agent certs from/etc/puppetlabs/puppet/ssl/ if on a separate server and the PuppetDB ones from /etc/puppetlabs/puppetdb/ssl/ 3. Run `puppet cert clean puppetdbhost.yourdomain` on the master (if not cleaned already and on a separate host) 4. Regenerate the Puppet Agent certs by performing a Puppet run on the PuppetDB, signing them on the master if necessary. 5. Run /opt/puppet/sbin/puppetdb-ssl-setup -f on thePuppetDB host. 6. Restart the PuppetDB service on its host, and the pe-httpd service on your master.
  27. 27. Regenerating The Console’s Certificate 1. cd /opt/puppet/share/puppet-dashboard/certs, and remove any existing contents. 2. sudo /opt/puppet/bin/rake RAILS_ENV=production cert:create_key_pair 3. sudo /opt/puppet/bin/rake RAILS_ENV=production cert:request 4. sudo puppet cert sign pe-internal-dashboard 5. sudo /opt/puppet/bin/rake RAILS_ENV=production cert:retrieve 6. sudo chown -R puppet-dashboard:puppet-dashboard certs/ 7. /etc/init.d/pe-httpd restart
  28. 28. Regenerating The Agent’s Certificate On the master: 1. puppet cert clean agenthostname 2. Restart pe-httpd On the agent: 1.rm -rf /etc/puppetlabs/puppet/ssl 2. puppet agent -t On the master: 1. puppet cert sign agenthostname
  29. 29. Regenerating Your Master’s Certificate 1. Edit your puppet.conf to update any changes to the hostname or alt names. 2. `puppet cert clean mastername` 3. Stop pe-httpd(/etc/init.d/pe-­‐httpd   stop). 4. Run `puppet master --no-daemonize -- verbose’.
  30. 30. Certs that Puppet can Regenerate pe-internal-broker pe-internal-mcollective-servers pe-internal-peadmin-mcollective-client pe-internal-puppet-console-mcollective-client
  31. 31. Regenerating All The Certificates http://showterm.io/f41a4b7bb5b0b006d8a80
  32. 32. Q&A
  33. 33. Resources Ask.Puppetlabs.com Irc.freenode.net #puppet PE-Users Mailing List: https://groups.google.com/a/puppetlabs.com/ group/pe-users/topics

×