Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
<ul><li>Thorsten Kellermann </li></ul><ul><li>Sun Microsystems </li></ul>X64 Work Shop  Linux Information Gathering
Agenda <ul><li>Linux Support Overview </li></ul><ul><ul><li>Support model and structure </li></ul></ul><ul><li>Data Collec...
Agenda (cont) <ul><li>Advanced Troubleshooting </li></ul><ul><ul><li>System Core Dump Capturing </li></ul></ul><ul><ul><li...
Linux Support Overview <ul><li>Linux Support from System TSC organization: </li></ul><ul><ul><li>EMEA: System TSC VSP </li...
Linux Support Overview (cont) <ul><li>Supported Linux Versions: </li></ul><ul><ul><li>Red Hat Enterprise Linux (RHEL) </li...
Linux Support Overview (cont) <ul><li>What is covered by support? </li></ul><ul><ul><li>Bugs within the OS or with Core ap...
Data Collection <ul><li>Entitlement information </li></ul><ul><ul><li>We need the entitlement for the Linux the customer i...
Data Collection (cont) <ul><li>Samples: </li></ul><ul><ul><li>Customer has a working and a non working system </li></ul></...
Data Collection (cont) <ul><li>Red Hat:  </li></ul><ul><ul><li>sysreport </li></ul></ul><ul><ul><ul><li>Mandatory for esca...
Data Collection (cont) <ul><li>Others </li></ul><ul><ul><li>Linux Explorer </li></ul></ul><ul><ul><ul><li>Most complete da...
Data Analyzing <ul><li>There is no automatic tool! </li></ul><ul><li>This presentation isn't complete at all. </li></ul><u...
Data Analyzing (cont) <ul><li>What packages are installed? Which version? </li></ul><ul><ul><li>RPM is the packages manage...
Data Analyzing (cont) <ul><li>Hardware/Firmware information </li></ul><ul><ul><li>lspci [-v[v[v[v[v]]]]] </li></ul></ul><u...
Data Analyzing (cont) <ul><li>Overview </li></ul>
System Core Dump Capturing <ul><li>No standard at the moment </li></ul><ul><ul><li>Kdump has find it's way into the mainst...
System Core Dump Capturing (cont) <ul><li>SLES 8 / 9 uses LKCD </li></ul><ul><ul><li>Based on an IBM/SGI implementation. <...
Setting up RHEL 3 & 4 Netdump <ul><li>Install Netdump Server </li></ul><ul><ul><li>install package netdump-server </li></u...
Setting up RHEL 5 Kdump <ul><li>Installed by Default </li></ul><ul><li>Configuration Dialog </li></ul><ul><ul><li>enable /...
Setting up SLES 8/9 LKCD <ul><li>Install required package: </li></ul><ul><ul><li>lkcdutils </li></ul></ul><ul><li>Edit /et...
Seting up SLES 10 Kdump <ul><li>Install needed packages: </li></ul><ul><ul><li>kexec-tools </li></ul></ul><ul><ul><li>kern...
Checking dump setup <ul><li>Check if everything fit together: </li></ul><ul><ul><li>Enable Magic SysRq feature temporarily...
Linux SysRq Feature <ul><li>The Magic SysRq Feature is somewhat similar to Stop-A on Solaris </li></ul><ul><li>It can forc...
Linux SysRq Feature (cont) <ul><li>Disabled by default, need to be enabled </li></ul><ul><ul><li>temporarily until next re...
Linux SysRq Feature (cont) <ul><li>Some Hotkeys: </li></ul><ul><ul><li>K </li></ul></ul><ul><ul><ul><li>call the Secure At...
Linux SysRq Feature (cont) <ul><li>Some Hotkeys (cont): </li></ul><ul><ul><li>b </li></ul></ul><ul><ul><ul><li>boots the s...
Crash Dump Analyzing <ul><li>Crash utility </li></ul><ul><ul><li>Support varios dump fomats </li></ul></ul><ul><ul><ul><li...
Crash Dump Analyzing (cont) <ul><li>You need to have the debug information of the kernel </li></ul><ul><li>Crash package n...
Troubleshoot a Hanging System <ul><li>Hard to troubleshoot due to lack of information </li></ul><ul><li>If a deadlocked ke...
Links <ul><li>External Sources: </li></ul><ul><ul><li>Linux Explorer http://www.unix-consultants.co.uk/examples/scripts/li...
Links <ul><li>Did you know http://www.google.com/linux? </li></ul>
X64 Work Shop  Linux Information Gathering <ul><li>Thorsten Kellermann </li></ul><ul><ul><li>[email_address] </li></ul></ul>
Upcoming SlideShare
Loading in …5
×

X64 Workshop Linux Information Gathering

2,737 views

Published on

  • Be the first to comment

X64 Workshop Linux Information Gathering

  1. 1. <ul><li>Thorsten Kellermann </li></ul><ul><li>Sun Microsystems </li></ul>X64 Work Shop Linux Information Gathering
  2. 2. Agenda <ul><li>Linux Support Overview </li></ul><ul><ul><li>Support model and structure </li></ul></ul><ul><li>Data Collection </li></ul><ul><ul><li>Red Hat sysreport </li></ul></ul><ul><ul><li>SuSE siga / config.sh </li></ul></ul><ul><ul><li>Linux explorer </li></ul></ul><ul><li>Data Analyzing </li></ul>
  3. 3. Agenda (cont) <ul><li>Advanced Troubleshooting </li></ul><ul><ul><li>System Core Dump Capturing </li></ul></ul><ul><ul><li>Linux SysRq </li></ul></ul><ul><ul><li>Hanging System </li></ul></ul><ul><li>Linux Analysis </li></ul><ul><ul><li>CDA </li></ul></ul>
  4. 4. Linux Support Overview <ul><li>Linux Support from System TSC organization: </li></ul><ul><ul><li>EMEA: System TSC VSP </li></ul></ul><ul><ul><ul><li>coverage from 9am - 5pm </li></ul></ul></ul><ul><ul><li>AMER/APAC: System TSC OS </li></ul></ul>
  5. 5. Linux Support Overview (cont) <ul><li>Supported Linux Versions: </li></ul><ul><ul><li>Red Hat Enterprise Linux (RHEL) </li></ul></ul><ul><ul><ul><li>Version 3 and 4 </li></ul></ul></ul><ul><ul><ul><li>AS, WS, ES, DESKTOP </li></ul></ul></ul><ul><ul><ul><li>Only existing contracts, no new contracts after the 30.09.2006. </li></ul></ul></ul><ul><ul><li>Novell/SuSE Linux Enterprise (SLES) </li></ul></ul><ul><ul><ul><li>Version 8, 9 and 10 </li></ul></ul></ul><ul><li>Back line support from Vendor available </li></ul><ul><ul><li>We have a path to escalate issue to Red Hat or Novell/SuSE. </li></ul></ul>
  6. 6. Linux Support Overview (cont) <ul><li>What is covered by support? </li></ul><ul><ul><li>Bugs within the OS or with Core applications </li></ul></ul><ul><li>What is not covered? </li></ul><ul><ul><li>Configuration of the system </li></ul></ul><ul><ul><li>HowTo questions </li></ul></ul><ul><ul><li>3 rd Party applications </li></ul></ul><ul><li>Other limitations </li></ul><ul><ul><li>Own compiled Kernels, tainted Modules </li></ul></ul><ul><ul><li>Sun do not fix bugs within any distribution, its up to the Vendor. </li></ul></ul>
  7. 7. Data Collection <ul><li>Entitlement information </li></ul><ul><ul><li>We need the entitlement for the Linux the customer installed </li></ul></ul><ul><li>General thoughts about data collection </li></ul><ul><ul><li>The issue must be visible within the data. </li></ul></ul><ul><ul><li>Data must be current. </li></ul></ul><ul><ul><ul><li>Anything changed to the system? New data! </li></ul></ul></ul><ul><ul><li>And it must be understandable. </li></ul></ul><ul><ul><ul><li>if not, try SGRT. </li></ul></ul></ul>
  8. 8. Data Collection (cont) <ul><li>Samples: </li></ul><ul><ul><li>Customer has a working and a non working system </li></ul></ul><ul><ul><ul><li>Collect data from both systems </li></ul></ul></ul><ul><ul><li>Customer has changed the configuration by the advice of Sun Support, but this doesn't work. </li></ul></ul><ul><ul><ul><li>Collect again all relevant data from the system to see what was changed. </li></ul></ul></ul><ul><ul><li>Customer applies online updates to the system, but the issue isn't fixed. </li></ul></ul><ul><ul><ul><li>We need again the data from the system to see what updates are applied. </li></ul></ul></ul>
  9. 9. Data Collection (cont) <ul><li>Red Hat: </li></ul><ul><ul><li>sysreport </li></ul></ul><ul><ul><ul><li>Mandatory for escalating to Red Hat </li></ul></ul></ul><ul><ul><ul><li>File system hierarchy </li></ul></ul></ul><ul><ul><ul><li>Lack of some interesting information. </li></ul></ul></ul><ul><li>SuSE: </li></ul><ul><ul><li>siga </li></ul></ul><ul><ul><ul><li>Insufficient messages etc. </li></ul></ul></ul><ul><ul><li>config.sh (preferred) </li></ul></ul><ul><ul><ul><li>Collect much more infos than siga. </li></ul></ul></ul><ul><ul><ul><li>Encapsulate siga report </li></ul></ul></ul>
  10. 10. Data Collection (cont) <ul><li>Others </li></ul><ul><ul><li>Linux Explorer </li></ul></ul><ul><ul><ul><li>Most complete data collection </li></ul></ul></ul><ul><ul><ul><li>Not a Sun tool </li></ul></ul></ul><ul><ul><ul><li>We are in discussion with SuSE to also accept this data set instead of siga/config.sh </li></ul></ul></ul><ul><ul><ul><li>Not accepted by Red Hat for escalation </li></ul></ul></ul>
  11. 11. Data Analyzing <ul><li>There is no automatic tool! </li></ul><ul><li>This presentation isn't complete at all. </li></ul><ul><li>Determinate the Linux Version: </li></ul><ul><ul><li>uname -a </li></ul></ul><ul><ul><li>/etc/*release* </li></ul></ul><ul><li>Looking up Messages: </li></ul><ul><ul><li>messages </li></ul></ul><ul><ul><li>dmesg </li></ul></ul><ul><ul><li>boot.log </li></ul></ul>
  12. 12. Data Analyzing (cont) <ul><li>What packages are installed? Which version? </li></ul><ul><ul><li>RPM is the packages manager of RHEL and SLES. </li></ul></ul><ul><ul><ul><li>rpm -qa </li></ul></ul></ul><ul><ul><ul><li>rpm -qaV (takes some time) </li></ul></ul></ul><ul><li>SAR report </li></ul><ul><ul><li>looking in the sar data (package sysstat) shows the load of the system at the time when an issue occurs </li></ul></ul>
  13. 13. Data Analyzing (cont) <ul><li>Hardware/Firmware information </li></ul><ul><ul><li>lspci [-v[v[v[v[v]]]]] </li></ul></ul><ul><ul><li>lsusb </li></ul></ul><ul><ul><li>dmidecode </li></ul></ul><ul><ul><ul><li>not part of sysreport! </li></ul></ul></ul><ul><ul><ul><li>hardware.py </li></ul></ul></ul><ul><ul><ul><ul><li>Python script wrapping dmidecode (RH only, may included in sysreport) </li></ul></ul></ul></ul><ul><ul><li>dmesg or /proc releated </li></ul></ul><ul><ul><ul><li>e.g. firmware of SCSI disk in /proc/scsi/scsi </li></ul></ul></ul>
  14. 14. Data Analyzing (cont) <ul><li>Overview </li></ul>
  15. 15. System Core Dump Capturing <ul><li>No standard at the moment </li></ul><ul><ul><li>Kdump has find it's way into the mainstream kernel. </li></ul></ul><ul><li>RHEL 3 / 4 uses it's own stuff </li></ul><ul><ul><li>netdump (preferred) </li></ul></ul><ul><ul><li>diskdump </li></ul></ul><ul><li>RHEL 5 uses Kdump </li></ul><ul><ul><li>An resident own kernel with small footprint </li></ul></ul><ul><ul><li>highly flexible and reliable </li></ul></ul>
  16. 16. System Core Dump Capturing (cont) <ul><li>SLES 8 / 9 uses LKCD </li></ul><ul><ul><li>Based on an IBM/SGI implementation. </li></ul></ul><ul><li>SLES 10 uses kdump </li></ul><ul><ul><li>An resident own kernel with small footprint </li></ul></ul><ul><ul><li>highly flexible and reliable </li></ul></ul>
  17. 17. Setting up RHEL 3 & 4 Netdump <ul><li>Install Netdump Server </li></ul><ul><ul><li>install package netdump-server </li></ul></ul><ul><ul><li>normally no configuration needed. </li></ul></ul><ul><ul><li>start service </li></ul></ul><ul><li>Install Netdump Client </li></ul><ul><ul><li>install package netdump-client </li></ul></ul><ul><ul><li>configure /etc/sysconfig/netdump </li></ul></ul><ul><ul><li>&quot;service netdump propagate&quot; </li></ul></ul><ul><ul><li>start service </li></ul></ul>
  18. 18. Setting up RHEL 5 Kdump <ul><li>Installed by Default </li></ul><ul><li>Configuration Dialog </li></ul><ul><ul><li>enable / disable kdump </li></ul></ul><ul><ul><li>configure dump locations </li></ul></ul><ul><ul><ul><li>local: file </li></ul></ul></ul><ul><ul><ul><li>net: nfs / ssh </li></ul></ul></ul><ul><ul><ul><li>partitions: ext2 / ext3 / raw </li></ul></ul></ul><ul><li>Quite easy to setup with the GUI dialog </li></ul>
  19. 19. Setting up SLES 8/9 LKCD <ul><li>Install required package: </li></ul><ul><ul><li>lkcdutils </li></ul></ul><ul><li>Edit /etc/sysconfig/dump </li></ul><ul><li>Write configuration </li></ul><ul><ul><li># lkcd config </li></ul></ul><ul><li>Activate service </li></ul><ul><ul><li># insserv /etc/init.d/boot.lkcd </li></ul></ul>
  20. 20. Seting up SLES 10 Kdump <ul><li>Install needed packages: </li></ul><ul><ul><li>kexec-tools </li></ul></ul><ul><ul><li>kernel-kdump </li></ul></ul><ul><ul><li>kernel-*-debuginfo </li></ul></ul><ul><li>Edit /etc/sysconfig/kdump </li></ul><ul><li>Enable kdump init service </li></ul><ul><ul><li>via YaST runlevel editor </li></ul></ul><ul><ul><li>&quot;chkconfig kdump on&quot; </li></ul></ul><ul><li>Add boot option &quot;crashkernel=64M@16M&quot; </li></ul>
  21. 21. Checking dump setup <ul><li>Check if everything fit together: </li></ul><ul><ul><li>Enable Magic SysRq feature temporarily </li></ul></ul><ul><ul><ul><li>echo &quot;1&quot; > /proc/sys/kernel/sysrq </li></ul></ul></ul><ul><ul><li>Force the system to dump </li></ul></ul><ul><ul><ul><li>echo &quot;c&quot; > /proc/sysrq-trigger </li></ul></ul></ul>
  22. 22. Linux SysRq Feature <ul><li>The Magic SysRq Feature is somewhat similar to Stop-A on Solaris </li></ul><ul><li>It can force the kernel to printout or dump information about the system </li></ul><ul><li>Sometimes really helpful for trouble shouting </li></ul><ul><li>May even work if the system seems to hang </li></ul>
  23. 23. Linux SysRq Feature (cont) <ul><li>Disabled by default, need to be enabled </li></ul><ul><ul><li>temporarily until next reboot </li></ul></ul><ul><ul><ul><li>echo &quot;1&quot; > /proc/sys/kernel/sysrq </li></ul></ul></ul><ul><ul><li>permanently </li></ul></ul><ul><ul><ul><li>edit /etc/sysctl.conf to add the line: kernel.sysrq = 1 </li></ul></ul></ul><ul><li>Issue locally on keyboard by Alt+SysRQ+<hotkey> </li></ul><ul><li>Issue remote by &quot;echo <hotkey> > /proc/sysrq-trigger&quot; </li></ul>
  24. 24. Linux SysRq Feature (cont) <ul><li>Some Hotkeys: </li></ul><ul><ul><li>K </li></ul></ul><ul><ul><ul><li>call the Secure Attention function (SAK). SAK terminate every process running on the actual console, to cleanup the terminal. </li></ul></ul></ul><ul><ul><li>s </li></ul></ul><ul><ul><ul><li>Synchronized all hard disks. </li></ul></ul></ul><ul><ul><li>u </li></ul></ul><ul><ul><ul><li>Remounts all hard disks in read only mode. This will prevent dataloss, when the system is in an unstable situation. </li></ul></ul></ul><ul><ul><li>t </li></ul></ul><ul><ul><ul><li>Shows the actual task list. </li></ul></ul></ul>
  25. 25. Linux SysRq Feature (cont) <ul><li>Some Hotkeys (cont): </li></ul><ul><ul><li>b </li></ul></ul><ul><ul><ul><li>boots the system immediately. You should synchronize and remount the hard disks read only before restarting the system. </li></ul></ul></ul><ul><ul><li>p </li></ul></ul><ul><ul><ul><li>Prints out the actual register content. </li></ul></ul></ul><ul><ul><li>m </li></ul></ul><ul><ul><ul><li>Prints out the memory information. </li></ul></ul></ul><ul><li>For a complete list lookup sysrq.txt in the Kernel documentation </li></ul>
  26. 26. Crash Dump Analyzing <ul><li>Crash utility </li></ul><ul><ul><li>Support varios dump fomats </li></ul></ul><ul><ul><ul><li>Kdump, LKCD, Net/Disk dump </li></ul></ul></ul><ul><ul><li>Integrated GDB </li></ul></ul><ul><ul><li>Can examinate live system Kernel </li></ul></ul><ul><li>http://people.redhat.com/~anderson/ </li></ul>
  27. 27. Crash Dump Analyzing (cont) <ul><li>You need to have the debug information of the kernel </li></ul><ul><li>Crash package need to be installed </li></ul><ul><li>Load vmcore for analyzing </li></ul><ul><ul><li>crash System.map vmlinux vmcore </li></ul></ul><ul><ul><ul><li>dmesg </li></ul></ul></ul><ul><ul><ul><li>ps list </li></ul></ul></ul><ul><ul><ul><li>stack traces </li></ul></ul></ul><ul><ul><ul><li>etc. </li></ul></ul></ul>
  28. 28. Troubleshoot a Hanging System <ul><li>Hard to troubleshoot due to lack of information </li></ul><ul><li>If a deadlocked kernel, NMI watchdog may help </li></ul><ul><ul><li>add Kernel boot cmd nmi_watchdog=1 to grub configuration. </li></ul></ul><ul><ul><li>When system look is detected, a kernel panic will be initiated. </li></ul></ul><ul><li>There might be a chance to force a dump (SysRq) when system hanging </li></ul>
  29. 29. Links <ul><li>External Sources: </li></ul><ul><ul><li>Linux Explorer http://www.unix-consultants.co.uk/examples/scripts/linux/linux-explorer/ </li></ul></ul><ul><ul><li>LKCD Setup on SLES http://www.novell.com/coolsolutions/feature/15284.html </li></ul></ul><ul><ul><li>Crash Utility http://people.redhat.com/~anderson/ </li></ul></ul><ul><li>Internal Sources </li></ul><ul><ul><li>System TSC Linux pages http://systems-tsc/twiki/bin/view/Teams/LinuxDataGathering </li></ul></ul><ul><ul><li>PTS Linux pages (outdated) http://barentz.germany.sun.com/ptsvs/Wiki.jsp?page=LinuxHowTos </li></ul></ul>
  30. 30. Links <ul><li>Did you know http://www.google.com/linux? </li></ul>
  31. 31. X64 Work Shop Linux Information Gathering <ul><li>Thorsten Kellermann </li></ul><ul><ul><li>[email_address] </li></ul></ul>

×