• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Wcl303 russinovich
 

Wcl303 russinovich

on

  • 1,020 views

 

Statistics

Views

Total Views
1,020
Views on SlideShare
1,020
Embed Views
0

Actions

Likes
0
Downloads
16
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Procexp\\vmware hang
  • procexp\\iexplorercpu spike
  • \\bsod\\Intel wireless>

Wcl303 russinovich Wcl303 russinovich Presentation Transcript

  • Case of the Unexplained 3
    Mark Russinovich
    Technical Fellow
    Microsoft Corporation
    Session Code: WCL303
  • About Me
    Technical Fellow, Microsoft
    Co-founder and chief software architect of Winternals Software
    Co-author of Windows Internals 4th and 5th edition and Inside Windows 2000 3rd edition with David Solomon
    Author of TechNet Sysinternals
    Home of blog and forums
    Contributing Editor TechNet Magazine, Windows IT Pro Magazine
    Ph.D. in Computer Engineering
  • Outline
    Introduction
    Sluggish Performance
    Application Hangs
    Error Messages
    Application Crashes
    Blue Screens
  • Case of the Unexplained…
    This is the 2009 version of the “case of the unexplained” talk series
    2007 & 2008 versions covered different cases
    Can view webcast on Sysinternals->Mark’s webcasts
    Based on real case studies
    Some of these have been written up on my blog
  • Troubleshooting
    Most applications do a poor job of reporting unexpected errors
    Locked, missing or corrupt files
    Missing or corrupt registry data
    Permissions problems
    Errors manifest in several different ways
    Misleading error messages
    Crashes or hangs
  • Purpose of Talk
    Show you how to solve these classes of problems by peering beneath the surface
    Interpreting file and registry activity
    Interpreting call stacks
    You’ll learn tools and techniques to help you solve seemingly unsolvable problems
  • Tools We’ll Use
    Sysinternals: www.microsoft.com/technet/sysinternals
    Process Explorer – process/thread viewer
    Process Monitor – file/registry/process/thread tracing
    Autoruns – displays all autostart locations
    SigCheck – shows file version information
    PsExec – execute processes remotely or in the system account
    Pslist – list process information
    Strings – dumps printable strings in any file
    ADInsight – real time LDAP (Active Directory) monitor
    Zoomit – presentation tool I’m using
    Microsoft downloads:
    Kernrate – sample-based system profiler
    Visual Studio: Spy++ - Window analysis utility
    Debugging Tools for Windows: Windbg application and kernel debugger: www.microsoft.com/whdc/devtools/debugging/Windbg
  • Outline
    Sluggish Performance
    Application Hangs
    Error Messages
    Application Crashes
    Blue Screens
  • The Case of the Slow Outlook Attachment
    User would see CPU burst and Outlook would hang for 15+ seconds whenever they received an attachment:
  • Process Monitor
    Process Monitor is a real-time file, registry, process and thread monitor
    It requires Windows 2000 SP4 w/Update Rollup 1, XP SP2 or higher, Server 2003 SP1 or higher, Vista, or Server 2008 (including 64-bit versions of Windows)
    It replaces Filemon and Regmon, but you can use Filemon and Regmon on older operating systems
    Enhancements over Filemon/Regmon include:
    More advanced filtering
    Operation call stacks
    Boot-time logging
    Data mining views
    Process tree to see short-lived processes
    When in doubt, run Process Monitor!
    It will often show you the cause for error messages
    It many times tells you what is causing sluggish performance
  • The Case of the Slow Outlook Attachment (Continued)
    Process Monitor trace of next received attachment implicated antivirus:
  • The Case of the Slow Outlook Attachment: Solved
    Searched web for confirmation:
    Checked AV settings found problematic option and disabled scanning:
  • Process Explorer
    Process Explorer is a Task Manager replacement
    You can literally replace Task Manager with Options->Replace Task Manager
    Hide-when-minimize to always have it handy
    Hover the mouse to see a tooltip showing the process consuming the most CPU
    Open System Information graph to see CPU usage history
    Graphs are time stamped with hover showing biggest consumer at point in time
    Also includes other activity such as I/O, kernel memory limits
  • The Case of the Periodic VMWare Freezes
    Noticed CPU peg every 10 seconds and the desktop freeze when running VMWare
    Saw in the Process Explorer System Information graph that it was the System process:
  • Processes and Threads
    A process represents an instance of a running program
    Address space
    Resources (e.g., open handles)
    Security profile (token)
    A thread is an execution context within a process
    Unit of scheduling (threads run, processes don’t run)
    All threads in a process share the same per-process address space
    The System process is the default home for kernel mode system threads
    Functions in OS and some drivers that need to run as real threads
    E.g., need to run concurrently with other system activity, wait on timers, perform background “housekeeping” work
    Other host processes: svchost, Iexplore, mmc, dllhost
  • Viewing Threads
    Task Manager doesn’t show thread details within a process
    Process Explorer does on “Threads” tab
    Displays thread details such as ID, CPU usage, start time, state, priority
    Start address is where the thread began running (not where it is now)
    Click Module to get details on module containing thread start address
  • Thread Start Functions and Symbol Information
    Process Explorer can map the addresses within a module to the names of functions
    This can help identify which component within a process is responsible for CPU usage
    Requires symbol information:
    Download the latest Debugging Tools for Windows from Microsoft (free)
    Configure Process Monitor’s symbol engine:
    Use dbghelp.dll from the Debugging Tools
    Point at the Microsoft public symbol server (or internal symbol server if you have access)
    Can configure multiple symbol paths separated by “;”
  • The Case of the Periodic VMWare Freezes: Solved
    Opened Threads tab for System process and paused after a spike:
    Ftser2k was XM Radio USB/Serial driver
    Stopping it didn’t remove spikes
    Http.sys is IIS kernel-mode cache driver
    Went to device manager and showed hidden devices
    Stopped http.sys and hangs went away
    Didn’t care about dependent services
  • The Case of the Runaway Internet Explorer
    Noticed a CPU spike and hovered over Process Explorer to see culprit:
    That was unexpected, because had just installed Adobe Acrobat Reader and exited Internet Explorer
    IE’s window wasn’t visible, but it was still in the process list
  • The Case of the Runaway Internet Explorer: Investigation
    The thread had a generic start address:
    Required deeper investigation…
  • Call Stacks
    Sometimes a thread start address doesn’t tell you what a thread is doing
    The stack might provide a hint:
    The stack is a per-thread region of memory that records a history of function nesting
    The bottom from (Function 3) is where the thread will continue executing
    Function 1
    Function 2
    Function 3
  • Viewing Call Stacks
    Click Stack on the Threads tab to view a thread’s call stack
    Lists functions in reverse chronological order
    Note that start address on Threads tab is different than first function shown in stack
    This is because all threads created by Windows programs start in a library function in Kernel32.dll which calls the programmed start address
  • The Case of the Runaway Internet Explorer: Stack Investigation
    I double-clicked on the thread to see its stack:
  • The Case of the Runaway Internet Explorer: What is GP.OCX?
    Opened DLL view to see DLL’s version information:
    DLL Search Online didn’t return any useful results
  • The Case of the Runaway Internet Explorer: Solved
    Searched for NOS Microsystems:
    Conclusion: Adobe uses gp.ocx, which had hit an infinite-loop bug
    Terminated IE process to stop CPU usage
  • Outline
    Sluggish Performance
    Application Hangs
    Error Messages
    Application Crashes
    Blue Screens
  • The Case of the Logon Script Hangs
    Multiple users complained that logon would take three minutes
    Investigation revealed that all complaints were from Dell Precision 670 workstations
    But only some of the 670 workstations were affected
    User configured Process Explorer to run during logon and saw Lisa Client consuming CPU:
    Lisa Client was custom logon application that checked system for installed applications
    Lisa Client CPU then went idle for several minutes, then exited and system would start acting normally
  • The Case of the Logon Script Hangs (Continued)
    User captured a Process Monitor trace after manually running Lisa Client
    Saw three-minute delay correspond to device error:
    Details column showed IOCTL_SCSI_PASS_THROUGH
    Captured trace on working system and looked for IOCTL_SCSI_PASS_THROUGH operation
    No device error and no delay:
  • The Case of the Logon Script Hangs: Solved
    Device error lead user to look at disks:
    Working systems had Fujitsu disks
    Systems with hangs had Seagate
    Solution:
    Temporary: wrote WMI script that queried disk type and would not launch Lisa Client on Seagate systems
    Final: Application developers changed Lisa Client to avoid performing problematic command
  • Outline
    Sluggish Performance
    Application Hangs
    Error Messages
    Application Crashes
    Blue Screens
    Undocumented Settings
  • The Case of the MMC Startup Failure
    User would get an error every time they started an MMC snapin:
  • The Case of the MMC Startup Failure: Solved
    Ran Process Monitor and saw an Access Denied error on an IE registry key:
    Checked permissions and Administrators had no access
    Solution: added full-access for Administrators and MMC started successfully
  • The Case of the Favorite that Wouldn’t Save
    User tried to change the URL for one of his IE favorites:
    Trying to save a new favorite resulted in a similar error:
  • The Case of the Favorite that Wouldn’t Save: Solved
    Captured a Process Monitor trace:
    AccessChk showed that folder was Medium Integrity (IE requires Low):
    Fixed integrity with Icacls and problem solved
  • The Case of the Persistent Executable
    Noticed that opening volumes in Explorer was really slow
    Volume context menu indicated presence of Autorun.inf
  • The Case of the Persistent Executable (Continued)
    Files reappeared after deleting, so monitored activity with Process Monitor
    File was recreated by Explorer, so looked at stack
  • Viewing Autostarts
    Use Autoruns to see what’s configured to start when the system boots and you login
    Windows MsConfig shows a subset defined autostart locations
    MsConfig doesn’t show as much information
  • The Case of the Persistent Executable (Solved)
    Process Explorer DLL search showed that amvo.dll loaded into Explorer and all its children
    Found amv0.exe and used Autoruns to delete it from the system Run key
  • Outline
    Sluggish Performance
    Application Hangs
    Error Messages
    Application Crashes
    Blue Screens
  • Application Crashes
    In most cases, there’s nothing you can do about application crashes
    They are caused by a bug in in the program
    Only the developer can fix a bug
    However, the crash may be caused by misconfiguration or an extension (a plugin)
    Monitor the application’s crash with Process Monitor if it’s reproducible
    Look for extensions in the crash file with Windbg
  • Finding the Crash Dump
    On pre-Vista systems, finding the dump file is easy:
  • Attaching to the Dying Process
    Vista doesn’t save crash dumps for most crashes
    Only if Microsoft requests a dump for study and you send it in
    When a crash occurs, don’t dismiss the crash dialog:
    Launch Windbg and attach to the process
    You can save a dump with the .dumpcommand
  • Identifying the Crashed Process
    On Vista, the process name might not be enough to identify the instance that’s crashed:
    To determine the PID of the crashed instance, look at WerFault’s command line:
  • Enabling Dump Archiving on Vista and Windows Server 2008
    Or you can configure Vista SP1 and Windows Server 2008 to always generate and save a dump file
    Create a key named:HKLMSoftwareMicrosoftWindowsWindows Error ReportingLocalDumps
    Dumps go to %LOCALAPPDATA%CrashDumps
    Override with a DumpFolder value (REG_EXPAND_SZ)
    Limit dump history with a DumpCount value (DWORD)
  • Analyzing a Crash
    Basic crash dump analysis is easy and it might tell you the cause
    Requires Windbg and symbol configuration
    Once the dump is loaded, find the faulting thread
    The debugger might identify it
    If the debugger doesn’t, examine each thread stack looking for “fault”, “exception”, or “error” names
    Examine the stack of the faulting thread to look for third-party plugins
    If you suspect an extension:
    Check for a new version
    Uninstall it if the problem persists
  • The Case of the Explorer Context Menu Crash
    Explorer would randomly crash when the user right-clicked on a file
    Attached to process and executed !analyze -v:
    Didn’t know what muangys.dll was and because module was unloaded, Windbg provided no information
  • The Case of the Explorer Context Menu Crash (Cont)
    Ran Process Explorer and looked at Explorer DLL view to find muangys.dll:
    File had no version information, but Strings identified the company and application:
  • The Case of the Explorer Context Menu Crash: Solved
    Was part of Icon editing software, which developer relied upon
    No newer version
    Solution: disable shell extension with Autoruns
  • Outline
    Sluggish Performance
    Application Hangs
    Error Messages
    Application Crashes
    Blue Screens
  • Crashes and Hangs
    Windows has various components that run in Kernel Mode, the highest privilege mode of the OS
    OS components: Ntoskrnl.exe, Hal.dll
    Drivers: Ntfs.sys, Tcpip.sys, device drivers
    Kernel-mode components are privileged extensions to the OS have to adhere to various rules
    Not accessing invalid memory
    Accessing memory at the right “Interrupt Request Level”
    Not causing resource deadlocks
    When a kernel-mode component performs an illegal operation, Windows crashes (blue screens)
    Crashing helps preserve the integrity of user data
    A resource deadlock can hang the system
  • Online Crash Analysis
    When you reboot after a crash, Windows offers to upload it to Microsoft Online Crash Analysis (OCA)
    Automated server generates a thumbprint of the crash and uses it as a key in a database
    If the database has an entry, the user is told the cause and directed at a fix
  • Basic Crash Dump Analysis
    Many times OCA doesn’t know the cause:
    Basic crash dump analysis is easy and it might tell you the cause
    Requires Windbg and symbol configuration
    Dump files are in either:
    WindowsMemory.dmp: Vista and servers
    WindowsMinidump: Windows 2000 Pro and Windows XP
  • The Case of the Crashed Phone Call
    Laptop crashed during a Skype VOIP call
    User reconnected and system crashed again
    Minidump file pointed at Intel wireless driver:
  • The Case of the Crashed Phone Call (Cont)
    Looked at file properties to determine what device the driver was for:
    Found device in Device Manager:
  • The Case of the Crashed Phone Call (Cont)
    Right-clicked and checked Windows Update for newer driver:
    Need to check OEM site, so had to find version number
  • The Case of the Crashed Phone Call: Solved
    OEM site had older version:
    Intel site had newer one:
    Installed and crashes stopped
  • Summary and More Information
    A few basic tools and techniques can solve seemingly impossible problems
    I learn by always trying to determine the root cause
    Resources:
    Webcasts of two previous “Case of the Unexplained “ talked
    Sysinternals->Mark’s Webcasts
    Sysinternals Video Library: in-depth dive on tools and troubleshooting
    My blog
    Windows Internals: understand the way the OS works
    If you’ve solved one, send me a description, screenshots and log files!
    I’ll send you a signed copy of Windows Internals
  • www.microsoft.com/teched
    Sessions On-Demand & Community
    www.microsoft.com/learning
    Microsoft Certification & Training Resources
    http://microsoft.com/technet
    Resources for IT Professionals
    http://microsoft.com/msdn
    Resources for Developers
    www.microsoft.com/learning
    Microsoft Certification and Training Resources
    Resources
  • Track Resources
    • Want to find out which Windows Client sessions are best suited to help you in your deployment lifecycle?
    • Want to talk face-to-face with folks from
    the Windows Product Team?
    Meet us today at the
    Springboard Series Lounge, or visit us at www.microsoft.com/springboard
    Springboard Series
    The Springboard Series empowers you to select the right resources, at the right technical level, at the right point in your Windows® Client adoption and management process. Come see why Springboard Series is yourdestination for Windows 7.
  • Required Slide
    Complete an evaluation on CommNet and enter to win!
  • Required Slide
    © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
    The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.