3. YOUR PRESENTER
• John Q Martin
o Sales Engineer for SQL Sentry
o Worked with SQL Server for ~10 years
o Consultant, SQL DBA, Dev & BI Developer
o Former Microsoft Premier Field Engineer
• Contact Information
o Email: Jmartin@SQLSentry.com
o Blog: http://blogs.sqlsentry.com/author/JohnMartin/
o Twitter: @SQLDiplomat
o LinkedIn: https://uk.linkedin.com/in/johnqmartin
4. AGENDA
• CPU Monitoring
• Memory Monitoring
• Storage Monitoring
• SQL Server Monitoring
o Monitoring Counters
o Wait Stats
o DMVs
o Events
6. MONITORING FUNDAMENTALS
• Monitor over time, keep the captured data as it will be invaluable
o Don’t just grab everything “just in case”
• Use historical data to create baselines
o Baselines will allow for spotting when regular events or time periods are ‘out of band’
• Historical monitoring data can be used to perform trend analysis and capacity
planning.
7. CPU METRICS
• Process
o % Processor Time
o % Privileged Time
• Processor
o % Processor Time
o % Privileged Time
o % DPC Time
8. CPU METRICS
• Important to monitor each CPU as well as the total CPU usage.
o Helps identify potential MAXDOP issues.
o Will allow for you to see if there are possible misconfigurations in the system outside of SQL
Server
• Excessive DPC and Privileged time can indicate issues elsewhere in the
system such as networking or storage.
• Monitoring the SQL Server Processes will allow you to see how much time is
spent on SQL Server
o Depending on storage you can capture more process instances
10. MEMORY METRICS
• NUMA Node Memory
o Available MBytes
o Total MBytes
• Memory
o Available Mbytes
o Page Faults/sec
o Page Reads/sec
o Page Writes/sec
11. MEMORY MONITORING
• Understand what volumes of data are being read into and out of memory.
• Tracking memory use by NUMA node can have benefits depending on the
configuration of the system.
o Differences in the amount of memory allocated to each NUMA node can affect processing in
the CPUs within each node.
• Ensuring that there is a sufficient free memory is important to maintaining a
stable system.
13. STORAGE METRICS
• Logical Disk
o Same as Physical Disk
o Depends on your disk
configuration, if 1:1
mapping between physical
& logical then use Physical
metrics.
• Physical Disk
o Disk Read Bytes/sec
o Disk Write Bytes/sec
o Disk Reads/sec
o Disk Writes/sec
o Split IO/sec
o Current Disk Queue Length
14. STORAGE MONITORING
• Key monitoring elements for storage
o IOPS
o Throughput
o Latency
• Monitor amount of space used
o Sample rate does not need to be frequent, can be minutes or hours rather than seconds.
• Understand the configuration of the disks as to whether you need to use
Logical and/or Physical Disk counters.
15. SQL SERVER STORAGE DMVS
• sys.dm_io_virtual_file_stats()
o Gives depth to the reads & writes into each database file
o Allows you to derive Read/Write balance for data files
o IO operations and Bytes written
• sys.sm_io_pending_io_requests
o Shows outstanding file IOs for SQL Server database files.
• sys.dm_db_index_physical_stats()
o Gather index fragmentation details
o Can cause lots of IO, use sparingly on large databases
16. SQL SERVER METRICS
• Buffer Manager
o Buffer Cache Hit Ratio
o Checkpoint pages/sec
o Page Reads/sec
o Page Writes/sec
• Access Methods
o Forwarded Records/sec
o FreeSpace Scans/sec
o Page Splits/sec
o Workfiles Created/sec
o Worktables Created/sec
17. SQL SERVER METRICS
• Buffer Node
o Page Life Expectancy
o Local node Page
lookups/sec
o Remote node page
lookups/sec
• Databases
o Active Transactions
o Log Bytes Flushed/sec
o Log Flush Wait time
o Log Flush Waits/sec
o Log Flushes/sec
o Percent Log Used
18. PAGE LIFE EXPECTANCY
• PLE Value is meaningless, Discuss.
• Value needs to be given context
o How large is the buffer pool
o What is my IO sub-system capability
o What % of the IO Channel is used to maintain the PLE value
• Investigate changes
o What happened when PLE suddenly dropped?
• Monitor at the Buffer Node Level
o Global PLE value will not equal mean AVG of Node value.
19. PAGE LIFE EXPECTANCY
• Look for changes and see what
else was going on
o Large batch job/report
o Someone runs DBCC DROPCLEANBUFFERS
• Frequent monitoring required as
changes can happen fast
o Seconds to minutes for monitoring interval.
What happened ?
21. SQL SERVER EVENTS
• Monitor SQL Agent for Failed Jobs
• Monitor for 823, 824, 825 errors
o Can indicate storage or corruption issues
o Make use of Agent Alerts or tools to scan the agent log
• Monitor and manage the dbo.suspect_pages table in MSDB
o SQL Server will track incidences of corrupt pages here
o Limited to 1000 records so needs to be managed if there is anything here
22. SUMMARY
• Identify base metrics that you should be capturing and a capture frequency
o Understand why you are collecting them and how to use them effectively
• Identify specific business events and cycles and create baselines to allow for
tracking performance over multiple iterations and time
• Look for correlation between performance metrics
o Make use of CORREL function in Excel if needed
• Track changes to the environment, code, applications etc. this will help
supplement the monitoring data.
24. THANK YOU!
• Slides will be available at http://blogs.sqlsentry.com
• More information at:
o SQLSkills, et al
• E-mail ebooks@sqlsentry.com for free copies of our e-books:
o Just tell them where you met me
• My contact info for other questions:
o Email: Jmartin@SQLSentry.com
o Twitter: @SQLDiplomat