1. SharePoint ProSix Steps for Proactively Monitoring SharePoint
• Ron Charity
• roncharity@gmail.com
• 416-300-6033
2. Read me (Remove when presenting)
• This is a draft document
– Reviews are required by Penton and Metalogix
– Review each page and notes
– Edit as you see fit and highlight change in RED
• The presenter has 15-20 minutes to present
• The presentation contains 15-20 slides to
meet time slot
3. Abstract (Remove when presenting)
This webinar will take you through a six step process for
enabling you to proactively monitor SharePoint health,
capacity and to help trouble shoot the root cause of
problems.
SharePoint is a platform or services and functionality
that enables an organizations to create valuable
solutions but with that value comes technical
complexity and risk. Common problems that companies
face are not meeting SLAs, lack of visibility into
performance and health, not able to proactively
manage technical risks and speed up / optimize
troubleshooting.
4. BIO
Ron Charity
A published Technologist with 20 + years in
infrastructure and application consulting.
Experience working in the US, Canada,
Australia and Europe. Has worked with
SharePoint and related technologies since
2000.
Plays several bands, rides a Harley Nightster,
and enjoys travel, especially to beach
destinations.
5. Agenda
• Points of view on the problem
• Impact to your organization
• How tools can help
• Six steps to address your monitoring needs
• Next steps
• Suggested reading
• Contact information
6.
7. Points of view on the problem
• The business user expects perfect and consistent service and
when things go wrong it should be easy to fix
– Not experts
– Just want it to work so they can do their day job
• IT is held/measured based on an SLA
– Expectation of service which is difficult to manage
– Lack visibility and insight
• The SharePoint team experienced staff, funding, tools and
process to proactively manage
– Capacity and performance insight
– Identify problems, troubleshoot and find root cause
– Provide tangible proof of problem
8. Impact to organization
• The Business experiences
–Productivity loss
–Frustration with performance and outages
–Damage to brand
• IT / vendor not able to meet SLA
–Noise and pain as a result
–Financial penalty impacts and scores
–Loss of credibility
• SharePoint team not able to
–Act proactively - instead reactive
–Can’t predict impact of changes
–Waste time and resources finding root cause
–Lose credibility
9. How tools can help
• Provides IT and business with factual reporting against SLAs
(Confidence IT / vender is delivering)
–Removes FUD / guessing from conversation
• Enables IT / SharePoint team so they are able to:
–Assessing impact of changes in Quality Assurance
–Visibility into operational issues
–Proactively monitor and prevent capacity and performance issues
before they happen - Alert on issues
–Report on performance and capacity / trends, predict impact of
changes
• Provides techs with visibility into health regarding the
–Physical hardware and OS
–Application Server and SharePoint
–SQL Server
10. Common mistakes
• Using out of box settings without testing
–False Alerts – loses credibility
• Monitoring everything
–Not what’s critical
–Key Information gets lost
–No training and awareness
–Nobody knows about or uses it
• Not integrated in change mgmt
–How to measure impact of change
–Review changes based on historic data / reports
• Not integrating in to problem mgmt and escalation process
–Reports go ignored - Alerts as well
• Not assigning alerts to techs, event tracking, follow up
11. Six steps
• The steps are solution-oriented
• Conduct a gap analysis of the steps and apply what you need
• The six steps:
–1 - Technical and process documentation
–2 - Establish a Quality Assurance (QA) Discipline
–3 - Baseline your SharePoint environment
–4 - Operational readiness
–5 - Deploy monitoring tools
–6 – Launch, monitor and refine ongoing
12. Step 1 - Technical and process
documentation
• Document your SharePoint environment
–Servers, storage, network
–Existing process and policy
• Document your process and policy
–Incident tracking and Problem mgmt
–Governance / Escalation process
–Grievances and complaints
–Change mgmt
• Review your third party contracts
–Outsourcing
–Contractors
–Providers
13. Step 2 - Establish a Quality Assurance
Discipline
• Governed and enforced my Executive Mgmt
• Must be integrated into Change Mgmt
• Goal = Enforceable / accountable
• Consists of the following
–Experienced and trained QA staff
–SharePoint environment
–Defect tracking, load generation and monitoring tools
• Formalized documentation and work flow
–Requirements form
–Test Plan
–Test Scripts
–Report and recommendations
14. Step 2 - Establish a Quality Assurance
Discipline
Before
After
15. Step 3 - Baseline your SharePoint
Environment
• Utilizing QA disciplines
–For new deployments
–For existing
• Use PAL / Vendor recommended counters to start
• Load environment to expected levels incrementally
• For new environments use reports for tweaking SP to deliver
expected performance
• For existing environments use reports as historic comparison
for performance and capacity
• Store reports and learnings for future use – leverage core
learnings / share knowledge
16. Step 4 - Operational Readiness
• Formalized and document process and policy is put into use
• OPs provided with orientation / training
• Tool training provided
–One to many
–Recordings
–Manuals
• Staffing models adjusted accordingly
• Tool must
–Become integrated into daily work routine
- Alerting , Troubleshooting , Reporting
• Be optimized so its reliable
–Mix of baselines and actual use in production
–Stakeholders updated
17. Step 5 - Deploy Monitoring Tools
• Physical hardware required is deployed
–Console
–Server
• Tool installed
–Console and Agents on servers
–Security – admins, users
–Counters and thresholds
–Reporting
• Testing to verify correct operation
• Necessary documentation created / updated
• Stakeholders updated
19. Step 6 – Launch, Monitor and Refine
• Monitoring service is officially launched
• Formal communications
• OPs staff expected to use tool
• Monitoring tools integrated into process and teams
–Feedback applied to monitoring, troubleshooting etc.
–Threshold accuracy / changes?
–Need for monitoring other counters
• Important to
–Demonstrate value
–Ensure monitoring is accurate
–Alerting is done when required
–Ensure key counters are monitored
21. Next Steps
•Document the SLA for your SharePoint service
offering – get sign off from stakeholders
•Create a business case to obtain funding
•Create project controls for a POC (Charter,
Communication plan, schedule, risk plan, test
plan etc.)
•Build POC environment and carry out tests and
document results
•Review findings and decide on next steps
22. Next Steps
•Install monitoring tools and use best guess and
vendor input as a start
•Baseline and trend environment to optimize
your monitoring
•Incorporate baselining into your change control
•Integrate alerts into your help desk software to
register issues and action
•Review reports monthly with stakeholders
23. Further Reading
• Best practices operational excellence -
http://technet.microsoft.com/en-
us/library/cc850692(v=office.14).aspx
• Managing application life cycle -
http://msdn.microsoft.com/en-us/library/ff649081.aspx
• Plan for monitoring - http://technet.microsoft.com/en-
us/library/jj219701(v=office.15).aspx
• Planning worksheets - http://technet.microsoft.com/en-
us/library/cc262451(v=office.15).aspx
• Performance Analysis of Logs (PAL) Tool -
https://pal.codeplex.com/
24. Contact Information
• Questions? Ideas or suggestions you want to
share?
• Text chat or contact me at
– roncharity@gmail.com
– ca.linkedin.com/in/ronjcharity/
Editor's Notes
Draft
Version 1.0
Date 1/16/2015
Left blank intentionally
Left blank intentionally
Intentionally left blank
Mental note >> What's the point? Why should they care?
There are many ways to approach this topic
Being allotted 20 minutes, I must briefly touch on important areas
I’m a consultant / architect – take a holistic approach – multiple view points
Your level of success depends what you’re managing to as success criteria
I will be prescriptive throughout the webinar and will be available through email
Lots to cover…
Mental note >> What's the point? Why should they care?
Successful people usually
Have a strategy
Have a solid network
Have some help
Senior sponsor
You require a strategy and plan to be truly successful.
Success often depends on specific points of view.
Maneuver carefully around fiefdoms and other politics.
Think of the it this way…
Coyote as your sponsor and Gorn as all the politics
Mental note >> What's the point? Why should they care?
The business user expects perfect and consistent service and when things go wrong it should be easy to fix
Not experts
Just want it to work so they can do their day job
IT is held/measured based on an SLA
Expectation of service which is difficult to manage
Lack visibility and insight
Lacks factual data
Mostly hearsay or word or mouth
The SharePoint team experienced staff, funding, tools and process to proactively manage
capacity and performance
identify problems, troubleshoot and find root cause
Provide tangible proof of problem
Especially when its not a SP problem
Network of storage
Nasty site customization or list
Mental note >> What's the point? Why should they care?
The Business experiences
Productivity loss
Frustration with performance and outages
Damage to brand
IT / vender not able to meet SLA
Noise and pain as a result
Financial penalty impacts and scores
Loss of credibility
SharePoint team not able to
act proactively - instead reactive
cant predict impact of changes
waste time and resources finding root cause
loose credibility
Mental note >> What's the point? Why should they care?
Provides IT and business with factual reporting against SLAs (Confidence IT / vender is delivering)
Removes FUD from conversation
Enables IT / SharePoint team so they are able to:
Assessing impact of changes in Quality Assurance
Visibility into operational issues
Proactively monitor and prevent capacity and performance issues before they happen
Alert on issues
Record and send events to helped desk software
Report on performance and capacity / trends, predict impact of changes
Factual data and no pure opinion – substantiated information
Demonstrates competence
Provides techs with visibility into Health regarding the
Physical hardware and OS – CPU, Disk, LAN I/O etc.
Application Server and SharePoint – key IIS and .Net (pages served, .Net Garbage collection etc.)
SQL Server – disk I/O, buffering etc.
Mental note >> What's the point? Why should they care?
Using out of box settings without testing
False Alerts – looses credibility
Monitoring everything
Not whats critical
Key Information gets lost
No training and awareness
Nobody knows about or uses it
Not integrated in change mgmt
How to measure impact of change
Review changes based on historic data / reports
Not integrating in to problem mgmt and escalation process
Reports go ignored
Alerts as well
Not sending / assigning alerts to techs
No event tracking
No accountability
No follow up
Mental note >> What's the point? Why should they care?
The steps are solution oriented
You might conduct a gap analysis and find you have some addressed or none
Based on work performed over several years stabilizing an internal and external environments that experienced $1-2 mil in additional costs annually due to outages and performance issues
Step 1 - Technical and process documentation
Step 2 - Establish a Quality Assurance (QA) Discipline
Step 3 - Baseline your SharePoint environment
Step 4 - Operational readiness
Step 5 - Deploy monitoring tools
Step 6 – Launch, monitor and refine ongoing
Mental note >> What's the point? Why should they care?
Step 1 - Technical and process documentation
Document your SharePoint environment
Servers, storage, network
Existing process and policy
Document your process and policy
Incident tracking and Problem mgmt
How incidents are recorded and tracked
Who is assigned to incidents
Governance / Escalation process
Grievances and complaints
Issues dragging out
When to involve third parties (e.g. Microsoft)
Change mgmt
Review your third party contracts
Outsourcing
Contractors
Providers
Mental note >> What's the point? Why should they care?
Step 2 - Establish a Quality Assurance (QA) Discipline
Governed and enforced my Executive Mgmt
Must be integrated into Change Mgmt
Goal = Enforceable / accountable
Consists of the following
Experienced and trained QA staff
SharePoint environment
Defect tracking tools
Load generation tools
Monitoring tools
Formalized documentation and work flow
Requirements form
Request form to register testing and provide status
Perhaps list with workflows and email notifications
Formalizes and streamlines
Test Plan
Whats being tested, scheduling, priority, outcomes
Specific test cases (memory, CPU, .Net Garbage Collection, Disk I/O)
Test Scripts
Scipts used for testing (Automating Load generation, actual steps to carry out)
Report and recommendations
Report that documents results of testing, findings, recommendations, learning etc.
Mental note >> What's the point? Why should they care?
Step 2 - Establish a Quality Assurance (QA) Discipline
Sample test case
Sample graph
Mental note >> What's the point? Why should they care?
Step 3 - Baseline your SharePoint environment
Utilizing QA disciplines
For new deployments
Load test environment
Use that as baseline
For existing
Use model QA environment
Work to understand delta between production and QA
Use that as baseline
Use PAL / Vender recommended counters to start
Load environment to expected levels
Increment to see impacts 10, 100, 500, 1000, 2000 etc.
For new environments
Use reports for tweaking SP to deliver expected performance
Example,
Confirm expected scalability / merit of architecture and configuration
Tweak environment as required and re-baseline
For existing environments
Use reports as historic comparison for performance and capacity
Decline / approve deployment of SPs or new custom code
Prove architectural changes / configuration settings
Example
.net garbage collection increased with new code deployed
Disk I/O increased with use of new webpart
Use PAL, Microsoft and Vender recommendations to assess results
Store reports and learnings for future use – lever core learnings
Mental note >> What's the point? Why should they care?
Step 4 - Operational readiness
Formalized and document process and policy is put into use
OPs provided with orientation / training
Tool training provided
One to many
Recordings
Manuals
Staffing models adjusted accordingly
Tool must
Become integrated into daily work routine
Alerting
Troubleshooting
Reporting
Be optimized so its reliable
Mix of baselines and actual use in production
Necessary documentation created / updated
Stakeholders updated
Mental note >> What's the point? Why should they care?
Step 5 - Deploy monitoring tools
Physical hardware required is deployed
Console
Server
Tool installed
Console and Agents on servers
Security – admins, users
Counters and thresholds
Reporting
Testing to verify correct operation
Necessary documentation created / updated
Stakeholders updated
Mental note >> What's the point? Why should they care?
Step 5 - Deploy monitoring tools
Physical hardware required is deployed
Console
Server
Agents
Help Desk integrated
for event alerting and assignment
Events dispatched to tech
Mental note >> What's the point? Why should they care?
Step 6 – Launch, monitor and refine ongoing
Monitoring service is officially launched
Formal communications
OPs staff expected to use tool
Monitoring tools integrated into
Change Controls meetings
Architecture discussions
Problem mgmt
Governance meetings
Feedback from above reviewed and applied to monitoring
Feedback from troubleshooting
Threshold accuracy / changes?
Need for monitoring other counters
Important to
Demonstrate value
Ensure monitoring is accurate
Alerting is done when required
Ensure key counters are monitored
Mental note >> What's the point? Why should they care?
CMM
Assess capabilities and next steps based on maturity
Where are the biggest gaps?
Tools?
Structure?
Processes?
Policy?
Staffing?
Mental note >> What's the point? Why should they care?
Document the SLA for your SharePoint service offering – get sign off from stakeholders
Create a business case to obtain funding
Create project controls for a POC (Charter, Communication plan, schedule, risk plan, test plan etc.)
Build POC environment and carry out tests and document results
Review findings and decide on next steps
Mental note >> What's the point? Why should they care?
Install monitoring tools and use best guess and vender input as a start
Baseline and trend environment to optimize your monitoring
Incorporate baselining into your change control
Integrate alters into your help desk software to register issues and action
Review reports monthly with statkeholders
Mental note >> What's the point? Why should they care?
?