Preventing serversickness


Published on

Preventing Server Sickness Becoming A Pandemic - Benelux March 2013

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Preventing serversickness

  1. 1. Prevent Server Sickness Becoming a Pandemic! Gabriella Davis The Turtle Partnership twitter: gabturtle
  2. 2. Fixing Your ServerWhat causes server sicknessTools to spot sicknessGetting Your Server Back to Full Health 2
  3. 3. Server Sickness 3
  4. 4. Server SicknessThe problem with DominoHow does a server get sick?–Vulnerabilities–Aging Configurations–Bad Habits–Developers Gone Wild 4
  5. 5. The Problem With Domino“My Server Is Running Fine”Server Stability–Often despite our best effortsTasks that just run–even without being properly configured 5
  6. 6. VulnerabilitiesStart with the OS–patch levels–unnecessary processes with exposed ports–disk and data securityThen the hardware–It’s all about disk performance–Using a SAN? Is the SAN configured for Domino?–Transaction logs configured? 6
  7. 7. VulnerabilitiesSecurity–ACLs • -Default- and Anonymous • LocalDomainServersHTTP vs HTTPsLDAPDIIOPSametime 7
  8. 8. Aging ConfigurationsWhat can give you problems over time–Database sizes–More users–More tasks and features 8
  9. 9. Bad HabitsWhat are your users doing?–what features are they using–how are they using them • are they creating repeating 10yr appointments for instance • are they copying themselves on emailsPassword quality for HTTP passwords 9
  10. 10. Giving Developers PowerAllowing development to dictate replication andagent schedulingThe curse of not production tested XPages codeDemands for “LDAP” or “DIIOP” for anapplication to work 10
  11. 11. Tools to Spot Sickness 11
  12. 12. Tools to Spot SicknessUnderstanding PrioritiesDDM Probes and Event AnalysisStatisticsCatalog.nsfQoS - new with Domino 9Enhanced Fault Reporting - new with Domino 9 12
  13. 13. Understanding PrioritiesServer role–What do you want from your server–What are statistics telling youWarning Levels–Is it safe to ignore ‘Warning (Low)’ and focus on ‘Fatal’ or ‘Failure’ 13
  14. 14. Bringing Problems to YouEvent Handlers, Event Generators, Statistics, FaultReports and DDM Probes - where to startSetting Statistic ThresholdsChoosing and configuring probesReviewing FaultsSetting up QoS behaviour 14
  15. 15. Bringing Problems To YouWhy we set up collection hierarchies for DDM–and howDaily and Weekly DDM reviews–What to look out for 15
  16. 16. Probes for Mail ServersSecurity - WeeklyDirectory PerformanceCritical mail routesMail ‘Slack’ 16
  17. 17. Probes for Application ServersAgent run times–agent cpu usageSecurity and Web Configuration 17
  18. 18. Probes for Struggling ServersOS level–disk performance (beware of reported SAN problems)–memory–network 18
  19. 19. What to look forFatal problemsPersistent WarningsPeak activity behaviour–uptick in problems at 9am, 1pm etcRepetitive low level ‘annoyances’ 19
  20. 20. Catalog.nsfNot every database is immediately visible but theyare all there (just hidden with selection formulae)It’s a good place to start looking for multiplereplicaIt’s a good place to find ACL issuesReplicates around your domain and updatesovernight 20
  21. 21. QoS - Quality of ServiceMonitor server health and performanceMonitors application behavior, stability and hangsRestarts Domino if it thinks there are memory issuesor an application is hungShuts down Domino if a clean shutdown doesn’thappen and the server hangsControlled via notes.ini settings and dcontroller.iniRequires Domino to be running under the JavaController • nserver -jc 21
  22. 22. QoS ConfigurationStarting Domino under Java Controller shouldcreate a dcontroller.ini fileQOS_Enable=1In Notes.Ini • QOS_ProbeInterval (defaults to 1 min) • QOS_ProbeTimeout (defaults to 5 mins) • QOS_ShutDown_Timeout • QOS_Apps_Timeout • QOS_Shutdown_Timeout 22
  23. 23. QOS - Potential ProblemsQOS doesn’t support passwords on server ids , therestart will pause at the password entry screenQOS timeouts being too lowDon’t enable QOS on servers without transactionlogging 23
  24. 24. Enhanced Fault ReportingFault Reporting Database -lndfr.nsfExpanded to include a by Disposition view–all faults when analyzed have a disposition value that categorises as • Problem • Possible Problem (possibly actionable ) • Possible Problem (likely NOT actionable ) • Informational • Unknown (investigate) 24
  25. 25. Possible Problem - ActionableOut Of Memory: Represents a crash in which the Java virtualmachine (JVM) ran out of a memory resource such as heapspace.Launched Notes multiple times: Indicates that the userquickly launched multiple instances of the Notes clientPossible hang: Indicates that the Notes client was manuallyterminated while it appeared to be doing useful work.User Kill: Indicates that the user manually terminated the clientwhile it appeared to be waiting for input or network timeout 25
  26. 26. Back to Full HealthGetting Control–Mail , Databases and ECLs–SMTP–Agent Scheduling–Directories–Adminp–LDAP–Tasks and Internet Site DocumentsDomino Configuration Tuner 26
  27. 27. Back to Full HealthGetting Control–Mail , Databases and ECLs–SMTP–Agent Scheduling–Directories–Adminp–LDAP–Tasks and Internet Site DocumentsDomino Configuration Tuner 27
  28. 28. Getting Control Mail and DatabasesSetting ACLs at directory level (Editor)Lock down ECLs via PoliciesIntroducing quotas alongside server based archivingConsider archiving files to a dedicated serverUpgrade to 8 and enable OOO router instead ofagentsDisable forwarding rules set up by usersUse message tracking and mail rules very sparinglyDisable on the fly searching of non indexeddatabases 28
  29. 29. Database Management ToolsDBMT Server Command • runs copy-style compact operations • purges deletion stubs • expires soft deleted entries • updates views • reorganizes folders • merges full-text indexes • updates unread lists • ensures that critical views are created for failover–Replaces Updall • Load updall - nodbmt tells updall to run but not perform the functions that DMBT already does 29
  30. 30. DBMT Parameters-compactThreads-updallThreads-ftiThreads-timeLimit refers to compact timeout for DBMT-range starttime stoptime–compactNdays (run Compact every x days)–ftiNdays (run FT Index every x days)–force d (day Sunday =1) fixup if compact fails for consecutive day 30
  31. 31. Getting Control SMTPRestrict relaying to specific ip addresses notnetwork rangesBeware of allowing authenticated relaying andopening up to dictionary attacksRestrict rights to send to internal groups frominternet addressesDon’t accept mail for local part matchesConfigure your server for HTML mail not plaintext 31
  32. 32. Getting Control SMTP (more)Don’t allow all connecting hosts to deliver mailinbound, if you use a service restrict to those hostsUse services / tools to spot attacks such as–persistent attempts to mass deliver within a time period–continual failures by a host to deliver to a correct addressMove responsibility for that first line of defenseaway from native Domino 32
  33. 33. Getting Control Agent SchedulingWhen are agents set to run–amgr_newmaileventdelay–amgr_newmailagentminintervalIf you’re using OOO agents how often are theyscheduledDo users have private agents running–Sh Agents [DBName] • All shared and private agents in a databaseWho has rights to run agents 33
  34. 34. Getting Control DirectoriesAvoid adding additional views to the DominoDirectoryThe risk of allowing local replicas with AuthorrightsDirectory Assistance–Sh xdir 34
  35. 35. Getting Control AdminpPurge old documentsRequests awaiting approvalTell adminp process NEW not ALL 35
  36. 36. Getting Control LDAPAllowing anonymous access to query LDAPAuthenticating LDAP queriesExtended Directory Catalog used by LDAPRelying on DNSNot configuring the LDAP task correctly to allowlarge searches with no timeoutsMaintaining schema.nsf 36
  37. 37. Getting Control Tasks and Program DocumentsDisable tasks you don’t needSchedule overnight tasks so they don’t overlap–and don’t conflict with backupsUse program documents so you can review andmanage easily–sh config servertasksat*Keeping templates on every serverUsing compact -B 37
  38. 38. Getting Control Internet Site DocumentsWeb Configuration means TCPIP tasks areconfigured in the server document and are serverwide–often enabled by defaultInternet site documents require you to opt in forTCPIP services–configured by hostname 38
  39. 39. Domino Configuration TunerDomino Configuration Tuner is an analysis toolbased on a set of pre-configured best practice/worstpractice rulesThe Rules are shipped by IBM with the Lotusinstalls and are updated via a public update siteMakes recommendations on configuration changesto enhance performance and security and reduceTCO 39
  40. 40. How does it work?Run and installed via the Domino ConfigurationTuner databaseUpdated by online template updates and ruleupdatesDCT rules and results are held in a local databaseand will require a restart of the client for changesto take effectScans–Server documents–notes.ini settings–advanced database propertiesIntended to scan servers in a single domain 40
  41. 41. How does it work?Creates reports on each scanned server based onthe rules you selectEach report contains–Issues–recommendations for adjustments–links to supporting documentation 41
  42. 42. Pre-requisitesv8 Notes client (standard or basic) or administratordct.nsf database and dct.ntf templateservers 7.x or higher 42
  43. 43. SetupDCT.NSFStdDominoConfigTuner Template (dct.ntf)ID must have reader access to names.nsfID must have ‘View Administrator’ rightsRequires no server or domain changes 43
  44. 44. View Administrator RightsServer DocumentSecurity TabView Administrator is a subsetof ‘Administrator’ rightsThink of it as ‘Show’ not ‘Tell’rights–Sh users - YES–tell http refresh - NO 44
  45. 45. DCT PreferencesList of all rulesReview rule , description and supportingdocumentationAll rules are enabled by default for all scansEnable and Disable rules 45
  46. 46. DCT UpdatesConnects to the IBM site to download–must have outbound connectivity 46
  47. 47. DCT UpdatesClick ‘check for updates’Connects to an external IBM site to identifiesany template or rule updates 47
  48. 48. DCT UpdatesAccept license and updates downloadIt’s not possible to selectively download 48
  49. 49. DCT Updates - Finished“Successful” screen will notify you to restartyour clientYou may need to do 2 client restarts beforeDCT can be used 49
  50. 50. Running the tunerFirst select the servers in your current domain you wantto run againstThe list of servers is retrieved from the domain of thehome server identified in your location documentChange locations to scan a different domain 50
  51. 51. Running the tunerYou can manually type in the full hierarchicalnames of any other servers you want to scan aspart of this analysisSeparate multiple server names with commas,semi colons or new linesYou can only scan servers you can reach so youneed a connection document to any you list–or the server needs to be available via your passthru server in your location 51
  52. 52. Understanding the ResultsSummary resultsIssues by criticality 52
  53. 53. Understanding the ResultsSummary resultsServers that failed to scan–reason why scan failed 53
  54. 54. Understanding the ResultsSummary resultsDetailed list of rules evaluated 54
  55. 55. Understanding the ResultsView the current reportSelect ‘change’ to view a different report 55
  56. 56. Understanding the ResultsFilter results to make analysis easier–by server–by specific rules–by severity 56
  57. 57. Understanding the resultsCategorised results of recommendationsSorted by criticality and then by server name 57
  58. 58. Understanding the resultsEach recommendation comes with anexplanation so you can evaluate on a result byresult basis if you want to make the change 58
  59. 59. Understanding the resultsEach recommendation is provided with a link toa best / worst practices supportingdocumentation 59
  60. 60. Working with RulesDisabling and enabling rules can be donethrough the ‘Preferences’ 60
  61. 61. Working with RulesSelecting a rule shows the description and linksto the best / worst practice documentation 61
  62. 62. Making ChangesAdvanced Database Properties–assigned en masse via Domino Adminnotes.ini settings–assigned via the command set config xxx = x–shown via the command sh config xxx = xMany recommendations refer to ‘some databases’but don’t specify which ones - check which oneswill be affected 62
  63. 63. ResourcesDomino Configuration Tuner blog––details and explanations of new rules published each month 63
  64. 64. Summary• No matter how well your servers are configured they will continue to degrade in performance over time unless you pro-actively monitor and fix• Many of the server performance issues will be seen first by your users before they filter down to you• Make reviewing your server configuration using DDM probes followed by a DCT analysis part of every server upgrade• Enable probes that are specific to the server role. Mail and Directory probes on Mail servers and Agent probes on Application servers• Use Security and Database probes configured in DDM to stay on top of any low level warnings that could cause larger problems in the future• Don’t over configure your servers to monitor everything or you’ll be looking for a needle in a haystack. Ask your servers to tell you only what you need to be aware of so immediately• Use the built in tools, DCT, Statistics, DDM, Catalog, Activity Trends to monitor your servers and gain a good understanding of what is their ‘normal’ behaviour so you can more easily spot when something goes wrong.
  65. 65. QuestionsHow to contact me:Gabriella Davisgabriella@turtlepartnership.comTwitter: gabturtle