Managing and Monitoring TeamPage


Published on

Chris Nuzum, Traction Software. Traction User Group, Oct 15 2010, Newport RI. TUG 2010 Newport slides, agenda and more see

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Managing and Monitoring TeamPage

  1. 1. Managing Teampage
  2. 2. Topics <ul><li>Process structure </li></ul><ul><li>Non-plugin local server modifications </li></ul><ul><li>Settings overview </li></ul><ul><li>Setup interfaces </li></ul><ul><li>Q&A </li></ul>
  3. 3. TeamPage — Processes <ul><li>3 Java Proceses </li></ul><ul><ul><li>StartTraction wrapper process — java.exe — lower process id </li></ul></ul><ul><ul><ul><li>Invokes Traction, monitors and proxies output, restarts </li></ul></ul></ul><ul><ul><li>Traction process — java.exe — higher process id </li></ul></ul><ul><ul><li>JavaDB Network Server — javaw.exe </li></ul></ul><ul><li>Plus native Windows Service/Linux Daemon </li></ul>
  4. 4. Locking <ul><li>Only one server should ever run against a journal at a time </li></ul><ul><ul><li>Otherwise journal file can be inconsistently numbered, requiring manual repair </li></ul></ul><ul><li>Two safeguards </li></ul><ul><ul><li>Lock file — if present, server won’t start </li></ul></ul><ul><ul><ul><li>Removed by JVM on clean exit </li></ul></ul></ul><ul><ul><li>Socket — if bound, server won’t start </li></ul></ul>
  5. 5. Windows Startup <ul><li>Automatic </li></ul><ul><ul><li>Windows Service — runs as System </li></ul></ul><ul><ul><ul><li>Unlock Traction on Boot </li></ul></ul></ul><ul><ul><ul><ul><li>removes lockfile </li></ul></ul></ul></ul><ul><ul><ul><li>Traction </li></ul></ul></ul><ul><ul><ul><ul><li>runs StartTraction wrapper </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>runs Traction </li></ul></ul></ul></ul></ul><ul><ul><li>console output - traction.out.txt </li></ul></ul><ul><li>Manual </li></ul><ul><ul><li>Windows Application </li></ul></ul><ul><ul><ul><li>TractionApplication </li></ul></ul></ul><ul><ul><ul><ul><li>runs StartTraction wrapper </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>runs Traction </li></ul></ul></ul></ul></ul><ul><ul><li>console output - console </li></ul></ul>
  6. 6. Unix Startup <ul><li>Automatic </li></ul><ul><ul><li>/etc/init.d/traction start </li></ul></ul><ul><ul><ul><li>TractionDaemon </li></ul></ul></ul><ul><ul><ul><ul><li>runs StartTraction wrapper </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>runs Traction </li></ul></ul></ul></ul></ul><ul><ul><li>console output - traction.out.txt </li></ul></ul><ul><li>Manual </li></ul><ul><ul><li>command line </li></ul></ul><ul><ul><ul><li>TractionApplication </li></ul></ul></ul><ul><ul><ul><ul><li>runs StartTraction wrapper </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>runs Traction </li></ul></ul></ul></ul></ul><ul><ul><li>console output - console </li></ul></ul>
  7. 7. Shutdown <ul><li>Windows </li></ul><ul><ul><li>Stop service </li></ul></ul><ul><ul><ul><li>Stops StartTraction </li></ul></ul></ul><ul><ul><ul><ul><li>Traction listens for heartbeat from StartTraction </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>If no heartbeat for specified period, exits with restart code </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>If no StartTraction, restart = exit </li></ul></ul></ul></ul></ul><ul><li>Unix </li></ul><ul><ul><li>/etc/init.d/traction stop </li></ul></ul><ul><ul><ul><li>kill -2 {Traction process ID} </li></ul></ul></ul><ul><ul><ul><ul><li>invokes clean shutdown handler </li></ul></ul></ul></ul><ul><li>All, Preferred </li></ul><ul><ul><li>Shutdown Traction button in Server Setup </li></ul></ul>
  8. 8. Shutdown Process <ul><li>Finish current work, then halt thread </li></ul><ul><ul><li>Appending journal </li></ul></ul><ul><ul><li>Reading mailbox </li></ul></ul><ul><ul><li>Serving files </li></ul></ul><ul><li>In error state, work may be stuck </li></ul><ul><li>Force shutdown via web page </li></ul><ul><li>Last resort, force shutdown via operating system </li></ul><ul><ul><li>Manually clear lock file after ensuring processes gone </li></ul></ul>
  9. 9. Local Mods <ul><li>Most configuration managed by TeamPage interfaces, can be overridden or extend in plug-ins. </li></ul><ul><li>Exceptions </li></ul><ul><ul><li>Adding support for new mimetypes — modify mime.types </li></ul></ul><ul><ul><li>Supporting new user-agents, modify </li></ul></ul><ul><ul><li>Re-apply modifications after upgrade </li></ul></ul>
  10. 10. Settings Overview
  11. 11. Tour of Admin Interfaces <ul><li>Server Setup </li></ul><ul><li>Project Setup </li></ul><ul><li>Personal Setup </li></ul><ul><li>Skin Setup </li></ul><ul><li>Plug-in Options </li></ul>
  12. 12. TeamPage Monitoring & Debugging
  13. 13. Normal EKG
  14. 14. Flatlining
  15. 15. Support2478 Support 2478: I use FAST Search and my TeamPage server has suddenly become very slow or has reported OutOfMemory errors. How do I recover?
  16. 16. Walkthrough of Tools <ul><li>Console output - traction.out.txt </li></ul><ul><li>Log files </li></ul><ul><ul><li>logs/debug.log </li></ul></ul><ul><ul><li>logs/statistics.log </li></ul></ul><ul><ul><ul><li>graphing memory usage - Support668 </li></ul></ul></ul><ul><li>Thread manager </li></ul><ul><li>kill -3 — Java thread+monitor dump </li></ul><ul><li>JConsole — install JDK, run with </li></ul>
  17. 17. Backup & Availability Planning
  18. 18. <ul><li>Considerations </li></ul><ul><li>Recommendations </li></ul><ul><li>Discussion - learn from each other </li></ul>
  19. 19. Considerations <ul><li>Back up FAQ is Doc19 </li></ul><ul><ul><li>Entire installation, not just journal data </li></ul></ul><ul><ul><li>Considerable time invested in server settings, config files, don’t want to have to recreate </li></ul></ul><ul><li>Open files – Windows Volume Shadow Copy </li></ul><ul><li>Time window </li></ul><ul><ul><li>Files can change during backup, require rebuild on restore to ensure consistency </li></ul></ul>
  20. 20. Recommendations <ul><ul><li>VMWare snapshots address consistency issue, can be run online, allow rollback, and can be mirrored remotely </li></ul></ul><ul><ul><ul><li>Recommend machine state as well as disks, otherwise rebuild required on restore </li></ul></ul></ul><ul><ul><li>scheduled rsync </li></ul></ul><ul><ul><ul><li>Incremental remote mirror </li></ul></ul></ul><ul><ul><li>ZFS snapshots, export </li></ul></ul><ul><ul><li>AWS EC2 snapshots </li></ul></ul>
  21. 21. Traction Authentication & Authorization
  22. 22. Security Principals <ul><li>Identify users & groups </li></ul><ul><ul><li>traction:u:18, traction:g:24, ad:g:52 </li></ul></ul><ul><li>Groups defined by principal, recursively </li></ul><ul><ul><li>traction:g:24 = { ad:g:52, traction:u:42 } </li></ul></ul><ul><li>ACLs defined over principals </li></ul><ul><ul><li>traction:g:18 allow publish own </li></ul></ul><ul><li>Each user has exactly one security principal </li></ul>
  23. 23. Local Users & Groups <ul><li>Stored in journal </li></ul><ul><li>Cached in memory </li></ul>
  24. 24. External Users & Groups <ul><li>Defined, managed externally </li></ul><ul><li>Active Directory, LDAP, and others supported </li></ul><ul><li>Cached in Principal Cache </li></ul><ul><ul><li>Downloaded at startup, updated asynchronously at defined interval </li></ul></ul><ul><ul><li>Force update by clearing cache with /type cachemanager </li></ul></ul>
  25. 25. Customizable Queries <ul><li>User directory configuration defines how lookups are done </li></ul><ul><ul><li>Depending on directory server, changing queries can dramatically improve performance </li></ul></ul>
  26. 26. Extensible Architecture <ul><li>Login Manager </li></ul><ul><ul><li>Determine credentials based on request </li></ul></ul><ul><li>Authenticator </li></ul><ul><ul><li>Determine whether credentials are valid </li></ul></ul>
  27. 27. Hybrid Login Managers <ul><li>Handle different types of connections differently </li></ul><ul><li>Dispatch based on skin, user-agent, URI path </li></ul><ul><li>Realms — HTTP basic auth, login always required </li></ul><ul><li>OpenRealms — HTTP basic auth, login optional </li></ul>hybrid_realms=com.traction.admin.RealmsLoginManager hybrid_realms_useragents=securerobot,attachmentsrobot hybrid_realms_servlet_paths=/webdav,/db hybrid_open_realms=com.traction.admin.OpenRealmsLoginManager hybrid_open_realms_skins=rss,rss091,rss092,rss10,rss20,atom,ical hybrid_open_realms_useragents=rss,robot,calendar
  28. 28. Hybrid Authenticator <ul><li>Switches based on principal </li></ul><ul><li>Handle both local Traction users and AD/LDAP users </li></ul>
  29. 29. Simple Login Managers <ul><li>Cookies — Encrypted cookie sent to browser </li></ul><ul><ul><li>Can be encoded with IP address of client </li></ul></ul><ul><li>Realms — HTTP Basic </li></ul><ul><li>Most secure over HTTPS </li></ul>
  30. 30. Single Sign-on <ul><li>LDAP X.509 Client Certificates </li></ul><ul><ul><li>HTTPS provides cert, determines principal </li></ul></ul><ul><ul><li>Lookup user in LDAP, make sure cert matches </li></ul></ul><ul><li>NTLM </li></ul><ul><ul><li>After handshake, browser provides hash code </li></ul></ul><ul><ul><li>Validate hash with AD server </li></ul></ul>
  31. 31. Single Sign-on <ul><li>NTLMv2 </li></ul><ul><ul><li>Via commercial library </li></ul></ul><ul><ul><li>Emulates protocol Windows workstations use to allow users to log in </li></ul></ul><ul><ul><li>More secure, more robust </li></ul></ul>
  32. 32. Federation <ul><li>NTLM RunAs authenticator for use with existing Enterprise Search federators, e.g. Vivisimo </li></ul><ul><li>Authenticate service account via NTLM, run as user performing the search </li></ul>
  33. 33. Performance Tuning
  34. 34. Memory <ul><li>Garbage collection burns CPU </li></ul><ul><li>Flushing caches requires reloading caches from disk </li></ul><ul><li>Run 64-bit </li></ul><ul><ul><li>32-bit limited to ~1.5GB heap </li></ul></ul><ul><ul><li>For best performance, heap should be less than physical RAM </li></ul></ul>
  35. 35. Caching in Traction <ul><li>Cache </li></ul><ul><ul><li>config/**, plugins/**/config/** </li></ul></ul><ul><ul><li>Entry tokens </li></ul></ul><ul><ul><li>Permissions </li></ul></ul><ul><ul><li>Principals </li></ul></ul><ul><ul><ul><li>Users </li></ul></ul></ul><ul><ul><ul><li>Group membership </li></ul></ul></ul>
  36. 36. Finding what’s slow on a page <ul><li>Enable timing debug </li></ul><ul><li>View page source </li></ul>
  37. 37. Tuning Label Driven Sections <ul><li>Use label-based queries when searching/filtering for labels </li></ul><ul><ul><li>:todo i(:todo and :r42) </li></ul></ul>
  38. 38. Use a Smaller Default Timeslice
  39. 39. Turn off Project Counts <ul><li>Permission filtered, can be expensive to calculate </li></ul>
  40. 40. Hide WebDAV Sidebar <ul><li>Page doesn’t complete drawing until WebDAV request complete </li></ul>
  41. 41. Disable WebDAV Auto Refresh
  42. 42. Offload JavaDB <ul><li>Run in a different process </li></ul><ul><li>Run on a different computer </li></ul><ul><li>Customer2050 </li></ul>
  43. 43. Offload Metrics Reports <ul><li>Export entire JavaDB </li></ul><ul><li>Run in a different instance </li></ul>
  44. 44. Making the Most of Metrics
  45. 45. Making the Most of Metrics <ul><li>Tour of metrics </li></ul><ul><ul><li>Hit counter </li></ul></ul><ul><ul><li>Top articles </li></ul></ul><ul><ul><li>Viewed by </li></ul></ul><ul><ul><li>Browsing History </li></ul></ul><ul><li>Controlling who can access detailed metrics reports </li></ul><ul><li>Report controls </li></ul><ul><li>Report details </li></ul><ul><li>Exporting CSV </li></ul><ul><li>Rebuilding indexes </li></ul><ul><li>Q&A </li></ul>