system automation, integration and recovery


Published on

WebMD Health Corp.: agile system automation, integration and recovery using HP Server Automation and HP Operations Orchestration

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

system automation, integration and recovery

  1. 1. WebMD Health Corp.: agile systemautomation, integration and recoveryusing HP Server Automation and HPOperations Orchestration– Derek Chang Manager, WebMD– Roger Hsu Manager, WebMD1 ©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice
  2. 2. Topics Who are we and where we stand Infrastructure Layout Middleware Integration HP OO preparation Application Administration Automation Build Deployment Automation Unattended WebMD Content Backup Maintenance Free System Results from HP SA/OO Implementation Q/AIntroduction & Agenda 2
  3. 3. cmsops Responsibility – Provide Maintenance and 24x7 support of CMS applications and their subsystems in production environment – Perform production system patches, bug fixes or software releases and other build deployments. – Support ongoing releases and developments in non-production environments – Define/document production support requirements, escalation procedures, issue tracking and guidelines for troubleshooting and build deployments. Resource: 4.5 headcounts* Universe – 300+ internal users – SDLC environments: dev/devint/qa00/qa01/qa02/perf/production – 130 servers – 4.4 TB of NAS storage for raw contents and site contents – Infrastructure: Zenoss, HPSA, HPOO, Serena teamtrack, MOSS, MSSQL/Oracle Core technology – EMC Documentum – Proprietary applications 3
  4. 4. Documentum An enterprise content management platform, now delivered by EMC Corporation, as well as the name of the software company that originally developed the technology. Flexible, versatile, powerful yet complex platform Implementation in WebMD – 2 major portal sites – 6 Documentum products – Proprietary content editor for advanced features – Proprietary page transformer – Proprietary utilities: 15 applications 4
  5. 5. Challenges Documentum is a new technology Documentum is a rare expertise Complexity of the CMS Cmsops support users within the company WebMD is a fast growing company 5
  6. 6. Life in cmsops Sampling duration: Oct 11,2007 – Jul 24, 2009 653 days/426 working days Source: customized teamtrack reports and emails Summary 1772 teamtrack requests 479 email requests* 5.3 tickets/working day 6
  7. 7. Our Approach Develop and utilize process templates Standardize and adopt the development model Identify what processes to be automated – Routine/mundane activities – Human interactions cause error/failure – Much longer Lifecycle/Service time than development time 7
  8. 8. Infrastructure Layout 8
  9. 9. Infrastructure Layout Corporate infrastructure Active Directory Exchange server Teamtrack Win2K3 Opsware OO Opsware SAS Middleware integration PAS LAB Web interface RHEL4u6_64BIT VM Jboss 5.0 Central and RAS Central and RAS Business mashup engine App server(s) Web interface Web interface Web interface Web services Teamtrack client Repository Server repository Code base OO client Workflow engine Software repository Opsware agent Email adapter Scheduler engine OCLI engine Email sender NRAS Web services engine LDAP module JRAS JAVA API XML module SAS OCLI client Twister Data modeling SAS web services client Opsware agent engine Build server RHEL4u4_32BIT VM Web interface Rpm/msi package tools QA/DEV Clients NAS/Build repository OCLI 1.0 Opsware agent 9
  10. 10. Middleware Integration 10
  11. 11. Middleware Integration Description – The core of the automation system – Connections among ticketing, monitoring, and system administration tools within WebMD operations. – Providing operation tools without users accessing underlying systems/tools 11
  12. 12. Middleware Integration Ticketing system integration – Use web services to connect Serena Business Mashup (TeamTrack) – Pull information from tickets and pass data to other systems such as HP OO – Update tickets after automation operation 12
  13. 13. Middleware Integration System Administration (HP SA/OO) integration – Java bean uses OO library to trigger OO workflow RSFlowInvoke rsf = new RSFlowInvoke(); rsf.setUrl(url+flowName+paraString); rsf.setUsername(user); rsf.setPassword(pw); result = rsf.invoke(); – Parse the workflow result (XML format) to get: • OO flow id and report URL • Start time and end time • OO flow response and result 13
  14. 14. Middleware Integration Web Application – Allows users to use the automation tools via a web browser over network to prevent access to underlying systems/tools such as HP OO directly – Uses Ajax and Richfaces technologies to provide dynamic and intuitive user experiences – Developed under JBoss Seam framework – Adopts Hibernate as Database layer framework 14
  15. 15. Middleware Integration Security and User Authorization – Integrates with WebMD LDAP servers that allows users to access the system with their WebMD id/password – JBoss Rules engine provides access control based on WebMD LDAP groups of each user 15
  16. 16. HP OO Preparation 16
  17. 17. HP OO Preparation Identify basic/out of the box OO operations – SSH – Windows Remote Command Execution – Change IIS status – Change Windows service status – OCLI to access HP SA – Iterator, Email CDO, …etc – Database operations (oracle/mssql) Modulization and utility workflows – Use OO operations to build up utility workflows that will be re-used frequently 17
  18. 18. HP OO Preparation HostsSSH: run Linux commands in a list of hostsGiven a list of hosts toIterator (PAS out-of-box SSH Command (PASoperation) out-of-box operation) Call Error Notice flow 18
  19. 19. HP OO Preparation HostsWinCommand: run Windows commands in a list of hostsGiven a list of hosts toIterator (PAS out-of-box SSH Command (PASoperation) out-of-box operation) Call Error Notice flow 19
  20. 20. HP OO Preparation IIS Flows: – HostIISSites: control multiple IIS Sites on single host – HostsIISSites: control multiple IIS Sites on multiple hostsGiven a list of hosts Given a list of sites Multiple hosts, multiple sites Single host, multiple sites 20
  21. 21. HP OO Preparation Window Services flows: – HostWinSvcsCtrl: control multiple services on single host – HostsWinSvcsCtrl: control multiple services on multiple hosts Multiple hosts, multiple services Single host, multiple services 21
  22. 22. Application Administration Automation 22
  23. 23. Application Administration Automation Goal: Develop OO workflows to stop/start WebMD applications and sites Workflow key features – Identify target servers – Windows: stop/start windows svc and IIS sites – Linux: stop/start applications and run any script if needed – Send error/success email notices 23
  24. 24. Application Administration Automation Users pick available host type and environment based on the permission given to their LDAP groups Login as consumer QA userConsumer users areNOT allowed to pickprofessional hosts QA users controls QA environments only 24
  25. 25. Application Administration Automation Users hit one of the action buttons User hits “Query Servers” 25
  26. 26. Application Administration Automation Web application then triggers corresponding HP OO workflow OO workflows connect HP SA with OCLI HP SA takes actions on target hosts 26
  27. 27. Application Administration Automation The OO workflows sends the result back to middleware in XML format Middleware parses the XML and display the result in GUI dmas qa00 server 27
  28. 28. Application Administration Automation Users receive email notices 28
  29. 29. Application Administration Automation Application Administration workflows: – Documentum Content Servers – Documentum Application Servers – ATS: WebMD proprietary content transformer – PATS: WebMD proprietary content transformer – Page Builder: WebMD proprietary content editor 29
  30. 30. Application Administration Automation WebMD Content ServersInitiate variables Start/stop SCSbased on portal (HostsSSH)OCLI Query Start/stop JMSServers based on (HostsSSH)portal, product,host type, and Start/stop docenvironment base (HostsSSH)Stop when query Clean up docservers only base (HostsSSH)Decision: start or Send email noticeshutdown when finishes 30
  31. 31. Application Administration Automation OCLI: query server WebMD Application Servers list against SAS HostsSSH: run{device_servergroup_nameequal_to "${portal}"} & commands in{device_servergroup_name each host in theequal_to "${product}"} & list{device_servergroup_nameequal_to "${hostType}"} &{device_servergroup_nameequal_to "${environment}"}Filter Stringfor i in`/opsw/api/com/opsware/server/ServerService/method/.findServerRefs:ifilter=${filterString}`;do/opsw/api/com/opsware/server/ServerService/method/getServerVOself:i="$i"; HostsSSH: run commandsdone in each host in the listOCLI command 31
  32. 32. Build Deployment Automation 35
  33. 33. Build Deployment Automation Goal: Develop an OO workflow to build RPM and deploy it to target servers Workflow key features: – Identify target servers, software policy and RPM in HP SA – Build RPM and upload it to HP SA – Stop/start applications in target servers – Detach/attach software policies and remediate target servers – Update RPM in software policies 36
  34. 34. Build Deployment Automation Workflow inputs: – Portal – Product – Host Type – Application – Environment – Build Version 37
  35. 35. Build Deployment Automation Identify target servers – Setup server groups in HP SA: portal groups, product groups, host type groups, and environment groups; then assign servers to appropriate groups Host type group Product group Portal group Environment group 38
  36. 36. Build Deployment Automation Identify target servers (Cont.) – Use OO SSH operation to execute OCLI command to get SAS server list • OCLI: findServerRefs and getServerVO in server service • Filter: Use aforementioned server groups as filter {device_servergroup_name for i in equal_to "${portal}"} & `/opsw/api/com/opsware/server/S {device_servergroup_name erverService/method/.findServer equal_to "${product}"} & Refs:i {device_servergroup_name filter=${filterString}`; equal_to "${hostType}"} & do {device_servergroup_name /opsw/api/com/opsware/server/Se equal_to "${environment}"} rverService/method/getServerVO Filter String self:i="$i"; done OCLI command 39
  37. 37. Build Deployment Automation Identify software policy & RPM – Software Policy naming in HP SA: {Application} – {Environment} – Use findSoftwarePolicyRefs OCLI command to identify software policy – Use findRPMRefs OCLI command to identify RPM 40
  38. 38. Build Deployment Automation Build RPM and upload it to HP SA – Required parameters: application and build version – A Perl application on Apache to build RPM – Client sends HTTP request with parameters to trigger the Perl application – Upload the RPM to HP SA with OCLI 1.0 – Get the result back to the client 41
  39. 39. Build Deployment Automation Stop/start applications in target servers – Use “HostsSSH: run Linux commands in a list of hosts” utility workflow to run stop/start command on target hosts Detach/attach software policies and remediate target servers – Use OO out-of-box operations Update RPM in software policies – Use OCLI update command in software policy service to replace RPM in target software policy 42
  40. 40. Build Deployment Automation Put it all together! Start/stop application in target servers Build and upload RPM Detach/attach SP,Identify SP, RPM, replace RPM in SP,and target servers and Remediate 43
  41. 41. Unattended WebMD Content Backup 53
  42. 42. Unattended WebMD Content Backup Goal: Develop two OO workflows: 1. shutdown all components and backup WebMD contents. 2. bring all components up Workflow key features: – Identify target servers – Windows: stop/start windows svc and IIS sites – Linux: stop/start applications and run any script if needed – Send error/success email notices – Utilize OO scheduler to trigger cold backup – The workflow needs to setup another schedule to trigger another flow to bring up all components 54
  43. 43. Unattended WebMD Content Backup Workflows OverviewFlow 1: Flow 2: 1. Shut down all components 1. Check backup status 2. Run file back up 2. Start all components 3. Run DB backup 4. Schedule another flow (flow 2) to start all components 60 min 55
  44. 44. Maintenance Free System 62
  45. 45. Maintenance Free System Goal: Proactively maintain the health of our applications without shutting them down Workflow key features: – Automatically clear cache and stale data without shutting down or restarting applications – Purge outdated publishing data and logs – Ensures that the most relevant information is retained. – Improves both system-level and publishing performance. – Minimize the need for frivolous restarts. – Keep our applications online longer 63
  46. 46. Maintenance Free System Workflow details – Single SSH Node – Runs a script to purge data/log files older than 3 days – Runs on OO scheduler once a day 64
  47. 47. Results fromHP SA/OO Implementation 65
  48. 48. Better Life in cmsops - 1Sampling duration: Sampling duration:Oct 11,2007 – Jul 24, 2009 Jul 25,2009 – Dec 10, 2009653 days/426 working days 135 days/93 working daysSource: customized teamtrack reports Source: customized teamtrack reportsand emails and emailsSummary Summary:1772 teamtrack requests 248 teamtrack requests479 email requests* 35 email requests (reduced by 35%)5.3 tickets/working day 3.1 tickets/working day 285 cmsai request (self-service) 66
  49. 49. Better Life in cmsops - 2 Non-prod environments are self-serviceable 15% of build deployment is automated Automatic/Scheduled data/log purging Scheduled/unattended cold backup* 67
  50. 50. Q/A 68
  51. 51. To learn more on this topic, and to connect with your peers after the conference, visit the HP Software Solutions Community: ©2010 Hewlett-Packard Development Company, L.P.
  52. 52. 70