Your SlideShare is downloading. ×
AppJam 2012: Zero to Production APM in 30 days
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

AppJam 2012: Zero to Production APM in 30 days


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • Exacttarget is a multichannel marketing platform– extremely high demands from some of the most demanding brands in the world.
  • Other stats– 25 million api calls processed daily10 billion subscriber manipulations daily via segmentation/grouping activitiesMultiple databases > 10 TerabytesOne customer – 32 billion rows under management
  • 5 Production stacks = separate environments with AppDsaas environment, all architected in the same fashion.
  • 2009 = Four monitoring engineers– only responsible for watching a triaging system alerts. No centralized management systems **illustrate via story about how many emails we were sending in a single day and the need to wait for an email every four hours
  • The Four tenants shared amongst all of product teams== collective shared vision.
  • This was an unsolicited set of points shared by one of the Exacttarget product VP’s…. Specifically called out AppDynamics
  • Discuss how we are leveraging multiple sources of seemingly disparate information sources (internal systems, Appd) and establishing the connective thread vertically through an incident.
  • Use theindiana jones analogy from the grail scene….
  • Transcript

    • 1. Zero to Production APM in 30 days(while sending 500,000,000 messages per day) Kevin Siminski (@ksiminsk) Director of Infrastructure Operations ExactTarget
    • 2. Our Challenge
    • 3. <Slide Title>• < First Bullet> – <Second Bllet>
    • 4. Our Application• Highly Virtualized – 5,000+ Virtual Machines, 90% Windows• Extremely diverse codebase – Microsoft .NET framework• Rapid Growth – 500+ million messages per day – 100+ million interactions per day within IIS Pools
    • 5. 30,000 Foot viewThousandsof IISservers
    • 6. The Team• Kevin Siminski- Director – 16 years in mission-critical operations – 4 years at ExactTarget (started with a team of four engineers)• Global Operations – Brandon Svec– principal system engineer responsible for implementation – Approximately 20 engineers – USA, UK, India and Australia
    • 7. The role of the operations team- Four years ago Support Infrastructure Ops. Dev
    • 8. The role of the operations team- Today Global Operations Development
    • 9. The Problem?
    • 10. Visibility EVERYWHERE
    • 11. The Usual Suspects• Enterprise Monitoring Systems – Nagios – Cacti – Microsoft System Center Operations Manager (SCOM) – Home grown applications• Synthetic Monitoring Systems – Neustar Webmetrics
    • 12. The Usual Alert Storm• < First Bullet> – <Second Bullet>
    • 13. Why AppDynamics?
    • 14. How we chose AppDynamics• Two Vendor Proof of Concept• Success Criteria – Ease of Implementation – Ongoing Support & Maintenance – Implementation timeline during Proof of Concept
    • 15. 30-Day Implementation plan• Instrument all external facing .NET IIS Instances – Approximately 600 node implementation• Single dedicated systems engineer• Production impact not an option• Multiple AppDynamics SaaS controllers*During the POC, we actually deployed agents to production machine
    • 16. 30-Day Implementation plan• Week 1 (February 2012) – SaaS/Controller Setup with AppD team – Onboard IT-security team with security audit – Review session with configuration management team – Test Agent Deployment- validate network/security paths are open and any configuration changes are vetted by Change Mgmt. Team• Week 2 – Deployed two agents into each production IIS pool- limited deployment to only „target‟ servers – Validate all controllers functioning and accessible
    • 17. 30-Day Implementation plan• Week 3 – All agents deployed to production pools with collection mechanism disabled – Configuration management team “owns” the deployment process and base configuration for all application servers• Week 4 – All services enabled during production change window – Network, Host/guest baselines established for pre and post change windows – All metrics closely monitored throughout the week to ensure no impact
    • 18. Lessons Learned from 30-Day Deployment• Things that went Well – Initial configuration discussions ironed out all firewall rule sets, network paths, etc. – Change management team understood the goal – POC process established solid foundation• Things that could be Improved – Automated agent deployment – Anticipation of load with controllers
    • 19. AppDynamics First Mission• Upgraded mission-critical application database – Microsoft SQL Server from 2003 to 2008 – Goal– No system outage in production stack• High Risk Migration – Unable to assess full risk because of legacy application components• All hands on deck (Dev, Architects, Ops, DBA‟s) – How did we know if there is an issue post migration?
    • 20. AppDynamics showed us the Impact
    • 21. Accelerating AppDynamics Adoption• Evangelize the tool as often as possible – After every system event, the ops team sends out the AppDynamics view of the event to validate impact
    • 22. Information Sharing is Key• Create views of the data that anyone can leverage – Tier 1 support analyst has a much different view of the world compared to a core systems architect• Educate the users of the system – It‟s very easy to jump to conclusions in a few clicks
    • 23. Sample View for Support team
    • 24. View for Release “Nights”
    • 25. Dev Ops Collaboration & Feedback Loops• Production exception reporting to engineers• Post Production deployment tests by engineers• Apply AppDynamics performance info to plan• Analyze trends in customer adoption
    • 26. Using AppDynamics REST API in our Dashboards
    • 27. Top Tips for deploying AppDynamics• Do no harm• Align product deployment with Development architects- they are your allies• Data is king- the metrics will illuminate the “hotspots”, use this information wisely• Have the product easily accessible- freely share the dashboards
    • 28. Questions?