Your SlideShare is downloading. ×
AppJam 2012: Zero to Production APM in 30 days
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

AppJam 2012: Zero to Production APM in 30 days

731

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
731
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Exacttarget is a multichannel marketing platform– extremely high demands from some of the most demanding brands in the world.
  • Other stats– 25 million api calls processed daily10 billion subscriber manipulations daily via segmentation/grouping activitiesMultiple databases > 10 TerabytesOne customer – 32 billion rows under management
  • 5 Production stacks = separate environments with AppDsaas environment, all architected in the same fashion.
  • 2009 = Four monitoring engineers– only responsible for watching a triaging system alerts. No centralized management systems **illustrate via story about how many emails we were sending in a single day and the need to wait for an email every four hours
  • The Four tenants shared amongst all of product teams== collective shared vision.
  • This was an unsolicited set of points shared by one of the Exacttarget product VP’s…. Specifically called out AppDynamics
  • Discuss how we are leveraging multiple sources of seemingly disparate information sources (internal systems, Appd) and establishing the connective thread vertically through an incident.
  • Use theindiana jones analogy from the grail scene….
  • Transcript

    • 1. Zero to Production APM in 30 days(while sending 500,000,000 messages per day) Kevin Siminski (@ksiminsk) Director of Infrastructure Operations ExactTarget
    • 2. Our Challenge
    • 3. <Slide Title>• < First Bullet> – <Second Bllet>
    • 4. Our Application• Highly Virtualized – 5,000+ Virtual Machines, 90% Windows• Extremely diverse codebase – Microsoft .NET framework• Rapid Growth – 500+ million messages per day – 100+ million interactions per day within IIS Pools
    • 5. 30,000 Foot viewThousandsof IISservers
    • 6. The Team• Kevin Siminski- Director – 16 years in mission-critical operations – 4 years at ExactTarget (started with a team of four engineers)• Global Operations – Brandon Svec– principal system engineer responsible for implementation – Approximately 20 engineers – USA, UK, India and Australia
    • 7. The role of the operations team- Four years ago Support Infrastructure Ops. Dev
    • 8. The role of the operations team- Today Global Operations Development
    • 9. The Problem?
    • 10. Visibility EVERYWHERE
    • 11. The Usual Suspects• Enterprise Monitoring Systems – Nagios – Cacti – Microsoft System Center Operations Manager (SCOM) – Home grown applications• Synthetic Monitoring Systems – Neustar Webmetrics
    • 12. The Usual Alert Storm• < First Bullet> – <Second Bullet>
    • 13. Why AppDynamics?
    • 14. How we chose AppDynamics• Two Vendor Proof of Concept• Success Criteria – Ease of Implementation – Ongoing Support & Maintenance – Implementation timeline during Proof of Concept
    • 15. 30-Day Implementation plan• Instrument all external facing .NET IIS Instances – Approximately 600 node implementation• Single dedicated systems engineer• Production impact not an option• Multiple AppDynamics SaaS controllers*During the POC, we actually deployed agents to production machine
    • 16. 30-Day Implementation plan• Week 1 (February 2012) – SaaS/Controller Setup with AppD team – Onboard IT-security team with security audit – Review session with configuration management team – Test Agent Deployment- validate network/security paths are open and any configuration changes are vetted by Change Mgmt. Team• Week 2 – Deployed two agents into each production IIS pool- limited deployment to only „target‟ servers – Validate all controllers functioning and accessible
    • 17. 30-Day Implementation plan• Week 3 – All agents deployed to production pools with collection mechanism disabled – Configuration management team “owns” the deployment process and base configuration for all application servers• Week 4 – All services enabled during production change window – Network, Host/guest baselines established for pre and post change windows – All metrics closely monitored throughout the week to ensure no impact
    • 18. Lessons Learned from 30-Day Deployment• Things that went Well – Initial configuration discussions ironed out all firewall rule sets, network paths, etc. – Change management team understood the goal – POC process established solid foundation• Things that could be Improved – Automated agent deployment – Anticipation of load with controllers
    • 19. AppDynamics First Mission• Upgraded mission-critical application database – Microsoft SQL Server from 2003 to 2008 – Goal– No system outage in production stack• High Risk Migration – Unable to assess full risk because of legacy application components• All hands on deck (Dev, Architects, Ops, DBA‟s) – How did we know if there is an issue post migration?
    • 20. AppDynamics showed us the Impact
    • 21. Accelerating AppDynamics Adoption• Evangelize the tool as often as possible – After every system event, the ops team sends out the AppDynamics view of the event to validate impact
    • 22. Information Sharing is Key• Create views of the data that anyone can leverage – Tier 1 support analyst has a much different view of the world compared to a core systems architect• Educate the users of the system – It‟s very easy to jump to conclusions in a few clicks
    • 23. Sample View for Support team
    • 24. View for Release “Nights”
    • 25. Dev Ops Collaboration & Feedback Loops• Production exception reporting to engineers• Post Production deployment tests by engineers• Apply AppDynamics performance info to plan• Analyze trends in customer adoption
    • 26. Using AppDynamics REST API in our Dashboards
    • 27. Top Tips for deploying AppDynamics• Do no harm• Align product deployment with Development architects- they are your allies• Data is king- the metrics will illuminate the “hotspots”, use this information wisely• Have the product easily accessible- freely share the dashboards
    • 28. Questions?

    ×