BUILDING DEVOPSWITHBEER ANDWHITEBOARDSJOHN MARTIN@tekBuddhaSTEVE BURTON@BurtonSays
CALL OF DUTY:DEV OPS
the challengeGAME SELECTDEVELOPERDEVELOPEROPERATIONSOPERATIONSDEVOPSDEVOPSNOOPSNOOPSAADEVOPSMISSION PARAMETERS:MISSION OBJ...
but what is success?
“success is going from failure tofailure without losingenthusiasm”Winston Churchill
failure
mean time to innocence (MTTI)
mean time to resolution (MTTR)Weeks, Days, Hours or Minutes?
mean time between failure (MTBF)Weeks, Days, Hours or Minutes?
availability?99.9%The most meaningless metric in IT today.
business metrics> revenue> throughput> performance> productivity
Edmunds.comEXPERT CAR ADVICEFOUNDED IN 1966550 EMPLOYEES650K DAILY UNIQUES
whoamiSR DIRECTOR PRODUCTION ENGINEERINGA DECADE SUPPORTING JAVAARCHITECTURESFUELED BY METRICS, WHITEBOARDS,LOGS, AND B...
Our environment.
Compelling EventsSource: http://is.gd/iJU4et
Growing Pains
Communication
2010 RedesignSource: http://is.gd/L77vl1COMPLETE REWRITE OF PLATFORMQA & BETA WORKED GREAT!BETA BECOMES PROD3 MONTHS I...
NOT LIKE THISSource:http://is.gd/PFLRmW
LIKE THISSource: http://is.gd/iJU4et
OUT OF HERESource: http://is.gd/oFCXNH
IN TO HERESource: http://is.gd/iJU4et
ONE OF THEMOST UNDERRATED TOOLSYOU ALREADYHAVE.THE WHITEBOARD
TEARING IT DOWNSource: http://is.gd/Vrnwu4
The Toolshed
Communicating with MetricsSource: http://is.gd/L77vl1DATA DRIVEN CULTURECHECK THE GUTDRIVE ACCOUNTABILITYLEARN FROM FA...
CLOUDY SKIESSource: http://is.gd/arBZ4M
Putting It All TogetherSource: http://is.gd/L77vl1UNCHARTED WATERSFAMILIAR TOOLINGIMPROVED COMMUNICATIONSMEASURABLE SU...
A Personal NoteSource: http://is.gd/L77vl1
A Personal NoteSource: http://is.gd/L77vl1
The Business PropositionSource: http://is.gd/L77vl1THE CLOUD ISN’T FREECOST PER HOST CAN GET SCARYLOOK FOR THE FREEBIES
AWESOMENESSSource: http://is.gd/iJU4et
Measuring SuccessSource: http://is.gd/L77vl1Before After Benefit $ SavingsApplication Availability % 99.91% 99.95% > 0.04%...
OUT OF HERESource: http://is.gd/oFCXNH
IN TO HERESource: http://is.gd/iJU4et
Source: http://is.gd/xKdI6EWHERE NEXT?
JOHN MARTINjmartin@edmunds.com@tekBuddhaSTEVE BURTONsburton@appdynamics.com@BurtonSaysQUESTIONS?We’re hiring!Stop by our b...
Building DevOps with Beer & Whiteboards
Upcoming SlideShare
Loading in...5
×

Building DevOps with Beer & Whiteboards

1,850

Published on

Velocity 2013 - How Edmunds learned from failure, begin opening communications between silos, and build a DevOps culture over beer and whiteboards.

(HINT: Download to see the presenter's notes for what may not make sense without a speaker!)

Published in: Technology, News & Politics
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,850
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • - The automative resource of the Internet - Originally in print, then Gopher in 1994, Web in 1996
  • - Our environment is highly distributed. When you visit Edmunds.com you’re interacting one or more of our 30 web apps spread out across a couple hundred hosts. - The website itself is built on Apache Tomcat, Solr, MongoDB, and Oracle Coherence. - Internally, you’ll also find ActiveMQ, Oracle, and some lingering WebLogic services we’ll soon be doing away with. - We rely heavily on a mix of different tools to build and support all this: chef, jenkins, CloudStack, AppDynamics, Splunk, to name a few. - But I’m getting ahead of myself because how we got to this architecture is part of the tale on how Edmunds came to embrace a DevOps mindset.
  • - So then where does our story start? - Let me be up front: WE STUMBLED. WE PERFECTED THE FACEPALM. - The specifics of our situations when the shit hit the fan may have felt unique, but they’re not. - We learned from our mistakes with the intent of getting better. - Let’s talk facepalms...
  • - This may be familiar... - In 2005, we had 30 servers. In 2006, we burst up to 300 and held steady for a few years with slow growth. - In 2009, we saw radical jump in server deployment - We grew in number of servers, but not in the number of admins - We had Kickstart, but that’s only good at bootstrap time - BladeLogic + AnthillPro seemed a good solution, but there were major issues - Growth is painful
  • - One very specific breakdown in our history that stands out to me. - 2007 - Edmunds 2.0: Introducing CMS for the business - All content was locked to a monthly release cycle - Six months of functional testing, without any performance validation. - Two months before launch, performance testing uncovered scalability issues. - Ops response: double application infrastructure and throw a hardware cache appliance. - Breakdown in relationships between Dev/Ops lead to major business costs. - Fast forward to 2009; remember that big jump in the number of servers we were deploying?
  • - 2010 Edmunds Redesign: Complete rewrite of all website code + modular breakout of applications. - Good collaboration between Dev/Ops to understand requirements on all sides. - But QA + BETA were build brick-by-brick, and not easily reproducible. - Armed with BladeLogic + AnthillPro, build/deploy was more automated but weren’t coupled together! - Production environment took 3 months to build while BETA served the new website. - We started to realize that the real challenge wasn’t technology but culture .
  • We wanted to stop working like this...
  • and start building like this.
  • We really wanted to get out of here.
  • - And go here - This is the Daily Pint! Let me buy you a beer! - This is where the wildest of ideas are born - Disagreements are worked through with positive jest and jeers - It is where we talked it over
  • - Then we’d take it here! - THE MOST UNDER RATED TOOL YOU ALREADY HAVE. - Floor-to-ceiling whiteboards where we worked out our ideas. - We talked gaps in handoffs, failure rates due to manual builds, linking tools in together - “self-service”, Automated testing, and much much more. - What happened those was no “ops”, no “dev”. We were technologists working to solve problems with no boundaries of roles in the way. - Our proposal: tear down silos. - We did just that!
  • - So who and how did this happen? - TechLeads who spent too much time in war rooms started chewing on the problem together. - Identified gaps in provisioning/config management and app deployment tools. - Scott McNealy was right about hardware/software dependencies. - Two teams, Production Engineering & Automation Engineering set about to provide tools which bridged the divide. - (ProdEng = Ops) + (AutoEng = Dev) == How we really started gaining inroads. (NOT IDEAL!) - Members of both these teams shed traditional views on what they were supposed to do and just did it. - The result were improved relationships, better tooling, and a clearer perspective on how future projects could work.
  • - So we started linking all our tools together! - “Your tools don’t make your culture, but they do have an impact on the people who do.”
  • - We now talk about data that our tools provide us - You can talk from your gut, but you better back it up with data - We pushed ownership and accountability by leveraging what we found with data . - The metrics were clearly pointing out our failures, allowing us to learn how to prevent them in the future.
  • - Armed with a tighter toolchain and a new way of working together, we were once again about to be put to the test. - Edmunds began investing resources into “the cloud”. - Heavily virtualized since 2010, but no clear “cloud” offerings - Two teams, one objective: make edmunds.com work on $x cloud platform - Why two? DIVERSIFY.
  • - This was our first shot at a “new” project armed with our new practices + tooling - They were uncharted waters, even though we’d been virtualized for a few years. “Cloud” is a different beast. - But with familiar tooling + improved communications, these teams produced success results that were easily measured. - Environment build time down to less than a week. - Done with 95% of the same toolset for both cloud platforms.
  • - We’ve all spent our careers as firefighters. - Street cred with co-workers, bosses, executives as cool headed during a mess - So what about when there are less - or different - kinds of fires? - By increasing accountable individuals, more “self-service”, less fires == increased capacity for business acumen. - This is the business value of what we call DevOps is leading is to.
  • - To go from this to this... - Invest in addressing systemic issues around communication + partnerships, we increase our capacity to take on other challenges - No big secret, it’s been talked about by Damon Edwards, John Willis - Covered beautifully in “The Phoenix Project” - Technologist in the age of the Internet are no longer back-office workers keeping the lights on - We help shape the direction of our companies; direct impact on revenue in ways our field sees change now yearly. - We needed to change the way we work together to free ourselves for “bigger things”. - An exciting time to be working in our field!
  • - Okay, back to our cloud initiatives... - With this additional capacity, here’s a few things we learned to give value to our company - Cloud isn’t free; server sprawl can be expensive and lack of education with “self-service” becomes a major issue. - How much does it cost to operate your environment? It’s tough to calculate! - Licensing by host or CPUs is costly at scale, so look for alternatives to those things you pay a premium for. - Managing operating costs starts with understanding where the money is going!
  • - A great growing experience the last few years @ Edmunds. - No rose-tinted glasses to suggest we’ve solved all our problems! BUT WE GOT SOME BIG ONES! - And today we work a helluva lot more like this! - So, let’s take on the challenge of showing some metrics of success by adopting a DevOps culture...
  • - Application Availability has increased. Not the holy metrics of “four 9’s”, but a bump all the same! - The number of high-severity INCs has dropped 50% year-over-year - The number of TKTs filed has dropped 50% year-over-year --- Self-service is slick! - The MTTR of pre-production issues has drastically reduced from 5 days to 2 days and even faster than that in most situations. - The time it takes us to build runways has gone down from 3 months to 1 week! - Deeper inspection of our costs-per-host, we’re expecting to begin shaving off overall operating costs drastically for next year’s budget. - Team morale? Well...
  • We got out of here.
  • And into here, so it’s pretty good.
  • - Always more to be done! You’re never “finished” growing. - Devs on-call! (You build it, you run it!) - Reducing infrastructure footprint == reducing operating costs - More RESTful applications - Other cloud offerings?
  • Building DevOps with Beer & Whiteboards

    1. 1. BUILDING DEVOPSWITHBEER ANDWHITEBOARDSJOHN MARTIN@tekBuddhaSTEVE BURTON@BurtonSays
    2. 2. CALL OF DUTY:DEV OPS
    3. 3. the challengeGAME SELECTDEVELOPERDEVELOPEROPERATIONSOPERATIONSDEVOPSDEVOPSNOOPSNOOPSAADEVOPSMISSION PARAMETERS:MISSION OBJECTIVESKILL YOUR COMPETITORS- DEVELOP, TEST, DEPLOY, OPERATE- AUTOMATION & BUSINESS AGILITYRECOMMENDED ESSENTIALSBEER, WHITEBOARDS, COMMUNICATION
    4. 4. but what is success?
    5. 5. “success is going from failure tofailure without losingenthusiasm”Winston Churchill
    6. 6. failure
    7. 7. mean time to innocence (MTTI)
    8. 8. mean time to resolution (MTTR)Weeks, Days, Hours or Minutes?
    9. 9. mean time between failure (MTBF)Weeks, Days, Hours or Minutes?
    10. 10. availability?99.9%The most meaningless metric in IT today.
    11. 11. business metrics> revenue> throughput> performance> productivity
    12. 12. Edmunds.comEXPERT CAR ADVICEFOUNDED IN 1966550 EMPLOYEES650K DAILY UNIQUES
    13. 13. whoamiSR DIRECTOR PRODUCTION ENGINEERINGA DECADE SUPPORTING JAVAARCHITECTURESFUELED BY METRICS, WHITEBOARDS,LOGS, AND BEER
    14. 14. Our environment.
    15. 15. Compelling EventsSource: http://is.gd/iJU4et
    16. 16. Growing Pains
    17. 17. Communication
    18. 18. 2010 RedesignSource: http://is.gd/L77vl1COMPLETE REWRITE OF PLATFORMQA & BETA WORKED GREAT!BETA BECOMES PROD3 MONTHS IN A WAR ROOM
    19. 19. NOT LIKE THISSource:http://is.gd/PFLRmW
    20. 20. LIKE THISSource: http://is.gd/iJU4et
    21. 21. OUT OF HERESource: http://is.gd/oFCXNH
    22. 22. IN TO HERESource: http://is.gd/iJU4et
    23. 23. ONE OF THEMOST UNDERRATED TOOLSYOU ALREADYHAVE.THE WHITEBOARD
    24. 24. TEARING IT DOWNSource: http://is.gd/Vrnwu4
    25. 25. The Toolshed
    26. 26. Communicating with MetricsSource: http://is.gd/L77vl1DATA DRIVEN CULTURECHECK THE GUTDRIVE ACCOUNTABILITYLEARN FROM FAILURE
    27. 27. CLOUDY SKIESSource: http://is.gd/arBZ4M
    28. 28. Putting It All TogetherSource: http://is.gd/L77vl1UNCHARTED WATERSFAMILIAR TOOLINGIMPROVED COMMUNICATIONSMEASURABLE SUCCESS STORIES
    29. 29. A Personal NoteSource: http://is.gd/L77vl1
    30. 30. A Personal NoteSource: http://is.gd/L77vl1
    31. 31. The Business PropositionSource: http://is.gd/L77vl1THE CLOUD ISN’T FREECOST PER HOST CAN GET SCARYLOOK FOR THE FREEBIES
    32. 32. AWESOMENESSSource: http://is.gd/iJU4et
    33. 33. Measuring SuccessSource: http://is.gd/L77vl1Before After Benefit $ SavingsApplication Availability % 99.91% 99.95% > 0.04% $167k revenue protection# of High Severity Incidents 21 10 < 50% $307k productivity# of Help desk Tickets 196 99 < 50%MTTR in Pre-Production 5 Days 2 Days < 45% $320k productivityTime To Build Runways 3 Months < 1 Week Seriously?!Operating Costs $$$$ TBDTeam Morale Bummered Beer
    34. 34. OUT OF HERESource: http://is.gd/oFCXNH
    35. 35. IN TO HERESource: http://is.gd/iJU4et
    36. 36. Source: http://is.gd/xKdI6EWHERE NEXT?
    37. 37. JOHN MARTINjmartin@edmunds.com@tekBuddhaSTEVE BURTONsburton@appdynamics.com@BurtonSaysQUESTIONS?We’re hiring!Stop by our booth!
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×