Cloud Applications Management Nirvana


Published on

Hybrid cloud is becoming a necessity for many organizations. But building and managing an environment that effectively leverages the strengths of both public and private clouds can be a greater challenge than anticipated. One of the most critical elements of a hybrid cloud scenario is the management solutions that manage the cloud application lifecycle effectively. This presentation focuses on how organizations can manage their hybrid environments to ensure they achieve cloud computing success.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • We live in a fast paced world where new capabilities that satisfy some user need can be delivered quickly. The user has come to expect the continual delivery of functionality + operability. If we are not set up to be agile, somebody else is and we can quickly fall behind
  • Hence application lifecycle has transitioned from serial big bang releases to smaller releases with continuous delivery to accommodate constant changing business needs
  • We are creating an environment of corporate anxiety and raised blood pressure.Developers start to take shortcutswrt testing and documentation and when you use the code in production, it doesn’t work as expected
  • We could end up dealing with stress by drinking heavily or if you are like me - eating chocolates
  • I am going to talk about some of the steps we can take to manage cloud application lifecycle better. These are things we have tried ourselves or with our customers.
  • There are a lot of options available in the market today, public, private, hybrid, iaas, paas, saas and a variety of vendors satisfying each need
  • What is key as the foundation is choosing the right platform that solves the business need. Focusing our efforts on what matters helps us develop a competitive advantage.
  • Plug-in based architectures are excellent examples of the contextual abstraction. The plug-in API provides a plethora of data structures and other useful context developers inherit from or summon via already existing methods. But to use the API, a developer must understand what that context provides, and that understanding is sometimes expensive…Eclipse and IntelliJComposable systems tend to consist of finer grained parts that are expected to be wired together in specific ways.e.e. parsing a file using a higher generation language vs shell scriptscomposable build tools scale (in time, complexity, and usefulness) better than contextual ones. Contextual tools like Ant and Maven allow extension via a plug-in API, making extensions the original authors envisioned easy. However, trying to extend it in ways not designed into the API range in difficultly from hard to impossible community – Eclipse vscmd lineOperations community - Chef and Puppet vs RBA (connection vs plugin model)
  • This page lists some of the things you can keep in mind when evaulation cloud platforms. SLA in particular is a tricky one because it’s the SLA of a component. When you tie the various components together, the SLA of a system is lower. And you will have down time even with 5 9s
  • WE have all been victim to Murphy’s lay - Anything that can go wrong, will go wrongAnd so we have to be prepared and plan for failure to avoid major losses to our business
  • The best defense against unexpected failures is to fail often. Game Day and the Monkeys from the Simian Army are about about discovering and learning from failures more quickly and proactively. As we learn more about our applications we can work resilience engineering practices into our application architectures Chaos MonkeyLatency MonkeyConformity MonkeyDoctor MonkeyJanitor MonkeySecurity Monkey10-18 MonkeyChaos GorillaChaos Kong
  • Automated failover architectures, circuit-breaker patterns, and other resilience engineering practices make availability more continuous, both by preventing failures and by enabling faster healing.E.g. Netflix – “If our recommendations system is down, we degrade the quality of our responses to our customers, but we still respond. We’ll show popular titles instead of personalized picks. If our search system is intolerably slow, streaming should still work perfectly fine”Starts with design and things you can do from ops Infrastructure – how the components communicate with each other about failure
  • What are you automating?What things?How do u automated – tools?
  • Git – Code RepositoryJenkins – Kick off testsVagrant – Build our dev environmentMaven – Buildfpm – create RPMs and .deb filesGrails – Web framework based on the Groovy languageArtifactory – repository managerChef – InstallsLogstash – collect logs from different serversGraphite – VisualizationNagios – MonitoringRiemann – MonitoringNew Relic – MonitoringBoundary – MonitoringAria – Subscription BillingQualys – Vulnerability scansTwillo - SMSOSSEC - IDS
  • Making decisions based on data is the next important step in cloud applications management
  • We have used logstash to consolidate logs in our environment
  • And visualize the data using graphiteOther tools:SplunkCactiRRD toolNetflix dashboardEtsy dashboard
  • Outsourcing to the cloud to reduce CapEx and move to an OpEx model just meant that while availability, confidentiality and integrity of your service and assets are solid, your sustainability and survivability are threatened.
  • Chris Hoff in his blog talked about economic denial of sustainability (EDoS) -- where the dynamism of the infrastructure allows scaling of service beyond the economic means of the vendor to pay their cloud-based service bills.
  • There are several tools in the market that help track costs. you have visibility into costs, you can start building applications with costs in - what server size do you really need? How much data should you transfer? When should you transfer data?PA Consulting Group recently worked with a public-sector client to deliver a large-scale Google App Engine implementation which needed to query large data sets calculated from source data at speed. The client had to choose between executing complex run-time queries and paying for processing power, or pre-computing large data sets and so paying for storage. Either approach was valid, but which would be more cost-effective?
  • You cant talk about cloud applications and not talk about security. While there are several aspects to security, I am going to talk about user managementThe challenge that enterprises face today that is different from web scale companies is the need to manage several applications created by several teams as opposed to one or a few applications. Users belong to one or more groups or departments which may interact with one another causing a human scale & coordination problemApps created by the teams can run in one or more cloud.Each cloud has its own authentication, keys, certificates causing operations sprawlProblemsComplianceCorodination
  • Leverage on-premise directory services to manage users and authenticationUsers can log into Cloud security broker (Enstratius) w/their AD credsthen, through that, use resources on the cloudwithout ever having to have creds on the outside cloud directlythen, once the vms are up and runningyou can create user access on those vms that is unique to the user inside enstratiusessentially, brokering your local AD auth into vms on the cloud without ever exposing your auth resource (AD in this case) to anything outside your firewall Cloud security as a shared responsibility and how important it is for users to recognize what remains in their domain of responsibility.CSAhas an excellent set of assessment guidelines and controls for cloud security.Challenge: Managing cloud as a one-off & forgetting to update correctlyUsers who change jobs or leave not fully synced or removedSolution: Synchronize/delegate authentication with LDAP/ADRetains single point of control over user & authenticationGuest VMs do not talk directly to your LDAP/AD infrastructureUsers removed from LDAP are automatically removed from appropriate VMs
  • Enstratius Mission is to address these type of problems
  • Cloud Applications Management Nirvana

    2. 2. Seema JethaniDirector Product Management, Enstratius@seemaj
    3. 3. The era of delayed gratificationis overThe Internet allows innovationsto be delivered as a constant flowthat incorporates user needs.We live in a fast paced world
    4. 4. PlanDevelopDeployOperateOptimizeThe new application lifecyclePlan Develop Deploy Operate Optimize
    5. 5. Dealing with constant change
    6. 6. Dealing with constant change
    9. 9. Too many choices?
    10. 10. Right tools for the right jobFocus on what mattersOutsource everything else
    11. 11. Why is it important?Picking the right system – accumulates lesstechnical debtEvery project has different needs – What matters ishigher level business goalsVendor relationships may exist – It’s time to forgetthem 
    12. 12. Evaluating Cloud PlatformsCriteria• Data Management• How and where will the data be stored?• Who can access the data and who owns it?• Security• Terms of Service• Support• Privacy Policy• Service Level Agreements (be careful about this one)• Ethics• Disclaimers• Breakup penalty• Price, Billing and Accounting• Technical Capabilities• Data and application architecture• APIs and data transformations• Performance• Geographies
    13. 13. Step 2Plan for FailureComplexity increases , defects accumulateNo single component can guarantee 100% uptimeFailure HappensAnd not JUST in the public cloud
    14. 14. Test for FailureThe best defense against major unexpectedfailures is to fail oftenTools:• Simian Army - All those damn Monkeys• Game DayIncrease resilience through large scale faultinjection across critical systemsHow:Start SmallLearn LessonsBuild ConfidenceFull scale live exercisesBuild resiliency into coding practices
    15. 15. Design for FailureRedundancy, Fault-Tolerance and Graceful DegradationEnables a system to continue operating properly in the event of the failure of someof its components.Circuit BreakerProtects clients from slow or broken services .Protects services from demand in excess ofcapacity.Feature FlagsRestrict features to certain environments, while still using the same code baseon all servers.
    16. 16. Step 3
    17. 17. What to automate?PlanDevelopDeployOperateOptimizeCreate and configurelightweight, reproducible, and portabledevelopmentenvironmentsTrigger builds, tests,manage features inreal timeMonitorapplications, trackcostsManageresources, scaleup/down rapidlyon-demand
    18. 18. How to Automate?Market of Toolsfpm
    19. 19. Step 4
    20. 20. Let data drive your decisionsGathering and Analyzing logs using Logstash
    21. 21. Let data drive your decisionsVisualizing using Graphite
    22. 22. Step 5Design and Operate withcosts in mind
    23. 23. There is a new attack in town …Bring the service down not by stopping theservice but by making it extremelyexpensive to run.Botnets can make seemingly legitimate requests for service togenerate an economic denial of sustainability (EDoS) -- where thedynamism of the infrastructure allows scaling of service beyond theeconomic means of the vendor to pay their cloud-based service bills.
    24. 24. Measuring costsSubscription Billing(manage onlinesubscription services)IT Accounting,Charge-back, Show-back (charging-backvariable IT costs. Afoundation for providingbasic IT cost transparency.)IT Finance andTechnology BusinessManagement(A more strategic role tomanage and forecastcosts, evaluate overallvalue, and assist inIT/business decision-making)AriaMonexaZuoraCloudabilityCloudRowsCloudynCostnomicsNewvemNicus SoftwarePace AppliedTechnologyuptimeCloudApptioBMCClaritiaCloudCruiserComsciCube Billing
    25. 25. Step 6
    26. 26. Challenges with User ManagementAPPS APPS APPSUsers belong to oneor more groups ordepartments whichmay interact withone anothercausing a humanscale &coordinationproblemApps created by theteams can run inone or more cloud.Each cloud has itsownauthentication, keys,certificates causingoperations sprawlAPPS
    27. 27. Leverage cloud security brokersUse cloud securitybroker solutionswithout exposinginternal services tomanage access toclouds, cloudresources or keysENTERPRISE DIRECTORYDEPLOYMENTNew orexisting userRemovedUserCLOUD MANAGEMENT SOLN• Users• Groups• Access Rights• KeysAdd / Sync Remove
    28. 28. Step 7Invest in your people and culture
    29. 29. If you do nothing elseHire smart people to figure thingsoutYou cannot automate everything –YMMVGet them to talk to each otherCommunication is key 
    30. 30. The 7 Steps1 Choose your path wisely2 Plan for failure3 Automate all the things4 Be data-driven5 Design and operate with costs in mind6 Security is not an after-thought7 Invest in your people and culture
    31. 31. The Enstratius Cloud Management Platform
    32. 32. Enterprise Scenario – with EnstratiusSingle point of control for implementation of governance policies Directory drivesaccess &authentication Full self servicewithin approvedgovernanceframework Complete,persistent audittrail Budget controls Security policycompliance