Successfully reported this slideshow.
Your SlideShare is downloading. ×

Our Journey To Continuous Delivery

Ad

Virgil Chereches & Robert Mircea
May 28th, 2015
Our Journey to
Continuous Delivery

Ad

2
“How long would it take your organization to deploy a change that
involved just one single line of code?
Do you do this ...

Ad

3
“Failure is not an option”
 Learning opportunity
 Detectable quickly
 Restore service quickly
Failure is inevitable
O...

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Upcoming SlideShare
Poles position
Poles position
Loading in …3
×

Check these out next

1 of 42 Ad
1 of 42 Ad
Advertisement

More Related Content

Slideshows for you (19)

Advertisement

Similar to Our Journey To Continuous Delivery (20)

Advertisement

Our Journey To Continuous Delivery

  1. 1. Virgil Chereches & Robert Mircea May 28th, 2015 Our Journey to Continuous Delivery
  2. 2. 2 “How long would it take your organization to deploy a change that involved just one single line of code? Do you do this on a repeatable, reliable basis?” Mary and Tom Poppendieck, Implementing Lean Software Development
  3. 3. 3 “Failure is not an option”  Learning opportunity  Detectable quickly  Restore service quickly Failure is inevitable Orange
  4. 4. 4 IT Performance and DevOps Practices “Our data shows that IT performance and well- know DevOps practices, such as those that enable continuous delivery, are predictive of organizational performance. As IT performance increases, profitability, market share and productivity also increase. IT is a competitive advantage, not just a utility, and it’s more critical now than ever to invest in IT performance.” http://bit.ly/1LwY4oB
  5. 5. 5 The Need for Continuous Delivery Agility for our Business Eliminate wasted time due to hand offs between teams
  6. 6. 6 Enable Architectural Decomposition of Systems
  7. 7. 7 Do Continuous Delivery to be Ready for… Microservices Swarm
  8. 8. 8 Share the Vision of CD Spread the “WHY” the “WHAT” ensure they “GET IT” Started organic, inspired by a few enthusiasts. Understand that it is about changing the way we work
  9. 9. 9 How We Joined the Army Dev Ops Enthusiasts 7 x DevOps Wannabes One project One opportunity R&D style Managers jumped on board from the start DevOps
  10. 10. 10 What We Want to Achieve Push to release at Orange scale
  11. 11. 11 JBoss cluster standalone JBoss cluster standalone v1.0 v1.1 JBoss cluster standalone v1.1 … VMWare VCenter Infrastructure JBoss cluster standalone v1.0 ProdDev/UAT v1.1 commit https://www.orange.ro/myaccount
  12. 12. 12 From What We Start Business requests in the hundreds/year 200+ apps in Java, C#, C, a few PHP. Lots of PL/SQL ~150 people – Dev, QA, BA, Ops, PM Mélange of proprietary and lots of custom development apps
  13. 13. 13 What is Continuous Delivery? Small, frequent changes delivered constantly A method to reduce risk of failure A way to give to our business a glimpse of the shape of product being built A set of practices enabled by a toolset
  14. 14. 14 Many deployments eventually lead to a Product Launch What is the difference between a Deployment and a Product Launch?
  15. 15. 15 Changes flowing through deployment pipeline Push the button “Approval”
  16. 16. 16 Orange Deployment Pipeline Architecture (Atlassian Stash Git / SVN) (Artifactory) (Jenkins/Teamcity/Sonar) (Selenium) (Enterprise Tester) (JMeter) Rundeck /Puppet
  17. 17. 17 Continuous Integration Server Purpose: detect code problems or quality issues quickly
  18. 18. 18 Best Practices for CD  build application only once and use it in all environments  keep application releasable all the time – master is releasable at all times  hide new functionality until it is finished – use feature flags  make small changes incrementally – commit and integrate often  use “branching by abstraction” – use a “proxy” component  use snapshots (in Maven) with care
  19. 19. 19 Techniques for achieving Continuous Deployment
  20. 20. 20 Blue-Green Deployments v1.0-WL App Server v1.0-Jboss www.orange.ro/myaccount-corporate Weblogic JBoss
  21. 21. 21 Feature flags Integrating with production is a test  On/Off  Orange IPs  List of Users  Time interval  Release date  Gradual Rollout (%)  Blacklist/whitelist IPs Enable/disable features quickly (no restart) ORO testers check feature in live
  22. 22. 22 Prefer Non-Breaking Expansion in Database Database migration tools FluentMigrator (.NET)
  23. 23. 23 Code Based Migration - FluentMigrator Run migration .migrate --conn "server=oracle-server; database=SmsDb" -db OracleManaged -a "..MigrationsMigrations.dll" --task migrate:up Rollback change .migrate --conn "server=oracle-server; database=SmsDb" -db OracleManaged -a "..MigrationsMigrations.dll" --task migrate:down
  24. 24. 24 Automated tests AFTER deploy Run smoke tests automatically to validate deployment success Examples:  make a HTTP request to homepage and look for a string  login with a test user  make a recharge with a test PrePay
  25. 25. 25 Metrics Server Metrics  CPU utilization  disk space Application Metrics  JMS messages sent/received,  memcached hits Business Metrics  Logins / Registration  Orders  Recharges  Options activations
  26. 26. 26 Naming Metrics Establish consistent rules in naming metrics Env App name Server name Module name Message type Target Action prod.apps.maco.platini.jms.vantive.customers.request.sent
  27. 27. 27 Graphite & Grafana - real-time metrics
  28. 28. 28 Instrumenting Code for Metrics Collection Trivial counters with Statsd (https://github.com/etsy/statsd/) Types of counters:  Counters  Timers  Gauges  Histograms
  29. 29. 29 Centralized Logging Apache Logs App logs Logstash Lumberjack Logstash Log shipping Logstash (parse log into fields) Elasticsearch (index logs) Statsd + Graphite (metrics dashboard) (log search & analysis)
  30. 30. 30 Infrastructure Automation
  31. 31. 31 “At an abstract level, a deployment pipeline is an automated manifestation of your process for getting software from version control into the hands of your users.” Deployment Pipeline
  32. 32. 32 Provisioning managers Configuration managers Workflow managers
  33. 33. 33 Configuration Manager Choice
  34. 34. 34 Software components HIERA
  35. 35. 35 Coding best practices Data/code separation r10k HIERA Keep code in version control reaktor Unit & acceptance testing Development workflow feature branch workflow puppet-rspec server-spec
  36. 36. 36 Node breeds Several items give node breed:  puppet role class Foreman host group (one-to-one relationship)  stage (dev/prod/qa/test)  Foreman compute profile (compute provider, CPUs, RAM, Disk)
  37. 37. 37 Tag-based everything Nodes Workflow steps Security
  38. 38. 38 Node herds Pre-provisioned nodes Shorten deployment time Breed matters Hostname does not matter
  39. 39. 39 Don’t be afraid to get dirty Rundeck API Foreman API PuppetDB API BigIP iControl Atlassian Stash API
  40. 40. 40 Scaling Puppet & Foreman
  41. 41. 41 What’s Next • Sponsor Rundeck development for SCM integration and sensitive options encryption • Evaluate network orchestration • Containers • Acceptance testing of puppet code with beaker (rspec+serverspec)
  42. 42. multumim

Editor's Notes

  • 1
  • Value Stream Mapping – Identify stages in release cycle
    Measure elapsed time
    Cycle time
  • Faster time to market in Telecom is key.
    Manage and increasing number of (micro)services and applications
    A spaghetti of links between apps and services with different lifecycles
  • Manage and increasing number of (micro)services and applications
    A spaghetti of links between apps and services with different lifecycles
  • Manage and increasing number of (micro)services and applications
    A spaghetti of links between apps and services with different lifecycles
  • Industry momentum – 2014
    Internal presentation of automation goals presented to both Operations and Development teams
    Managers included from the start
  • The goal: Fully automatic cycle from checkin to production.
    Gated release => push to release
    End-to-end optimized approach to software delivery
    Release code frequently
    Automate the process of build, deploy, test and release
  • Deploy early into production-like environment during Build phase
    Eliminate waste time at handover of a new release from Dev to Ops
    Deploy the software often and incrementally to build confidence into deliverable
    Continuous Delivery requires a cultural shift
    End-to-end optimized approach to software delivery
    Address also the last mile – RELEASE (into production, QA, etc)
    Automate the process of build, deploy, test and release
    Introduce the concept of Deployment Pipeline – automation of the process to put software in production starting from code check-in in source control by developers (describes the model of delivery)
  • Every change creates a build that will pass through a sequence of tests of, and challenges to, its viability as a production release. This process of a sequence of test stages, each evaluating the build from a different perspective, is begun with every commit to the version control system, in the same way as the initiation of a continuous integration process. Purpose, build confidence into release
  • Continuous Integration is cheap. Not continuously integrating is costly. If you don’t follow a continuous approach, you’ll have longer periods between integrations
  • One approach is to create branches in version control that are merged when work is complete, so that mainline is always releasable.
    This approach is suboptimal, since the application is not being continuously integrated if work happens on branches.

    1.Create an abstraction over the part of the system that you need to change.
    2.Refactor the rest of the system to use the abstraction layer. 3.Create a new implementation, which is not part of the production code path until complete.
    4.Update your abstraction layer to delegate to your new implementation.
    5.Remove the old implementation.
    6.Remove the abstraction layer if it is no longer appropriate.

    Snapshots – makes reproducing builds harder

    Run automated tests to increase confidence after build
    Automatically prepare environments (test database, configure app servers) – virtual machines (in the internal “cloud” infrastructure)
    Same deployment workflow & scripts both for Acceptance / UAT / Production – “rehearse” same steps to reduce risk
    Externalize and version each application’s settings depending on environment (test/UAT/performance/live)
    Perform automated smoke tests to validate success
    Notify automatically the interested persons
    Collect metrics and populate dashboards to detect live failures
  • The blue-green deployment approach does this by ensuring you have two production environments, as identical as possible. At any time one of them, let's say blue for the example, is live. As you prepare a new release of your software you do your final stage of testing in the green environment. Once the software is working in the green environment, you switch the router so that all incoming requests go to the green environment - the blue one is now idle.

    Blue-green deployment also gives you a rapid way to rollback - if anything goes wrong you switch the router back to your blue environment. There's still the issue of dealing with missed transactions while the green environment was live, but depending on your design you may be able to feed transactions to both environments in such a way as to keep the blue environment as a backup when the green is live.
  • Write to both versions (prefer ADD to ALTER)
    Fill history if necessary (offline process)
    Read from new version
    Stop writing on old version
  • Version migrations with the code.
  • Dashboards
    Separation of environments
  • Goal: build a deployment pipeline with opensource software components
      
    What is needed? A combination of those: a CM (configuration manager), a PM (provisioning manager), a SR (software/package/module repository) and a WM (workflow manager).
     
    Definitions:
    Node/resource: the basic unit of computing from the application deployment perspective; it may be a virtual machine or a container or even a physical host
    Configuration management: the task of standardizing resource (nodes) configurations and enforce the standards in an automated way. The configuration management is essential for compliance, predictability and repeatability. Provisioning: the task of preparing a set of compute resources to fulfill the requirements of services on top of it. This includes setting up network configuration, assigning IP addresses, setting up name services, installing OS, creating virtual machines and so on.
    Workflow: an orchestrated and repeatable sequence of activities (task or jobs if you wish) across multiple resources (nodes)
    Software repository: a storage location from where the binary packages can be retrieved and installed on nodes
  • When we started we looked at several alternatives for software components: razor from puppetlabs&emc, cobbler from fedora, foreman from Redhat.
    And then for CMs there were at least four good alternatives (we dismissed cfengine since we tried it a long time ago). The situation was similar for workflow managers.
    I must say that it is not easy to evaluate and compare so many applications. We tried to keep in mind that the components should interoperate one with each other, that we do not want vendor lock-in (that’s why we chose between open-source variants) and we need an extensible framework of tools (so tool flexibility matters).
  • So we started with CM Choice.
    We had the following constraints:
    the UNIX team is more prolific with shell scripts and DSLs than with general purpose programming languages
    the software components should be usable for legacy projects too.
     
    Specific requirements:
    freeze the code and software packages needed to build a system - here the needs for SR
    integrate with existing environment (vSphere, BigIP F5,, AD etc)

    Choice: puppet is the best CM for ORO since it was developed by sysadmins and has a DSL
  • In the end this was the list of software components used to implement the deployment pipeline.

    Ecosystem around puppet: foreman takes well the place of Puppet Enterprise Console at least for several functionalities (dashboard, provisioning, reporting, auditing)

    Rundeck is a good workflow manager based on SSH and interpreted scripts and possessing a comprehensive REST API. This API was selected as the Infrastructure API.
  • Development flow: we choose feature branch workflow model -> r10k came to help, reaktor is on the way
    Code/data separation -> hiera was already there, only two backends needed to be developed
  • Dunbar number ~ the number of social relationship we are able to manage is limited to ~150 (this includes human to human as well as human to machine interaction)
    This means that we should concentrate more on managing roles than individual systems. In the devops world this is called the cattle/pets choice of treatment of servers. So, we were choosing to treat the servers as cattle and defined roles for them.
  • Hostname does not matter (cattle vs. pets)
    It is easier to re-provision than to repair
    We tag the nodes and BigIP pool member definitions
    Rundeck tags == puppet external facts delivered via Foreman API
    Tag-based Workflow
    Tag-based permissions for Rundeck jobs and nodes

    Node state is defined by a specific combination of tag values.
    Transition from a state to another is done via Rundeck jobs or Froeman API Calls.


    Node state machine:
    Designed a deployment workflow based on node state. The state of the node is derived from a combination of node tags, dynamically changed by workflow steps.
    Node tags are pushed to the nodes via Puppet external facts and retrieved in orchestrator (Rundeck) via Foreman-Rundeck integration (foreman as a resource model for rundeck).
    Node starts in build state and exit the scene in to be removed state.
     
    Node tags:
    - state tags: used to determine the state of the node
    - identification tags: used in Rundeck projects to limit user visibility of node resources and to impose ACL *
    Security is built on tags as well
  • The node see the light of the day in build state, then enters installed and after some smoke testing, the prepared state. When the node is needed for deployment, it is first selected for the application that’s going to be put on it, then some customization are being applied on it and if then it is deployed.
    If the node is no longer needed due to deployment of new version of application on other nodes, the node is put down (decommissioned from everywhere).
    We have defined node herds by preparing several nodes sharing the same breed. The deployment job is selecting nodes from the herds.
  • Integration between components (glue code):
    - based on APIs (F5 iControl, Foreman API, Rundeck API, Stash API, PuppetDB API)
    languages used: Ruby, Python and shell

    An external component was created to deal with host name generation; this software component is implementing a REST API, answering to requests with a JSON with generated host name(s). The only parameter until now is the host name prefix. The sequence is persisted on disk.

    Two hiera backends were developed: one for extending hiera hierarchies based on role or profile class, another one to retrieve SSH Key from Active directory based on group membership
  • We started from the beginning with a scalable infrastructure. We split foreman and puppet components on 7 nodes. The installation process (automation tools spawn) is completely automatized with Kafo, the foreman puppet based installer.
  • We have plans to sponsor the development of several Rundeck features (SCM integration, secure options safe storage), to investigate in the area of network automation, to include containers in the process and write acceptance and unit testing for our infrastructure code.
  • 42

×