Going Cloud Native with IBM Cloud and NetflixOSS for Dev@Pulse


Published on

Dev@Pulse 2014 Lightning Talk.

Focused on how to use the IBM Cloud and NetflixOSS for high availability/automatic recovery, elastic and web scale, and high velocity continuous delivery. The talk also includes a live demo of chaos testing (Chaos Gorilla specifically) where the application was shown to have enough high availability to survive an entire datacenter / availability zone outage.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • http://www.geekabout.com/2008-02-19-479/40-most-disastrous-cable-messes.html
  • Icons from http://www.iconhot.com/icon/rrze/computer-database.html
  • http://nabeeloo.com/2013/02/throwback-thursday-pagers-and-beepers/https://www.flickr.com/photos/justmartha/3998898107/sizes/n/http://www.digitaltrends.com/social-media/wake-up-a-3-a-m-phone-call-is-your-ticket-into-this-nocturnal-social-network/
  • Make bigger
  • Going Cloud Native with IBM Cloud and NetflixOSS for Dev@Pulse

    1. 1. Going cloud native for your applications and services Jerry Cuomo Andrew Spyker
    2. 2. Topics Jerry is going to cover – Our Journey to Cloud Services – Stop along the way, Winning Netflix Cloud Prize – Our Goals in 2014 in delivering Cloud Services @JerryCuomo Andrew is going to @aspyker – Describe “Xen, Methodology, Approach” to building world-class services – Highlighting new capabilities to support this methodology, running on IBM Cloud – Prove this by example
    3. 3. Our Journey to Cloud Services • From my blog – http://bit.ly/cuomoblog • In 2014, we will continue driving our software to the cloud. To complement our packaged software business, we are transforming our development operations to also deliver our wares as self service cloud-native offerings within the IBM Cloud (SoftLayer, Bluemix, PureApp). • You know you have a cloud service if it is addressable via URL, has Ts&Cs, and has an operations team running it 24x7x365.
    4. 4. Acme Air and winning the Netflix Cloud Prize • Acme Air – Cloud and Mobile Sample and Benchmark • Acme Air + NetflixOSS + IBM SoftLayer – IBM SoftLayer Port to embrace NetflixOSS platform – Winner: Best Example Mash-Up Application Category
    5. 5. Cloud Services Goals • We will follow the “Zen” of operating cloud services • “We will rule the cloud, the cloud will not rule us” – Proactive on failure and security testing and auto recovery • Move from reactive model to predictive model – We are always watching and anticipating • Scalable service fabric services, ops excellence team – Tools, libraries, services, and practices and COE for cloud • Focus on key areas including – Elastic and Web Scale – High Availability and Automatic Recovery – High Velocity Continuous Delivery
    6. 6. Elastic and Web Scale Doing This Not Doing That Source: Programmableweb.com 2012
    7. 7. Elastic and Web Scale Durable Storage Load Balancers Front end API (browser and mobile) Booking Service Authentication Service Temporal caching Strategy Benefit Make deployments automated Without automation impossible Expose well designed API to users Offloads presentation complexity to clients Remove state for mid tier services Allows easy elastic scale out Push temporal state to client and caching tier Leverage clients, avoids data tier overload Use partitioned data storage Data design and storage scales with HA
    8. 8. HA and Automatic Recovery Feeling This Not Feeling That
    9. 9. Highly Available Service Runtime Recipe Execute auth-service call (REST services) Call “Auth Service” Ribbon REST client with Eureka Hystrix Web App Front End Eureka Eureka Server(s) Eureka Server(s) Fallback Implementation Micro service Implementation Server(s) App Service (auth-service) Karyon Implementation Detail Benefits Decompose into micro services • • Key user path always available Failure does not propagate across service boundaries Karyon /w automatic Eureka registration • • New instances are quickly found Failing individual instances disappear Ribbon client with Eureka awareness • • Load balances & retries across instances with “smarts” Handles temporal instance failure Hystrix as dependency circuit breaker • • Allows for fast failure Provides graceful cross service degradation/recovery
    10. 10. IaaS High Availability DAL01 Datacenter (DAL06) DAL05 Global Load Balancers Eureka Local LBs Web App Auth Service Booking Service Region (Dallas) Cluster Auto Recovery and Scaling Services Rule Why? Always > 2 of everything 1 is SPOF, 2 doesn’t web scale and slow DR recovery Including IaaS and cloud services You’re only as strong as your weakest dependency Use auto scaler/recovery monitoring Clusters guarantee availability and service latency Use application level health checks Instance on the network != healthy
    11. 11. Let’s prove it • What is you lost a random instance? Demonstrated as part of Netflix Cloud prize bit.ly/noss-sl-blog • What if you lost a whole datacenter? DEMO TIME!
    12. 12. DEMO Overview DAL06 ✗ Datacenter (DAL05) DAL01 Global Load Balancers Eureka Local LBs Web App Region (Dallas) Cluster Auto Recovery and Scaling Services Chaos Gorilla Auth Service Booking Service
    13. 13. DEMO Success! DAL06 Online Video(DAL05) Datacenter DAL01 (shows recovery as well) Global Load Balancers Eureka http://bit.ly/sl-gorillavid ✗ Local LBs Web App Region (Dallas) Cluster Auto Recovery and Scaling Services Chaos Gorilla Auth Service Booking Service
    14. 14. Continuous Delivery Not This Reading This
    15. 15. Continuous Delivery Continuous Build Server Baked to SoftLayer Image Templates Cluster v1 Canary v2 Cluster V2 Step Technology Developers test locally Unit test frameworks Continuous build Continuous build server based on gradle builds Build “bakes” full instance image Imaginator (Aminator inspired) creates SoftLayer images Developer work across dev and test Archaius allows for environment based context Developers do canary tests, red/black deployments in prod Asgard console provides app cluster common devops approach, security patterns, and visibility
    16. 16. More details? • PAS-1418A - Porting the Netflix OSS Cloud Architecture to SoftLayer – Today - 5:00 – 6:00, Room 116 • All code available on Github – netflix.github.io – github.com/EmergingTechnologyInstitute – Blog - iSpyker.blogspot.com – Twitter - @aspyker