• Like
  • Save
 

DevOps goes Mobile (daho.am)

on

  • 739 views

We've learned a lot through doing DevOps: Every commit is automatically integrated, tested, and deployed to a staging environment. And then it only takes one push of a button and the release goes ...

We've learned a lot through doing DevOps: Every commit is automatically integrated, tested, and deployed to a staging environment. And then it only takes one push of a button and the release goes live...

Unfortunately, it's not as simple anymore when operating mobile applications: How can you quickly update your mobile software when the app store provider wants to test your software first for a few days? How can you update your configuration when your app can run offline? And how do you track down errors when the data is distributed to millions of mobile clients? Those were just some of the challenges we encountered during the operation of mobile games with millions of daily users. In this talk we will talk about the solutions we have found to address them.

Statistics

Views

Total Views
739
Views on SlideShare
706
Embed Views
33

Actions

Likes
1
Downloads
8
Comments
0

1 Embed 33

https://twitter.com 33

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    DevOps goes Mobile (daho.am) DevOps goes Mobile (daho.am) Presentation Transcript

    • DevOps goes Mobile Jesper Richter-Reichhelm @jrirei
    • A story about a nice little game … … that was very successful.
    • November 2013 !? … … … ! But the new version had a serious bug. Apple just needed 4 days of testing - quite ok!
    • November 2013 !? … … … … Thanksgiving day in US while we were waiting…
    • “In recognition of your incredible efforts and achievements, I’m happy to announce that we’re extending the Thanksgiving holiday this year.” - Tim Cook Quite cool move by Apple’s CEO…
    • … but it did add to our tension as we were waiting.
    • November 2013 !? … … … … … … Apple took five days to approve our patch.
    • Crashing 15% Not affected 50% Old version 35% >200,000 users affected A bug like that could kill a game - or a company. We were lucky that only ‘few’ users were affected.
    • 8000 crash reports within the first hour…
    • November 2013 (on web) 5 hours crashing, not 5 days! Reaction times on mobile are much worse than on web.
    • 270 employees 100 engineers with only 3 managers
    • We started 2009 on Facebook with Flash games Since 2013 we focus on mobile first.
    • Independent teams FE Dev BE Dev Art Product One team per game with lots of autonomy
    • “You build it, you run it. - Werner Vogels We are not the only ones working like that
    • Classic way ! ! Ops ! Dev Devs want to change, Ops want to avoid risk. This is a classic conflict.
    • DevOps way ! Dev ! ! Ops DevOps is about bringing Dev and Ops together. Silos don’t help in making the product.
    • Wooga way ! Dev Ops ! Dev Ops In small teams dev and ops can even be in one person.
    • • Agile admins • Faster releases • Virtualization • Automation tools Some say DevOps is about All true, all good, but a bit missing the point.
    • I say it’s about Feedback loops back from ops to dev.
    • Cooperation Collaboration Feedback loops I say it’s about
    • Mobile is different especially for apps
    • DevOps is different for Mobile
    • Challenges
    • “There are three challenges: mobile, devices, and not having control.” - Me
    • Mobile Network Network conditions change while the device moves.
    • Latency Bandwidth Offline mode
    • Runs on a Device
    • Operating system Form factor Device specs So many devices to make it work on…
    • But even more important …
    • … the devices are not within our reach.
    • Local storage Logfile access Offline handling You have a distributed network of millions of computers. Now keep up the data integrity! :-) And it’s hard to see what did go wrong…
    • Deploy w/o Control So many practices like CD to be in control… … and most does not work on mobile.
    • Even harder on iOS than on Android
    • Mandatory tests Strict rules Only full deploys
    • We WANT canary testing!
    • Staged rollout Reduce risk Fast feedback
    • Staged rollout Reduce risk Fast feedback Google provides staged rollouts. Great! :-)
    • Pull updates User can interfere But we still have no full control over updates.
    • Pull updates User can interfere
    • 0,0% 20,0% 40,0% 60,0% 80,0% 4.4.2 4.4.1 4.4 4.3.x 4.2.x 4.1.x 4.0.x 3.x 2.x 1.x 5% 3% Version distribution of our game Diamond Dash (iOS)
    • 0,0% 20,0% 40,0% 60,0% 80,0% 2014 ... ... ... ... ... ... 2013 2012 2011 5% 3% ~10% play on versions that are older than 18 months.
    • WORD Jelly Splash iOS has a similar problem. Even forced updates can be partially avoided.
    • Wooga Solutions
    • Cross Platform
    • HTML5 not viable Unity Apportable Others? We have some solutions but we are not happy yet. So sad that HTML5 is not yet good enough…
    • Devices
    • Restrict devices Test pool Internal ‘app store’ How to live with the pain…
    • WORD For Apple you add ‘fake’ requirements. Google provides a way to restrict to devices.
    • Still we need manual test on physical devices…
    • Just kidding. :-) … and lot’s of alcohol afterwards.
    • Jenkins CI helps (of course)
    • Internal app store helps, too. Makes it easier to distribute and test internally.
    • No manual builds Keep dSYM file Jailbreaked iPhone Copy Live 2 Test Lessons learned from the Thanksgiving bug:
    • WORD Simple Backend Services Following the EC2 example … … Wooga now provides internal services to mobile games.
    • SBS way ! Game ! Game ! SBS ! SBS Game devs write 90% of the SDKs. SBS provides a simple REST API to its services.
    • SDK
    • Async. calls Offline handling Local storage Conflict assist Encryption Most network problems are handled by the SDK.
    • Key Value Store
    • REST API Authorisation ETags ETags are the key to conflict resolution.
    • Clients are responsible for solving conflicts. Updates without a valid Etag are invalid. On every update a new ETag is created.
    • Config Service
    • Variable config Offline mode Environments AB testing Online configuration allows AB testing. And that enables feature switching!
    • Simple configuration with basic version control… … and easy support of environments.
    • AB tests can be tested and run individually.
    • • It’s hell - live with it • form factor, OS version • Restrict devices (camera, min. OS) • cross platform dev. • many local devices • Apple Enterprise acc. • complex tool chain Simple definition of how big the groups are.
    • • It’s hell - live with it • form factor, OS version • Restrict devices (camera, min. OS) • cross platform dev. • many local devices • Apple Enterprise acc. • complex tool chain Groups define variations of the basic config.
    • The BI team provides tools for analysis.
    • But analysts within game teams are responsible.
    • With all this feature switching becomes easy.
    • Error Analytics
    • Real time Crash reports Segmentation Normalisation Staged rollout are only useful … … if you have the right tooling.
    • 0 25.000 50.000 75.000 100.000 Users Errors Version 1 Version 2 Which version is better?
    • 0 1.250 2.500 3.750 5.000 Users Errors Version 1 Version 2 Version 1 has more errors … … but it also has more users.
    • 0% 2,5% 5% 7,5% 10% Users Errors Affected Users Version 1 Version 2 Normalisation makes that easy. The tradeoff is having a lot of more traffic.
    • Wrap up
    • Mobile is different especially for apps
    • DevOps is different for Mobile
    • Continuous Delivery
 helps
    • November 2013 5 hours crashing, not 5 days! This can kill you.
    • “MTTR > MTBF” - John Allspaw The Mean Time To React is more important … … than the Mean Time Between Failures.
    • “MTTR is more important than MTBF (for most types of F)!” - John Allspaw If you are dead, you cannot recover (of course).
    • Dev + Ops is needed You need both - Dev and Ops - to succeed.
    • Be able to react fast even on Mobile The one thing that matters most in operation.
    • Questions? @jrirei http://wooga.com/jobs