SlideShare a Scribd company logo
1 of 51
Download to read offline
SCALING MOBILETESTING
ON AWS: EMULATORS ALL
THE WAY DOWN
Kim Moir, Mozilla, @kmoir
URES, November 13, 2015
Good morning. My name is Kim Moir and Iā€™m a release engineer at Mozilla. Today Iā€™m going to discuss how we scale our Android testing on AWS. Show of hands -
how many of you test on Android? On a continuous integration farm? 

References

Androids by etnyk Attribution-NonCommercial-NoDerivs 2.0 Generic license

https://www.ļ¬‚ickr.com/photos/etnyk/5588953445/sizes/l
A little about me. I live in Ottawa, Ontario, Canada. My hobbies include running and making ice cream, which complement each other well. This picture shows a release
engineering ice cream ļ¬‚avour - coļ¬€ee ice cream with chocolate chip cookies soaked in Kahluha. Before I was a release engineer at Mozilla I worked at IBM as a release
engineer on Eclipse. So 12 years working on open source release engineering. Iā€™m really excited to be here today to share my stories, and learn from all of you.
Hereā€™s a picture of the where the amazing Mozilla release engineering team work. As you can see, we are quite distributed across the world, and many of us work
remotely from our homes.
Mozilla is a non-proļ¬t. Our mission is to promote openness, innovation & opportunity on the web. 

Youā€™re probably familiar with the products we build, such as Firefox for Desktop, Android, iOS and Firefox OS. Firefox for iOS was actually released yesterday - so go
and try it out!

Note that we ship Firefox on four platforms and with ~97 locales on the same day as US English
We have a continuous integration farm running 24x7 on commit. Our release cadence is every six weeks for Firefox for Android. We release betas every week.

https://wiki.mozilla.org/RapidRelease

Iā€™ll talk a little bit about our environment in general, before I delve into our Android test environment.
DAILY
ā€¢ 350 pushes
ā€¢ 4700 build jobs
ā€¢ 150,000 test jobs
Here are some recent numbers on the aggregate jobs we run (all products, not just Firefox for Android). Today, about 66% of build jobs and 80% of test jobs are run on
AWS. We only have our performance tests left that run on raw devices. They canā€™t run on emulators because performance is not constant. 

Each time a developer lands a change, it invokes a series of builds and associated tests on relevant platforms. Within each test job there are many actual test suites that
run.

September:

8188 pushes

https://secure.pub.build.mozilla.org/buildapi/reports/pushes?starttime=1441090800&endtime=1443682800

September jobs

https://secure.pub.build.mozilla.org/buildapi/reports/waittimes?starttime=1441090800&endtime=1443682800

Builds Oct 4-Oct10

https://secure.pub.build.mozilla.org/buildapi/reports/waittimes?starttime=1443942000&endtime=1444460400

builds 15560

Builds Tuesday Oct 6

https://secure.pub.build.mozilla.org/buildapi/reports/waittimes?starttime=1444104000&endtime=1444190400

2814
15 MINUTE SERVICE
We have a commitment to developers that build/test jobs should start within 15 minutes of being requested. We donā€™t have a perfect record on this, but certainly our
numbers are good. We have metrics that measure this every day so we can see what platforms need additional capacity. And we adjust capacity as needed, and
remove old platforms as they become less relevant in the marketplace. 

ā€”ā€”ā€”

Pizza picture by djwtwo

Attribution-NonCommercial-ShareAlike 2.0 Generic (CC BY-NC-SA 2.0)

https://www.ļ¬‚ickr.com/photos/djwtwo/9864611814/sizes/l/
+ many Mozilla tools
Here are some of projects that we use in our infrastructure. 

Buildbot is our continuous integration engine. However, we are in the process of migrating to TaskCluster. Task cluster is a set of components that manages task
queuing, scheduling, execution and provisioning of resources. It was designed to run automated builds and test at Mozilla.

We use Puppet for conļ¬guration management all our Buildbot servers, and the Linux, Mac and machines. So when we provision new hardware, we just boot the device
and it puppetizes based on itā€™s role thatā€™s deļ¬ned by itā€™s hostname. 

Our repository of record is hg.mozilla.org but developers also commit to git repos and these commits are transferred to the hg repository. We also use a lot of mozilla
tools that allow us to scale. These tools are open source as well and I have links at the end of the talk to these repos.

ā€”ā€”

References

octokitty http://www.ļ¬‚ickr.com/photos/tachikoma/2760470578/sizes/l/
DEVICES
ā€¢ 6700+ in total
ā€¢1900+ for builds
ā€¢4700+ for tests
ā€¢75% AWS
These numbers are for both Android and desktop devices. The pools overlap.

80% test AWS and 66% build AWS

ā€”ā€”-

References

https://secure.pub.build.mozilla.org/builddata/reports/slave_health/index.html

* https://secure.pub.build.mozilla.org/slavealloc/ui/#silos
HISTORY OF MOBILETESTING
AT MOZILLA
Before I talk about where we are today, Iā€™d like to step back and talk about how our mobile testing evolved over the years.

Hereā€™s a picture from 2009 of a mobile pedalboard. This was our ļ¬rst attempt at mobile test automation. It was used to report Fennec performance data on the Nokia
N810's 

Picture by Aki Sasaki

https://www.ļ¬‚ickr.com/photos/drkscrtlv/3590117065/sizes/l
Picture by Aki Sasaki

https://www.ļ¬‚ickr.com/photos/drkscrtlv/3590924524/sizes/l

http://escapewindow.dreamwidth.org/205930.html
In 2010, we then moved on to testing on Android 2.2 on Tegras. Tegra are bare reference boards.

We stored Tegra in shoe racks from Bed Bath and Beyond

These shoe racks were stored in a room that was shielded from wireless interference. The shoe racks allowed us to position the phones so they werenā€™t too close
together, on a material that didnā€™t get too hot and did not conduct electricity. These racks also allowed us to easily take dead phones out, open, remove batteries,
reimage and replace. 

Picture from John Oā€™Duinnā€™s blog

http://oduinn.com/blog/2010/02/11/unveiling-mozillas-faraday-cage/

http://oduinn.com/images/2013/blog_2013_RelEngAsForceMultiplier.pdf
In 2012, we started running continuous integration tests on Android reference cards in specially designed racks. We started with 800 of them, but only use about 200
today. The cards are called pandas. These were used to run Android 4.0 tests for correctness, debug and performance.

___

References

Pictures of Panda chassis from Dustinā€™s blog

https://blog.mozilla.org/it/2013/01/04/mozpool/2012-11-09-08-30-03/
They had a custom relay board to allow us to reboot them remotely.

Pictures of Panda chassis from Dustinā€™s blog

https://blog.mozilla.org/it/2013/01/04/mozpool/2012-11-09-08-30-03/
Many racks of pandas

These devices are not as stable as desktop devices, and are prone to failure. Given their numbers, having to deal with the machines failing all the time is very expensive if
they were managed by humans. We wrote some software called mozpool to automatically reimage and reboot them.

Pictures of Panda chassis from Dustinā€™s blog

https://blog.mozilla.org/it/2013/01/04/mozpool/2012-11-09-08-30-03/
WHAT DID WE LEARN?
What did we learn over these iterations of our mobile testing infrastructure?

Each successive mobile testing solution became more reliable (fewer infra failures) and easier to manage via automated tools

Manufacturers EOL reference cards. Old reference cards donā€™t support new Android versions

Does not scale for peak load

Time consuming and expensive to adjust automation infrastructure to for every new hardware iteration

Picture

https://www.ļ¬‚ickr.com/photos/wocintechchat/21909333504/sizes/l from 

http://www.wocintechchat.com/blog/wocintechphotos #WOCtechchat

Picture: computer history museum 

https://www.ļ¬‚ickr.com/photos/indigoprime/2239342335/sizes/o/
We have bursty traļ¬ƒc, both for time of day, time of year etc

Example of the number of jobs running per hour in a typical week

Bursty traļ¬ƒc - you can see that the number of jobs run each day is variable as time zones wake up, and the large trough is the weekend.
BRANCHING
We have many diļ¬€erent branches in Hg at Mozilla. Our Hg branches are all named after diļ¬€erent tree species

Developers push to diļ¬€erent branches depending on their purpose. Diļ¬€erent branches have diļ¬€erent scheduling priorities within our continuous integration engine. So
for instance, if a change is landed in a mozilla-beta branch, the builds and tests associated with that change will have machines allocated to them with at a higher priority
than if a change was landed on a cedar branch which is just for testing purposes.

Picture by Aurelio Asiain 

Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0)

https://ļ¬‚ic.kr/p/v27AD
Source: http://opensignal.com/reports/2015/08/android-fragmentation/
What do we need to test? Hereā€™s a picture of Android device fragmentation as of August 2015

Source: http://opensignal.com/reports/2015/08/android-fragmentation/
And here is current Android adoption (October 2015)

Android ā€œKit Katā€ 4.4 has about 40% adoption rate

Android "Jelly Bean" versions (4.1ā€“4.3.1), with a combined share of 30.2%.

Sources

https://en.wikipedia.org/wiki/Android_version_history
ANDROIDTEST PLATFORMS
ā€¢Android 2.3, 4,0, 4.2 (x86), 4.3
ā€¢Test types
ā€¢correctness
ā€¢debug
ā€¢performance
Obviously, we cannot test on all those platforms and devices, itā€™s not feasible. We limit our testing to the following platforms.
In 2012, we started moving our build and test infrastructure to Amazon. We ļ¬rst implemented this for desktop Firefox jobs on Linux. We then implemented them for
Android.

Scalable infrastructure for bursty traļ¬ƒc with an API to manage it all.

Scalable

Deals with bursty load

APIs!

Picture by Tim Norris 

Create Commons Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0)

https://www.ļ¬‚ickr.com/photos/tim_norris/2600844073/sizes/o/
AWSTERMINOLOGY
ā€¢ EC2 - Elastic compute 2 - machines asVMs
ā€¢ EBS - Elastic block store - network attached
storage
ā€¢ Region - separate geographical area
ā€¢ Availability zone - Multiple, isolated locations
within a region
Iā€™m going to talk a bit about some AWS terms for those of you that may not be familiar with them. 

Notes: 

AWS instance types http://aws.amazon.com/ec2/instance-types/
MORE AWSTERMS
ā€¢ AMI - Amazon machine image
ā€¢ instance type -VM with deļ¬ned speciļ¬cations
and cost per hour. For example:
-AMIs - Amazon has standard ones that you can modify or create your own

-pricing on instance types can depend on the region

-m3.medium currently costs around $0.07hr in most regions (Nov 2015 costs)

-Some instance types may not be available in all availability zones
PUPPETVS AMIS
AMIs are Amazon machine instances

Golden AMIs

We create golden image AMIs via cron each night. These images are generated from our puppet conļ¬gs. We have diļ¬€erent images deļ¬ned for diļ¬€erent instance type
and the role that they perform. For example test and build instances have diļ¬€erent libraries and conļ¬guration in puppet. 

Originally we used puppet to manage all our of build and test instances. It was too slow to puppetize the spot instances

Solution: Create golden AMIs from conļ¬gs each night via cron. These are used to instantiate the new spot instances.

We also use the same pool AMI to run Android tests and Linux tests, they just run in diļ¬€erent directories. Another reason for nightly regeneration is pre-populating VCS
caches to reduce ļ¬rst time startup load.

Picture by shaireproductions

Creative Commons Attribution 2.0 Generic (CC BY 2.0)

https://ļ¬‚ic.kr/p/dTfsCs
USE SPOT INSTANCES
ā€¢ Use spot instances vs on demand instances
ā€¢ much cheaper
ā€¢ not instantiated as quickly
ā€¢ terminated if outbid while running
Amazon has many diļ¬€erent types of instances. Initially, we used on demand instances. They instantiate quickly but cost more per hour than other options. 

Spot instances are Amazon way of bidding oļ¬€ excess capacity. You can bid for the instance and if nobody else bids for it at a price above your oļ¬€er, the spot instances
will be instantiated for you. However, if youā€™re running a spot instance and someone bids a price higher than you did, your instance can be killed. But thatā€™s okay
because we have conļ¬gured our build farm to retry jobs that failed and a very small percentage are killed this way (< 1%)

Since the spot instances arenā€™t available as quickly as the on-demand instances, some tests donā€™t start within 15 minutes but thatā€™s okay. Spot instances are
instantiated every time with the AMI you specify.

Other notes

Smart bidding spot bidding library https://bugzilla.mozilla.org/show_bug.cgi?id=972562
Minimum viable instance type

Run more tests in parallel on a cheaper instance types rather than upgrading instance type

Most tests run on m3.medium but some need more

Limit the subset of tests run on more expensive instance types to those that actually need it

Our tests have a timeout for a suite of tests. If they donā€™t complete within this timeout, they fail and retry. 

Itā€™s much cheaper to run more tests in parallel on a cheaper instance type, than run on a more expensive instance type due to the scale of our operations. For example
our Android 4.3 reftests invoke 48 parallel jobs.

For instance, we have Android tests that run on Emulators on AWS. Some of the reference tests required a c3.xlarge to run.

The correctness tests were ļ¬ne to run on m3.medium

Picture by kenny magic

Creative Commons Attribution 2.0 Generic (CC BY 2.0)

https://www.ļ¬‚ickr.com/photos/kwl/4247555680/sizes/l
WHEREā€™STHE CODE?
ā€¢ The tools we use are all open source
ā€¢ https://github.com/mozilla/build-cloud-tools
ā€¢ Which use boto libraries (Python interface to
AWS) https://github.com/boto/boto
The code we use to interact with AWS APIs resides here
SMARTER BIDDING
ALGORITHMS
ā€¢ Important scripts
ā€¢ aws_stop_idle.py
ā€¢ aws_watch_pending.py
-stop_idle stops instances that are no longer needed given our current capacity (idle for a certain time period - threshold depends on if on-demand or spot)

-aws_watch_pending activates instances given the criteria on the next slide
REGIONS AND INSTANCES
ā€¢ Run instances in multiple regions
ā€¢ Start instances in cheaper regions ļ¬rst
ā€¢ Automatically shut down inactive instances
ā€¢ Start instances that have been recently running
ā€¢ Bid on similar instance types
If you look at aws_watch_pending.py, these are some of the rules that it implements

We also use machines in multiple AWS regions, in case one region went down, and also to incur cost savings (some regions are cheaper). Currently we only use us-east1
and us-west2. Since all of our CI infrastructure resides in California, we donā€™t use most other regions. Unlike some companies that need to have instances available
instantly - for instance I recently saw a talk by Bridget Kromhout (http://bridgetkromhout.com/speaking/2014/beyondthecode/), an operations engineer from DramaFever.
This company provides international movies content on demand. They use every single AWS region because there customer base is so distributed.

Better build times and lower costs if you start instances that have recently been running (still retain artifact dirs, billing advantages)
LIMIT POOL SIZE
Limit pool size

The size of the AWS pools allocated to diļ¬€erent instance types is limited so if the number of requests spikes we have higher pending counts, but not a huge spike in our
AWS bill.

Bidding algorithm does not bid automatically bring up machines for all pending jobs. Adds some more capacity, waits, re-evaluates pending count, and adds some more
if needed

Similar to thermostat system to heat your house, gradually add more heat 

Picture - Ottawa Arboretum - Creative Commons

Attribution-NonCommercial 2.0 Generic (CC BY-NC 2.0)

https://www.ļ¬‚ickr.com/photos/rohit_saxena/4552766281/sizes/l
LIMIT EBS USE
ā€¢ EBS is network attached store to the EC2VM
ā€¢ Much cheaper to use the disk that comes with the
instance type
SUMMARY: AWS
ā€¢ Golden master of AMIs regenerated daily
ā€¢ Use spot instances
ā€¢ Smarter bidding algorithms
ā€¢ Optimize use of regions, instance type and capacity
ā€¢ Limit pool size and increase capacity gradually
ā€¢ Use instance storage vs EBS to save $
With these changes, we reduced our initial AWS bill by 70% (as of last year) However, today we use AWS S3 (backend storage) so this has really increased our bill from
our initial implementation (we migrated all of our FTP data to S3)
EMULATOR ENVIRONMENT
(1)
ā€¢ Android 4.3 (AOSP 4.3.1_r1, JLS36I); standard 2.6.29 kernel
ā€¢ 1 GB of memory
ā€¢ 720Ɨ1280, 320 dpi screen
ā€¢ 128 MBVM heap
ā€¢ 600 MB /data and 600 MB /sdcard partitions
ā€¢ front and back emulated cameras; all emulated sensors
ā€¢ standard crashreporter, logcat, anr, and tombstone support
So now that weā€™ve talked about our AWS environment, letā€™s talk about our move to emulators

From https://gbrownmozilla.wordpress.com/2015/04/23/android-4-3-opt-tests-running-on-trunk-trees/
EMULATOR ENVIRONMENT
(2)
ā€¢ Run emulator that comes with Android SDK and
load the custom image, install Firefox apk
ā€¢ We run tests on a variety of instance types
(m3.medium, m3.xlarge, c3.xlarge)
http://developer.android.com/tools/devices/emulator.html
This a screenshot of when the emulator is starting up. We have a tooling in our test suites that creates a screen shot when the emulator starts, or when a test fails.
These binaries of the screen shots, logs or other testing artifacts are uploaded to Amazon S3 storage and available for developers when their tests fails.
This screenshot is of and android test suite test failure.

Most of the time the logs that are uploaded with the screenshot are more useful.

Example log

http://mozilla-releng-blobs.s3.amazonaws.com/blobs/try/
sha512/61c91375333e3265c832cļ¬€6f1ļ¬€314fb9b70c6a2d15386f0a303c7226cfd1ed7209680d88ac032332907a43cfcf4f03c5f02e5531101ae3b855c699ce1e4e02
ACCESSTO DEVICES
ā€¢ Access to processes via adb (Android debug
bridge)
ā€¢ Allows us to kill errant processes
ā€¢ Some test types require root permissions to copy
ļ¬les to certain locations or for other privileged
operations
http://developer.android.com/tools/help/adb.html
MIGRATION PROCESS
ā€¢ Moved correctness tests, then debug
ā€¢ Many intermittent issues
ā€¢ Debug were problematic
ā€¢ Take longer and consume more resources
Migration Process

Intermittent issues

Debug were problematic 

Take longer and consume more resources
MIGRATION LESSONS
ā€¢ Use more powerful instances types
ā€¢ Specify timeouts that are longer for individual tests
ā€¢ Skip tests on certain (slow) platforms
ā€¢ Split the tests into smaller tests
ā€¢ Optimize or simplify the test
https://gbrownmozilla.wordpress.com/2015/05/26/handling-intermittent-test-timeouts-in-long-running-tests/
PERFORMANCE TESTS
ā€¢ Autophone is a Mozilla project measuring page
load performance and testing video playback on
real Android devices
ā€¢ Provision, verify, recover, run tests and identity
status of variety of phones
Retain small pool of real devices for performance tests

From https://wiki.mozilla.org/Auto-tools/Projects/Autophone

Verify that a phone is working correctly: sd card is writable and not full, etc.

Attempt to recover a phone that reports errors, rerunning the current test/test framework.

Provide at least a high-level status for all phones: whether they are idle, running a test, or disabled/broken.

Support a large number of phones, potentially split amongst several host machines.
EMULATORS IN AWS:THE
GOOD
Emulators: the good

When we want to test a new Android version, we just need a new emulator image, not a new hardware stack. No lead time associated with procuring and installing new
hardware in the data centre.

Increased reliability due to fewer retries (2% vs 18% on Pandas)

Some of that reliability stems from the fact that with the emulator tests will run them from the same, fresh Android image each time. When the tests ran on devices, the
reimaging process took a long time and the devices had to be re-imaged every so often which was a more manual process.

Scalable to deal with daily job spikes

We donā€™t have to write and maintain software to manage a pool of devices. We can just use the Amazon APIs to provisions resources for our CI system.

Picture by SaturatedEyes - Creative Commons Attribution-NonCommercial 2.0 Generic (CC BY-NC 2.0)

https://www.ļ¬‚ickr.com/photos/shuttershuk/7099823113/sizes/l
EMULATORS IN AWS:THE BAD
ā€¢ More tests running in parallel (tests run slower,
added more tests)
ā€¢ No performance tests because weā€™re running
emulators on emulators
Emulators: the bad

Tests run slower because weā€™re running tests on emulators on emulators

More tests need to run in parallel because they take longer

Example: Android 4.3 debug tests need to run about 2x many jobs as they did when running on raw devices

No performance tests (have a separate pool of raw devices for this purpose)

As a side note: Amazon has a new oļ¬€ering from this summer called Device Farm which allows you to run tests on a multiple devices. We donā€™t use it because it is
through an API that doesnā€™t support the tests harnesses that we use. Also, it doesnā€™t that doesnā€™t allow root access to the device. Also, the pricing ($250 a month for a
single dedicated device) is much more expensive than spot instances). 

Picture by Tuncay - Creative Commons 

Attribution 2.0 Generic (CC BY 2.0) https://www.ļ¬‚ickr.com/photos/tuncaycoskun/15809887756/sizes/l
SUMMARY: EMULATORS ON
AWS
ā€¢ Determine what testing can be done on emulator
vs real device
ā€¢ Use minimum viable instance type
ā€¢ Run more tests in parallel
May need larger instance type to speed up longer running tests

Minimize the number of tests that need to run on real hardware. Running tests on real devices in continuous integration is much more complicated/painful that running
them on emulators. Does not allow you to upgrade easily for the next Android version
FUTURE WORK
ā€¢ Android 5.0 on emulator
ā€¢ Make it better
QUESTIONS?
WHEREā€™STHE CODE?
ā€¢ Cloud tools: https://github.com/mozilla/build-cloud-tools
ā€¢ buildbot conļ¬gs https://github.com/mozilla/build-buildbot-conļ¬gs
ā€¢ builldbotcustom https://github.com/mozilla/build-buildbotcustom
ā€¢ Mozharness https://github.com/mozilla/build-mozharness
ā€¢ Mozpool https://github.com/mozilla/mozpool
ā€¢ Puppet conļ¬gs https://github.com/mozilla/build-puppet
LEARN MORE
ā€¢ @MozRelEng
ā€¢ http://planet.mozilla.org/releng/
ā€¢ Mozilla Releng wiki https://wiki.mozilla.org/
ReleaseEngineering
ā€¢ IRC: channel #releng on moznet
MORE READING 1
ā€¢ Laura's talks on monitoring complex systems http://vimeo.com/album/3108317/video/
110088288
ā€¢ Armenā€™s talk on our hybrid infrastructure https://air.mozilla.org/problems-and-cutting-
costs-for-mozillas-hybrid-ec2-in-house-continuous-integration/
ā€¢ Move to AWS starting in 2012
ā€¢ http://atlee.ca/blog/posts/blog20121002ļ¬refox-builds-in-the-cloud.html
ā€¢ http://johnnybuild.blogspot.ca/2012/08/migrating-linux32-and-linux64-builds-to.html
ā€¢ http://atlee.ca/blog/posts/blog20121214behind-the-clouds.html
ā€¢ http://rail.merail.ca/posts/ļ¬refox-unit-tests-on-ubuntu.html
Scaling

http://atlee.ca/blog/posts/bursty-load.html

jacuzzis

http://atlee.ca/blog/posts/initial-jacuzzi-results.html

http://hearsum.ca/blog/experiments-with-smaller-pools-of-build-machines/

Caching
MORE READING 2
ā€¢ AWS spot instances vs reserved instances
ā€¢ http://atlee.ca/blog/posts/now-using-aws-spot-instances.html
ā€¢ http://rail.merail.ca/posts/ļ¬refox-builds-are-way-cheaper-now.html
ā€¢ http://rail.merail.ca/posts/ec2-spot-instances-experiments.html
ā€¢ http://taras.glek.net/blog/2014/05/09/how-amazon-ec2-got-15x-cheaper-in-6-months/
ā€¢ http://taras.glek.net/blog/2014/03/05/more-and-faster-c-i-for-less-on-aws/
ā€¢ AWS networking
ā€¢ http://atlee.ca/blog/posts/aws-networks-and-burning-trees.html
ā€¢ http://rail.merail.ca/posts/using-dns-to-query-aws.html
MORE READING 3
ā€¢ Scaling
ā€¢ http://atlee.ca/blog/posts/bursty-load.html
ā€¢ jacuzzis
ā€¢ http://atlee.ca/blog/posts/initial-jacuzzi-results.html
ā€¢ http://hearsum.ca/blog/experiments-with-smaller-pools-of-build-machines/
ā€¢ Caching
ā€¢ http://atlee.ca/blog/posts/cache-em-all.html
ā€¢ Geoffrey Brownā€™s blog on Android tests https://gbrownmozilla.wordpress.com/

More Related Content

What's hot

AtlasCamp 2010: The Atlassian Plugin SDK For Fun & Profit - Ben Speakmon
AtlasCamp 2010: The Atlassian Plugin SDK For Fun & Profit - Ben SpeakmonAtlasCamp 2010: The Atlassian Plugin SDK For Fun & Profit - Ben Speakmon
AtlasCamp 2010: The Atlassian Plugin SDK For Fun & Profit - Ben Speakmon
Atlassian
Ā 

What's hot (20)

Continuous deployment-at-flipkart
Continuous deployment-at-flipkartContinuous deployment-at-flipkart
Continuous deployment-at-flipkart
Ā 
Securing jenkins
Securing jenkinsSecuring jenkins
Securing jenkins
Ā 
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
Ā 
Chaos engineering applied
Chaos engineering appliedChaos engineering applied
Chaos engineering applied
Ā 
Everything You Know is Not Quite Right Anymore: Rethinking Best Practices to ...
Everything You Know is Not Quite Right Anymore: Rethinking Best Practices to ...Everything You Know is Not Quite Right Anymore: Rethinking Best Practices to ...
Everything You Know is Not Quite Right Anymore: Rethinking Best Practices to ...
Ā 
Continuous Deployment: The Dirty Details
Continuous Deployment: The Dirty DetailsContinuous Deployment: The Dirty Details
Continuous Deployment: The Dirty Details
Ā 
IBM Connect 2014 BP204: It's Not Infernal: Dante's Nine Circles of XPages Heaven
IBM Connect 2014 BP204: It's Not Infernal: Dante's Nine Circles of XPages HeavenIBM Connect 2014 BP204: It's Not Infernal: Dante's Nine Circles of XPages Heaven
IBM Connect 2014 BP204: It's Not Infernal: Dante's Nine Circles of XPages Heaven
Ā 
iOS Parallel Automation: run faster than fast ā€” Viktar Karanevich ā€” SeleniumC...
iOS Parallel Automation: run faster than fast ā€” Viktar Karanevich ā€” SeleniumC...iOS Parallel Automation: run faster than fast ā€” Viktar Karanevich ā€” SeleniumC...
iOS Parallel Automation: run faster than fast ā€” Viktar Karanevich ā€” SeleniumC...
Ā 
Mobile Development with Ionic, React Native, and JHipster - AllTheTalks 2020
Mobile Development with Ionic, React Native, and JHipster - AllTheTalks 2020Mobile Development with Ionic, React Native, and JHipster - AllTheTalks 2020
Mobile Development with Ionic, React Native, and JHipster - AllTheTalks 2020
Ā 
Siterise for OpenText Web Experience Management, Portal, and Tempo Social.
Siterise for OpenText Web Experience Management, Portal, and Tempo Social.Siterise for OpenText Web Experience Management, Portal, and Tempo Social.
Siterise for OpenText Web Experience Management, Portal, and Tempo Social.
Ā 
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
Ā 
Using CI for continuous delivery Part 1
Using CI for continuous delivery Part 1Using CI for continuous delivery Part 1
Using CI for continuous delivery Part 1
Ā 
Mobile Development with Ionic, React Native, and JHipster - ACGNJ Java Users ...
Mobile Development with Ionic, React Native, and JHipster - ACGNJ Java Users ...Mobile Development with Ionic, React Native, and JHipster - ACGNJ Java Users ...
Mobile Development with Ionic, React Native, and JHipster - ACGNJ Java Users ...
Ā 
Java REST API Framework Comparison - PWX 2021
Java REST API Framework Comparison - PWX 2021Java REST API Framework Comparison - PWX 2021
Java REST API Framework Comparison - PWX 2021
Ā 
Maven
MavenMaven
Maven
Ā 
Lā€™enjeu du mobile pour le dĆ©veloppeur Web, et comment Mozilla va vous aider
Lā€™enjeu du mobile pour le dĆ©veloppeur Web,  et comment Mozilla va vous aiderLā€™enjeu du mobile pour le dĆ©veloppeur Web,  et comment Mozilla va vous aider
Lā€™enjeu du mobile pour le dĆ©veloppeur Web, et comment Mozilla va vous aider
Ā 
Jenkins tutorial for beginners
Jenkins tutorial for beginnersJenkins tutorial for beginners
Jenkins tutorial for beginners
Ā 
Building Cross Platform Apps with Electron
Building Cross Platform Apps with ElectronBuilding Cross Platform Apps with Electron
Building Cross Platform Apps with Electron
Ā 
Front End Development for Backend Developers - GIDS 2019
Front End Development for Backend Developers - GIDS 2019Front End Development for Backend Developers - GIDS 2019
Front End Development for Backend Developers - GIDS 2019
Ā 
AtlasCamp 2010: The Atlassian Plugin SDK For Fun & Profit - Ben Speakmon
AtlasCamp 2010: The Atlassian Plugin SDK For Fun & Profit - Ben SpeakmonAtlasCamp 2010: The Atlassian Plugin SDK For Fun & Profit - Ben Speakmon
AtlasCamp 2010: The Atlassian Plugin SDK For Fun & Profit - Ben Speakmon
Ā 

Similar to Scaling mobile testing on AWS: Emulators all the way down

OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open WideOCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
OCCIware
Ā 
Rapidly Building and Deploying Scalable Web Architectures
Rapidly Building and Deploying Scalable Web ArchitecturesRapidly Building and Deploying Scalable Web Architectures
Rapidly Building and Deploying Scalable Web Architectures
Keith Fitzgerald
Ā 
SumitK's mobile app dev using drupal as base ststem
SumitK's mobile app dev using drupal as base ststemSumitK's mobile app dev using drupal as base ststem
SumitK's mobile app dev using drupal as base ststem
Sumit Kataria
Ā 

Similar to Scaling mobile testing on AWS: Emulators all the way down (20)

EclipseCon 2016 - OCCIware : one Cloud API to rule them all
EclipseCon 2016 - OCCIware : one Cloud API to rule them allEclipseCon 2016 - OCCIware : one Cloud API to rule them all
EclipseCon 2016 - OCCIware : one Cloud API to rule them all
Ā 
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open WideOCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
Ā 
DevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer Tools
DevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer ToolsDevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer Tools
DevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer Tools
Ā 
Run your Java apps on Cloud Foundry
Run your Java apps on Cloud FoundryRun your Java apps on Cloud Foundry
Run your Java apps on Cloud Foundry
Ā 
Run Your Java Code on Cloud Foundry - Andy Piper (Pivotal)
Run Your Java Code on Cloud Foundry - Andy Piper (Pivotal)Run Your Java Code on Cloud Foundry - Andy Piper (Pivotal)
Run Your Java Code on Cloud Foundry - Andy Piper (Pivotal)
Ā 
Containers: DevOp Enablers of Technical Solutions
Containers: DevOp Enablers of Technical SolutionsContainers: DevOp Enablers of Technical Solutions
Containers: DevOp Enablers of Technical Solutions
Ā 
State ofappdevelopment
State ofappdevelopmentState ofappdevelopment
State ofappdevelopment
Ā 
Build mini - Windows 10 Dev & Cross platform Dev
Build mini - Windows 10 Dev & Cross platform DevBuild mini - Windows 10 Dev & Cross platform Dev
Build mini - Windows 10 Dev & Cross platform Dev
Ā 
8 Principles for Enabling Build/Measure/Learn: Lean Engineering in Action
8 Principles for Enabling Build/Measure/Learn: Lean Engineering in Action8 Principles for Enabling Build/Measure/Learn: Lean Engineering in Action
8 Principles for Enabling Build/Measure/Learn: Lean Engineering in Action
Ā 
Spring boot microservice metrics monitoring
Spring boot   microservice metrics monitoringSpring boot   microservice metrics monitoring
Spring boot microservice metrics monitoring
Ā 
Spring Boot - Microservice Metrics Monitoring
Spring Boot - Microservice Metrics MonitoringSpring Boot - Microservice Metrics Monitoring
Spring Boot - Microservice Metrics Monitoring
Ā 
Rapidly Building and Deploying Scalable Web Architectures
Rapidly Building and Deploying Scalable Web ArchitecturesRapidly Building and Deploying Scalable Web Architectures
Rapidly Building and Deploying Scalable Web Architectures
Ā 
Mobile UI Testing using Appium and Docker
Mobile UI Testing using Appium and DockerMobile UI Testing using Appium and Docker
Mobile UI Testing using Appium and Docker
Ā 
SumitK's mobile app dev using drupal as base ststem
SumitK's mobile app dev using drupal as base ststemSumitK's mobile app dev using drupal as base ststem
SumitK's mobile app dev using drupal as base ststem
Ā 
Microxchg Microservices
Microxchg MicroservicesMicroxchg Microservices
Microxchg Microservices
Ā 
Microservices for the Masses with Spring Boot and JHipster - Chicago JUG 2018
Microservices for the Masses with Spring Boot and JHipster - Chicago JUG 2018Microservices for the Masses with Spring Boot and JHipster - Chicago JUG 2018
Microservices for the Masses with Spring Boot and JHipster - Chicago JUG 2018
Ā 
DockerCon EU 2015: Day 1 General Session
DockerCon EU 2015: Day 1 General SessionDockerCon EU 2015: Day 1 General Session
DockerCon EU 2015: Day 1 General Session
Ā 
Dev ops on aws deep dive on continuous delivery - Toronto
Dev ops on aws deep dive on continuous delivery - TorontoDev ops on aws deep dive on continuous delivery - Toronto
Dev ops on aws deep dive on continuous delivery - Toronto
Ā 
DevOps On AWS - Deep Dive on Continuous Delivery
DevOps On AWS - Deep Dive on Continuous DeliveryDevOps On AWS - Deep Dive on Continuous Delivery
DevOps On AWS - Deep Dive on Continuous Delivery
Ā 
Accelerate Spring Apps to Cloud at Scale
Accelerate Spring Apps to Cloud at ScaleAccelerate Spring Apps to Cloud at Scale
Accelerate Spring Apps to Cloud at Scale
Ā 

Scaling mobile testing on AWS: Emulators all the way down

  • 1. SCALING MOBILETESTING ON AWS: EMULATORS ALL THE WAY DOWN Kim Moir, Mozilla, @kmoir URES, November 13, 2015 Good morning. My name is Kim Moir and Iā€™m a release engineer at Mozilla. Today Iā€™m going to discuss how we scale our Android testing on AWS. Show of hands - how many of you test on Android? On a continuous integration farm? References Androids by etnyk Attribution-NonCommercial-NoDerivs 2.0 Generic license https://www.ļ¬‚ickr.com/photos/etnyk/5588953445/sizes/l
  • 2. A little about me. I live in Ottawa, Ontario, Canada. My hobbies include running and making ice cream, which complement each other well. This picture shows a release engineering ice cream ļ¬‚avour - coļ¬€ee ice cream with chocolate chip cookies soaked in Kahluha. Before I was a release engineer at Mozilla I worked at IBM as a release engineer on Eclipse. So 12 years working on open source release engineering. Iā€™m really excited to be here today to share my stories, and learn from all of you.
  • 3. Hereā€™s a picture of the where the amazing Mozilla release engineering team work. As you can see, we are quite distributed across the world, and many of us work remotely from our homes.
  • 4. Mozilla is a non-proļ¬t. Our mission is to promote openness, innovation & opportunity on the web. Youā€™re probably familiar with the products we build, such as Firefox for Desktop, Android, iOS and Firefox OS. Firefox for iOS was actually released yesterday - so go and try it out! Note that we ship Firefox on four platforms and with ~97 locales on the same day as US English
  • 5. We have a continuous integration farm running 24x7 on commit. Our release cadence is every six weeks for Firefox for Android. We release betas every week. https://wiki.mozilla.org/RapidRelease Iā€™ll talk a little bit about our environment in general, before I delve into our Android test environment.
  • 6. DAILY ā€¢ 350 pushes ā€¢ 4700 build jobs ā€¢ 150,000 test jobs Here are some recent numbers on the aggregate jobs we run (all products, not just Firefox for Android). Today, about 66% of build jobs and 80% of test jobs are run on AWS. We only have our performance tests left that run on raw devices. They canā€™t run on emulators because performance is not constant. Each time a developer lands a change, it invokes a series of builds and associated tests on relevant platforms. Within each test job there are many actual test suites that run. September: 8188 pushes https://secure.pub.build.mozilla.org/buildapi/reports/pushes?starttime=1441090800&endtime=1443682800 September jobs https://secure.pub.build.mozilla.org/buildapi/reports/waittimes?starttime=1441090800&endtime=1443682800 Builds Oct 4-Oct10 https://secure.pub.build.mozilla.org/buildapi/reports/waittimes?starttime=1443942000&endtime=1444460400 builds 15560 Builds Tuesday Oct 6 https://secure.pub.build.mozilla.org/buildapi/reports/waittimes?starttime=1444104000&endtime=1444190400 2814
  • 7. 15 MINUTE SERVICE We have a commitment to developers that build/test jobs should start within 15 minutes of being requested. We donā€™t have a perfect record on this, but certainly our numbers are good. We have metrics that measure this every day so we can see what platforms need additional capacity. And we adjust capacity as needed, and remove old platforms as they become less relevant in the marketplace. ā€”ā€”ā€” Pizza picture by djwtwo Attribution-NonCommercial-ShareAlike 2.0 Generic (CC BY-NC-SA 2.0) https://www.ļ¬‚ickr.com/photos/djwtwo/9864611814/sizes/l/
  • 8. + many Mozilla tools Here are some of projects that we use in our infrastructure. Buildbot is our continuous integration engine. However, we are in the process of migrating to TaskCluster. Task cluster is a set of components that manages task queuing, scheduling, execution and provisioning of resources. It was designed to run automated builds and test at Mozilla. We use Puppet for conļ¬guration management all our Buildbot servers, and the Linux, Mac and machines. So when we provision new hardware, we just boot the device and it puppetizes based on itā€™s role thatā€™s deļ¬ned by itā€™s hostname. Our repository of record is hg.mozilla.org but developers also commit to git repos and these commits are transferred to the hg repository. We also use a lot of mozilla tools that allow us to scale. These tools are open source as well and I have links at the end of the talk to these repos. ā€”ā€” References octokitty http://www.ļ¬‚ickr.com/photos/tachikoma/2760470578/sizes/l/
  • 9. DEVICES ā€¢ 6700+ in total ā€¢1900+ for builds ā€¢4700+ for tests ā€¢75% AWS These numbers are for both Android and desktop devices. The pools overlap. 80% test AWS and 66% build AWS ā€”ā€”- References https://secure.pub.build.mozilla.org/builddata/reports/slave_health/index.html * https://secure.pub.build.mozilla.org/slavealloc/ui/#silos
  • 10. HISTORY OF MOBILETESTING AT MOZILLA Before I talk about where we are today, Iā€™d like to step back and talk about how our mobile testing evolved over the years. Hereā€™s a picture from 2009 of a mobile pedalboard. This was our ļ¬rst attempt at mobile test automation. It was used to report Fennec performance data on the Nokia N810's Picture by Aki Sasaki https://www.ļ¬‚ickr.com/photos/drkscrtlv/3590117065/sizes/l
  • 11. Picture by Aki Sasaki https://www.ļ¬‚ickr.com/photos/drkscrtlv/3590924524/sizes/l http://escapewindow.dreamwidth.org/205930.html
  • 12. In 2010, we then moved on to testing on Android 2.2 on Tegras. Tegra are bare reference boards. We stored Tegra in shoe racks from Bed Bath and Beyond These shoe racks were stored in a room that was shielded from wireless interference. The shoe racks allowed us to position the phones so they werenā€™t too close together, on a material that didnā€™t get too hot and did not conduct electricity. These racks also allowed us to easily take dead phones out, open, remove batteries, reimage and replace. Picture from John Oā€™Duinnā€™s blog http://oduinn.com/blog/2010/02/11/unveiling-mozillas-faraday-cage/ http://oduinn.com/images/2013/blog_2013_RelEngAsForceMultiplier.pdf
  • 13. In 2012, we started running continuous integration tests on Android reference cards in specially designed racks. We started with 800 of them, but only use about 200 today. The cards are called pandas. These were used to run Android 4.0 tests for correctness, debug and performance. ___ References Pictures of Panda chassis from Dustinā€™s blog https://blog.mozilla.org/it/2013/01/04/mozpool/2012-11-09-08-30-03/
  • 14. They had a custom relay board to allow us to reboot them remotely. Pictures of Panda chassis from Dustinā€™s blog https://blog.mozilla.org/it/2013/01/04/mozpool/2012-11-09-08-30-03/
  • 15. Many racks of pandas These devices are not as stable as desktop devices, and are prone to failure. Given their numbers, having to deal with the machines failing all the time is very expensive if they were managed by humans. We wrote some software called mozpool to automatically reimage and reboot them. Pictures of Panda chassis from Dustinā€™s blog https://blog.mozilla.org/it/2013/01/04/mozpool/2012-11-09-08-30-03/
  • 16. WHAT DID WE LEARN? What did we learn over these iterations of our mobile testing infrastructure? Each successive mobile testing solution became more reliable (fewer infra failures) and easier to manage via automated tools Manufacturers EOL reference cards. Old reference cards donā€™t support new Android versions Does not scale for peak load Time consuming and expensive to adjust automation infrastructure to for every new hardware iteration Picture https://www.ļ¬‚ickr.com/photos/wocintechchat/21909333504/sizes/l from http://www.wocintechchat.com/blog/wocintechphotos #WOCtechchat Picture: computer history museum https://www.ļ¬‚ickr.com/photos/indigoprime/2239342335/sizes/o/
  • 17. We have bursty traļ¬ƒc, both for time of day, time of year etc Example of the number of jobs running per hour in a typical week Bursty traļ¬ƒc - you can see that the number of jobs run each day is variable as time zones wake up, and the large trough is the weekend.
  • 18. BRANCHING We have many diļ¬€erent branches in Hg at Mozilla. Our Hg branches are all named after diļ¬€erent tree species Developers push to diļ¬€erent branches depending on their purpose. Diļ¬€erent branches have diļ¬€erent scheduling priorities within our continuous integration engine. So for instance, if a change is landed in a mozilla-beta branch, the builds and tests associated with that change will have machines allocated to them with at a higher priority than if a change was landed on a cedar branch which is just for testing purposes. Picture by Aurelio Asiain Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0) https://ļ¬‚ic.kr/p/v27AD
  • 19. Source: http://opensignal.com/reports/2015/08/android-fragmentation/ What do we need to test? Hereā€™s a picture of Android device fragmentation as of August 2015 Source: http://opensignal.com/reports/2015/08/android-fragmentation/
  • 20. And here is current Android adoption (October 2015) Android ā€œKit Katā€ 4.4 has about 40% adoption rate Android "Jelly Bean" versions (4.1ā€“4.3.1), with a combined share of 30.2%. Sources https://en.wikipedia.org/wiki/Android_version_history
  • 21. ANDROIDTEST PLATFORMS ā€¢Android 2.3, 4,0, 4.2 (x86), 4.3 ā€¢Test types ā€¢correctness ā€¢debug ā€¢performance Obviously, we cannot test on all those platforms and devices, itā€™s not feasible. We limit our testing to the following platforms.
  • 22. In 2012, we started moving our build and test infrastructure to Amazon. We ļ¬rst implemented this for desktop Firefox jobs on Linux. We then implemented them for Android. Scalable infrastructure for bursty traļ¬ƒc with an API to manage it all. Scalable Deals with bursty load APIs! Picture by Tim Norris Create Commons Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0) https://www.ļ¬‚ickr.com/photos/tim_norris/2600844073/sizes/o/
  • 23. AWSTERMINOLOGY ā€¢ EC2 - Elastic compute 2 - machines asVMs ā€¢ EBS - Elastic block store - network attached storage ā€¢ Region - separate geographical area ā€¢ Availability zone - Multiple, isolated locations within a region Iā€™m going to talk a bit about some AWS terms for those of you that may not be familiar with them. Notes: AWS instance types http://aws.amazon.com/ec2/instance-types/
  • 24. MORE AWSTERMS ā€¢ AMI - Amazon machine image ā€¢ instance type -VM with deļ¬ned speciļ¬cations and cost per hour. For example: -AMIs - Amazon has standard ones that you can modify or create your own -pricing on instance types can depend on the region -m3.medium currently costs around $0.07hr in most regions (Nov 2015 costs) -Some instance types may not be available in all availability zones
  • 25. PUPPETVS AMIS AMIs are Amazon machine instances Golden AMIs We create golden image AMIs via cron each night. These images are generated from our puppet conļ¬gs. We have diļ¬€erent images deļ¬ned for diļ¬€erent instance type and the role that they perform. For example test and build instances have diļ¬€erent libraries and conļ¬guration in puppet. Originally we used puppet to manage all our of build and test instances. It was too slow to puppetize the spot instances Solution: Create golden AMIs from conļ¬gs each night via cron. These are used to instantiate the new spot instances. We also use the same pool AMI to run Android tests and Linux tests, they just run in diļ¬€erent directories. Another reason for nightly regeneration is pre-populating VCS caches to reduce ļ¬rst time startup load. Picture by shaireproductions Creative Commons Attribution 2.0 Generic (CC BY 2.0) https://ļ¬‚ic.kr/p/dTfsCs
  • 26. USE SPOT INSTANCES ā€¢ Use spot instances vs on demand instances ā€¢ much cheaper ā€¢ not instantiated as quickly ā€¢ terminated if outbid while running Amazon has many diļ¬€erent types of instances. Initially, we used on demand instances. They instantiate quickly but cost more per hour than other options. Spot instances are Amazon way of bidding oļ¬€ excess capacity. You can bid for the instance and if nobody else bids for it at a price above your oļ¬€er, the spot instances will be instantiated for you. However, if youā€™re running a spot instance and someone bids a price higher than you did, your instance can be killed. But thatā€™s okay because we have conļ¬gured our build farm to retry jobs that failed and a very small percentage are killed this way (< 1%) Since the spot instances arenā€™t available as quickly as the on-demand instances, some tests donā€™t start within 15 minutes but thatā€™s okay. Spot instances are instantiated every time with the AMI you specify. Other notes Smart bidding spot bidding library https://bugzilla.mozilla.org/show_bug.cgi?id=972562
  • 27. Minimum viable instance type Run more tests in parallel on a cheaper instance types rather than upgrading instance type Most tests run on m3.medium but some need more Limit the subset of tests run on more expensive instance types to those that actually need it Our tests have a timeout for a suite of tests. If they donā€™t complete within this timeout, they fail and retry. Itā€™s much cheaper to run more tests in parallel on a cheaper instance type, than run on a more expensive instance type due to the scale of our operations. For example our Android 4.3 reftests invoke 48 parallel jobs. For instance, we have Android tests that run on Emulators on AWS. Some of the reference tests required a c3.xlarge to run. The correctness tests were ļ¬ne to run on m3.medium Picture by kenny magic Creative Commons Attribution 2.0 Generic (CC BY 2.0) https://www.ļ¬‚ickr.com/photos/kwl/4247555680/sizes/l
  • 28. WHEREā€™STHE CODE? ā€¢ The tools we use are all open source ā€¢ https://github.com/mozilla/build-cloud-tools ā€¢ Which use boto libraries (Python interface to AWS) https://github.com/boto/boto The code we use to interact with AWS APIs resides here
  • 29. SMARTER BIDDING ALGORITHMS ā€¢ Important scripts ā€¢ aws_stop_idle.py ā€¢ aws_watch_pending.py -stop_idle stops instances that are no longer needed given our current capacity (idle for a certain time period - threshold depends on if on-demand or spot) -aws_watch_pending activates instances given the criteria on the next slide
  • 30. REGIONS AND INSTANCES ā€¢ Run instances in multiple regions ā€¢ Start instances in cheaper regions ļ¬rst ā€¢ Automatically shut down inactive instances ā€¢ Start instances that have been recently running ā€¢ Bid on similar instance types If you look at aws_watch_pending.py, these are some of the rules that it implements We also use machines in multiple AWS regions, in case one region went down, and also to incur cost savings (some regions are cheaper). Currently we only use us-east1 and us-west2. Since all of our CI infrastructure resides in California, we donā€™t use most other regions. Unlike some companies that need to have instances available instantly - for instance I recently saw a talk by Bridget Kromhout (http://bridgetkromhout.com/speaking/2014/beyondthecode/), an operations engineer from DramaFever. This company provides international movies content on demand. They use every single AWS region because there customer base is so distributed. Better build times and lower costs if you start instances that have recently been running (still retain artifact dirs, billing advantages)
  • 31. LIMIT POOL SIZE Limit pool size The size of the AWS pools allocated to diļ¬€erent instance types is limited so if the number of requests spikes we have higher pending counts, but not a huge spike in our AWS bill. Bidding algorithm does not bid automatically bring up machines for all pending jobs. Adds some more capacity, waits, re-evaluates pending count, and adds some more if needed Similar to thermostat system to heat your house, gradually add more heat Picture - Ottawa Arboretum - Creative Commons Attribution-NonCommercial 2.0 Generic (CC BY-NC 2.0) https://www.ļ¬‚ickr.com/photos/rohit_saxena/4552766281/sizes/l
  • 32. LIMIT EBS USE ā€¢ EBS is network attached store to the EC2VM ā€¢ Much cheaper to use the disk that comes with the instance type
  • 33. SUMMARY: AWS ā€¢ Golden master of AMIs regenerated daily ā€¢ Use spot instances ā€¢ Smarter bidding algorithms ā€¢ Optimize use of regions, instance type and capacity ā€¢ Limit pool size and increase capacity gradually ā€¢ Use instance storage vs EBS to save $ With these changes, we reduced our initial AWS bill by 70% (as of last year) However, today we use AWS S3 (backend storage) so this has really increased our bill from our initial implementation (we migrated all of our FTP data to S3)
  • 34. EMULATOR ENVIRONMENT (1) ā€¢ Android 4.3 (AOSP 4.3.1_r1, JLS36I); standard 2.6.29 kernel ā€¢ 1 GB of memory ā€¢ 720Ɨ1280, 320 dpi screen ā€¢ 128 MBVM heap ā€¢ 600 MB /data and 600 MB /sdcard partitions ā€¢ front and back emulated cameras; all emulated sensors ā€¢ standard crashreporter, logcat, anr, and tombstone support So now that weā€™ve talked about our AWS environment, letā€™s talk about our move to emulators From https://gbrownmozilla.wordpress.com/2015/04/23/android-4-3-opt-tests-running-on-trunk-trees/
  • 35. EMULATOR ENVIRONMENT (2) ā€¢ Run emulator that comes with Android SDK and load the custom image, install Firefox apk ā€¢ We run tests on a variety of instance types (m3.medium, m3.xlarge, c3.xlarge) http://developer.android.com/tools/devices/emulator.html
  • 36. This a screenshot of when the emulator is starting up. We have a tooling in our test suites that creates a screen shot when the emulator starts, or when a test fails. These binaries of the screen shots, logs or other testing artifacts are uploaded to Amazon S3 storage and available for developers when their tests fails.
  • 37. This screenshot is of and android test suite test failure. Most of the time the logs that are uploaded with the screenshot are more useful. Example log http://mozilla-releng-blobs.s3.amazonaws.com/blobs/try/ sha512/61c91375333e3265c832cļ¬€6f1ļ¬€314fb9b70c6a2d15386f0a303c7226cfd1ed7209680d88ac032332907a43cfcf4f03c5f02e5531101ae3b855c699ce1e4e02
  • 38. ACCESSTO DEVICES ā€¢ Access to processes via adb (Android debug bridge) ā€¢ Allows us to kill errant processes ā€¢ Some test types require root permissions to copy ļ¬les to certain locations or for other privileged operations http://developer.android.com/tools/help/adb.html
  • 39. MIGRATION PROCESS ā€¢ Moved correctness tests, then debug ā€¢ Many intermittent issues ā€¢ Debug were problematic ā€¢ Take longer and consume more resources Migration Process Intermittent issues Debug were problematic Take longer and consume more resources
  • 40. MIGRATION LESSONS ā€¢ Use more powerful instances types ā€¢ Specify timeouts that are longer for individual tests ā€¢ Skip tests on certain (slow) platforms ā€¢ Split the tests into smaller tests ā€¢ Optimize or simplify the test https://gbrownmozilla.wordpress.com/2015/05/26/handling-intermittent-test-timeouts-in-long-running-tests/
  • 41. PERFORMANCE TESTS ā€¢ Autophone is a Mozilla project measuring page load performance and testing video playback on real Android devices ā€¢ Provision, verify, recover, run tests and identity status of variety of phones Retain small pool of real devices for performance tests From https://wiki.mozilla.org/Auto-tools/Projects/Autophone Verify that a phone is working correctly: sd card is writable and not full, etc. Attempt to recover a phone that reports errors, rerunning the current test/test framework. Provide at least a high-level status for all phones: whether they are idle, running a test, or disabled/broken. Support a large number of phones, potentially split amongst several host machines.
  • 42. EMULATORS IN AWS:THE GOOD Emulators: the good When we want to test a new Android version, we just need a new emulator image, not a new hardware stack. No lead time associated with procuring and installing new hardware in the data centre. Increased reliability due to fewer retries (2% vs 18% on Pandas) Some of that reliability stems from the fact that with the emulator tests will run them from the same, fresh Android image each time. When the tests ran on devices, the reimaging process took a long time and the devices had to be re-imaged every so often which was a more manual process. Scalable to deal with daily job spikes We donā€™t have to write and maintain software to manage a pool of devices. We can just use the Amazon APIs to provisions resources for our CI system. Picture by SaturatedEyes - Creative Commons Attribution-NonCommercial 2.0 Generic (CC BY-NC 2.0) https://www.ļ¬‚ickr.com/photos/shuttershuk/7099823113/sizes/l
  • 43. EMULATORS IN AWS:THE BAD ā€¢ More tests running in parallel (tests run slower, added more tests) ā€¢ No performance tests because weā€™re running emulators on emulators Emulators: the bad Tests run slower because weā€™re running tests on emulators on emulators More tests need to run in parallel because they take longer Example: Android 4.3 debug tests need to run about 2x many jobs as they did when running on raw devices No performance tests (have a separate pool of raw devices for this purpose) As a side note: Amazon has a new oļ¬€ering from this summer called Device Farm which allows you to run tests on a multiple devices. We donā€™t use it because it is through an API that doesnā€™t support the tests harnesses that we use. Also, it doesnā€™t that doesnā€™t allow root access to the device. Also, the pricing ($250 a month for a single dedicated device) is much more expensive than spot instances). Picture by Tuncay - Creative Commons Attribution 2.0 Generic (CC BY 2.0) https://www.ļ¬‚ickr.com/photos/tuncaycoskun/15809887756/sizes/l
  • 44. SUMMARY: EMULATORS ON AWS ā€¢ Determine what testing can be done on emulator vs real device ā€¢ Use minimum viable instance type ā€¢ Run more tests in parallel May need larger instance type to speed up longer running tests Minimize the number of tests that need to run on real hardware. Running tests on real devices in continuous integration is much more complicated/painful that running them on emulators. Does not allow you to upgrade easily for the next Android version
  • 45. FUTURE WORK ā€¢ Android 5.0 on emulator ā€¢ Make it better
  • 47. WHEREā€™STHE CODE? ā€¢ Cloud tools: https://github.com/mozilla/build-cloud-tools ā€¢ buildbot conļ¬gs https://github.com/mozilla/build-buildbot-conļ¬gs ā€¢ builldbotcustom https://github.com/mozilla/build-buildbotcustom ā€¢ Mozharness https://github.com/mozilla/build-mozharness ā€¢ Mozpool https://github.com/mozilla/mozpool ā€¢ Puppet conļ¬gs https://github.com/mozilla/build-puppet
  • 48. LEARN MORE ā€¢ @MozRelEng ā€¢ http://planet.mozilla.org/releng/ ā€¢ Mozilla Releng wiki https://wiki.mozilla.org/ ReleaseEngineering ā€¢ IRC: channel #releng on moznet
  • 49. MORE READING 1 ā€¢ Laura's talks on monitoring complex systems http://vimeo.com/album/3108317/video/ 110088288 ā€¢ Armenā€™s talk on our hybrid infrastructure https://air.mozilla.org/problems-and-cutting- costs-for-mozillas-hybrid-ec2-in-house-continuous-integration/ ā€¢ Move to AWS starting in 2012 ā€¢ http://atlee.ca/blog/posts/blog20121002ļ¬refox-builds-in-the-cloud.html ā€¢ http://johnnybuild.blogspot.ca/2012/08/migrating-linux32-and-linux64-builds-to.html ā€¢ http://atlee.ca/blog/posts/blog20121214behind-the-clouds.html ā€¢ http://rail.merail.ca/posts/ļ¬refox-unit-tests-on-ubuntu.html Scaling http://atlee.ca/blog/posts/bursty-load.html jacuzzis http://atlee.ca/blog/posts/initial-jacuzzi-results.html http://hearsum.ca/blog/experiments-with-smaller-pools-of-build-machines/ Caching
  • 50. MORE READING 2 ā€¢ AWS spot instances vs reserved instances ā€¢ http://atlee.ca/blog/posts/now-using-aws-spot-instances.html ā€¢ http://rail.merail.ca/posts/ļ¬refox-builds-are-way-cheaper-now.html ā€¢ http://rail.merail.ca/posts/ec2-spot-instances-experiments.html ā€¢ http://taras.glek.net/blog/2014/05/09/how-amazon-ec2-got-15x-cheaper-in-6-months/ ā€¢ http://taras.glek.net/blog/2014/03/05/more-and-faster-c-i-for-less-on-aws/ ā€¢ AWS networking ā€¢ http://atlee.ca/blog/posts/aws-networks-and-burning-trees.html ā€¢ http://rail.merail.ca/posts/using-dns-to-query-aws.html
  • 51. MORE READING 3 ā€¢ Scaling ā€¢ http://atlee.ca/blog/posts/bursty-load.html ā€¢ jacuzzis ā€¢ http://atlee.ca/blog/posts/initial-jacuzzi-results.html ā€¢ http://hearsum.ca/blog/experiments-with-smaller-pools-of-build-machines/ ā€¢ Caching ā€¢ http://atlee.ca/blog/posts/cache-em-all.html ā€¢ Geoffrey Brownā€™s blog on Android tests https://gbrownmozilla.wordpress.com/