Testing is simple. You understand what is important, you test it, then, if possible, you automate it.
We want to deliver a great and stable experience to all of the users every time!
Why?
We affect a lot of users and we care about them.
2. Agenda
‣ Organization
The company - fast facts
TPD - Tech Product Design
‣ Release process
Mobile development
Mobile testing
‣ Exercise
Scripted Testing vs Model Based Testing
6. “The right music for
every moment”
Started in 2006 in Sweden
Available in 60 countries
Over 30M songs
Over 2 billion of playlists
100M users
40M paying subscribers
$5 billion revenue paid to stakeholders
48. MBT Interfaces
public interface StartRadio {
public void e_Init();
public void v_InitialView();
public void e_GoToArtist();
public void v_Artist();
public void e_StartArtistRadio();
public void v_ArtistRadio();
public void e_GoToPlaylist();
public void v_Playlist();
public void e_StartPlaylistRadio();
public void v_PlaylistRadio();
public void e_GoToAlbum();
public void v_Album();
public void e_StartAlbumRadio();
public void v_AlbumRadio();
public void e_Pause();
public void v_TrackNotPlaying();
}
Explains the rules of the game.
Time at the end to ask questions, but I don’t want this to be a conference, so please don’t hesitate to interrupt me. Put your hand up and throw the question.
I am Software Engineer in Test at Spotify, I joined the band in 2014. Before I was a developer in Spain, where I come from. In my previous company, a startup of 20 people, at some point all our QA people left. Then I realised how important was quality, testing, and at the same time how boring was to do the same things over and over. So I developed a quality mindset, and discovered testing is simple. You understand what is important, you test it, then, if possible, you automate it.
Here the question is, How can we test it?
This simple question asked before any development has started can prevent a lot of bugs and issues.
This is about bug prevention and not about bug detection.
And that’s what this talk is about.
Explain the union of the teams and the company.
Fast facts about Spotify
https://press.spotify.com/se/about/
Being present everywhere.
iOS, Android, Windows, OSX.
We want to deliver a fabulous and stable Spotify experience to all of the users every time!
Why?We affect a lot of users and we care about them.
Tech
Product
Design
Every tribe has a business missions, such as Provide fast and reliable access to all the world’s music. And make sure they have the right mix of domain area knowledge.
Alliance – a collection of tribes.
Three tribe leads, one Product lead, one Design lead and one Tech lead, all together.
Product platform alliance mission is to: ensure a unified Spotify experience wherever it matters and that Spotify can scale product innovation with quality.
Partners tribe mission: Work with partners to make Spotify experience available where it matters, be it in cars, TVs, game consoles, speakers, 3rd party apps or emerging platforms.
Everyone in the same chapter has the same domain area knowledge (web developers, mobile developers, QA etc.). The chapter sees to personal development within your area of expertise.
Squads has no manager, instead the entire squad own their work.
Feels like a startup
Self-Organized
Cross-functional
5-7 engineers, less than 10 people
Client devs, backend, QA, TA, Product Owner, Designer.
Same here, TPD in the same team.
Squad priorities
Deal with live incidents
Deal with blocking bugs
Normal feature work
Loosely coupled, tightly aligned squads
Alignment enables autonomy.
Consumer squads have a mission. Explore: “Make easy to discover and browse music”
Car integration is Spotify in the car’s radio: BMW, Pioneer, etc.
Platform squads support the rest of the organization, like “iOS Infra”
~15 squads deliver into the clients.
They decide, when all the teams have said “we are ok”, to release.
Release criteria
Reach 50 000 streams on a release candidate
Monitor all quality metrics before releasing
Quality metrics?
Share of users without a crash
Startup time
View loading times
View loading errors
Playback latency
Playback errors
What is happening in those Feature squads to bring new cool ideas into your hands.
To understand the release process first we need to explain the way of working
Waterfall
Discover one problem
Solve it and deliver the solution
Long term projects
Plan driven
Agile
Delivery is out of the loop because there is a dependency with mobile release cycle.
Thing about a simple solution, build and verify is fixing the problem. Then repeat this in short cycles (sprints) to improve the solution.
Value driven
By we, I mean in my squad at least. Remember each squad can define the way of working.
We use Agile but we made it more dynamic, it is a mix of methodologies.
Here when we say testing we meant any kind, Manual or Automated testing.
Short feedback loop and constant
Now we are going to talk of this stage of the release process
An automated software delivery process.
Purpose: getting software from version control
into the hands of the users.
Every change:
Build software
Run a sequence of test stages
Release
In theory
Every commit is built and ready to be delivered to the users
In practice
Our master branch should always be in a shippable state
Employees and beta testers get new builds automatically
Pace of delivery lower than pace of change
One repository per (Spotify mobile client + Automated tests)
Create pull request
Trigger Continuous Integration
Code review
+1
Merge it
Example Spotify iOS client
60 regular contributors
80-120 merged pull requests per day
Now we are going to talk of this stage of the release process
It is a team effort.
Squad owns quality.
The team is responsible for what they are doing.
Then, there is a quality role in the team, what is the QA doing?
First, what is not doing.
QA, Quality !Assurance -> Assistance
Doesn’t own quality, part of team that helps out with quality.
And why am I not referring to quality assurance you might ask? That’s the established abbreviation. Well, that’s not what QA at Spotify is about – QA doesn’t own quality, don’t jump at the end of the process to put a stamp saying “it is ok”.
QA helps out with quality. We investigate. We learn things. We report lessons learned. We make sure people know what we’ve found and what its significance means. We provide data that is important for understanding and improving the quality of the product. We analyze, we assist, we advise – that is definitely not the same as assuring quality. Who can do that?
I like to call it Question Asker. It needs to question every uncertainty, unclarity, risks etc. That is when the question. “How do we test it?” gains power. And the earlier in the process is asked, the better. Communication is a key to success.
We have different roles related with testing:
QA – Quality Assistance
Puts testing first. Quality assistance and advising. Product experts. Be the user. Risk analysts. Manual testing with an exploratory mindset.
TA – Test Automator
Focus on test automation. Ambassadors of test infrastructure. Implement and maintain test frameworks. The main purpose of TA is to enable manual testers to focus on what manual testing should be. Helps teams automate tests for their products.
QE – Quality Engineer
Tool smiths. Develops developer productivity infrastructure. Focus on testability. Push test beyond conventional. developers that helps teams building tools for monitoring, testing, code evalutation etc.
then the questions is, who tests?
Our Co-founder, Martin, helping us in a test session.
Martin is not our tester. He is still good though!
Yes, we have test sessions to verify new features, the QA is facilitating these sessions.
Because, as we said, the squad is the owner of the quality, so the decision about if something is shippable relies on the squad.
We do a quick round the table thumbs up/down to know if we are ready to go. Or if they found any blockers.
Why do we perform testing?
To verify the product meet user expectations
Mindmap or risk areas of a test session, the QA facilitate the test session, as we said.
This includes analysis of what test scenarios we do want to cover.
That drives to create manual scripted testing for some test sessions.
That are a candidate to turn into Automated tests for the regression testing.
Now this feature is out and we will need to verify it every 2 weeks.
Why regression testing?
We are delivering a fresh experience every 2 weeks, and different teams are adding a lot of changes every day that can break your stuff.
Imagine this every 2 weeks!
This represent a big challenge:
Complex device matrix
Different devices, hardware, operating systems, screen sizes, localizations etc.
Many user conditions to test on
Memory, CPU, network conditions etc.
Many test scenarios
Same functionality in many different contexts and conditions. Example: Connect
Testing slows down release cycle
Crashes and issues reported post-launch
Poor ratings and loss in users
We try to cover everything every 2 weeks, but
We cannot test everything!
How we deal with challenges?
Test automation.
Automate what can be automated
Define a strategy that maximizes our test coverage
Based on our user data
Prioritize test cases with high value and low cost
Test on both simulators/emulators and real devices
Monitoring and crash reporting tools
Measure performance
Only Test automation? No, we also rely on:
Manual testing help from:
Employees
Beta users
Improve product quality by bug prevention
Automated regression testing
Increase test coverage
Enable developer productivity
Shorten feedback loops
Free up manual testers
More time for exploratory testing, user experience, UI interactions, look and feel, etc.
Does the product actual makes sense for the end user?
Point out that when we talk about doing manual testing here we refer to exploratory testing because scripted testing can be automated.
Scripted testing, ideally, must be done only once (during the test session).
E2E(end-2-end test) is a way out of the boring repetitive scripted testing.
Testers know the app and the feature and they can invest the time in exploratory testing,
They will find corner cases bugs.
Remember, avoid an Ice Cream shape.
Clear visualisation of, why do not have only unit tests.
Simple TA tests running on pre-merge -> Bug prevention
Manual testing = Manual exploratory testing
Every Bug fixed = regression test (unit, integration or system test) Remember the pyramid
-Manual scripted testing fully automated.
-Create high confidence on TA for developers on each pull request.
Fearless development: My change, if green, will not break stuff.
1000 pre-merge /day - only scripted
15000 BVT /day - only scripted
40000 SUPA /day - scripted and MBT
For iOS where the hardware and software have been built under the same roof we are ok using emulators running on the cloud.
We still have real devices testing for performance tests.
For Android, every hardware is different, and the OS behave in a different way. So testing in real devices is needed, we still having tests on emulators though. We have a QA Lab in the Stockholm office with a few devices.
What happen when the team say, “yes, we are ready to ship”?
We are doing a release every 2 weeks, that is why it is out of the loop.
And What would happen when the feature you have been working on is not ready on that date?
As we saw, the development are in progress while the testing is performed, so maybe not all your feature is there on time.
Every 2 weeks we create a new branch called Release, and this action or day we called it, Feature Complete.
If you are not ready, it is ok, your feature is disabled by a toggle so you are not blocking the release. It will be out to users once it is shippable.
Release cycle
Every 2 weeks:
Feature complete -> release branch
Only fix blocking bugs on release branch
Release candidate
No blocking bugs and all tests green
Sent to our beta users
Release criteria
Monitor all quality metrics before releasing
50 000 streams on a release candidate
Release to users
App Store/Google Play Store
Quality metrics?
Share of users without a crash
Startup time
View loading times
View loading errors
Playback latency
Playback errors
We have a way to deliver the latest version to employees.
Incremental rollouts of a new feature.
Who tests?
Employee releases
Beta programs
Incremental roll-outs
A/B tests
Build and test at every software change
Imagine you are building a bridge, while you are on it. And you are going to step up on the next rock. You want to be sure it is not going to fall.
- Automate what can be automated
We are still on the bridge, for whatever reason we forget to check the next rock. We have a safety net.
Because this is about failing and learning from it.
Put the results of your tests where everybody can see them.
Share the results with the rest of the people will engaged them, get feedback from others is always a good way to improve. And when everything is good, it is nice to remember the hard work behind it.
Put a big screen with the dashboard, it is a top satisfaction to see it green. People needs to see the result of writing all those tests.
We are creating our own test interface for our mobile client.
They are Acceptance or end to end tests
Why?
The idea of having the Test API inside the app is because we have access to the client state.
We want to have our own asserts, enable us to write much better assertions. And this can’t be done with other mobile frameworks.
For example, There are other options like Appium.
In this case the mobile automation framework is situated outside the App.
It has some good points, like you don’t need to recompile the client for a TA change.
However, we are not reinventing the wheel, we are giving the TA’s the option to use another language for the tests.
This API is not included in the app when we build it for Production. It is only included when we build the app for Testing/debug mode
Who is written what?
We need our developers to do this. Test API
TA Acceptance tests
And because the implementation of the acceptance test is born from a collaboration between the TA and the Client Developer
This is done before saying the Feature is complete.
It is manually tested and if possible this manual test is automated
When?
So the moment to write this code is when we are creating the feature.
We are creating the test interface at the same time that the TA is writing the Java test, so we know what we need to implement.
Having the Test interface inside the App give us more control about the assertions that we can do in the tests,
Some people are writing the acceptance tests when the feature is built. That can be a mistake.
An Acceptance test is also code. It’s another task of the development process.
We are applying Agile methodologies.
We have a story sliced into tasks, and one of them is the TA Api, other is to write the acceptance test.
You have the logic fresh in your mind, the discussions about the feature and what to test.
The sooner that you can find a bug the better, and here it is about bug prevention
Maslow hierarchy, is the idea that you need to satisfy your more basic needs first before higher levels.
We have unit tests, is the base of our pyramid.
We are trying to move to a higher level to the top of it.
Report step
Perform an action
Verify the result
Repeat
Pros:
Uses JUnit type of testing
Easy to get started
Deterministic
Cons:
Lack of edge case coverage
Ask how many are using MBT in the model slide. Then tell.
We are using Model Based Testing in combination with Graphwalker (open source tool)
to cover corner cases and bigger areas.
Graphwalker generates test sequences with various strategies and stop conditions.
Pros:
Easier to design and follow flow of test
Faster to modify test behaviour with little to no code changes
Test edge cases
Cons:
Tests are non-deterministic (multiple edges/vertices)
Graphwalker dependency and overhead
Zoom in in the clickagemediaobjetc
This model is truly dependent to the client
This model is really dependent of the client
I want you to focus on the zoom area
First we created the model for a client and after that we wanted to implemented it for other client. We need to repeat the same model AGAIN!
Here you have an example of Search.
This model was one of the first models created:
We have 2 mobile clients, Android and iOS and desktop.
We need to create and maintain 3 models for the same functionality.
This is how it looks now, and the kind of model that we are creating.
In this model the actions are client independent.
We can share the same model between different clients
The idea is to take the action as a status change. Not as a click, tap or other client related action.
Leaving the implementation details to the coding.
Be the user when you create a model.
How a model will translate into Java code
Execution of a test comparison.
Exercise.
Who has Spotify on the mobile? Pair up or just on your own.
Play around with the app, only with search, and only with the tracks section. You must need to have a premium account.
I do it on the simulator so they see what I am talking about.
Go to Search
Search for a track
Play a track from Search results
Verify it is playing
2 weeks later…
You have a new release, and you didn’t write any test.
Test it again
2 weeks later…
You have a new release, and you didn’t write any test
2 weeks later…
You have a new release, and you didn’t write any test
Write together a scripted test.
Create together a model and implement the test.
We introduce a new section to go and see more tracks results.
Go to Search
Search for a track
Click on “See all tracks”
Verify a new view with more than 5 tracks is loaded.
Add this scenario to the scripted test.
Add it to the model and implement the test.
We discovered a bug, that after going into this “See all tracks” section
then go back to search results
Tap on a track row.
What happened:
It was not possible to start playing a track.
What I expected:
The track starts to play.
Which method would have found the bug?
Would it be always like that?