Initial thoughts on live user testing for games
What is the purpose of live user testing (closed alpha, closed beta, open beta, and
similar) for a game, from a test perspective? What types of tests can you
effectively move to live user testing, and what types of tests are better suited for
professional game testers? What type of feedback can you expect from live user
test compared to professional game testing? Which bugs are easier found with
live user testing than normal game testing? These are questions I will explore in
this article, and add my personal view on the subject.
From a marketing and business perspective there can certainly be many reasons
to run different types of live user tests, but this is not something I will cover here.
We will only discuss the purpose from a testing/QA perspective.
So what should be the purpose of live user testing from a test/QA perspective?
The way I see it live user testing has two main purposes:
Feedback from real users
Finding specific types of bugs that are difficult to find for a small group of
professional game testers
Based on this purpose we can pinpoint a number of test activities that could
primarily be done through live user tests instead of by game testers.
Types of tests
I have identified four categories of tests, which I think could be done at large
scale through live user tests instead of by game testers.
Fun Factor Testing
If something is fun or not is often highly subjective. If you do this through live
user test instead of with game testers, and you manage to get feedback properly,
it could provide a large and diverse feedback material, which will be more
valuable, and more accurate.
Learnability & Attractiveness
Can users traverse the UI properly? Do they understand how to play the game?
What is the learning curve of the players? Is the game attractive to them? Does
the tutorial do a good job of teach the core mechanics of the game? Immersion?
This is something that is very hard for game testers to give universal and
unbiased input on.
Monitoring weapon usage in an FPS beta test to see which weapons give a higher
K/D ratio could be one example. Monitoring level difficulty based on how many
users pass a specific level during a beta test could be another. Game testers are
often much better than normal users and difficulty can be hard to assess based
on their performance. It can also be hard for game testers to find the more
nuanced game balancing problems, which only show up in a large statistical
material. Obviously game testers should find large imbalances long before a live
user test starts.
Stress testing with a large amount of users in a live environment at the same
time is an obvious example. Login servers for MMOs come to mind. Long term,
high load testing is also difficult to run in a test environment. What happens if
100000 players are active in the game for an extended period of time? Will we
have performance degrades, or even crashes?
So what kind of feedback can we expect from the users during a live user test?
On a high level there are two kinds of feedback. You have the feedback forms and
surveys that the user fill in and send to you, and you have the data you can
collect from the live user test.
If you are going to perform valuable live user test, it is my opinion that you have
to have the data framework and diagnostic tools in place to handles this. If you
can see what the users are doing through data, then suddenly you can take
decisions on that data. If 99% of players cannot pass a specific level in 25
attempts, then perhaps it is to hard. If everyone in an FPS match is using the
shotgun and no other weapon, then that gives you some information. If a players
K/D ratio is continuously 5 with a pistol, but only 0.5 with an assault rifle, it is at
least worth investigating. If players quickly skip the tutorial, then perhaps it
should be redesigned. Things like play session lengths, how often players return
to the game, when they decide to stop playing, are other examples of value data
that can be collected.
The more data you can get the better. This is something you have to think about
early in the design and development process, as it is often very hard to solve
There is of course also a value in surveys and written feedback, however it is
often much more difficult and expensive to gather and manage. Also, many times
people cannot properly understand and assess the situation. The sniper rifle may
feel overpowered, but the statistical data says otherwise. A new interface may be
scary at first, but once the user is further along the learning curve it may be a
significant improvement. How to create the optimal feedback form to get the
right information from the users I leave to experts in that field.
So what kind of input can the game designers get from live user test? I see the
input as two separate categories.
o Level difficulty
o Weapon/skill/class balance
o Game mechanics fine tuning
User experience / User interaction
o Fun factor
o Physics believability
o Ease of use
o Game mechanics
I will not go further into these categories, since I have basically covered them
already. This is not a comprehensive list and you can certainly go into more
details here, but I will leave that to someone else.
But apart from this feedback we can also expect to find some bugs, which can be
hard to find for game testers.
Here I will list a number of categories of bugs that I think can be found during
live user test, which are hard to find for game testers even if they perform
Bugs triggered by a large amount of simultaneous users over different
o As I explained in the reliability test section above
o Bugs that only show a certain % of the time and cannot be easily
o Bugs that only appear for specific combinations of factors
o If you have completed a number of quests in a certain order, and
you take a specific action, then a bug appears
o Users have different hardware and software platforms, and
o All combinations of platforms and accessories is impossible to
cover for game testers, but with thousands of users, you will get a
o Connectivity and network configurations could also play a role
As you may notice, I did not list interoperability test as something that should be
plan for live user test, but I still list interoperability bugs as something you can
expect from live user test. My reason for this is that game testers should secure
interoperability earlier than live user test, but as I wrote above it will still be
difficult for them to cover all different combinations. So I see interoperability
bugs as a “fortunate” side effect of live user test, but it is not something that you
can just ignore before this phase. Basic interoperability problems should be
found long before live user test is started. Same goes for intermittent bugs and
combinatorial bugs. You should plan for and try to find these types of bugs
earlier, but live user test helps you uncover those bugs that could be impossible
to find by a small group of game testers.
You should not use live user test instead of professional game testers – they
should complement each other. Of course there could be for example financial
constraints that force you to move into live user test too early, but from a pure
testing perspective, this is my view.
I believe that there are many hidden costs with starting live user test too early.
Not only will the users not enjoy a bug-riddled game, which may result in
negative feelings about the game, but they will also flood the developers with
feedback, which needs to be analyzed. Imagine thousands of bug reports which
all have the same simple core problem. Analyzing all of them will be very time
and resource consuming for something that could have reported by a single
game tester. Bug reports from live user tests are seldom as accurate and easy to
analyze as similar reports from game testers.
I think live user test is a very valuable tool if wielded correctly. But it is not a
silver bullet to all your testing problems.
Obviously you can deep dive into much of what I have discussed. There are data
analysts, game designers and game developers that I am sure have much to say
about different aspects of live user test, but in this article I am tried to focus on
the testing perspective, and these are my views at the moment.