Making a game "Just Right" through testing and play balancing


Published on

James C. Smith from Reflexive explains how testing and play balancing help perfecting a game.

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Making a game "Just Right" through testing and play balancing

  1. 1. Making a Game “Just Right” Through Testing and Play Balancing James C. Smith Co-founder / Producer Reflexive Entertainment July 23-25 2008 CGA Seattle 2008
  2. 2. About James C. Smith (in 60 seconds or less) <ul><li>Co-founded Reflexive Entertainment 1997 </li></ul><ul><li>Producer (vision holder) & lead programmer </li></ul><ul><ul><li>Ricochet Xtreme </li></ul></ul><ul><ul><li>Ricochet Lost Worlds </li></ul></ul><ul><ul><li>Big Kahuna Reef </li></ul></ul><ul><ul><li>Big Kahuna Words </li></ul></ul><ul><ul><li>Big Kahuna Reef 2 </li></ul></ul><ul><ul><li>Ricochet Infinity </li></ul></ul><ul><ul><li>Build in Time </li></ul></ul><ul><li>Editor & Chief: </li></ul>
  3. 3. World Map (agenda) <ul><li>List types of testing, their goals, and methods </li></ul><ul><li>Usability Test </li></ul><ul><li>Play Balancing </li></ul><ul><li>Q&A / Resources </li></ul>
  4. 4. Types of Testing <ul><ul><li>Focus Testing </li></ul></ul><ul><ul><li>Usability Testing </li></ul></ul><ul><ul><li>Play Balancing </li></ul></ul><ul><ul><li>Bug Testing </li></ul></ul><ul><ul><li>Compatibility Testing </li></ul></ul>
  5. 5. Types of Testing <ul><ul><li>Types of Testing </li></ul></ul><ul><ul><ul><li>Focus Testing </li></ul></ul></ul><ul><ul><ul><li>Usability Testing </li></ul></ul></ul><ul><ul><ul><li>Play Balancing </li></ul></ul></ul><ul><ul><ul><li>Bug Testing </li></ul></ul></ul><ul><ul><ul><li>Compatibility Testing </li></ul></ul></ul><ul><ul><li>Differences </li></ul></ul><ul><ul><ul><li>Different Goals </li></ul></ul></ul><ul><ul><ul><li>Done at different time </li></ul></ul></ul><ul><ul><ul><li>May need different type of testers or different interaction with testers (in person vs. remote) </li></ul></ul></ul><ul><ul><li>May end up combining them </li></ul></ul><ul><ul><ul><li>You should still think of their goals independently </li></ul></ul></ul>
  6. 6. BETA testing <ul><ul><ul><li>Ambiguous term </li></ul></ul></ul><ul><ul><ul><li>Means many different things to different people </li></ul></ul></ul><ul><ul><ul><li>I never user this term in formal discussions </li></ul></ul></ul><ul><ul><ul><li>Informally, it often refers to any or all of the kinds of testing we will discuss today </li></ul></ul></ul>
  7. 7. Focus Testing <ul><li>Goal : See if people like the game mechanic, theme and style </li></ul><ul><li>When : As early as possible. Before the game finished or hardly even started </li></ul><ul><li>Can accomplish a lot with no game implementation by using mocked up screen shots, story boards, and paper </li></ul><ul><li>Can accomplish more with a prototype even if there is no tutorial and hardly any levels finished </li></ul>
  8. 8. Usability Testing <ul><li>Goal : See if the players understand all the features of the game </li></ul><ul><li>When : After the tutorial is finished and the early levels </li></ul><ul><li>How : Watch people play for 1 hour </li></ul><ul><li>More on this later </li></ul>
  9. 9. Play Balancing <ul><li>Goal : Figure out which levels are too hard or easy, which items are too expensive or too powerful </li></ul><ul><li>When : “ game play feature complete” </li></ul><ul><ul><li>After all the levels and items are made </li></ul></ul><ul><ul><li>Every part of the play mechanic works </li></ul></ul><ul><ul><li>Meta game structure and shell stuff may be incomplete such as trophies, story screens, configuration menus, and maybe even tutorials </li></ul></ul><ul><li>How : Have off site people play instrumented build all the way though the whole game. Collect data, analyze, and adjust game setting. Repeat. </li></ul><ul><li>More on this later. </li></ul>
  10. 10. Bug Testing <ul><li>Goal : Find features of the game that don’t work as intended </li></ul><ul><li>When : After game is feature complete </li></ul><ul><ul><li>(everything is implemented including meta structure and shell) </li></ul></ul><ul><li>How : Internal QA staff, contract testing company, or volunteer community of game players </li></ul><ul><li>Tools Reflexive Uses : </li></ul><ul><ul><li>Bug Tracking: Bugzilla </li></ul></ul><ul><ul><li>Forums (phpBB, vBulletin, …) </li></ul></ul><ul><ul><li>Video Recording software: CamStudio </li></ul></ul><ul><ul><li>Volunteers from customer base (Same people who did the play balancing) </li></ul></ul>
  11. 11. Compatibility Testing <ul><li>Goal : Make sure the game work with every kind of hardware and software imaginable </li></ul><ul><li>When : Could be as early as when engine is stable and 90% of content is in. May wait until game is feature complete. </li></ul><ul><li>How : In house lab or contact testing company or rely on feedback from beta testers </li></ul><ul><li>What Reflexive Does : </li></ul><ul><ul><li>Reuse mature framework battle tested for years </li></ul></ul><ul><ul><li>Rely on reports from play balancing tester and bug testers </li></ul></ul>
  12. 12. Types of Testing - Review <ul><ul><li>Focus Testing </li></ul></ul><ul><ul><li>Usability Testing </li></ul></ul><ul><ul><li>Play Balancing </li></ul></ul><ul><ul><li>Bug Testing </li></ul></ul><ul><ul><li>Compatibility Testing </li></ul></ul>
  13. 13. Schedule Naive Schedule
  14. 14. Schedule Better Schedule
  15. 15. World Map (agenda)
  16. 16. Usability Testing <ul><li>Goal : See if the players understand all the features of the game </li></ul><ul><li>When : After the tutorial is finished and the early levels </li></ul><ul><ul><li>Don’t wait for the game to be finished </li></ul></ul><ul><li>How : </li></ul><ul><ul><li>Tester plays game for about 45 minutes </li></ul></ul><ul><ul><li>Moderator watches and takes notes </li></ul></ul><ul><ul><li>Tester answer a survey </li></ul></ul>
  17. 17. Usability Testing – The Tester (player) <ul><li>Requirements </li></ul><ul><ul><li>Must be players who haven't played this game before </li></ul></ul><ul><ul><li>Should be a “casual” player </li></ul></ul><ul><ul><li>Really needs to be done in person </li></ul></ul><ul><li>Can’t be </li></ul><ul><ul><li>Remote customers </li></ul></ul><ul><ul><li>Development team members </li></ul></ul><ul><li>Candidates (from easy to hard, worst to best) </li></ul><ul><ul><li>Employees not on the team </li></ul></ul><ul><ul><li>Family and friends </li></ul></ul><ul><ul><li>Random people off the street </li></ul></ul><ul><ul><li>Customer who happen to be local </li></ul></ul>
  18. 18. Usability Testing – Neutral <ul><li>Don’t tell the tester that you made the game </li></ul><ul><ul><li>People tend to be polite to the creator </li></ul></ul><ul><li>Be careful not to ask leading questions </li></ul><ul><ul><li>Wrong: Isn’t this level fun? </li></ul></ul><ul><ul><li>Wrong: How do you like this level I made? </li></ul></ul><ul><ul><li>Correct: Would you say this level is boring or fun? </li></ul></ul><ul><li>Tell the player that you are not testing them </li></ul><ul><ul><li>If the player can’t figure out what to do then the game designer failed not the player </li></ul></ul><ul><li>The player is never wrong or stupid </li></ul><ul><ul><li>It doesn’t matter that the answer is flashing in their face. If they don’t see it, then you need to change something </li></ul></ul>
  19. 19. Usability Testing – The Test <ul><li>Watch tester play the game and don’t help her in any way </li></ul><ul><li>Encourage tester to think out loud and even ask questions with the understanding that they will not be answered </li></ul><ul><li>Take notes </li></ul><ul><li>Video Recording is also preferable. </li></ul><ul><ul><li>Record the screen and the player if you can </li></ul></ul>
  20. 20. Usability Testing – Exit Survey <ul><li>Exit survey question </li></ul><ul><ul><li>What did you like or dislike? </li></ul></ul><ul><ul><li>What part was too hard or easy? </li></ul></ul><ul><ul><li>Do you know what feature X does? </li></ul></ul><ul><ul><li>Explain how to use feature Y. </li></ul></ul><ul><ul><li>What part was most confusing? </li></ul></ul><ul><li>Often times the survey won’t reveal what things the player really got stuck on. That is why you watch and take notes </li></ul><ul><li>Other times the survey will reveal that you notes were wrong </li></ul><ul><ul><li>Wik Story </li></ul></ul>
  21. 21. Lessons Learned About Game Design - Usability Tutorials are hard and need lots of testing <ul><li>What you think is perfectly clear… never really is </li></ul><ul><ul><li>Test it on real players, adjust, and then test again </li></ul></ul><ul><li>Optional moves (like combos) are hardest to teach </li></ul><ul><li>Forcing a move doesn’t teach it </li></ul><ul><ul><li>Players often read it, and then do it, and still don’t get it (Big Kahuna Net) </li></ul></ul><ul><li>Adding more text is usually the wrong solution </li></ul><ul><li>Visual effects can help show cause and effect </li></ul><ul><li>Don’t add a feature you can’t teach </li></ul>
  22. 22. GDC Session by User Research Engineers from Microsoft <ul><li>Do-It-Yourself Usability: How to Use User Research to Improve your Game </li></ul><ul><li> </li></ul>
  23. 23. World Map (agenda)
  24. 24. Playbalancing Overview <ul><li>Adjust difficulty of every level & item in the game </li></ul><ul><li>Recruit 50+ testers </li></ul><ul><li>Give them a full version of game with special instrumentation to collect a playlog </li></ul><ul><li>Collect and analyze play logs </li></ul>
  25. 25. When to start Playbalancing <ul><ul><li>When the game is “gameplay feature complete” </li></ul></ul><ul><li>Needed </li></ul><ul><ul><li>Every gameplay feature done </li></ul></ul><ul><ul><li>All upgrades and items implemented </li></ul></ul><ul><ul><li>Every level complete </li></ul></ul><ul><li>Not needed </li></ul><ul><ul><li>Meta game structure (Stories, Trophies) not needed </li></ul></ul><ul><ul><li>Final art not needed </li></ul></ul><ul><ul><li>No need to fix all bugs, performance, or compatibility </li></ul></ul>
  26. 26. Recruit Testers for Playbalancing <ul><li>Can be located anywhere in the world </li></ul><ul><li>Only first play through is accurate. Reply by same person is invalid. </li></ul><ul><li>Cannot be QA staff or contracted testing company </li></ul><ul><li>Sampling of target audience of your game </li></ul><ul><li>Contact your customers (or players of other casual games) </li></ul><ul><ul><li>Post in forums </li></ul></ul><ul><ul><li>Mention in e-mail newsletter </li></ul></ul><ul><li>Give them full version of game (not a 60 minute trial) </li></ul><ul><li>First build should be given to a limited subset of your testers so the rest are “fresh” </li></ul>
  27. 27. Creating a PlayLog File <ul><li>Special version of game logs player actions </li></ul><ul><li>Usually a simple text file with one line per level played </li></ul><ul><li>Comma-separated values (CSV) easy to write and easy to import into Excel </li></ul>
  28. 28. Sample Playlog
  29. 29. Playlog Fields (Player behavior on level) <ul><li>Level Number </li></ul><ul><li>Status (completed, failed, aborted) </li></ul><ul><li>Seconds played </li></ul><ul><li>Score / revenue </li></ul><ul><li>Optional goals achieved </li></ul><ul><li>Upgrades purchased </li></ul><ul><li>Items used </li></ul><ul><li>Mouse clicks </li></ul><ul><li>Invalid moves </li></ul><ul><li>Hints used </li></ul>
  30. 30. Playlog Fields (universal) <ul><li>Play name </li></ul><ul><li>If first time </li></ul><ul><li>Minutes Running Game </li></ul><ul><li>Build Number </li></ul><ul><li>Player created with build number </li></ul><ul><li>Avg. FPS </li></ul><ul><li>IP Address </li></ul><ul><li>Date & Time </li></ul><ul><li>Hardware Info (RAM, CPU, Video Resolution) </li></ul>
  31. 31. Collect Playlog File From Testers – E-Mail (Use for Big Kahuna Reef & Ricochet Games) <ul><li>Ask player to E-mail file as attachment </li></ul><ul><ul><li>Pro </li></ul></ul><ul><ul><ul><li>No implementation on your part </li></ul></ul></ul><ul><ul><ul><li>Players Know what is being tracked </li></ul></ul></ul><ul><ul><li>Con </li></ul></ul><ul><ul><ul><li>Most players won’t do it </li></ul></ul></ul><ul><ul><ul><li>Lots of files for you to manage </li></ul></ul></ul><ul><li>Usually the testing coordinator will import each log file into one master Excel document or database </li></ul>
  32. 32. Collect Playlog File From Testers – FPT (Used for Wik and Mosaic) <ul><li>Game uses simple FTP batch file to upload playlog to server </li></ul><ul><ul><li>Pro </li></ul></ul><ul><ul><ul><li>Simple implementation on your part </li></ul></ul></ul><ul><ul><ul><li>Players can find out what is being tracked </li></ul></ul></ul><ul><ul><li>Con </li></ul></ul><ul><ul><ul><li>Lots of files for you to manage </li></ul></ul></ul><ul><li>Usually the testing coordinator will import each log file into one master Excel document or database </li></ul>
  33. 33. Online Playlog using HTTP (Used for Build in Time) <ul><li>After each level, send one record to server using HTTP post </li></ul><ul><li>Also save a copy in a local PlayLog.CSV file for curios testers </li></ul><ul><ul><li>Pro </li></ul></ul><ul><ul><ul><li>Real-time data as game is being played </li></ul></ul></ul><ul><ul><ul><li>All data in one database rather than a separate file form each user </li></ul></ul></ul><ul><ul><ul><li>Data from EVERY player </li></ul></ul></ul><ul><ul><ul><li>Can also handle disabling old build </li></ul></ul></ul><ul><ul><li>Con </li></ul></ul><ul><ul><ul><li>Requires more work to setup but not much </li></ul></ul></ul>
  34. 34. Online Playlog using HTTP - Details <ul><li>Game engine collects all playlog fields during level play </li></ul><ul><li>At end of level, all fields are sent to a web server using an HTTP Post </li></ul><ul><ul><li>Just like a form post in a browser </li></ul></ul><ul><ul><li>LibCurl can help with this </li></ul></ul><ul><li>The server can be programmed using simple web frameworks and languages like PHP. </li></ul><ul><ul><li>A trivial PHP script can capture all the posted form fields and save them in a MySQL database </li></ul></ul><ul><li>Retrieving the data from MySQL can be made simpler with another trivial PHP program to output the database as CSV </li></ul>
  35. 35. Reflexive’s Playlog Tools <ul><li>Source code available /playlog </li></ul><ul><ul><li>Not intended to be a turn key solution </li></ul></ul><ul><ul><li>It’s a huge head start when rolling your own </li></ul></ul><ul><ul><li>Requires web host with PHP and MySQL </li></ul></ul><ul><li>Makes no assumptions about what fields are in your playlog </li></ul><ul><ul><li>Automatically add new columns to the SQL table </li></ul></ul><ul><li>Contents (about 300 lines of PHP code) </li></ul><ul><ul><li>playlog.php collects posted fields and stores them in DB </li></ul></ul><ul><ul><li>download.php dumps the DB as a CSV file for Excel </li></ul></ul><ul><ul><li>gateway.php handle beta authorization </li></ul></ul><ul><ul><li>dbconnect.php has MySQL Password </li></ul></ul><ul><li>Or roll your own in, Rubu or whatever </li></ul><ul><ul><li>Should be trivial for any experienced web developer </li></ul></ul>
  36. 36. Analyzing Playlog <ul><li>Excel, Access, or any spreadsheet or database </li></ul><ul><li>Averages per level </li></ul><ul><li>Averages per player </li></ul><ul><li>PivotTable (or Crosstab) is your best friend </li></ul>
  37. 37. Make adjustments and test again <ul><li>Use a new build number that is tracked in the playlog file </li></ul><ul><li>Disable the old build if you have on-line validation </li></ul><ul><li>Get “fresh” testers for the new build but also let the original testers replay on the new build </li></ul><ul><li>Don’t balance for the experienced players. Look at averages for first time players </li></ul><ul><li>This is where it is important to know which logs are from replays </li></ul>
  38. 38. Using External Testers <ul><li>Clearly label the game as beta and give it a kill date </li></ul><ul><ul><li>All Reflexive beta builds “expire” 21 days after they were compiled </li></ul></ul><ul><ul><li>Each time you launch the game it pops-up a message box with the number of days until expiration </li></ul></ul><ul><li>On-line validation is even better (makes sense when using on-line playlog) </li></ul><ul><ul><li>Beta game pings playlog sever before allowing player to start a game </li></ul></ul><ul><ul><li>Verifies that player’s firewall is configured to allow playlog to work </li></ul></ul><ul><ul><li>Allows sever to deny the game from running if it is an out of date beta </li></ul></ul><ul><li>Wrapping the game in DRM may help </li></ul>
  39. 39. Lessons Learned About Testing Process <ul><li>Automate playlog collection as much as possible </li></ul><ul><li>Make all test builds expire </li></ul><ul><ul><li>Final test build of BKR2 did not </li></ul></ul><ul><li>On-line validation is even better then expiration </li></ul><ul><li>Game should ask the player… </li></ul><ul><ul><li>to enter their FORUM user name </li></ul></ul><ul><ul><li>if they have played before </li></ul></ul><ul><li>Log file should include answers to above, build number, and FIRST build number used </li></ul><ul><li>Logging failed levels in not enough, be sure to log aborts </li></ul><ul><li>Reserve some testers for the 2 nd or 3 rd play balancing build </li></ul>
  40. 40. Playbalancing - Data
  41. 41. “ Speed” Levels “ Speed” Levels Build in Time Play Log Analysis
  42. 42. Too Easy Build in Time Play Log Analysis
  43. 43. Too Difficult Too Difficult New Version Old Version Build in Time Play Log Analysis
  44. 44. World Map (agenda)
  45. 45. Resources <ul><li>James C. Smith – [email_address] </li></ul><ul><li>Reflexive’s Playlog Tools: /playlog </li></ul><ul><li>Libcurl (HTTP library) </li></ul><ul><li>Surveys - www. SurveyMonkey .com </li></ul><ul><li>Forums: www. vBulletin .com www. phpbb .com </li></ul><ul><li>Bug Tracking: www. Bugzilla .org </li></ul><ul><li>Video Recording: www. CamStudio .org </li></ul><ul><li>Market Research: www. CasualCharts .com </li></ul><ul><li>GDC Session: Do-It-Yourself Usability </li></ul>