Versus playtest slides

of 15

Danny – Epic Quest 3
Versus prototype and playtest
“Big Picture” Serious Goals:
Practice and develop/improve some Critical Thinking (CT) skills:
For “contestant” players:
• providing explanations and arguments to support or oppose a claim
• understanding multiple viewpoints on an issue
For “judge” players:
• analyzing information, especially source bias and credibility
• evaluating the logic of claims, especially correlation vs. causation

Target Audience and Context:
• Adolescents, ages 10 – 18
• Informal educational setting, e.g. afterschool club or community center

Professor Carrie Heeter, Michigan State University

of 15

Prototype Serious Goals:
For “contestant” players:
• select and provide evidence to support or oppose a claim, consistent with assigned role
For “judge” players:
• analyze claims and evidence, especially source bias and credibility
• evaluate consistency of role – claim – evidence presented

Playtest Questions:
Player characteristics:
1. Does the game work differently for different genders? Homogeneous vs heterogeneous
gender groups?
2. Does the game work for younger players? (ages 9 – 11)
3. Does the game work for players with language disabilities? (reading, receptive, expressive)
4. Effect of cultural IQ and current CT competence?


of 15

Playtest Questions (continued):
Game mechanics:
5. Adequate usability with minimum of instruction?
6. Sufficiently clear and simple to avoid confusion?
7. Can judge retain evidence info? Is note-taking necessary and practical?
Content:
8. Can players make sense of the presented info (claims & evidence)?
9. Accessible to players with moderate language disabilities?
10. How does visual presentation of info in Claim and Evidence cards influence player
selection? (text length, bolding/highlighting, source description, quality rating)
Dynamics and Parameters:
11. Effect of player relationships? (know each other well or not)
12. Is play balanced? Could any of the 3 sides win?
13. Effect of chance vs. strategy?
14. Reasonable playing duration for each round?
15. Is it fun and engaging?

of 15


Judge’s cards: Topic/Roles,
Quality ratings, Sources chart

Contestant: sample Claim cards
(from deck of 10)


sample Evidence cards (from deck
of 23), placeholders

of 15

Materials and Game Rules
Materials:
Coin or 6-sided die
Contestant: Claim cards; Evidence cards; placeholders for each contestant’s Evidence cards.
Judge: Topic/Role cards; Quality rating cards; chart showing Quality ratings of sources; notepaper & pencil.
Participants:
By mutual consent, one player is the judge. Other players divide into 2 teams of contestants.
Rules:
In the first round, each contestant team is dealt, face down, 3 Claim cards and 5 Evidence cards (facilitator
ensures at least 1 Evidence card for each of 4 possible Topics). The judge randomly draws a Topic card,
reads it aloud, and chooses roles to assign to each contestant team. Each team may choose to offer to
trade one or more Claim or Evidence cards to the opposing team (for an equal number), without showing
them to the judge. The judge rolls the die or flips the coin to determine which team first plays a Claim; that
team deliberates, then plays a Claim card face-up and 1 – 3 Evidence cards face-down, declaring side of
Claim (pro/con). The other team may choose to either play the opposite side of this Claim or play an
alternate Claim card; then must play face-down 1 – 3 Evidence cards (but not more than first team put
down). The first team reads aloud, from each played Evidence card, any pieces of info (possibly including
source); judge may take notes. If the team has played more than 1 Evidence card, the judge may ask the
team to read aloud the remaining info (including source) from one particular Evidence card. Same process
with the second team: read aloud some of played Evidence cards, etc.

of 15

Playtest Instructions and Game Rules (continued)
After listening to both teams’ evidence, the judge may place up to 3 Quality rating cards on any Evidence
cards that have been played. Then the judge may award 1 point for “Role Authenticity” to either/both teams,
and 2 points for “Most Convincing” to the more persuasive team (but if the second team chose to play an
alternate Claim, it can only earn 1 point for “Most Convincing”). All the played Evidence cards are revealed.
For each one on which the judge played a Quality rating card: if its quality value (0 – 3) matches the one
shown on the judge’s Quality rating card, the judge gets a point. If the total of the “Most Convincing” team’s
Evidence quality points is lower than the total for the other team, the “Most Convincing” team gets a bonus
point; if the “Most Convincing” team’s Evidence quality points total is higher, the judge gets a bonus point.
Each player is verbally surveyed with the following questions:
“Are the rules clear or confusing?”, “Is playing the game hard, medium, or easy?”, “Is the info on the cards
easy or hard to understand?”, “How fun is this game, on a scale of 1 – 5 (1 = torture, 5 = extremely fun)?”,
“Do you feel like playing another round?”
Second round:
The already-played Claim card(s) and Evidence cards are discarded; each team is dealt replacements. Play
continues as in the first round. If possible, a third round is played. After the final round, the winning player is
the one with the highest number of points (the judge or one of the contestant teams).
After surveying each player again with the same questions as above, additional comments and suggestions
are solicited from each player.


of 15

Playtester Recruitment and Description
Recruitment and Logistics:
Due to the ambitious set of playtest variables and questions, I decided I needed 3 playtest sessions: one at
school with my students, and two with family/friends/neighbors. The rules of Versus require at least 3
players in each session; I was ultimately able to recruit a total of 10 playtesters, which entailed considerable
effort. For the student group, logistics proved a bit tricky: short class periods and some minor disruptions
necessitated playtesting over 2 days (one round of play each day). Scheduling the other playtest groups
was difficult, mainly in reconciling my schedule with those of busy adolescents. As an incentive for the time
commitment, I offered each family/friend/neighbor participant a $5 gift card for completing the playtest.
Playtester Descriptions:
Each playtester was invited to choose their own pseudonym: 9 complied, and I assigned one to the 10th.
Student group (at school):
Three boys with language-based learning disabilities: “Jacob” (age 16), “Junior” (age 18), “Fred” (age 17).
Neighbor/family group:
Two boys and two girls: “Max” (age 10), “Thomas” (age 10), “Olivia” (age 15), “Lily” (age 13).
Friends group:
Two boys and one girl: “Krabs” (age 12), “Percy” (age 13), “Hermione” (age 9).

of 15

Neighbor/family group, middle of round 2.
The judge is seated in center (facing camera).

Student group, during round 2. Judge is
at extreme right (in red shirt).

of 15

Playtesting Methodology Variations:
• In the student group playtest, between the first day (round 1) and the second day (round 2): I added some
visual aids (evidence place-markers, assigned-role place-cards), reduced the amount of text in some of
the Evidence cards, and bolded some keywords in Claim and Evidence cards (to clarify topic association).
The visual aids were also used in the other group sessions.
• In the student group and neighbor/family group playtest sessions, I recorded observations in a narrative
style. For the friends group playtest session, I created a recording chart (based on the salient points of the
earlier sessions), which I filled-in during the session; this made it easier to record more details efficiently.
• In the friends group playtest session round 1, I increased the number of cards dealt to 4 Claims and 8
Evidence (for each contestant), in order to increase relevant choices and opportunities for strategizing.

General Playtesting Observations and Results:
• Working by myself, it was hard to orchestrate the playtest & observe/record at the same time; it would
have been much easier with a dedicated observer to assist me.
• It took a long time (15 – 20 minutes) to explain the rules to each group at the start of the playtesting
session. Round 1 of play was slowed by players’ processing of rules and my reminders. Round 2 play
progressed faster & more smoothly.
• The contestants needed to simultaneously consider several factors, which made round 1 hard for most.
• The judge’s actions were brief; there was too much “down time” waiting for the contestants to act.

of 15

Results relating to Serious Goals:
Contestants:
• Demonstrated significant interpretation and communication of evidence in support of or against a claim,
and attempted to choose expressions consistent with the assigned role.
• Sometimes appeared that contestant did not completely understand the claim or evidence being provided,
or the match with the perspective of the assigned role.

Judges:
• Some practice analyzing evidence for bias, and evaluating role + claim + evidence for consistency; easier
to assess when judge was asked to explain scoring decisions. Some attention to apparent bias, but little
interest in examining quality/credibility ratings of specific evidence sources.

Results relating to Key Questions: Summary of Findings
• Substantial instruction required at the start of each session, and coaching during round 1, made round 1
harder to play and less “fun”.
• After the first, round duration was reasonable. Point balancing was pretty good, with no clearly unfair
advantages.
• Chance (in the dealt Claim & Evidence cards relevant to the assigned topic & role) was dominant,
restricting the contestants’ opportunities to strategize.
• Players at the low end of age, language ability, cultural IQ & CT competence had more difficulty in
understanding the rules and card info, as expected; however, they were still able to complete 2 rounds of
play and report a favorable “fun” rating.

of 15

Detailed Results for each Key Question:
Player characteristics:
1. Gender variation: 2 heterogeneous groups (2 Boys + 1 Girl; 2 B + 2 G), 1 homogeneous group (3 B).
Average survey ratings after 2 rounds:
Fun (scale 1 – 5): Boys = 4, Girls = 4; Rules: B = clear, G = clear; Info understandable: B = med/easy, G
= easy/med; Play mechanics: B = easy, G = easy/med
Conclusion: no significant gender differences in survey results.
2. Age variation: 3 players at low end (ages 9 – 10), 4 players in mid-range (ages 12 – 15), 3 players at
high end (ages 16 – 18).
Low end had slightly more difficulty understanding card info (ave. survey rating “medium”), and more
variation in fun rating (from 3 to 5). No consistent differences between age groups in other assessments.
3. Language disabilities: 3 players at low end (student group) required more coaching in rules & flow.
Survey results & comments mixed, but consistent with expectations: rule clarity & card-info understanding
varied, as did fun rating (3 – 5, ave. 4). All three were willing to play another round (in classroom setting).
4. Cultural IQ & CT competence: informally assessed: 4 players at low end, 6 players at high end.
Low-end players had more difficulty with understanding & following rules & card info, and slightly lower fun
ratings (average 3.5). Surprisingly, all the low-end players expressed willingness to play another round,
while the high-end players’ responses were mixed.

of 15

Detailed Results for each Key Question: (continued)
Game mechanics:
5. Instruction requirement: 10 – 15 minutes of rules instruction was necessary before starting round 1, and
some reminders & clarifications had to be provided during round 1 (especially with the student group).
This elicited lower “fun” ratings and more “hard to play” survey responses after round 1.
6. Clarity: All the players with language disabilities and most of those with low-end CT competence reported
some confusion in round 1; improvement shown for most in round 2 (except some of the student group).
7. Judge retention of info: Despite declining to take notes, all the judge players appeared to retain info
sufficiently to make confident judgments on authenticity, persuasion, and quality of evidence.
Content:
8. Player interpretation of card info: Language-impaired players, and those with lower cultural IQ, had
some difficulty interpreting the info. Most other players seemed able to understand adequately, with
survey responses in “medium” to “easy” range.
9. Accessibility for language-impaired players: Contestants needed help with some vocabulary; one
contestant needed most of the cards read aloud to him before selecting.
10. Card visual presentation: Didn’t record enough data to analyze this in a meaningful way.

of 15

Detailed Results for each Key Question: (continued)
Dynamics & Parameters:
11. Player relationships: In each group, the players knew each other fairly- to very-well. Closeness of relationship appeared
to make the judge’s assignment of roles more intentional, but didn’t clearly affect verdicts (authenticity, persuasiveness).
12. Balance: Reasonably good: after 2 rounds, point totals ranged from 1 to 7, and point spreads ranged from 2 to 6; the
judge tied for the lead in one group. No players appeared discouraged by their point standing.
13. Chance vs. strategy: Chance seemed dominant in the Claim & Evidence cards played: the contestants’ choices were
narrowly constrained by the topic draw, role assignment, and the Claim & Evidence cards dealt. Some contestant strategy
was shown in choosing whether to trade cards (most players didn’t) and which of the Evidence info to read aloud. When
only 1 Evidence card was played by each side (majority of rounds), judge had no opportunity to make a disclosure request.
14. Round duration: The first round in each session took 15 – 25 minutes, mainly due to the need for explanation &
clarification of rules, plus contestants needing to read all dealt cards before playing them & before deciding what to read
aloud. Second round was quicker, 10 – 15 minutes: contestants only needed to read newly-drawn cards (replacing alreadyplayed cards) and review previously-drawn cards. I would expect subsequent rounds to have about the same duration.
15. Fun/engagement: No players opted to drop-out, or appeared to “tune-out”, during the playtest. The average surveyed
“fun” rating was 4 (though in comparison to other educational games). Most players indicated willingness to play another
round in the same setting, given more time.


of 15

Final Comments & Conclusions
Playtester Feedback & Suggestions:
Junior (student, age 18): “too much info, hard to follow”.
Jacob (student, age 16): “too many parts to the game, made it confusing”; “I don't know if I learned anything,
but I had fun”.
Olivia (neighbor/family, age 15): “give the judge more opportunities to get points”; “the game might be more
fun on the computer”.
Max (neighbor/family, age 10): “play longer; add more complexity; make the play more continuous”.
Some conclusions toward improving the prototype:
• Reduce the contestant mechanics and information-processing for round 1, to provide an easier and faster
“on-ramp” to the game; perhaps start with fewer dealt cards and fewer possible topics, then increase in
round 2.
• Increase contestant choices and opportunities for strategizing in round 2 (and later), by ensuring that
each contestant team is dealt multiple viable Claim and Evidence cards (for assigned topic & role).
• Give the judge more mechanics during each round, especially actions that involve analysis/evaluation;
require judge to explain scoring decisions, perhaps with use of guiding questions.
• Give the judge an incentive to consider (and perhaps challenge) the relative quality ratings for Evidence
sources.
• Add more Evidence cards from low quality-rating sources (0, 1).

of 15

Final Comments & Conclusions (continued)
Ideas to improve further playtesting:
• Use observation recording form more consistently, and employ an assistant to record observations.
• Formally assess the playtesters’ initial cultural IQ and CT competence with a pre-test. Use a post-test to
try to detect any improvements.
• Because some judge players are better able to retain and process information visually, figure out a way
for contestants to show the judge selected text on Evidence cards (and still conceal other text).
• Record and analyze how the visual presentation of information in Evidence cards affected player use (for
both contestants and judges); playtest variations in visual presentation.
• In each post-round survey, ask the judge to explain how he/she attempted to identify the quality rating of
the played Evidence cards; this might become a game mechanic, as it potentially contributes to a serious
goal of the game.
• Playtest with multi-player contestant teams, to see effects on game dynamics and serious outcomes.
• Playtest with more rounds of play, to see changes in game dynamics. This would require adding more
Evidence cards to the deck.
• Encourage & record guided discussion among the players at the end of the game.


Versus playtest slides

Recommended

Recommended

More Related Content

Similar to Versus playtest slides

Similar to Versus playtest slides (19)

Recently uploaded

Recently uploaded (20)

Versus playtest slides