• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
A child's job is to play, we should let them (Final Paper
 

A child's job is to play, we should let them (Final Paper

on

  • 1,582 views

Best P

Best P

Statistics

Views

Total Views
1,582
Views on SlideShare
1,581
Embed Views
1

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 1

http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    A child's job is to play, we should let them (Final Paper A child's job is to play, we should let them (Final Paper Document Transcript

    • A CHILD’S JOB IS TO PLAY, WE SHOULD LET THEM...Pamela WongResearch Manager, Direction First Page 1 of 1
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstIntroductionThere appears to be very little consensus and a shortage of research investigating effectiveresearch approaches and question types with children. Direction First has put the standardapproaches to the test, along with using the latest technology from GMI to evaluate more modernapproaches. We wanted to know which questionnaire scales gave better discrimination and todetermine if the use of interactive and gaming scales would improve data quality by improvingengagement.Direction First has undertaken original research challenging the traditional approach of questioningchildren by creating audio and visually interactive game based techniques designed to answer‘traditional’ objectives. Today’s children live in a digital world and we wanted to test if onlinegaming methodologies maintained attention better and led to better quality data. There have beenfew studies that compare and measure the discrimination and engagement of different questiontypes and methods. In this research we explored different question types and scales to understandwhich types enabled better discrimination, and ultimately, which question types were moreengaging and provided better quality data.We compared different question scales on over 500 Australian children between 7 and 10 years oldin an online survey. The research was conducted in three different stages. Each stage contained anindependent sample of participants. Children in each stage rated their liking of the same fifteenitems on one scale before moving onto the next scale until all four scales had been used.Sensory Food Research on ChildrenGlobally, the children’s market is estimated to be valued at $USD1.3 trillion (Nairn, 2010). Childrenhave much more autonomy and influence over household purchases than previous generations, tosuch an extent that today’s youth are more likely to be described as consumers rather than aschildren (Geraci, 2004). The growth in the consumption power of children as consumers andinfluencers of family purchases, including household groceries, has been recognised as substantialbusiness, and this has similarly led to growth in spending to find out what children want, why andhow to best market to them.The childrens market is a notoriously challenging market to research. Whilst children are beingexposed to significantly more information and technology at a younger age, they still tend to havelimited linguistic and numeracy skills, cognitive abilities and short attention spans. Because of this,they may be able to participate and respond to research in more limited ways unless techniques areadapted. For this reason, there are specialty companies and departments dedicated to conductingresearch with children. Page 1 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstIn food sensory research literature, it has been found that children have difficulty with understandingand remembering instructions, interpreting abstract symbols or pictures, and completing tasks suchas seriation (ranking in order of magnitude) and attending to multiple aspects, for example, textureand flavour of a food (Popper and Kroll, 2005, 2003). Younger children tend to focus on a singleaspect of a product, without attending to other aspects (Fliegelman et al, 2004).Children develop linguistic, literacy and numeracy skills at different rates, and there is suchtremendous variation in such skills among children of the same age (variations up to 4 years) thatsome researchers believe school grades may be better determinants of skills/abilities amongchildren than age alone (C&R Research, 2009). The changing vernacular of children from eachgeneration is of particular importance to researchers, as it affects the language with which wecommunicate with children. Whilst language needs to be familiar, child friendly and suitable to theage group, children often aspire to be older and look up to children who are older than them, so it isimportant to keep things simple enough to understand and be familiar, they must not feel thateverything has been dumbed down for them. This also applies to themes and imagery.When asking children questions, there is a tendency to respond positively to questions aboutwhether they like something for different reasons, that is, they are more likely to respond withpositive descriptors than negative (Geraci, 2004). Children tend to rate new products and ideaspositively because they are excited about novelty and not necessarily because they really like theproducts. C&R Research addressed this issue by designing an unbalanced scale that made mostresponses sound positive, such as a five point scale labelled as, “love it”, “like a lot”, like a little”, it’sok”, and “don’t like at all”. This aimed to enable children to distinguish products that they reallyloved and those that were just interesting because they were new. Winning concepts were believedto have clearly surfaced (Fliegelman et al, 2004).If children don’t like an idea or product because it’s novel, then familiarity may also be a factor thatfalsely drives liking. Introducing unfamiliar foods to kids several times has been found to enhanceliking of the product due to the “mere exposure effect” (Birch and Marlin, 1982). This hasimplications for researchers and companies introducing new products to market. Most sensoryprotocols expose a child only once to a novel food in small portions, however, Ubrick (2002)proposes that new foods may require repeated testing to assess the true potential of a product.Popper and Kroll (2005) have emphasised the importance of considering cognitive and socialfactors that affect sensory food testing with children. Food preferences are influenced by theinterplay of nature (e.g. innate preference for sweet tastes, aversion to bitter tastes) and nurture(e.g. parents, peers, and the environment). Peer influences can also have long lasting effects onchildren’s food preferences. Children’s food choices may be affected by their desire to exercisecontrol of themselves and to be viewed as older and more mature. Changing societal influences Page 2 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction Firsthave led to children maturing earlier, which has resulted in increases in cognitive demands andprocessing skills needed to meet these demands (Chambers, 2005). Today, technology continuesto create generations of child consumers that are exposed to more products, ideas and technologythan previous generations. Not only are children growing up with more media and entertainmentoptions to choose from, but more media is being targeted directly to them than in previousgenerations. Multi-tasking while using various forms of technology (e.g. surfing the internet whilewatching TV) is enjoyed by most children. This lends support to our belief that children may bemore capable of completing more sophisticated questionnaires than we originally thought.Questionnaire scalesResearching children requires different procedures routinely applied to adults, includingpsychological factors such as gaining confidence, trust and providing motivation, communicating inchild-appropriate language and using appropriate questionnaire scales (Schraidt, 2009).Specialized research methods, adaptations and techniques have been developed by various firmsconducting research on children. One such firm is the Peyram & Kroll Research Corporation whohas published the bulk of sensory food research on children, and conduct a specialty practice in thisfield. The P&K Corporation believe that there is a consensus among the research community thatchildren (as young as 5 years old) can discriminate, particularly in regard to expressing their degreeof liking, which means they are able to indicate a degree of preference if the correct measuringtechniques are used (Schraidt, 2009). There is little consensus in literature, however, on which arethe most effective techniques, question types and scales when conducting research on children.Hedonic scales for food acceptance have been used widely for consumer testing. In Australia,different agencies are using very different questioning types and scales for children, recognising thefact that children require special questioning techniques. Questionnaire scales used on childreninclude face scales, star scales, line scales, and normal descriptor type scales amongst others(Figures 1-4).Figure 1. Standard 9 point hedonic scale for adults 1 2 3 4 5 6 7 8 9 Like Neither Dislikeextremely like nor extremely dislike Page 3 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstFigure 2. Facial scale for childrenFigure 3. P&K scale for children 1 2 3 4 5 6 7 8 9 Super Really Good Just a Maybe Just a Bad Really Super good good little good or little bad bad good maybe bad badFigure 4. Star scales for children                                                                                                                                                                    Dislike a lot                       Like a lot Facial scales (Figure 2) which were designed to inspire closer attention to the scaling task, havecontinued to be popular based on the rationale that children have limited reading and linguistic skillsand cannot understand complex words or phrases. Whilst this scale continues to be used by somefor conducting sensory research, it has been found to be less discriminating than other verbalscales and may introduce unintended bias. Children tend to respond to pictures based on theemotion that they show (a smiley face shows a happy person) rather than what they are supposedto represent (how the food makes you feel). Pictorial facial hedonic scales have been said to beambiguous as the face, which is intended to show a degree of dislike can be interpreted by childrenas feeling angry, which is an emotion not usually experienced when thinking about food (Popperand Kroll, 2003; Cooper, 2002). Page 4 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstThe P&K scale (Figure 3) was a child oriented scale developed specially by Peyram and Kroll to beused for children who were semi-literate (Popper and Kroll, 2005). This scale was reported toperform better than the standard hedonic scales and the smiley face scale. Whilst there are manymerits to the application of the face scale, Kroll (1990) found that the face scale was less effectiveand less discriminating compared to hedonic ratings on the P&K scale.No references were found in literature on the star scale (Figure 4), but several specialists in foodsensory research on children have recommended this scale above other scales, and it has beenused by sensory research firms in Australia for many years. It has been said that childrenunderstand the star scale easily, as the stars represent grades or rewards that closely follow thegrades that they are awarded for good work at school. However, it is important when using anyscaling to emphasise that there are no right or wrong answers to help children to answer truthfully(Fliegelman et al., 2004)Other researchers believe that because children cannot distinguish shades of meaning, that askingany type of rating question on a scale is not useful as they do not understand. Simplified, finitescales such as” “like it”, “it’s ok” or “don’t like it” have been recommended for younger children(Fliegelman et al, 2004). Pair-wise questionnaire approaches where children chose their favouriteoption between 2 choices was reported as effective among very young children (Fliegelman et al,2004). On a similar basis, a bifurcated approach where children were firstly asked if a food was“good” or “bad” before being asked if it was “really good” or “really bad” was found to be effective forchildren under 7 years old (Kroll, 1990).Kroll (1990) conducted a comprehensive study on children to compare various sensoryquestionnaire scales, scale lengths and the effectiveness of self-administered versus one-on-oneinterviews. In this study, the relative merits of the different rating scales that can be used in testingchildren were assessed. A standard hedonic scale, a face scale, a child-oriented scale (P&K) andpaired comparison were used with children between 5 and 10 years. Findings showed that the P&Kscale performed better than the standard hedonic or face scale in terms of discrimination. The useof a shorter scale, under the hypothesis that it would offer simplicity (7 points as opposed to 9points) was not found to offer any advantages among children. The 9 point scale resulted in betterdiscrimination and produced more reliable results than the 7 point scale. In one-on-one interviews, ithas been hypothesised that children may respond positively to acquiesce, which provides aplausible reason for using self-administered questionnaires when possible. Children over 8 yearsold performed as well in self administered questionnaires as one-on-one interviews.Sensory researchers agree that children are different to adults and require tailored researchapproaches. Guinard (2001) reported differences found in sensory intensity (strength) thresholds inadults and children, however, these differences in perception may be more reflective of the Page 5 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction Firstdifferences in how children interpret questions and how they use intensity scales, rather than truephysiological differences. This provides further support to the need to conduct more research in thisarea.Respondent engagementRespondent engagement in online research has been discussed extensively throughout theresearch industry. Common metrics of engagement include completion rates, survey time spent,verbosity of open ended responses, consistency checks, fatigue and satisficing (doing just enoughto complete a task) measures, and the ability of participants to follow instructions accurately. Thesemeasures have been said to be indicators of engagement, which ultimately determine completionrates, enjoyment and data quality.SSI research revealed that on average, survey response rates in the UK, France and theNetherlands collapsed dramatically from 30% in 2004 to 10% in 2009. Research was conducted tounderstand the effects of survey length, fatigue and subsequent effects on response quality (Cape,2009). Fatigue or satisficing behaviour was hypothesised as indicators of participant’s lack ofengagement, so researchers used various measures to investigate reasons for changes in surveybehaviour since 2004. By positioning non-mandatory question scales, SSI measured rates of non-response. Data on drop-out rates, survey time spent, rates of satisficing, numbers of words typed inopen ended questions, and rates of answering falsely (in order to skip a section) were used asmetrics to explain survey behaviour, and measures of data quality. The research indicated thatthere was a critical limit of 20 minutes for surveys, after which engagement and data qualitydropped.Sleep and Puleston of Engage Research and GMI (2009) examined causes of boredom in onlinesurveys. Various techniques were tested with the aim of improving data quality, including the use ofvisuals/animations, use of alternatives to grid questions, role playing, survey energisers andimproving language, amongst others. Data quality measures were examined including straight-lining, responses to open ended questions and the ability to follow instructions accurately.Techniques applied resulted in a successful reduction in drop-out rates, increased time spent andsupply of higher volumes of data (open ended responses, follow on questions) and better qualitydata.A substantial volume of research on improving engagement has been conducted on panels of adultrespondents, who it seems are becoming bored with online surveys. This is a trend seen globally.So it seems reasonable to believe that for children who have much shorter attention spans, andmore limited cognitive abilities, that traditional “black and white” form surveys and research questionscales for adults are not likely to be highly engaging. Page 6 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstToday’s youth are becoming more technologically savvy at a much younger age. In countries whereall choices of media are available, children use between 4 and 6 media a day (e.g. TV, radio,internet and books), and often simultaneously (Solomon and Peters 2005). It is believed that theability to follow several topics more or less simultaneously with attention switching from onemedium to another demands quite an advanced level of cognitive and memory coordination.While there is a consensus that children can provide valuable information for marketers, there islittle consensus on the extent to which survey design needs to be simplified to minimise confusionand capture accurate information. The hypothesis is that children need simplicity, however manyresearchers have found evidence contrary to this belief.Connecting with the most inter-connected generation of youth is not an easy task. In Australia,access to media is ubiquitous and over 90% of children aged between 7 and 10 years, spendbetween 30 to 60 minutes a day, surfing the internet and using various types of media, oftensimultaneously (Direction First online survey, June 2010). This level of multi-tasking by childrenmeans that marketing messages need to be interesting and compelling, and this also applies tomarket research on children.Australia has been described as a “Game Nation” and playing video and computer games (e.g.Figures 5-6) has become as popular as the internet and television. Whilst playing video gamesdoes not compete for time spent in non-media activities, it competes with use of older media, and isincreasingly becoming a more social activity (Brand et al, 2009). The enormous popularity of gamesand high proportion of young gamers under 10 years old gives us reason to believe that there arecertainly more ways in which we need to conduct research on young digital natives to capture theirattention, be more enjoyable, interactive, immersive and engaging. Page 7 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstFigure 5. A single player computer game of the past: Nintendo TetrisFigure 6. A current massively multi-player online role playing game (MMORPG): Nintendo WiiThe Legend of Zelda Page 8 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstOne of the most successful television based educational-entertainment programs was SesameStreet, which aired on television in 1969, after substantial academic scrutiny. The creators turnedwhat was considered a low involvement, not educational and non-interactive medium into anenormously successful teaching tool (Gladwell, 2001). Inspiration was drawn from educationalpsychology, television commercials and comedy sketches to improve numeracy and literacy skillsamong preschoolers, which was proven to improve viewers reading and learning skills. Much in thesame way that “edutainment” derived its parentage from educational psychology, advertising andentertainment to capture childrens’ attention and teach during play time, researchers can draw fromsuch techniques to make research more appropriate, fun and engaging for children and adults,whilst collecting better quality data.BackgroundIn June 2010, Direction First conducted an online study to investigate which question scales workbest on children, and to determine whether interactive and gaming elements improved engagement.The main objectives of the research were to:  Test a standard hedonic questionnaire scale with scales designed for children to see which of them gave better discrimination power.  To determine if the use of interactive elements or a combination of interactive gaming scales would improve data quality by improving engagement.  To determine which of the scales and questionnaire formats was the most engaging, enjoyable and fun.Over 500 Australian children aged between 7 and 10 years were invited to participate in the onlinestudy conducted in June 2010. The research was conducted in three different stages with eachstage comprised of an independent sample of participants. Children in each stage rated their likingof the same fifteen items on one scale before moving onto the next scale until all four scales hadbeen used. The orders of the scales were randomised in a balanced block design to avoidpositional bias. Parents firstly completed a screening exercise, with children taking over once thescreener was complete to undertake the survey.‘Warm-up’ questions were asked at the beginning of each new scale to ensure respondents wereaware that they had progressed onto a new scale. Scale experience questions were presented atthe end of each scale to find out how much children enjoyed the experience and how easy it was forthem. Consistency check questions were used to determine whether respondents were engagedand attentive at the beginning and end of the survey. Page 9 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstFifteen concepts were selected for the research and included conceptual text descriptions andimages of unbranded common food consumption items, flavours, and unbranded commercial-likeproducts. Common food consumption images included milk, honey, ice cream, bread and water.Flavours presented as words included mint, chocolate, cinnamon, peanut butter, and lemon. Theunbranded commercial-like concepts included images of sweet biscuit and savoury snack productsthat were relatively similar to existing market products. Concepts were selected so that the rangecontained a mix of liked, neutral, and disliked flavours and products to represent a wide hedonicrange. The concepts researched create a context for conducting concept testing as well asaddressing other aspects more likely to be presented in food sensory testing applications.The 3 stages were as follows:  Stage 1: Traditional. N=96.  Stage 2: Interactive. N=167.  Stage 3: Interactive and gaming. N=248.The 4 question scales tested in each of the 3 stages included the following:  9 pt standard hedonic scale  5 pt smiley face scale  9 pt P&K scale  9 pt star scaleScales read left to right from negative to positive in all surveys. Whilst some researchers use someof the scales the other way around, we decided to keep it consistent with our current questionnairescales to avoid confusion.Traditional (Stage 1)The first stage of the research was designed to compare and put to test 4 different scales in theirtraditional, ‘black and white’ format. A sample of 100 children evaluated concepts and flavours byanswering questions that appeared as they usually would on paper questionnaires. Essentially, thiswas placing a paper questionnaire in an online survey (Figures 7 – 10). Page 10 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstFigure 7. Stage 1 - Standard 9pt hedonic scaleFigure 8. Stage 1 - 5pt Smiley face scaleFigure 9. Stage 1 - 9pt P&K scaleFigure 10. Stage 1 - 9 pt Star scale                                                                                                                                                                    Dislike a lot                       Like a lot Interactive (Stage 2)The second stage introduced the four scales in a graphically enhanced, interactive format, withsliders and audio - visual scales. The interactive scales were designed by Direction First using flashtechnology on GMI’s platform (Figures 11 – 14). Page 11 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstFigure 11. Stage 2 - Standard 9pt hedonic scaleFigure 12. Stage 2 - 5pt Smiley face scaleFigure 13. Stage 2 - 9pt P&K scaleFigure 14. Stage 2 - 9pt Star scale Page 12 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstGaming and interactive (Stage 3)The third stage repeated the four interactive scales used in Stage 2. Drawing inspiration from thelatest online video games, Direction First designed an avatar-like character that participants wereasked to choose and dress at the beginning of the survey (Figure 15). The character continuedthrough the survey journey with the participant, in same way that popular role playing video gamesare played today. This third stage also introduced a series of popular video game inspiredbackgrounds (Figure 16).Figure 15. Dressing your character Page 13 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstFigure 16. Character in survey1. Comparison of scalesTo ensure that the scales were comparable, we converted the 5 point smiley face scale to a 9 pointscale to be comparable to the other scales. The reason why the 5 point facial scale was used ratherthan a 9 point scale was because they have not been commonly used, and after reviewing a 9 pointfacial scale, we found the subtle differences in expressions to be too minute and somewhatconfusing. All mean scores were reported on a 9 point scale (Table 1).Table 1. Comparison of scales Scale Score 9-point Standard 1 2 3 4 5 6 7 8 9 9-point P&K 1 2 3 4 5 6 7 8 9 9 point Star 1 2 3 4 5 6 7 8 9 5 point Smiley 1 2 3 4 52. Comparison of stagesTo ensure that the samples from each of the 3 stages were homogeneous and comparable,interlocking quotas were used at each stage of the research to obtain even gender and agebalance. Because there were significant differences in age and gender proportions in each of thestages, the dataset was weighted with each individual stage being balanced towards the targetquotas, with 25% obtained in each cell (Table 2). Page 14 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstTable 2. Weighted proportions in each sample 7 to 8yrs 9 to 10yrs Male 25% 25% Female 25% 25%Which scale performed best?The performance of the scales were compared using several different approaches to cater for thedifferent hypotheses surrounding inter-scale and inter-stage differences.We hypothesised that the widely used star scale would be the most discriminating scale, followedby the child-oriented P&K scale. We thought that the smiley face and standard hedonic scaleswould perform equally in terms of discriminating power. We also believed that the interactive scalewould improve engagement and therefore lead to better quality data and consistency.The main areas of measurement of scale effectiveness and inter-scale performance were scalediscrimination power and range (proportion) of scale used. We investigated a number of statisticalmeasures to compare the scales (see Appendix). Prior to scale comparisons, respondents who hadfailed any of the consistency checks were removed from the data file.In Stage 1, where the traditional, “black and white” survey format was used, there was anopportunity to compare the effectiveness of the scales without the influence of interactive audio-visual elements or avatars. We examined the results from this survey to determine which of the 4scales provided the best discriminating power.Repeated measures analysis of variance with Duncan’s tests were used to compare the scales onall possible pairs of means. In Stage 1, the overall hedonic ratings showed a very similar patternacross scales (Figure 17). Page 15 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstFigure 17. Stage 1 Means of the fifteen items tested across each scaleIn the traditional, ‘black and white’ survey (Stage 1), the 4 different question scales (Standard, Star,Smiley Face and P&K) performed similarly, in terms of providing similar patterns in overall hedonicratings for the fifteen concepts. Page 16 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstOn examination of the overall hedonic results when scales were interactive (Stage 2), there was avery similar pattern across the scales. An interesting pattern emerged whereby the P&K scaletended to record slightly higher scores than the other scales. The same pattern was observed wheninteractive gaming elements (Stage 3) were used. Furthermore, the Standard 9 point scale wasalso shown to have a tendency toward higher ratings than the Star and Smiley Face scales.Comparing discriminating power of the scales in the traditional survey (Stage 1), a very slightadvantage went to the Standard 9 pt hedonic scale. Despite having fewer scale points, the Smileyface scale showed a similar level of performance as the other scales. When interactive elementswere used in Stage 2, scale discrimination was observed to drop overall, and no single scaleperformed better. The P&K scale performed marginally better than the other scales in theinteractive gaming survey (Stage 3).In terms of scale range or proportion used, a large proportion of the scales were used, and therewere no significant differences observed between the scales in the traditional questionnaire (Stage1). Results were similarly observed when scales were interactive (Stage 2). When interactivegaming elements were used (Stage 3), a significantly larger proportion of the Star and Smiley Facescales were used compared to the P&K scale (Table 3).Table 3. Proportion of scale used across the stages Stage 1 Stage 2 Stage 3 N=96 N=167 N=248Scale Proportion of scale usedStar 9pt 74% 77% 74%Smiley Face 5 pt 77% 78% 76%Standard 9pt 76% 76% 71%P&K 9pt 72% 73% 67%Further analysis of the scales revealed that when the15 hedonic scores were averaged, and thescales compared on average performance, no significant differences were observed.The inter-stage comparison of each individual scale revealed that there were also no significantdifferences in the performance of the individual scales between the stages. This suggests that theinteractive and gaming elements did not affect the research outcome significantly. Page 17 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstWhich questionnaire was more engaging?To measure respondent engagement, 5-point Likert scales were used to obtain feedback fromparticipants on how easy and how much fun they had on each of the scales in the study. The itemson the ease of use scale ranged from 1=‘Hard’ to 5=‘Easy’. The items on the fun scale ranged from1=‘No fun at all’ to 5=‘Lots of fun’. Both scales had numerical values assigned to all points on thescale, and so were treated as scale variables for the purpose of analysis.To compare the performance of the various stages, ability to follow instructions and responseconsistency were measured through a series of question checks that were repeated at thebeginning and end of the survey. This involved clicking at selected points on scales and indicatingthe number of brothers and sisters the participants had.Time to complete the surveys was also recorded and compared at each stage.Which scale was easiest to use?All of the questionnaire scales used at each stage were seen as easy to use (mean scores of over 4out of 5) (Table 4).In the traditional survey (Stage 1), the Smiley face scale was considered as significantly easier touse than the Standard and Star scales, but not significantly easier than the P&K scale. The P&Kscale was significantly easier to use than the Standard 9pt scale.When participants used interactive scales (Stage 2), the Smiley face and P&K scales wereconsidered as slightly (directionally) easier to use than the Standard scale.With gaming and interactive elements activated (Stage 3), all scales were considered as similarlyeasy to use and there were no significant differences. Page 18 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstTable 4. Mean scores on ease of use How EASY was it to answer the questions about the Stage 1 Stage 2 Stage 3 flavours and foods on this N=96 N=167 N=248 scale? Scale Mean/9 Standard 9pt 4.5 4.5 4.5 Star 9pt 4.6 4.6 4.6 P&K 9pt 4.7 4.7 4.6 Smiley Face 5pt 4.9 4.8 4.7No significant differences were observed by age and genderWhich scale was fun to use?All of the scales across all stages were seen as fun to use, with all obtaining mean scores of over 4out of 5 (Table 5).In the traditional survey (Stage 1), the Smiley scale was directionally more fun than the Standardscale (i.e. approaching a significant level).When interactive survey elements were used (Stage 2), the Smiley Face and Star scales were bothseen as significantly more fun to use than the Standard.With interactive-gaming elements (Stage 3), the Smiley Face scale was viewed as significantlymore fun to use than the Standard and P&K scales. The Star scale was considered as significantlymore fun than the Standard. Page 19 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstTable 5. Mean scores for rating of fun How much FUN did you have answering the Stage 1 Stage 2 Stage 3 questions about the N=96 N=167 N=248 flavours and foods on this scale? Scale Mean/9 Standard 9pt 4.2 4.1 4.1 Star 9pt 4.4 4.5 4.6 P&K 9pt 4.4 4.3 4.3 Smiley Face 5pt 4.6 4.5 4.7No significant differences were observed across age and gender in Stage 2 and 3. In Stage 1,younger males did not have as much fun on the 9-point hedonic and P&K scales as their oldercounterparts.Response consistency and following instructionsThe ability to answer consistently and follow instructions is a measure of respondent engagement,as it determines whether a participant is paying attention and is engaged in the task.Participants were asked to indicate how many brothers and how many sisters they had at 2 differentpoints in each survey stage (Figure 18). This was used because it was a question that wasrelatively easy for most children to answer, didn’t require an opinion (unchanging), and thereforeshould have remained constant. The questions are shown below: Page 20 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstFigure 18. Consistency question on number of siblingsThe results below (Table 6) indicate that there were very high levels of consistency when the siblingquestion was used and Chi-squared testing indicated that there were no significant differencesacross the different stages on this measure ( ).Table 6. Proportion of respondents making consistency errors when asked about number ofsiblings Stage 1 Stage 2 Stage 3 N=96 N=167 N=248 No mismatch 97% 93% 95% 1 mismatch 2% 7% 5% Both mismatch 1% 0% 0% Total 100% 100% 100%Participants were also asked to select a specific point on a scale at 2 different points in each surveystage. The second consistency check was used to check if participants were paying attention and ifthey were able to follow simple instructions at each stage. The question is shown below (Figure 19): Page 21 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstFigure 19. Consistency question on following instructionsIn Table 7 below, Chi squared testing revealed that there were significant differences in theproportions of consistency errors made by participants across the 3 stages ( ).Table 7. Proportion of respondents making consistency errors when following simpleinstructions Stage 1 Stage 2 Stage 3 N=96 N=167 N=248 Neither wrong 94% 79% 81% Both wrong 3% 5% 10% First check wrong, second check right 3% 14% 9% First check right, second check wrong 0% 2% 0% Total 100% 100% 100%Pairwise comparisons (p=0.05)Further analysis revealed that one in ten participants in the gaming stage (Stage 3) got bothconsistency checks incorrect, a significantly higher proportion than those in either the first or secondstages. 14% of those in Stage 2 got the first check wrong.It is possible that interactive and gaming elements distracted participants from completing simpletasks. Whilst a higher proportion of participants failed to follow the simple instructions properly inStage 2 and 3, they still managed to consistently answer questions about themselves. Page 22 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstTime taken to complete surveyOne common metric often used to determine data quality and respondent engagement has beentime spent in the survey. It has been said that spending too short a time or spending too much timeare both indicators of inattentiveness, resulting from speeding or distraction.The time taken to complete the survey was similar across each stage and there were no significantdifferences observed (Table 8). Note that this time was calculated based on time from frame toframe so excluded the building of the character in Stage 3 (for fairer comparison).Table 8. Time taken to complete survey by stage Time Taken Stage (HH:MM:SS) Stage 1 00:16:52 Stage 2 00:15:24 Stage 3 00:15:18Data that were 2 standard deviations away from the mean were removed for analysis.We thought that our participants would spend more time on surveys where interactive elementswere present, and even more time when gaming elements were activated. However, the resultsshow that there were no differences, and even very slightly (not significant) less time spent whereinteractive and gaming elements were present.ConclusionsIn terms of inter-scale comparison, all 4 questionnaire scales (standard, star, smiley face and P&K)presented in the traditional, ‘black and white’ survey format (Stage 1) performed similarly, in termsof providing similar patterns in overall hedonic ratings for the fifteen concepts. It was observed thatthe Standard scale offered a slight advantage, as it had marginally more discriminating power.However, when the hedonic scores were averaged across all products and the individual scalescompared, there were no differences. This suggests that all scales performed equally and no scaleperformed better in terms of discriminating power.In the interactive survey (Stage 2), discriminating power of all scales appeared lower overall,suggesting some level of interference and no single scale stood out from the rest. In the interactive-gaming survey (Stage 3), discriminating power was on average on par with the traditional “blackand white” survey, with the P&K scale performing marginally better than the others.The inter-stage comparison of each individual scale revealed that there were also no significantdifferences in the performance of the individual scales between the stages. This suggests that the Page 23 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction Firstinteractive and gaming elements did not affect the scales. However, inter-stage effects wereobserved in consistency checks. A significantly higher proportion of respondents in the traditionalsurvey (Stage 1) followed instructions correctly than in the interactive (Stage 2) and interactive-gaming (Stage 3) surveys. When asked questions about themselves (i.e. number of siblings), mostchildren answered consistently at all stages. Perhaps children don’t like being told what to do whenthey are playing?Overall, all 4 question scales at all stages were seen as easy and fun to use. In terms of ease ofuse, the Standard scale was considered as less easy to use when presented in a traditional andinteractive survey. With interactive - gaming elements, all the scales were seen as similarly easy touse, suggesting that gaming elements made them somewhat easier. There was consensus that theStandard scale was less fun to use, and this was observed across all stages.The addition of interactive and gaming elements neither enhanced nor reduced the level ofenjoyment. Because enjoyment and ease of use scores were positive and high across all stages, afew questions arose:  Did the children not want to admit that the task was difficult because they aspire to do things that older children can do easily?  Were these results affected by the tendency for children acquiesce when asked if they had fun, even when they had not?  Was it that we thought some of the scales were more fun than others, when in fact, from a child’s perspective, they were not quite as fun to them as we expected?The standard scale offered slightly better discriminating power, but was not as easy and fun to usecompared to other scales. We would suggest that this scale could lead to boredom and be lessengaging when conducting research with children.The P&K and star scales, designed originally for children, both performed well and similarly indiscriminating power. Both were considered as easy to use and fun. The P&K scale rated as slightlyeasier to use in traditional format (Stage 1), whilst the Star scale was considered as slightly morefun when used with interactive and gaming formats (Stage 2 and 3). Because both scales performequally well on one aspect or another, it would be important in moving forward that we test theirperformance on other aspects, such as predicting real behaviour. Both of these scales remain moreor less suitable for research with children, as we have not yet found a better scale. With the starscale, we would recommend that children should be reminded that the stars do not reflect right orwrong answers, so giving lower scores to something does not mean they will not be rewarded andvice versa. The P&K scale is not widely used in Australia and may be more suitable because thelanguage is more child-friendly, however, it should be noted that language differences between Page 24 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstAmerican and Australian children could mean that this scale needs to be adapted to the vernacularof Australian kids. Language may be another area to investigate in future.The smiley face scale performed on par with the other scales in terms of discrimination power, andwas seen as significantly more fun and easy to use than the Standard scale. This scale however,has been criticised in sensory food literature because it is considered as ambiguous and may leadto misinterpretation due to its emotional element. Whilst some would say that the advantages of thisscale would not be likely to outweigh its uncertainties, there may be more merits to this scale thancurrently recognised. And it is possible that what some believe to be the misgivings of the scale, arein fact the strengths. There is currently a convergence of thought surrounding the role of emotionsand decision making. Damasio (1994) in his book, “Emotion, Reason and the Human Brain”suggests that rationality stems from emotion, and that emotion stems from bodily senses. Thistheory is now informing developments of thinking in biometric and neuroscience research.Next StepsThis research prompts Direction First to consider the potential role of biometrics and emotions insensory food research in future. Traditional sensory research relies heavily on self-reported data formeasuring hedonics. However, because self-reported data is often obscured by experience andconscious thought, it may not provide enough insight into true responses and behaviour. Biometricsmeasures involuntary physiological responses such as heart rate, respiration patterns, perspirationand body movements. Biometrics such as those used by Bryant (2009) and Zeinstra (2009) utilisecutting edge technology to interpret what are considered as involuntary and therefore, unobscured,“real” and “true” measures of appeal, enjoyment, engagement, and attention. Currently, thesetechnologies are not widely available; however there may be forthcoming common measures inemotions and sensory food research in the future. The ethics of the use of biometrics with childrenwill require industry discussion.In light of this research, the general performance of the different scales, response consistency andthe suggestion that gaming elements did not significantly contribute to scale discrimination, wesuggest caution in moving scales strongly in this direction without consideration for the wholeresearch approach taken with children. Children played with us in their answers when weestablished a more playful environment, and perhaps this is not what we want in research. Page 25 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstReferencesBalogh,M., 2002.Cracking the kids marketing code, B&T, 2002[http://www.bandt.com.au/articles/03/0C00FC03.asp, accessed 22.01.10]Brand, J.,Borchard, J. And Holmes, K. 2009. Interactive Australia, 2009.National Researchprepared by Bond University for the Interactive Entertainment Association of Australia.Bryant, J. A., Weinberg, L., Levine, B., Jacobs, D. and Massoudian, M., 2009. Inspiring Change:Innovative Methods and Integrated Advertising. Online Research, Part 1, ESOMAR 2009.Cape, P. 2009. Questionnaire Length, fatigue Effects and Response Quality Revisited. SurveySampling International.Chambers, E IV. 2005. Conducting Sensory Research with Children: A Commentary. J. SensoryStudies. 20: 90-92.Cooper, H., 2002. Designing successful diagnostic scales for children. Presented at Ann. Mtg.Institute o f Food Technologists, Anaheim, CA, June 15-19.Covey, N., 2007. Connected Kids: Trends in Youth Gaming. ARF Youth Council, 21 August, 2007.The Nielsen Company.Cranmer, S. and Ulicsak, M., 2010. Gaming in Families, Final Report, Futurelab, United Kingdom.C&R Research, 2009. YouthBeat, KidzBeat Magazine Winter.Damasio, 1994. Emotion, Reason and the Human Brain.Fliegelman, A., Metx, P., and McIlrath, M., 2004. The ABC’s of Conducting Effective MarketResearch with Kids. C&R Research. Published in Media Research Club of Chicago (MRCC), June2004.Franco, C., 2010. Popular Online Games: new insight from European Research, WARCGeraci, J.C. 2004. What Do Youth Marketers Think About Selling to Kids? Harris Interactive.Published in Media Research Club of Chicago (MRCC), June 2004.Gladwell, M., 2001. The Tipping Point, Abacus, London, UK. Page 26 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstGuinard, J.X., 2001. Sensory and consumer testing with children. Trends in Food Science andTechnology, 11(8), 273–283.Kroll, B. J., 1990. Evaluating rating scales for sensory testing with young children. FoodTechnology, 44, 78–86.Nairn, A., 2009. Protection or Participation? Getting research ethics right for children in the digitalage, ESOMAR Congress.Lawless, H. T., Popper, R. And Kroll, B. J. 2010. A comparison of the labelled magnitude (LAM)scale, an 11-point category scale and the traditional 9-point hedonic scale. Food Quality andPreference 21 (2010): 4-12.Popper, R., & Kroll, J. J., 2005. Issues and viewpoints conducting sensory research with children.Journal of sensory studies, 20(1), 75–87. Also published in Food Technology, May 2003 Vol 57:5,60-65.Popper, R. And Kroll, J.J. 2003. Conducting Sensory Research with Children. Food Technology,Vol. 57:5, 60-65.Schraidt, M.F., 2009. Testing with Children: Getting Reliable Information from Kids. Peyram & KrollResearch Corporation (http://www.pk-research.com/paper_15.html, accessed April, 2010)Sleep, D. And Puleston, J., 2009. Leveraging interactive techniques to engage online respondents,Engage Research and GMI Interactive.Solomon, D. and Peters, J., 2005. Resolving Issues in children’s research. Young Consumers,Quarter 4, World Advertising Research Center, 68-73.Ubrick, B. (2002). Kids have great taste: An update to sensory work with children. Presented atAnn. Mtg. Institute of Food Technologists, Anaheim, CA, June 15-19.Zeinstra, G.G, Koelen, M.A., Colindres, D., Kok, F.J.. de Graaf, C., 2009. Facial expressions inschool-aged children are a good indicator of ‘dislikes’, but not of ‘likes’. Food Quality andPreference 20 (2009): 620–624. Page 27 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstAppendix 1.1 Materials and MethodsThe study was conducted in three separate stages. Each stage contained an independent sampleof respondents. The core materials and methods used were the same in each of the three stages.The main contrasts across each of the stages were as follows:  The first experiment was designed to compare the four scales as they exist in their standard ‘black and white’ form.  The second experiment introduced the four scales in a graphically enhanced ‘interactive’ format, with the scales providing light and sound feedback to respondents.  The third experiment repeated the four interactive scales with the introduction of an avatar like character that respondents designed at the beginning of the survey and which then operated as a guide taking them through the survey. This stage also introduced a series of background images that the guide was placed within. 1.1.1 SamplesThe samples were presented as conceptual text descriptions and images of common foodconsumption items, flavours, and commercial-like products.The common food consumption items included milk, honey, ice cream, bread and water. Theflavours included the taste of mint, chocolate, cinnamon, peanut butter, and lemon. The commerciallike products were made up concepts of a mix of sweet biscuit and savoury snack products thatwere relatively similar to some existing market products.This mix of different foods, flavours and products was used to ensure that the scales were testedacross the different levels of food consumption – from basic flavours, to common foodstuffs, tocommercial products. This was to test the scales in contexts relating not only to concept testing,but also on aspects more likely to be presented in sensory testing applications. The products werealso selected to contain a mix of liked, neutral, and disliked flavours and products, and so representa wider hedonic range. Page 28 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction First 1.1.2 Measurement InstrumentsThe four scales were tested in each of the stages: 1. 9 pt star scale 2. 5 pt smiley face scale 3. 9 pt hedonic scale 4. 9 pt P&K scale 1.1.3 ProcedureEach participant rated their liking of the fifteen items on one scale, before moving onto the nextscale, until all four scales had been used to rate the items. The order of the scales was randomisedin a balanced block design across participants. Scale experience questions were presented at theend of each scale, and ‘warm-up’ questions were used at the beginning of each new scale toensure respondents were aware that they had transitioned onto a new scale. 1.2 Statistical Analysis 1.2.1 Making the samples from each stage comparableTo ensure samples from each of the stages of research were homogenous in terms of age andgender, an interlocking quota was used in each stage of the research with an even balance of ageand gender as follows:Table 1. 7 to 8 9 to 10Male 25% 25%Female 25% 25%At the completion of surveying (Chi-squared) testing indicated that there were significant differencesin the proportions across the stages ( ). The dataset was therefore weighted with eachindividual stage being balanced towards the target quota’s, with 25% obtained in each cell. Page 29 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction First 1.2.2 Making the scales comparableAs three of the four scales had nine data points, the smiley face scale of five points was convertedto a 9-point scale with 1=1, 2=3, 3=5 and so on, as shown in the following table.Table 2. Scale Score 9-point hedonic 1 2 3 4 5 6 7 8 9 9-point P&K 1 2 3 4 5 6 7 8 9 Star (9 point) 1 2 3 4 5 6 7 8 9 Smiley 1 2 3 4 5All reported mean scores on the items tested are therefore on a 9 point scale. 1.2.3 How the scales were comparedThe performances of the scales were compared using several different approaches to cater for thedifferent hypotheses surrounding inter-scale and inter-stage differences.The main areas of measurement were response consistency, respondent engagement, scalediscrimination power, and range of scale used.To compare the performance of the various stages, response consistency was measured through aseries of question checks that were repeated at the beginning and end of the survey. Theseinvolved indicating the number of brothers and sisters participants had, and clicking at selectedpoints on scales.To measure respondent engagement, 5-point likert scales were used to obtain feedback fromrespondents on how easy and how much fun they had on each of the scales in the study. The itemson the ease of use scale ranged from 1 ‘Hard’ to 5 ‘Easy’. The items on the fun scale ranged from1 ‘No fun at all’ to 5 ‘Lots of fun’. Both scales had numerical values assigned to all points on thescale, and so were treated as scale variables for the purpose of analysis.The F-Ratio from Anova represents the ratio of systematic to unsystematic variance, or signal tonoise. Consequently, it has been used as a measure of scales’ ability to differentiate products(Lawless, Popper, & Kroll, 2010). The number of differences between means in post hoccomparisons is also a common measure of product differentiation. Consequently, the product F-ratio from Anova and number of different means by Duncan’s multiple range test were used asmeasures of product discrimination. Where variances are uneven, non parametric alternativeswere used. Page 30 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstPrior to scale comparisons based on F-Ratios, respondents who had failed any of the consistencychecks were removed from the data file. Because there were different age and gender samplesizes in each of the stages, weighting was applied to make these consistent across stages.The range of the scale used was calculated as the highest minus the lowest rating given across allfifteen attributes, and then divided by the total scale range. 1.1.1 ParticipantsIn the first experiment one hundred participants with children aged from seven to ten years wererecruited to a web survey. Parents of children completed a screening exercise, with children takingover once the screener was complete to undertake the survey. 1.2 Results – Stage 1 1.2.1 Respondent EngagementIn some cases Levene’s test indicated that the variances associated with the scales were not even,so a non-parametric F-test, Browne-Forsyth, was used. For paired comparison post hoc analysis insuch cases the Games-Howell test was used. 1.2.1.1 How easy were the scales to use?Browne-Forsyth revealed a significant difference among the scales. F(3,378)=6.4, p<.01. Games-Howell paired comparison tests indicated that the Smiley scale was significantly easier to use thanthe Standard 9pt and Star scales, but not significantly better than the P&K scale; while the P&Kscale was significantly easier to use than the Standard 9pt scale (P<.05 in all cases).Table 3. Scale Avg. Standard 4.9 Star 4.6 Super 4.7 Smiley 4.9 Page 31 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstNo significant differences were observed by age and gender.Table 4. Age & GenderHow EASY was it to answer thequestions about the flavours and foods 7-8 7-8 9-10 9-10on this scale? male female male femaleStage1Star 4.5 4.6 4.7 4.5Stage1Smiley 5.0 4.9 4.9 4.8Stage19pt 4.2 4.7 4.7 4.4Stage1P&K 4.5 4.7 4.7 4.9 1.2.1.2 How much fun were the scales to use?Browne-Forsyth did not reveal a significant difference among the scales F(3,455)=2.2,p=.082. TheWelch F-Ratio almost reached a significant level F(3,262)=2.4, p=.067. Games-Howell pairedcomparison tests revealed a ‘trend’ that the smiley scale was more fun than the standard 9-pointscale (p=.06).Table 5. Scale Avg. Standard 4.2 Star 4.4 Super 4.4 Smiley 4.6While no significant different differences were observed in terms of ease of use, there was adirectional indication that younger males did not have as much fun on the 9-point hedonic and P&Kscales as their older counterparts, as indicated by the Welch test, F(3,61)=2.6, p=.06.Table 6. Age and GenderHow much FUN did you have 7-8 7-8 9-10 9-10answering the questions about the male female male femaleflavours and foods on this scale?Stage1Star 4.1 4.4 4.7 4.3Stage1Smiley 4.5 4.6 4.6 4.6Stage19pt 3.9 4.3 4.6 4.1Stage1P&K 4.0 4.4 4.7 4.4 Page 32 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction First 1.2.2 Scale Discrimination PowerRepeated measures analysis of variance with Duncan’s tests were used to compare the scales allpossible pairs of means. The overall hedonic ratings showed a very similar pattern across scales.Figure 1. Stage 1 Means of the fifteen items tested across each scale*Sorted in descending order of product likingA slight advantage went to the Standard 9pt Hedonic scale in terms of identifying a greater numberof significant differences. Despite having fewer scale points, the Smiley scale showed a similarlevel of performance as the other scales.Table 7. F-Ratios and Significant Duncan tests Stage 1 No. of sig Duncan Scale F-Ratio tests Star 37.4 85 / 105 Smiley 41.6 85 / 105 Standard 41.0 87 / 105 Super 40.3 85 / 105 Page 33 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstDf for stage 1 were (14,1784) for all F-Ratios. All F’s were significant at p<.001.Duncan’s tests were performed on 105 possible pairs of means.Proportion of Scale usedThe proportion of the scale used was treated as a mean score (from 0-1). No significant differenceswere observed.Table 8. Proportion Scale of scale used Stage1Star 74% Stage1Smiley 77% Stage19pt 76% Stage1P&K 72% 1.1 Results – Stage 2 1.1.1 Respondent Engagement 1.1.1.1 How easy were the scales to use?Levene’s test indicated that the variances associated with the scales were not even, so a non-parametric F-test, Browne-Forsyth, was used. For paired comparison post hoc analysis theGames-Howell test was used.All of the scales were felt to be easy to use – all obtaining a mean score of over 4 out of 5.Browne-Forsyth revealed a just significant effect, F(3,429)=2.6, p=.05, and post-hoc tests revealeda directional difference suggesting the Smiley scale to be easier to use than the 9-point Hedonicscale (p=.07).Table 9. Scale Avg. Standard 4.5 Star 4.6 Super 4.7 Smiley 4.8 Page 34 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstNo significant differences were observed across age and gender.Table 10. Age & Gender How EASY was it to answer the questions about the flavours and foods 7-8 7-8 9-10 9-10 on this scale? male female male female Stage2Star 4.5 4.5 4.9 4.5 Stage2Smiley 4.9 4.7 4.7 4.8 Stage29pt 4.4 4.5 4.7 4.5 Stage2P&K 4.7 4.9 4.7 4.6 1.1.1.2 How much fun were the scales to use?All of the scales were fun for the respondents to use – all obtaining a mean score of over 4 out of 5.A significant difference between the scales was found with Browne-Forsyth, F(3,464)=4.5, p<.01.Post-hoc tests revealed the Star and Smiley scales to be significantly more fun to use than the 9-point Hedonic scale (p<.05).Table 11. Scale Avg. Standard 4.1 Super 4.3 Star 4.5 Smiley 4.5No significant differences were observed across age and gender.Table 12. Age and Gender How much FUN did you have 7-8 7-8 9-10 9-10 answering the questions about the male female male female flavours and foods on this scale? Stage2Star 4.7 4.3 4.6 4.3 Stage2Smiley 4.4 4.5 4.6 4.6 Stage29pt 4.1 4.0 4.3 4.1 Stage2P&K 4.2 4.1 4.5 4.4 Page 35 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction First 1.1.2 Scale Discrimination PowerRepeated measures analysis of variance with Duncan’s tests were used to compare the scales allpossible pairs of means. The overall hedonic ratings showed a very similar pattern across scales,however, in stage 2 a pattern emerged whereby the P&K scale tended to record higher scores thanthe other scales.Figure 2. Stage 2 Means of the fifteen items tested across each scale*Sorted in descending order of product likingTable 13. F-Ratios and Significant Duncan tests Stage 2 No. of sig Duncan Scale F-Ratio tests Star 29.4 79 / 105 Smiley 30.6 79 / 105 Standard 30.4 79 / 105 Super 30.0 80 / 105Df for stage 1 were (14,1784) for all F-Ratios. All F’s were significant at p<.001. Page 36 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstDuncan’s tests were performed on 105 possible pairs of means. 1.1.3 Proportion of Scale UsedThe proportion of the scales used in Stage 2 were not significantly different, ranging between 73and 78 percent.Table 14. Proportion Scale of scale used Stage2Star 77% Stage2Smiley 78% Stage29pt 76% Stage2P&K 73% 1.2 Results – Stage 3 1.2.1 Respondent Engagement 1.2.1.1 How easy were the scales to use?Levene’s test indicated that the variances associated with the scales were not even, so a non-parametric F-test, Browne-Forsyth, was used. For paired comparison post hoc analysis theGames-Howell test was used.All of the scales were felt to be very easy to use – all obtaining a mean score of 4.5 or more out of5. No significant differences between the scales were recorded, nor were any differences observedacross age and gender.Table 15. Scale Avg. Standard 4.5 Super 4.6 Star 4.6 Smiley 4.7 Page 37 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstTable 16. Age & GenderHow EASY was it to answer thequestions about the flavours and foods 7-8 7-8 9-10 9-10on this scale? male female male femaleStage3Star 4.7 4.6 4.7 4.6Stage3Smiley 4.6 4.7 4.8 4.7Stage39pt 4.5 4.6 4.3 4.5Stage3P&K 4.6 4.7 4.4 4.7 1.2.1.2 How much fun were the scales to use?All of the scales were fun for the respondents to use – all obtaining a mean score of over 4 out of 5.Levene’s test indicated that the variances associated with the scales were not even, so a non-parametric F-test, Browne-Forsyth, was used. For paired comparison post hoc analysis theGames-Howell test was used.Significant differences across the scales was identified, F(3,416)=6.6, p<.01. Post-Hoc testsrevealed that the Smiley scale was significantly more fun to use than the 9-point Hedonic and P&Kscales; and that the Star scale was significantly more fun than the 9-point Hedonic (p<.05 in allcases).Table 17. Scale Avg. Standard 4.1 Super 4.3 Star 4.6 Smiley 4.7 Page 38 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstNo significant differences were observed across age and gender.Table 18. Age and GenderHow much FUN did you have 7-8 7-8 9-10 9-10answering the questions about the male female male femaleflavours and foods on this scale?Stage3Star 4.7 4.5 4.5 4.5Stage3Smiley 4.7 4.7 4.6 4.6Stage39pt 4.2 4.3 4.0 4.0Stage3P&K 4.4 4.4 4.1 4.2 1.2.2 Scale Discrimination PowerRepeated measures analysis of variance with Duncan’s tests were used to compare the scales allpossible pairs of means. The overall hedonic ratings showed a very similar pattern across scales.The pattern observed in stage 2 was repeated in stage 3 where the P&K scale tended to recordhigher scores than the other scales. In this stage, the 9-point Hedonic scale was also shown tohave a tendency toward higher ratings than the Star and Smiley scales.Figure 3. Stage 3 Means of the fifteen items tested across each scale*Sorted in descending order of product liking Page 39 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstTable 19. F-Ratios and Significant Duncan tests Stage 3 No. of sig Duncan Scale F-Ratio tests Star 44.0 82 / 105 Smiley 43.8 83 / 105 Standard 39.7 83 / 105 Super 37.3 85 / 105Df for stage 1 were (14,1784) for all F-Ratios. All F’s were significant at p<.001.Duncan’s tests were performed on 105 possible pairs of means. 1.2.3 Proportion of Scale UsedA significantly larger proportion of the Star and Smiley scales were used than the P&K scale,F(3,475)=4.3, p<.01.Table 20. Proportion Scale of scale used Stage3Star 74% Stage3Smiley 76% Stage39pt 71% Stage3P&K 67% 1.3 Results – Comparisons of Stages 1.3.1 Response Consistency Measures – Clicking specified numbers on a scaleChi-squared testing indicated that there were significant differences in the proportions ofconsistency errors across the stages ( ).Follow up pairwise comparisons (p<.05) revealed that the first stage (containing standard scales)contained a significantly higher proportion of respondents who made no errors compared to boththe second and third stages. Page 40 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstTable 21. Consistency check I Stage Stage Stage 1 2 3 Consistency Checks N=96 N=167 N=248 Neither wrong 94% 79% 81% Both wrong 3% 5% 10% First check wrong, second check right 3% 14% 9% First check right, second check wrong 0% 2% 0% Total 100% 100% 100%Proportion of respondents making consistency errorsOne in ten respondents got both of the consistency checks incorrect in the third stage, asignificantly higher proportion than in either the first or second stages (p<.05). 1.3.2 Consistency Measures – Number of brothers and sistersThe consistency measure that assessed the number of brothers and sisters a respondent had atthe beginning and then again at the end of the survey revealed very high levels of consistency.Chi-squared testing indicated that there were no significant differences across the different stageson this consistency measure ( ).Table 22. Consistency check II Stage Stage Stage 1 2 3 No mismatch 97% 93% 95% 1 mismatch 2% 7% 5% Both mismatch 1% 0% 0% 1.3.3 Time TakenTime taken to complete the survey was similar across each stage, with no significant differencesobserved. Data that were two standard deviations away from the mean were removed for analysis.Note: this time is calculated based on time from frame to frame so excludes the building of thecharacter in stage 3 (for fairer comparison). Page 41 of
    • A child’s job is to play, we should let them... Pamela Wong, Direction FirstTable 23. Time taken to complete survey by stage Time Taken Stage (HH:MM:SS) Stage 1 00:16:52 Stage 2 00:15:24 Stage 3 00:15:18 Page 42 of