Code, J., Forde, K., Ralph, R., &Zap, N. (2021). Assessment for learning in immersive and virtual environments – Evidence-
centred game design in stem. In D. Anderson, M. Milner-Bolotin, R. Santos, & S. Petrina (Eds.), Proceedings of the 6th
International STEM in Education Conference (STEM 2021) (pp. 263-269). University of British Columbia
ASSESSMENT FOR LEARNING IN IMMERSIVE AND VIRTUAL
ENVIRONMENTS – EVIDENCE-CENTRED GAME DESIGN IN STEM
Jillianne Code
1
, Kieran Forde
1
, Rachel Ralph
2
and Nick Zap
1
1. University of British Columbia
2. Centre for Digital Media
ABSTRACT
Creative thinking, problem-solving and inquiry skills are primary goals of teaching and
learning. This paper reports on the development of an authentic performance assessment in
science, technology, engineering and mathematics (STEM), Falling Skies!, built around an
ecological, inquiry-based problem – where students are presented with the issue of a mass
mortality event and are challenged to investigate why this happened. Assessment for Learning
in Immersive Virtual Environments (ALIVE; alivelab.ca) is a research program that examines
how 3D immersive virtual environments (3DIVEs), as assessments for learning, is designed to
enable students to regulate their science inquiry abilities in real-time. Specifically, this project
explores the use of 3DIVEs to provide feedback through the formative assessment of inquiry
reasoning in the context of middle school life science. Ultimately, the ALIVE project aims to
contribute empirical evidence of how students conduct complex logic, assisting them to
become better self-regulated learners, thus providing a sense of personal agency, efficacy, and
opportunity necessary to participate in STEM careers.
Keywords: Evidence-Centred Game Design (EDgD), immersive learning, assessment, feedback,
science inquiry, learner agency
INTRODUCTION
Creative thinking, problem-solving and inquiry skills are primary goals of teaching and learning
(Jonassen, 1997; Shute & Wang, 2016). Current assessment approaches are inadequate at identifying
how students develop creative thinking, problem-solving, and scientific reasoning – essential 21st-
century skills (Shute & Emihovich, 2018). Instruction designed using learning principles related to
solving authentic well-structured, and ill-structured problems is critical for lifelong learning and transfer
to novel contexts (Van Eck et al., 2017). Feedback from formative assessments or assessments for
learning carried out during instruction can help educators tailor teaching and deepen students’
understanding, enabling them to self-regulate (Jaehnig & Miller, 2007; Van der Kleij et al., 2011).
Research clearly illustrates that the shorter the time interval between teachers’ eliciting the feedback and
using it to improve instruction and for the students to use it to enhance their learning, the more
significant the impact on learning (Wiliam & Leahy, 2015). Without the aid of technology, teachers’
ability to provide this type of feedback on a regular, timely basis is challenging. Computer-based
assessments (CBAs) have various advantages, such as the possibility of providing more timely feedback,
automated scoring, and higher efficiency (Van der Kleij et al., 2011). Facilitating authentic problem
solving and scientific inquiry through 3dimensional immersive virtual environments (3DIVEs) similar
to video games has shown considerable promise in the assessment literature, particularly on summative
assessment or assessment of learning (e.g., Baker et al., 2016).
Immersive game-based environments can be designed to assess science inquiry, problemsolving, and
critical thinking skills (Mislevy et al., 2014; Shute & Emihovich, 2018). As an instrument of assessment,
3DIVEs can be designed to simulate authentic tasks where students apply knowledge and reasoning to
situations similar to those they encounter in the real world – such as conditions that approximate how
scientists and engineers work through problems (Baker et al., 2016). Summative assessment using
3DIVEs is well researched and supported in the literature (Baker et al., 2016; Shute& Emihovich, 2018).
Using assessment frameworks such as evidence centred design (ECD; Mislevy et al., 2003) that focus
specifically on the psychometric properties of assessments appropriately
aligned with learning outcomes is critical. Further, using ECD and align it with game-based
assessments of learning using learning analytics and educational data mining (Baker & Siemens,
2014) results in reliable student models of inquiry task performance (Baker et al., 2016). While the
Baker et al. models of inquiry performance provide some utility towards assessment for learning,
the 3DIVE that they were modelled upon was not explicitly designed for this purpose. As a result,
there remain questions of validity.
Research Questions
Assessment for Learning in Immersive Virtual Environments (ALIVE; alivelab.ca) is a research
program that examines how 3DIVEs, as assessments for learning, are designed to enable students to
regulate their science inquiry abilities in real-time (See Figures 1 & 2). Specifically, this project
explores the use of 3DIVEs to provide feedback through the formative assessment of inquiry
reasoning in the context of middle school life science. Research questions that guide this project
include the following.
RQ 1. To what extent do various methods for providing formative feedback in a 3DIVE affect
students’ academic achievement on a science inquiry-based task?
RQ 2. To what extent do various methods for providing formative feedback in a 3DIVE affect
students’ agency as measured by goal setting, motivation, self-regulation, and selfefficacy?
RQ 3. To what extent can models of student interaction within a 3DIVE predict whether a
student will successfully conduct scientific inquiry and how this is related to their agency
for learning?
RQ 4. How can 3DIVEs, in the context of a real-world science inquiry problem, be developed to
provide formative feedback within the middle school classroom?
LITERATURE REVIEW
The key to educational reform relies on exploring alternative forms of assessment (Code & Zap,
2017). Feedback is conceptualized as information provided by an agent (e.g., teacher, peer, parent,
book, internet) regarding aspects of one’s performance or understanding (Hattie, 2011). For
example, a teacher can provide corrective information, a peer can provide an alternative strategy, a
book can provide information to clarify ideas, a parent can give encouragement, and a learner can
look up the information to evaluate the correctness of a response. Feedback is thus a “consequence”
of performance (Hattie, 2011). The literature on the effectiveness of feedback suggests complex
relationships between the feedback intervention, task, learning context, and individual differences
(Shute & Emihovich, 2018) affect the magnitude of the feedback effects (Hattie, 2011). However,
primary studies published to date have reported insufficient data to meaningfully examine this
complex relationship (Van der Kleij et al., 2015).
In a recent meta-analysis, Van der Kleij and colleagues (2015) examines to what extent various
methods for providing item-based feedback in a computer-based learning environment (CBLE) affect
students’ learning outcomes. Shute (2008) distinguished different types of feedback which, Van der
Kleij et al. further classified as knowledge of results (KR), knowledge of correct response (KCR), and
elaborated feedback (EF). Their meta-analysis aimed to provide multiple effect sizes, one for each type
of feedback (KR, KCR, EF) at four different feedback levels (task, process, selfregulation, self; see
Theoretical Framework). Also taken into account is the level of learning outcomes (lower-order vs.
higher-order), which is a relevant variable when examining feedback effects in a CBLE (Van der Kleij
et al., 2011). In the results of their meta-analysis, Van der Kleij and colleagues found that EF was more
effective than KR and KCR. Still, this hypothesis could not be meaningfully tested due to the small
number of observations and insufficient power. Results consistently showed that more EF led to higher
learning outcomes when compared to lower learning outcomes. However, most EF was aimed at the
task and process levels, making it difficult to draw any generalizable conclusions. Further, since the
results also show an uneven distribution of EF across the feedback levels, this
460
indicates a lack of overall research on specific groups in which EF may be practical (e.g., feedback
at the self-regulation level).
Given the high levels of variability in the effectiveness of feedback in CBLEs, more research is
needed examining how feedback is appropriately received and how to design CBLEs to enable an
increase in the frequency, types, and impact of feedback in the classroom. Given the low number of
studies in secondary education settings reported in Van der Kleij et al. (2015), the degree to which
the conclusions of this meta-analysis apply to younger learners is questionable. Finally, for the
limited studies available, the results show somewhat smaller effect sizes in school settings than in
higher education, suggesting that feedback might function differently within this context—ALIVE
project research aims to bridge these gaps.
THEORETICAL FRAMEWORK
Models of feedback in CBLEs need to consider their multidimensional nature. This
multidimensionality of feedback forms the framework on which this research is based. It has been
established that one dimension of feedback involves the type of feedback (KR, KCR, EF), which we
will include as the foundation of our framework. However, to provide meaningful results, we also
need to take into consideration the assessment context (e.g., science inquiry), level of feedback
(Hattie, 2011), as well as CBLE design impacts on learner agency (Code, 2020), and individual
differences (Hattie, 2012; Stevenson et al. 2013).
Science Inquiry
Authentic performance assessment in science, technology, engineering and mathematics (STEM)
requires students to apply scientific reasoning and knowledge in a way that resembles real-world
inquiry contexts and is central to the modern curriculum (Code et al., 2012; BCMOE, 2018).
Existing assessment frameworks built around knowledge acquisition are limited in their ability to
evaluate how inquiry processes develop. However, cognitive models of inquiry enable researchers
to examine these processes in situ. Models of STEM inquiry are structured around theorizing,
identifying questions and hypothesizing, accessing data and investigating, and analyzing and
synthesizing (White, Collins & Frederiksen, 2011). Building upon this model of inquiry,
interactions in a CBLE specifically ones that are enabled by 3DIVEs – must keep three aspects of
assessment in mind (Pellegrino et al., 2001): (1) the model of student cognition in the domain being
assessed (e.g., life science); (2) the set of beliefs about the kinds of observations that will provide
evidence of students’ competencies (e.g., 3DIVE trace data); and (3) the interpretation process for
making sense of the evidence (e.g., design framework). As 3DIVEs can feasibility be designed for
summative assessment (Baker et al., 2016), leveraging these findings, we can consider how
formative assessment and interaction design using this technology can potentially get us closer to
evaluating how students engage in inquiry processes.
Level of Feedback
At the task or product level, feedback is about how well a task is being accomplished, such as
distinguishing between correct from incorrect answers, acquiring more information, and building more
surface knowledge. This type of feedback is most common, is often called corrective feedback, and
encompasses 90% of teachers’ questions (Hattie, 2012). The second level of feedback aims to create the
product or complete the task (e.g., inquiry). Such feedback can lead to alternative processing, cognitive
load reduction, strategies for error detection, and cueing to seek more helpful information (Hattie, 2012).
Third, feedback at the self-regulation level (students’ monitoring of their learning processes) can
enhance students’ skills in self-evaluation and provide greater confidence to engage further in a task.
When students can monitor and self-regulate their learning, they can more effectively use feedback
(Zimmerman, 2008). Finally, the fourth level of feedback is directed toward
461
praising the self (e.g., Well done!). Praise can comfort and support and is welcomed by students;
however, research at this level is mixed at best (Skipper & Douglas, 2015).
Learner Agency
On the axiom that ‘learners are agents’, it follows that an understanding of human agency is
necessary to appreciate learning (Code, 2020). Agency is an emergent capability manifested in a
students’ ability to interact with personal, behavioural, environmental, and social factors in the learning
context (Martin, 2004; Bandura, 2006). Agency enables students’ influence on decisionmaking around
what and how something is learned. In other words, learner agency is the capacity of students to act and
engage with factors in the learning environment, ultimately enabling student voice and choice in the
learning process. Agency is inherent in students’ ability to regulate, control, and monitor their learning.
Research further suggests that agency mediates goal orientations, student perceptions of the learning
environment, social identification, the learning strategies they use, and overall academic performance
(Code, 2020). Providing students greater choice and voice in the curriculum through technologies
designed to enable inquiry learning improves engagement in the learning experience, empowering
students to become agents in their education.
EVIDENCE-CENTRED GAME DESIGN
Assessing complex interactions requires a comprehensive framework for making valid
inferences about learning. One such framework is Evidence Centered Design (ECD; Mislevy et al.,
2003), which provides a formal, multilayered approach to designing assessments as arguments
(Mislevy et al., 2014). The ECD framework helps to make explicit how high fidelity and rich
assessment data in 3DIVEs is established through iterative cycles of analysis, design, development,
implementation, and evaluation instructional design decisions (Figure 1).
Figure 1. Game development and assessment design
frameworks (Mislevy et al., 2014)
Falling Skies! The prototype
A prototype 3DIVE, Falling Skies! (Zap & Code, 2015; Figures 2 & 3), was developed using the
Unity3d Game Engine and was readily accessible using a web browser with the Unity Web Player
installed. Falling Skies! is built around an ecological, inquiry-based problem – where students are
presented with the issue of a mass mortality event of blackbirds in a village and are challenged to
investigate why this happened (Code & Zap, 2017). Students are presented with several probable causes
of this die-off – an ecological problem mirrored from a real-life case study (Robertson, 2011).
462
Students can freely traverse through the 3DIVE using an avatar (Figure 2), speak to villagers,
access reference resources, collect samples (Figure 3), and perform tests on these samples in a
simulated laboratory setting. The students can review the results of their tests and take notes to help
them later hypothesize what they think caused the die-off of this bird species. This 3DIVE is
aligned with the BC curriculum and was designed as a summative assessment.
Figure 1. Choosing a character in Falling Skies! Figure 2. First-person view in Falling Skies!
Falling Skies! The remix.
A remix 3DIVE, Falling Skies! 2.0 (FSV2; Code et al., 2021) is currently in development using
the Unity3d Game Engine and will be available as an iOS app optimized for the iPad. Similar to the
prototype, FSV2 is conceptualized around an ecological, inquiry-based problem – however, in this
update, students are presented with the issue of a mass mortality event involving red-headed
woodpeckers. Students are given several probable causes of this mortality event – an ecological
problem mirrored from a real-life case study (Government of Canada, 2021). This 3DIVE is aligned
with the BC curriculum and is being designed as a formative assessment using the ECD and the
formative assessment framework previously illustrated.
INFLUENCE & IMPACT.
The potential of this research to make a considerable impact in education is multifold. This project
will specifically deepen our understanding of the design and implementation of 3D immersive
technologies in the classroom and provide evidence for these technologies' role in providing formative
feedback towards making teaching and learning more effective and efficient. This understanding
encourages ways of teaching and learning necessary in the knowledge-based economy of the 21st
century. Ultimately, the ALIVE project will contribute empirical evidence of how students conduct
complex reasoning, assisting them to become better self-regulated learners, thus providing a sense of
personal agency, efficacy, and opportunity necessary to participate in STEM careers.
ACKNOWLEDGEMENTS
Jillianne Code and Nick Zap are supported in part by the Social Science and Humanities
Research Council of Canada (430-2016-00480).
REFERENCES
Baker, R. S. J. d., Clarke-Midura, J., & Ocumpaugh, J. (2016). Towards general models of effective science
inquiry in virtual performance assessments. Journal of Computer Assisted Learning, 32(3), 267-280.
Baker, R. S. J. d., & Siemens, G. (2014). Educational data mining and learning analytics. In K. Sawyer (Ed.),
Cambridge Handbook of the Learning Sciences (2nd ed.) (pp. 253-274).
Bandura, A. (2006). Toward a psychology of human agency. Perspectives on Psychological Science, 1(2),
164-180.
British Columbia Ministry of Education (BCMOE). (2018). Applied design, skills and technology.
https://curriculum.gov.bc.ca/curriculum/adst
463
Clarke-Midura, J., Code, J., Zap, N. & Dede, C. (2012). Assessing science inquiry in the classroom: A case
study of the virtual assessment project. In L. Lennex & K. Nettleton (Eds.), Cases on inquiry through
instructional technology in math and science: Systemic approaches (pp. 138-164). IGI Publishing.
Code, J. (2020). Agency for learning: Intention, motivation, self-efficacy and self-regulation. Frontiers
in Education, 5(19), 1-15.
Code, J., Clarke-Midura, J., Zap, N., & Dede, C. (2012). Virtual performance assessment in immersive
virtual environments. In H. Wang (Ed.), Interactivity in e-learning: Case studies and frameworks (pp.
230-252). IGI Publishing.
Code, J. & Zap, N. (2017). Assessment in immersive virtual environments: Cases for learning, of learning,
and as learning. Journal of Interactive Learning Research, 28(3), 235-248.
Code, J. & Zap, N. (2015). Assessment for Learning in Immersive Virtual Environments (ALIVE):
Falling skies (Version 1.0) [3D virtual environment]. University of Victoria.
Skipper, Y. & Douglas, K. (2015). The influence of teacher feedback on children’s perceptions
of studentteacher relationships. Educational Psychology 85(3), 276-288.
Government of Canada, (2021). Recovery Strategy for the Red-headed Woodpecker (Melanerpes
erythrocephalus) in Canada 2021. Retrieved from https://www.canada.ca/en/environment-
climatechange/services/species-risk-public-registry/recovery-strategies/red-headed-
woodpecker-2021.html
Hattie, J., & Gan, M. (2011). Instruction based on feedback. In P. Alexander & R. E. Mayer (Eds.),
Handbook of research on learning and instruction (pp. 249–271). New York, NY: Routledge.
Jonassen, D. H. (1997). Instructional design models for well-structured and III-structured problem-solving
learning outcomes. Educational Technology Research and Development, 45(1), 65–94.
Martin, J. (2004). Self-regulated learning, social cognitive theory, and agency. Educational Psychologist,
39(2), 135-145.
Mislevy, R. J., Almond, R. G., & Lukas, J. F. (2003). A brief introduction to evidence‐centered design.
ETS Research Report Series, 2003(1), i-29.
Mislevy, R. J., Oranje, A., Bauer, M., von Davier, A., Hao, J., Corrigan, S., Hoffman, E., DiCerbo, K.,
& John, E. (2014). Psychometric considerations in game-based assessment. GlassLab Research.
Rosa, E. M., & Leow, R. P. (2004). Computerizes task-based exposure, explicitness, type of feedback, and
Spanish L2 development. Modern Language Journal, 88, 192-216.
Shute, V. (2008). Focus on formative feedback. Review of Educational Research, 78(1), 153-189.
Shute, V., & Emihovich, B. (2018). Assessing problem-solving skills in game-based immersive
environments. In J. Voogt, G. Knezek, R. Christensen, & K. W. Lai (Eds.), Second Handbook of
Information Technology in Primary and Secondary Education (pp. 635-648). Springer.
Shute, V. J., & Wang, L. (2016). Assessing and supporting hard-to-measure constructs. In A. A. Rupp & J.
P. Leighton (Eds.), The handbook of cognition and assessment: Frameworks, methodologies, and
application (pp. 535–562). Hoboken: Wiley.
Van der Kleij, F. M., Feskins, R. C. W., & Eggen, T. J. H. M. (2015). Effects of feedback in a
computerbased learning environment on students' learning outcomes: A meta-analysis. Review
of Educational Research, 85(4), 475-511.
Van der Kleij, F. M., Timmers, C. F., & Eggen, T. J. H. M. (2011). The effectiveness of methods for
providing written feedback through a computer-based assessment for learning: A systematic review.
CADMO, 19, 21-39.
Van Eck, R. N., Shute, V. J., & Rieber, L. P. (2017). Levelling up: Game design research and practice for
instructional designers. In R. Reiser & J. Dempsey (Eds.), Trends and issues in instructional design
and technology (4th ed., pp. 227–285). Upper Saddle River: Pearson Education.
White, B., Collins, A., & Frederiksen, J. (2011). The nature of scientific meta-knowledge. In Khine, M. S.,
& Saleh, I. (Eds.), Models and modelling in science education: Cognitive tools for scientific enquiry (pp.
4176). London, UK: Springer.
Wiliam, D., & Leahy, S. (2015). Embedding formative assessment: Practical techniques for K-12
classrooms. West Palm Beach, FL: Learning Sciences International.
Winters, F., & Azevedo, R. (2005). High-school students' regulation of learning during computer-
based science inquiry. Journal of Educational Computing Research, 33(2), 189-217.
464
Zimmerman, B. (2008). Investigating self-regulation and motivation: Historical background, methodological
developments, and future prospects. American Educational Research Journal, 45(1), 166-183.
465

Assessment For Learning In Immersive And Virtual Environments Evidence-Centred Game Design In STEM

  • 1.
    Code, J., Forde,K., Ralph, R., &Zap, N. (2021). Assessment for learning in immersive and virtual environments – Evidence- centred game design in stem. In D. Anderson, M. Milner-Bolotin, R. Santos, & S. Petrina (Eds.), Proceedings of the 6th International STEM in Education Conference (STEM 2021) (pp. 263-269). University of British Columbia ASSESSMENT FOR LEARNING IN IMMERSIVE AND VIRTUAL ENVIRONMENTS – EVIDENCE-CENTRED GAME DESIGN IN STEM Jillianne Code 1 , Kieran Forde 1 , Rachel Ralph 2 and Nick Zap 1 1. University of British Columbia 2. Centre for Digital Media ABSTRACT Creative thinking, problem-solving and inquiry skills are primary goals of teaching and learning. This paper reports on the development of an authentic performance assessment in science, technology, engineering and mathematics (STEM), Falling Skies!, built around an ecological, inquiry-based problem – where students are presented with the issue of a mass mortality event and are challenged to investigate why this happened. Assessment for Learning in Immersive Virtual Environments (ALIVE; alivelab.ca) is a research program that examines how 3D immersive virtual environments (3DIVEs), as assessments for learning, is designed to enable students to regulate their science inquiry abilities in real-time. Specifically, this project explores the use of 3DIVEs to provide feedback through the formative assessment of inquiry reasoning in the context of middle school life science. Ultimately, the ALIVE project aims to contribute empirical evidence of how students conduct complex logic, assisting them to become better self-regulated learners, thus providing a sense of personal agency, efficacy, and opportunity necessary to participate in STEM careers. Keywords: Evidence-Centred Game Design (EDgD), immersive learning, assessment, feedback, science inquiry, learner agency INTRODUCTION Creative thinking, problem-solving and inquiry skills are primary goals of teaching and learning (Jonassen, 1997; Shute & Wang, 2016). Current assessment approaches are inadequate at identifying how students develop creative thinking, problem-solving, and scientific reasoning – essential 21st- century skills (Shute & Emihovich, 2018). Instruction designed using learning principles related to solving authentic well-structured, and ill-structured problems is critical for lifelong learning and transfer to novel contexts (Van Eck et al., 2017). Feedback from formative assessments or assessments for learning carried out during instruction can help educators tailor teaching and deepen students’ understanding, enabling them to self-regulate (Jaehnig & Miller, 2007; Van der Kleij et al., 2011). Research clearly illustrates that the shorter the time interval between teachers’ eliciting the feedback and using it to improve instruction and for the students to use it to enhance their learning, the more significant the impact on learning (Wiliam & Leahy, 2015). Without the aid of technology, teachers’ ability to provide this type of feedback on a regular, timely basis is challenging. Computer-based assessments (CBAs) have various advantages, such as the possibility of providing more timely feedback, automated scoring, and higher efficiency (Van der Kleij et al., 2011). Facilitating authentic problem solving and scientific inquiry through 3dimensional immersive virtual environments (3DIVEs) similar to video games has shown considerable promise in the assessment literature, particularly on summative assessment or assessment of learning (e.g., Baker et al., 2016). Immersive game-based environments can be designed to assess science inquiry, problemsolving, and critical thinking skills (Mislevy et al., 2014; Shute & Emihovich, 2018). As an instrument of assessment, 3DIVEs can be designed to simulate authentic tasks where students apply knowledge and reasoning to situations similar to those they encounter in the real world – such as conditions that approximate how scientists and engineers work through problems (Baker et al., 2016). Summative assessment using 3DIVEs is well researched and supported in the literature (Baker et al., 2016; Shute& Emihovich, 2018). Using assessment frameworks such as evidence centred design (ECD; Mislevy et al., 2003) that focus specifically on the psychometric properties of assessments appropriately
  • 2.
    aligned with learningoutcomes is critical. Further, using ECD and align it with game-based assessments of learning using learning analytics and educational data mining (Baker & Siemens, 2014) results in reliable student models of inquiry task performance (Baker et al., 2016). While the Baker et al. models of inquiry performance provide some utility towards assessment for learning, the 3DIVE that they were modelled upon was not explicitly designed for this purpose. As a result, there remain questions of validity. Research Questions Assessment for Learning in Immersive Virtual Environments (ALIVE; alivelab.ca) is a research program that examines how 3DIVEs, as assessments for learning, are designed to enable students to regulate their science inquiry abilities in real-time (See Figures 1 & 2). Specifically, this project explores the use of 3DIVEs to provide feedback through the formative assessment of inquiry reasoning in the context of middle school life science. Research questions that guide this project include the following. RQ 1. To what extent do various methods for providing formative feedback in a 3DIVE affect students’ academic achievement on a science inquiry-based task? RQ 2. To what extent do various methods for providing formative feedback in a 3DIVE affect students’ agency as measured by goal setting, motivation, self-regulation, and selfefficacy? RQ 3. To what extent can models of student interaction within a 3DIVE predict whether a student will successfully conduct scientific inquiry and how this is related to their agency for learning? RQ 4. How can 3DIVEs, in the context of a real-world science inquiry problem, be developed to provide formative feedback within the middle school classroom? LITERATURE REVIEW The key to educational reform relies on exploring alternative forms of assessment (Code & Zap, 2017). Feedback is conceptualized as information provided by an agent (e.g., teacher, peer, parent, book, internet) regarding aspects of one’s performance or understanding (Hattie, 2011). For example, a teacher can provide corrective information, a peer can provide an alternative strategy, a book can provide information to clarify ideas, a parent can give encouragement, and a learner can look up the information to evaluate the correctness of a response. Feedback is thus a “consequence” of performance (Hattie, 2011). The literature on the effectiveness of feedback suggests complex relationships between the feedback intervention, task, learning context, and individual differences (Shute & Emihovich, 2018) affect the magnitude of the feedback effects (Hattie, 2011). However, primary studies published to date have reported insufficient data to meaningfully examine this complex relationship (Van der Kleij et al., 2015). In a recent meta-analysis, Van der Kleij and colleagues (2015) examines to what extent various methods for providing item-based feedback in a computer-based learning environment (CBLE) affect students’ learning outcomes. Shute (2008) distinguished different types of feedback which, Van der Kleij et al. further classified as knowledge of results (KR), knowledge of correct response (KCR), and elaborated feedback (EF). Their meta-analysis aimed to provide multiple effect sizes, one for each type of feedback (KR, KCR, EF) at four different feedback levels (task, process, selfregulation, self; see Theoretical Framework). Also taken into account is the level of learning outcomes (lower-order vs. higher-order), which is a relevant variable when examining feedback effects in a CBLE (Van der Kleij et al., 2011). In the results of their meta-analysis, Van der Kleij and colleagues found that EF was more effective than KR and KCR. Still, this hypothesis could not be meaningfully tested due to the small number of observations and insufficient power. Results consistently showed that more EF led to higher learning outcomes when compared to lower learning outcomes. However, most EF was aimed at the task and process levels, making it difficult to draw any generalizable conclusions. Further, since the results also show an uneven distribution of EF across the feedback levels, this 460
  • 3.
    indicates a lackof overall research on specific groups in which EF may be practical (e.g., feedback at the self-regulation level). Given the high levels of variability in the effectiveness of feedback in CBLEs, more research is needed examining how feedback is appropriately received and how to design CBLEs to enable an increase in the frequency, types, and impact of feedback in the classroom. Given the low number of studies in secondary education settings reported in Van der Kleij et al. (2015), the degree to which the conclusions of this meta-analysis apply to younger learners is questionable. Finally, for the limited studies available, the results show somewhat smaller effect sizes in school settings than in higher education, suggesting that feedback might function differently within this context—ALIVE project research aims to bridge these gaps. THEORETICAL FRAMEWORK Models of feedback in CBLEs need to consider their multidimensional nature. This multidimensionality of feedback forms the framework on which this research is based. It has been established that one dimension of feedback involves the type of feedback (KR, KCR, EF), which we will include as the foundation of our framework. However, to provide meaningful results, we also need to take into consideration the assessment context (e.g., science inquiry), level of feedback (Hattie, 2011), as well as CBLE design impacts on learner agency (Code, 2020), and individual differences (Hattie, 2012; Stevenson et al. 2013). Science Inquiry Authentic performance assessment in science, technology, engineering and mathematics (STEM) requires students to apply scientific reasoning and knowledge in a way that resembles real-world inquiry contexts and is central to the modern curriculum (Code et al., 2012; BCMOE, 2018). Existing assessment frameworks built around knowledge acquisition are limited in their ability to evaluate how inquiry processes develop. However, cognitive models of inquiry enable researchers to examine these processes in situ. Models of STEM inquiry are structured around theorizing, identifying questions and hypothesizing, accessing data and investigating, and analyzing and synthesizing (White, Collins & Frederiksen, 2011). Building upon this model of inquiry, interactions in a CBLE specifically ones that are enabled by 3DIVEs – must keep three aspects of assessment in mind (Pellegrino et al., 2001): (1) the model of student cognition in the domain being assessed (e.g., life science); (2) the set of beliefs about the kinds of observations that will provide evidence of students’ competencies (e.g., 3DIVE trace data); and (3) the interpretation process for making sense of the evidence (e.g., design framework). As 3DIVEs can feasibility be designed for summative assessment (Baker et al., 2016), leveraging these findings, we can consider how formative assessment and interaction design using this technology can potentially get us closer to evaluating how students engage in inquiry processes. Level of Feedback At the task or product level, feedback is about how well a task is being accomplished, such as distinguishing between correct from incorrect answers, acquiring more information, and building more surface knowledge. This type of feedback is most common, is often called corrective feedback, and encompasses 90% of teachers’ questions (Hattie, 2012). The second level of feedback aims to create the product or complete the task (e.g., inquiry). Such feedback can lead to alternative processing, cognitive load reduction, strategies for error detection, and cueing to seek more helpful information (Hattie, 2012). Third, feedback at the self-regulation level (students’ monitoring of their learning processes) can enhance students’ skills in self-evaluation and provide greater confidence to engage further in a task. When students can monitor and self-regulate their learning, they can more effectively use feedback (Zimmerman, 2008). Finally, the fourth level of feedback is directed toward 461
  • 4.
    praising the self(e.g., Well done!). Praise can comfort and support and is welcomed by students; however, research at this level is mixed at best (Skipper & Douglas, 2015). Learner Agency On the axiom that ‘learners are agents’, it follows that an understanding of human agency is necessary to appreciate learning (Code, 2020). Agency is an emergent capability manifested in a students’ ability to interact with personal, behavioural, environmental, and social factors in the learning context (Martin, 2004; Bandura, 2006). Agency enables students’ influence on decisionmaking around what and how something is learned. In other words, learner agency is the capacity of students to act and engage with factors in the learning environment, ultimately enabling student voice and choice in the learning process. Agency is inherent in students’ ability to regulate, control, and monitor their learning. Research further suggests that agency mediates goal orientations, student perceptions of the learning environment, social identification, the learning strategies they use, and overall academic performance (Code, 2020). Providing students greater choice and voice in the curriculum through technologies designed to enable inquiry learning improves engagement in the learning experience, empowering students to become agents in their education. EVIDENCE-CENTRED GAME DESIGN Assessing complex interactions requires a comprehensive framework for making valid inferences about learning. One such framework is Evidence Centered Design (ECD; Mislevy et al., 2003), which provides a formal, multilayered approach to designing assessments as arguments (Mislevy et al., 2014). The ECD framework helps to make explicit how high fidelity and rich assessment data in 3DIVEs is established through iterative cycles of analysis, design, development, implementation, and evaluation instructional design decisions (Figure 1). Figure 1. Game development and assessment design frameworks (Mislevy et al., 2014) Falling Skies! The prototype A prototype 3DIVE, Falling Skies! (Zap & Code, 2015; Figures 2 & 3), was developed using the Unity3d Game Engine and was readily accessible using a web browser with the Unity Web Player installed. Falling Skies! is built around an ecological, inquiry-based problem – where students are presented with the issue of a mass mortality event of blackbirds in a village and are challenged to investigate why this happened (Code & Zap, 2017). Students are presented with several probable causes of this die-off – an ecological problem mirrored from a real-life case study (Robertson, 2011). 462
  • 5.
    Students can freelytraverse through the 3DIVE using an avatar (Figure 2), speak to villagers, access reference resources, collect samples (Figure 3), and perform tests on these samples in a simulated laboratory setting. The students can review the results of their tests and take notes to help them later hypothesize what they think caused the die-off of this bird species. This 3DIVE is aligned with the BC curriculum and was designed as a summative assessment. Figure 1. Choosing a character in Falling Skies! Figure 2. First-person view in Falling Skies! Falling Skies! The remix. A remix 3DIVE, Falling Skies! 2.0 (FSV2; Code et al., 2021) is currently in development using the Unity3d Game Engine and will be available as an iOS app optimized for the iPad. Similar to the prototype, FSV2 is conceptualized around an ecological, inquiry-based problem – however, in this update, students are presented with the issue of a mass mortality event involving red-headed woodpeckers. Students are given several probable causes of this mortality event – an ecological problem mirrored from a real-life case study (Government of Canada, 2021). This 3DIVE is aligned with the BC curriculum and is being designed as a formative assessment using the ECD and the formative assessment framework previously illustrated. INFLUENCE & IMPACT. The potential of this research to make a considerable impact in education is multifold. This project will specifically deepen our understanding of the design and implementation of 3D immersive technologies in the classroom and provide evidence for these technologies' role in providing formative feedback towards making teaching and learning more effective and efficient. This understanding encourages ways of teaching and learning necessary in the knowledge-based economy of the 21st century. Ultimately, the ALIVE project will contribute empirical evidence of how students conduct complex reasoning, assisting them to become better self-regulated learners, thus providing a sense of personal agency, efficacy, and opportunity necessary to participate in STEM careers. ACKNOWLEDGEMENTS Jillianne Code and Nick Zap are supported in part by the Social Science and Humanities Research Council of Canada (430-2016-00480). REFERENCES Baker, R. S. J. d., Clarke-Midura, J., & Ocumpaugh, J. (2016). Towards general models of effective science inquiry in virtual performance assessments. Journal of Computer Assisted Learning, 32(3), 267-280. Baker, R. S. J. d., & Siemens, G. (2014). Educational data mining and learning analytics. In K. Sawyer (Ed.), Cambridge Handbook of the Learning Sciences (2nd ed.) (pp. 253-274). Bandura, A. (2006). Toward a psychology of human agency. Perspectives on Psychological Science, 1(2), 164-180. British Columbia Ministry of Education (BCMOE). (2018). Applied design, skills and technology. https://curriculum.gov.bc.ca/curriculum/adst 463
  • 6.
    Clarke-Midura, J., Code,J., Zap, N. & Dede, C. (2012). Assessing science inquiry in the classroom: A case study of the virtual assessment project. In L. Lennex & K. Nettleton (Eds.), Cases on inquiry through instructional technology in math and science: Systemic approaches (pp. 138-164). IGI Publishing. Code, J. (2020). Agency for learning: Intention, motivation, self-efficacy and self-regulation. Frontiers in Education, 5(19), 1-15. Code, J., Clarke-Midura, J., Zap, N., & Dede, C. (2012). Virtual performance assessment in immersive virtual environments. In H. Wang (Ed.), Interactivity in e-learning: Case studies and frameworks (pp. 230-252). IGI Publishing. Code, J. & Zap, N. (2017). Assessment in immersive virtual environments: Cases for learning, of learning, and as learning. Journal of Interactive Learning Research, 28(3), 235-248. Code, J. & Zap, N. (2015). Assessment for Learning in Immersive Virtual Environments (ALIVE): Falling skies (Version 1.0) [3D virtual environment]. University of Victoria. Skipper, Y. & Douglas, K. (2015). The influence of teacher feedback on children’s perceptions of studentteacher relationships. Educational Psychology 85(3), 276-288. Government of Canada, (2021). Recovery Strategy for the Red-headed Woodpecker (Melanerpes erythrocephalus) in Canada 2021. Retrieved from https://www.canada.ca/en/environment- climatechange/services/species-risk-public-registry/recovery-strategies/red-headed- woodpecker-2021.html Hattie, J., & Gan, M. (2011). Instruction based on feedback. In P. Alexander & R. E. Mayer (Eds.), Handbook of research on learning and instruction (pp. 249–271). New York, NY: Routledge. Jonassen, D. H. (1997). Instructional design models for well-structured and III-structured problem-solving learning outcomes. Educational Technology Research and Development, 45(1), 65–94. Martin, J. (2004). Self-regulated learning, social cognitive theory, and agency. Educational Psychologist, 39(2), 135-145. Mislevy, R. J., Almond, R. G., & Lukas, J. F. (2003). A brief introduction to evidence‐centered design. ETS Research Report Series, 2003(1), i-29. Mislevy, R. J., Oranje, A., Bauer, M., von Davier, A., Hao, J., Corrigan, S., Hoffman, E., DiCerbo, K., & John, E. (2014). Psychometric considerations in game-based assessment. GlassLab Research. Rosa, E. M., & Leow, R. P. (2004). Computerizes task-based exposure, explicitness, type of feedback, and Spanish L2 development. Modern Language Journal, 88, 192-216. Shute, V. (2008). Focus on formative feedback. Review of Educational Research, 78(1), 153-189. Shute, V., & Emihovich, B. (2018). Assessing problem-solving skills in game-based immersive environments. In J. Voogt, G. Knezek, R. Christensen, & K. W. Lai (Eds.), Second Handbook of Information Technology in Primary and Secondary Education (pp. 635-648). Springer. Shute, V. J., & Wang, L. (2016). Assessing and supporting hard-to-measure constructs. In A. A. Rupp & J. P. Leighton (Eds.), The handbook of cognition and assessment: Frameworks, methodologies, and application (pp. 535–562). Hoboken: Wiley. Van der Kleij, F. M., Feskins, R. C. W., & Eggen, T. J. H. M. (2015). Effects of feedback in a computerbased learning environment on students' learning outcomes: A meta-analysis. Review of Educational Research, 85(4), 475-511. Van der Kleij, F. M., Timmers, C. F., & Eggen, T. J. H. M. (2011). The effectiveness of methods for providing written feedback through a computer-based assessment for learning: A systematic review. CADMO, 19, 21-39. Van Eck, R. N., Shute, V. J., & Rieber, L. P. (2017). Levelling up: Game design research and practice for instructional designers. In R. Reiser & J. Dempsey (Eds.), Trends and issues in instructional design and technology (4th ed., pp. 227–285). Upper Saddle River: Pearson Education. White, B., Collins, A., & Frederiksen, J. (2011). The nature of scientific meta-knowledge. In Khine, M. S., & Saleh, I. (Eds.), Models and modelling in science education: Cognitive tools for scientific enquiry (pp. 4176). London, UK: Springer. Wiliam, D., & Leahy, S. (2015). Embedding formative assessment: Practical techniques for K-12 classrooms. West Palm Beach, FL: Learning Sciences International. Winters, F., & Azevedo, R. (2005). High-school students' regulation of learning during computer- based science inquiry. Journal of Educational Computing Research, 33(2), 189-217. 464
  • 7.
    Zimmerman, B. (2008).Investigating self-regulation and motivation: Historical background, methodological developments, and future prospects. American Educational Research Journal, 45(1), 166-183. 465