SlideShare a Scribd company logo
1 of 15
Download to read offline
Paper ID #36855
Assessing authentic problem-solving in heat transfer
Jiamin Zhang
Jiamin Zhang, PhD, is a postdoctoral scholar and lecturer in physics at Auburn University. Her research focuses on
studying authentic problem-solving in undergraduate engineering programs and what factors impact student persistence in
STEM. She earned her PhD in chemical engineering from the University of California, Santa Barbara.
Soheil Fatehiboroujeni (Assistant Professor )
Soheil Fatehiboroujeni received his Ph.D. in mechanical engineering from the University of California, Merced in 2018
focused on the nonlinear dynamics of biological filaments. As an engineering educator and postdoctoral researcher at
Cornell University, Sibley School of Mechanical and Aerospace Engineering, Soheil worked in the Active Learning
Initiative (ALI) to promote student-centered learning and the use of computational tools such as MATLAB and ANSYS in
engineering classrooms. In Spring 2022, Soheil joined Colorado State University as an assistant professor of practice in
the Department of Mechanical Engineering. His research is currently focused on the long-term retention of knowledge and
skills in engineering education, design theory and philosophy, and computational mechanics.
Matthew Ford
Matthew J. Ford (he/him) received his B.S. in Mechanical Engineering and Materials Science from the University of
California, Berkeley, and went on to complete his Ph.D. in Mechanical Engineering at Northwestern University. After
completing a postdoc with the Cornell Active Learning Initiative, he joined the School of Engineering and Technology at
UW Tacoma to help establish its new mechanical engineering program. His teaching and research interests include solid
mechanics, engineering design, and inquiry-guided learning. He has supervised undergraduate and master's student
research projects and capstone design teams.
Eric Burkholder (Postdoctoral Scholar)
Eric Burkholder is an assistant professor of physics and of chemical engineering at Auburn Univeristy. He received his
PhD in chemical engineering from Caltech and spent three years as a postdoc in Carl Wieman's group at Stanford
University. His research focuses broadly on problem-solving in physics and engineerin courses, as well as issues related to
retention and equity in STEM.
Š American Society for Engineering Education, 2022
Powered by www.slayte.com
Assessing authentic problem-solving in heat transfer
Introduction
Engineers are known as problem-solvers. In their work, they encounter ill-structured problems
that require them to collect additional information, consider external constraints, and reflect on
their solution process1,2,3,4,5,6
. Recent graduates cite these skills as the most important technical
skills required of them in their everyday work7
. However, there are reports from employers and
researchers that undergraduate students are not prepared for solving these kinds of problems when
they graduate8,9
.
One of the reasons for this skills gap is that the majority of problem-solving in traditional
undergraduate engineering programs consists of solving textbook problems. Textbook problems
are designed to exercise a limited set of knowledge and skills, and thus may not reflect the
problem-solving practices that are used in real-world settings. Textbook problems do not allow
room for students to make their own assumptions, decide what information is needed, decide how
to present their findings, etc. Project-based learning, such as in capstone design courses and
senior labs10,11
, is one alternative that allows room for more student decision-making, but these
opportunities are often limited in undergraduate curricula.
A principal challenge in teaching problem-solving is that one cannot teach what one cannot
measure, and real-world problem-solving skills are difficult to measure. As discussed above,
textbook problems may not measure real-world problem-solving skills. Design projects and other
long-term projects may be more realistic measures of problem-solving practice, but they are
impractical for large-scale assessment. Indeed, the National Academies are calling for the
development and widespread use of research-based assessments to better measure student
understanding and skills in undergraduate programs12
.
There are some existing instruments that aim to measure problem-solving, but they each have
limitations. Some attempt to measure problem-solving using puzzle-like challenges that do not
rely on any scientific or engineering content knowledge13
. These assessments thus do not
measure how one solves engineering problems, and it is not clear to what extent problem-solving
on these assessments correlates with problem-solving in real-world practice14,15,16
. Other
“critical-thinking” or aptitude assessments are commercially available, but due to their proprietary
nature, the evidence for reliability and validity of these assessments is not widely available17,18,19
.
To-date, there are almost no assessments of problem-solving in engineering.
Our aim in this paper is to describe the development and pilot-testing of an assessment of
problem-solving in engineering. We chose heat transfer as the context for our assessment so that
it would be useful to educators in a wide variety of engineering disciplines.
Theory
Problem-solving has long been studied in engineering education research12
. Early studies are
based on information-processing models, which posit a step-by-step approach to
problem-solving20
. These models consider how knowledge is represented, the role of background
knowledge, and limits on working memory. Other early studies are grounded in constructivism21
or socioconstructivism22,23
. One limit to these frameworks is that they are often not grounded in
direct empirical evidence.
Cognitive systems engineering24
, grounded in naturalistic decision-making theory25
, incorporates
both cognitive and ecological (contextual) elements of problem-solving and is grounded in
empirical evidence. This work often studies how skilled practitioners make critical decisions
when solving specific problems in a real-world setting. Based on this work, researchers have
developed a framework for characterizing expert problem-solving in science and engineering that
describes problem-solving as a set of 29 decisions-to-be-made26
. How these decisions are made is
argued to be highly context dependent and draws upon deep disciplinary knowledge.
In addition, Price et al.27
have come up with a general template for how to assess these decisions.
The researcher chooses an authentic context and structures the problem so that the solver is asked
to make a relevant subset of the 29 decisions to solve the problem. The basic components of the
framework are 1) provide an authentic problem context, 2) ask a series of questions that require
test-takers to make decisions about problem definition and planning how to solve, 3) provide
more information, 4) ask a series of questions that require decisions about interpreting
information and drawing conclusions, and 5) have test-takers choose and reflect on their solution.
One of the authors has previously developed such an assessment in chemical process design28
.
One important feature of these assessments is that students are not graded based on a scale that
compares them to one another, but rather compares their responses to a consensus of experts. This
represents a philosophical shift in educational assessment. As described in Adams and Wieman29
,
the development phase of the assessment consists of student and expert pilot-testing as an integral
phase of rubric development. We describe this process in greater detail below.
Assessment
When choosing the context for the heat transfer assessment, we wanted to ensure (1) the relevant
physics includes the key concepts in heat transfer (e.g., convection and conduction), (2) any
additional physics should be simple enough to explain, and (3) a professor who does research in
heat transfer or someone from industry who regularly uses heat transfer to design systems should
be able to be considered “expert-enough” to solve the problem. The physical context we chose is
the countercurrent heat exchange between arteries and veins in the human finger. This
countercurrent heat exchange mechanism reduces the degree to which the returning venous blood
must be warmed by the core, at the expense of a lower average temperature in the
extremities.
We situate the participant of the assessment as a member of an engineering team responsible for
the design and development of a thermal regulation system for the next generation of space suits.
The preliminary step for the engineering team is to develop a quantitative model of the body’s
natural thermal regulation mechanisms, particularly in extremities. The model should capture the
effect of countercurrent heat exchange between arteries (warmer blood) and veins (cooler blood)
in a finger. The assessment begins broadly by asking the participant how they would model the
heat transfer in the finger. Then, it asks what assumptions the participant would make to simplify
the problem, what information they need to solve the problem, and what variables they think the
findings will be most sensitive to. These questions prompt the participant to identify key features
of the problem and think about possible models. We then provide the participant with a model
proposed by their colleague for the heat exchange in the finger and ask more detailed questions
about the feasibility of the model to draw the participant’s attention to some important features of
the model, such as the geometry. We then ask the participant what information they would need to
further evaluate the model. Next, we provide a revised, more detailed model and ask the
participant to give further feedback. We then ask some more detailed questions about the second
model, again to draw the participant’s attention to important features of the model, such as the
boundary conditions and governing equations. The schematics provided at the beginning of the
assessment and for the two models are included in Figure 1. After asking questions about the two
proposed models, we provide a comparison of experimental measurements and simulation results
from the second model and ask the participant whether they think the model matches the
experimental data. Finally, we ask the participant to compare the two proposed models and
discuss how they would model the phenomenon given the two models.
(a) (b)
(c)
Figure 1: Schematics in the heat transfer assessment (a) Blood vessels in a human hand (from Complete
Anatomy App30), (b) Resistor network model (Model 1), (c) Finite element model (Model 2).
Figure 2 provides a summary of the sequence of provided information and questions in the heat
transfer assessment. The color coding reflects the decisions relevant to the assessment that are
part of the 29 decisions identified by Price et al.26
Each item in the table in the middle of the
figure represents either provided information or a question to be answered, color coded according
to the template item type. Items with multiple color codes indicate that multiple decisions were
probed in one question. For the items that have half-yellow, information was provided in the
context of a question being asked.
iterations
Figure 2: Assessment sequence with color coding for decision probed. Each item represents either
provided context/information or a question to be answered, color coded according to the template item
type. Items with multiple color codes indicate that multiple decisions were probed in one question (or
where half-yellow, information was provided in the context of a question being asked).
After developing the assessment, we first asked former teaching assistants in heat transfer courses
to take the assessment to ensure that the questions were being interpreted the way we intended
and that the prompts were adequately capturing the thought process of the solver29
. The
assessment was then pilot tested with 12 experts (faculty members who do research in heat
transfer, as well as professional engineers who design heat exchangers) and 12 undergraduate
students. The experts were either acquaintances of members of the research team or volunteers
recruited through email lists of The American Physical Society (APS) and American Society for
Engineering Education (ASEE). The students were undergraduate students who were taking or
had recently taken a heat transfer class in mechanical engineering at the time of the assessment.
We used think-aloud interviews31
for both the experts and students during the pilot-testing. The
participants were given a link to the Qualtrics survey for the assessment and were asked to
complete the assessment and talk about their thought process during the interview. The responses
in Qualtrics, interview transcripts, and audio files were used during the data analysis.
Analysis
We used data from the expert responses to create a scoring rubric for the assessment. Students
were graded based on how well their responses agreed with the experts. We started by analyzing
the three closed-response, multiple-choice questions: assumptions, information needed, and
sensitivity (numbers 3, 4, and 5 in Fig. 2). As an example, for the sensitivity question (item 5 in
the assessment), the participants were asked to choose five out of nine variables that they thought
the findings would be the most sensitive to. Choices with stronger expert consensus (either for or
against) were given more weight to the total score than choices on which experts disagreed or
highlighted different key features. The score for each question was calculated using the following
formula:
score =
X
items
(2 ∗ response − 1) ∗ (%consensus − 50%). (1)
For Item 3 (assumptions) and Item 5 (sensitivity), possible answers to each item in the question
are TRUE and FALSE, which translate to values of 0 and 1, respectively. For Item 4 (information
needed), possible answers to each option in the question are Definitely Want, Might Want, and
Don’t Need, with values of 1, 0.5, and 0, respectively. The expert consensus is calculated by
summing the expert response values and dividing by the total number of experts. According to
Equation 1, if the %consensus for an item in the assumptions question is 75% and a student
selected TRUE for that assumption, they will get 0.25 added. If a student selected FALSE for that
assumption, they will get 0.25 subtracted. For questions on which experts disagree (i.e., %
consensus is close to 50%), our weighting scheme ensures that these questions only make a very
small contribution to the total score.
To develop the rubric for the open-response questions, we needed to code the expert responses to
identify themes of answers that experts agree on. We used an emergent scheme and did three
rounds of coding, with 3 coders coding separately then discussing the codes for the first two
rounds and 2 coders coding separately then discussing the codes for the third round. The first
round of coding focused on counting codes separately for each question. The second round of
coding focused on the decisions experts and students made when responding to each question.
From the first two rounds of coding, we found many overlapping codes for multiple questions and
grouped questions that got similar answers together for counting the codes: Items 7-11 for
feedback on Model 1, Items 13-17 for feedback on Model 2, Items 19 & 22 for feedback and
questions about the experimental data, and Items 20 & 21 for feedback and questions about the
simulation results. We then used the expert responses to come up with a list of key themes which
were mentioned by multiple experts. For example, the experts overwhelmingly mentioned that
Model 1 (resistor network model) was missing the temperature dependence in the axial direction.
The lists of codes for these groups of questions are included in Table 1.
Items 8, 14, and 21 explicitly asked the participants a yes/no question about the feasibility or
validity of the model. Item 8 asked whether the colleague will successfully investigate the role of
countercurrent heat exchange in the finger using the resistor network model. Out of the 12 experts
interviewed, 7 experts explicitly said the colleague won’t be successful because the first model
has limitations. One expert said the model is a good start, two experts said the model is OK, and
two experts didn’t explicitly answer the question. Similar level of consensus was achieved for
Items 14 and 21. For Item 14, the majority of experts explicitly stated that the colleague will be
successful using the finite element model; for Item 21, the majority of experts explicitly stated
that the simulation results don’t agree with the experimental data.
The other codes listed in Table 1 are derived from either general questions asking the participant
for their feedback or asking about specific aspects of the model, such as the geometry, and
boundary conditions. The open-response nature of the questions results in high variability among
experts and experts can choose a variety of things to mention. Even a few experts mentioning the
same code indicates high importance. Thus, we considered codes that were mentioned by at least
3 experts to be expert consensus items. The other items mentioned by experts were labeled as
expert non-consensus items.
Students were then graded based on how well their responses agreed with the experts. Students
were rewarded for highlighting expert consensus items, and penalized for highlighting extraneous
features that experts did not consider to be important. Not all non-consensus items mentioned by
students are considered extraneous: in the feedback to Model 1, one student stated that the
resistors should be in parallel, which is incorrect. This response was considered extraneous.
When responding to Item 9, “are there important features missing from the model”, one student
asked for more material properties. While this was not mentioned by any of the experts, we didn’t
label this response as extraneous because it is a reasonable comment consistent with a correct
understanding of the model. For each group of questions, students were awarded one point for
each expert consensus item and penalized 0.5 points for each extraneous item:
score = Nexpert consensus items − 0.5Nextraneous items. (2)
When coding the questions to identify expert-consensus feedback items, we coded Questions 19
& 22 and Questions 20 & 21 separately, because 19 & 22 were about the experiment and data
collection while 20 & 21 were about the model prediction and validation. However, when
calculating the scores, we grouped questions 19, 20, 21 and 22 together because the questions
asked participants to compare the experimental results with the simulation results and thus belong
to the same overall category. Items 2, 23, and 24 were not coded because there was no clear
expert or student consensus, and interview audio files suggested that experts did not interpret
those questions in the manner intended. These questions will be eliminated or significantly
revised in future iterations.
There were a few experts and students who did not complete all of the questions in the Qualtrics
survey. In those cases, we used the recorded interview to fill in the missing responses. After using
this approach, we only had one missing response from one student for one of the options in Item 4
information needed. We left the response blank when calculating the score for that student so the
student did not gain or lose points for the missing response. One expert misinterpreted the
questions about the two models and the experimental data and two experts had incomplete
responses to questions about Model 1 and questions about experiments and simulation. Their
responses were excluded when calculating the statistics and plotting the scores in the Results
section.
Table 1: Rubric for scoring the assessment based on responses from eleven experts. The feasibility and
validity items were mentioned by at least seven experts. The other rubric items were mentioned by at
least three experts. The contributing questions refer to the items in Figure 2.
Topic Area Elements of Expert Solution (Expert Consensus Codes) Contributing
Questions
Feedback
on Model 1
(Resistor
Network
Model)
• Feasibility: No, colleague won’t be successful using this
modela
• Missing axial dependence
• Problems with lumped flows or missing capillaries
• Missing internal convective resistance
• Unclear boundary conditions (Ta, Tv not specified)
7 - 11
Feedback
on Model 2
(Finite
Element
Model)
• Feasibility: Yes, colleague will be successful using this
modelb
• Remove transient term in equation
• Neglect circumferential conduction (in the theta direction)
• Neglect axial conduction
• Difficulty in estimating external convection coefficient
• Problems with lumped flows or missing capillaries
13 - 17
Feedback
on
Experiment
and
Simulation
Questions about experiment and data collection:
• Asking about error bars
• Asking about measurement probe
• Asking about experimental control: environment
• Asking about subject differences
19 & 22
Questions about model prediction and validation:
• Validity: No, model doesn’t match experimental datac
• Large temperature mismatch between model and experiment
• Asking about error bars in simulation result
• Asking about model parameter: perfusion rate
20 & 21
a
Number of experts who explicitly stated colleague won’t be successful : number of experts who explicitly stated
colleague will be successful = 7 : 2
b
Number of experts who explicitly stated colleague will be successful : number of experts who explicitly stated
colleague won’t be successful = 7 : 1
c
Number of experts who explicitly stated model doesn’t match experimental data : number of experts who explicitly
stated model matches experimental data = 8 : 0
Results
For the three closed-response questions, the average scores of the students are all lower than that
of the experts and the standard deviations for the student scores are larger than that of the experts.
The sensitivity question is the best at distinguishing between expert and student responses, with
the average score for students two standard deviations below the average score for experts. The
assumptions question has the smallest separation between the expert and student performances,
with the average score for students only half of one standard deviation below the average score
for experts. The descriptive statistics for the closed-response questions are included in Table 2.
The maximum possible score was calculated by selecting the expert consensus choice for all of
the options. The number of items indicates the number of sub-questions (or items) in a question.
For example, in the information question, 32 different pieces of information were listed (such as
outside air temperature and average blood temperature in the vein) and the participant was asked
to select from Definitely Want, Might Want, and Don’t Need for each piece of information.
Table 2: Descriptive statistics for expert and student scores in closed-response questions (items 3-5).
Assumptions
(Item 3)
Information
(Item 4)
Sensitivity
(Item 5)
Max Possible Score 1.58 7.75 2.00
Min Possible Score -1.58 -7.75 -2.00
Number of Items 10 32 9
Number of Subject 12 12 12
Experts Average 0.99 5.07 1.25
Standard Dev. 0.60 0.88 0.38
Number of Subject 12 12 12
Students Average 0.75 4.00 0.49
Standard Dev. 0.75 1.73 0.63
To compare the performance of experts and students across these three questions that were on
different scales, we rescaled the expert and student scores using the expert average and expert
standard deviation:
scaled score =
score − mean(expert scores)
stdv(expert scores)
. (3)
The scaled scores for Assumptions, Information, and Sensitivity are plotted in Figure 3. The box
encompasses the 25th to the 75th percentiles of each score. Note that the thick center line in the
box plot indicates the median, and thus may differ from zero.
One of the reasons why the assumptions question and information needed question have low
separation between experts and students is that different experts had different interpretations of
the assumptions and information. For example, “ignore axial heat transfer” is ambiguous. Several
experts didn’t make this assumption because axial advection (heat transported by the flow of
warm blood) shouldn’t be ignored. However, in the feedback to Model 2, several experts
recommended ignoring axial conduction. If we were to change the wording of the assumptions
question to “ignore axial conduction” or “ignore axial advection”, we will likely get better
consensus among experts. Furthermore, experts won’t necessarily make a simplifying assumption
if they think the data can be easily obtained. For example, for “assume specific heat and density of
blood and flesh are the same” and “assume thermal conductivity of blood and flesh are the same”
some experts didn’t make these assumptions because they think the data will be easy to find.
Figure 3: Comparison of scaled expert and student scores of closed-response questions. The scores
are scaled by subtracting the average expert score then dividing by the expert standard deviation. The
median student score is lower than the median expert score on all items. The sensitivity question shows
the clearest differentiation between students and experts.
For the open-response questions, we calculated the scores for three groups of questions: Model 1,
Model 2, and experiment & simulation. The statistics of the expert and student scores are
included in Table 3. The maximum possible score is the total number of expert consensus items
for each group of questions. The number of subjects differs because some expert responses were
missing or discarded, as described in the Analysis section. For all of the open-response question
groups, the average scores for the students are lower than that of the experts. Model 2 questions
have the largest separation between expert and student performances, with the average student
score more than one standard deviation below that of the experts. Model 1 questions and
experiment & simulation questions have smaller separation between expert and student
performances, with the average student score less than one standard deviation below that of the
experts. Importantly, although the average scores for Model 1 and Model 2 questions are very
similar for experts, the average Model 2 student score is only one third of the average Model 1
student score. Model 2 is the finite element model, which is a more complicated model compared
to the resistor network model and students are not performing as well for Model 2.
We plotted the scaled scores in Figure 4. The scores are rescaled according to Eqn. 3. One
striking feature of the box plot is that for Model 2 questions, the 75th percentile of the student
score is lower than the 25th percentile of the expert score, again highlighting the large separation
between experts and students for questions regarding the finite element model.
Out of the 12 students in the pilot study, only two students mentioned “remove transient term”,
and only one student mentioned “neglect axial conduction” for Model 2. This implies that
students are not very good at making simplifying assumptions for the finite element model. For
Model 1, however, students are much better at identifying the model flaw: “missing axial
Table 3: Descriptive statistics for expert and student scores in open-response questions.
Model 1
Questions
(Items 7-11)
Model 2
Questions
(Items 13-17)
Experiment and
Simulation
(Items 19-22)
Max Possible Score 5 6 8
Number of Subject 9 11 9
Experts Average 2.39 2.41 3.89
Standard Dev. 1.27 1.41 1.27
Number of Subject 12 12 12
Students Average 1.54 0.50 2.83
Standard Dev. 1.22 0.71 1.47
Figure 4: Comparison of scaled expert and student scores of open-response questions. The scores
are scaled by subtracting the average expert score then dividing by the expert standard deviation. The
median student score is lower than the median expert score for all question groups. The Model 2
questions show the clearest differentiation between students and experts.
dependence”, with 6 students mentioning this in their response.
Discussion
Across the three closed-response questions and the three groups of open-response questions,
students on average score lower than experts. In particular, the sensitivity question and Model 2
questions have the largest separation between expert performance and student performance.
Although the questions for Model 1 and Model 2 are nearly identical, with the only difference
being the specific feature highlighted in Items 10 and 17, the average Model 2 student score is
only one third of the average Model 1 student score. For Model 1, all of the expert consensus
codes point out flaws in the model (e.g., “missing axial dependence”), whereas only one of the
five expert consensus codes for Model 2 is about model flaws (“problems with lumped flows or
missing capillaries”). The consensus codes for model flaws correspond to the decision “what are
the important underlying features?” in the list of decisions described by Price et al.26
. The
majority of the expert consensus codes for Model 2 suggest simplifications for the model (e.g.,
“remove transient term in equation”), which corresponds to the decision “what approximations
and simplifications to make?” in the list of decisions described by Price et al.26
The differences in
student performance for Model 1 and Model 2 questions suggest that students are good at
identifying important missing features, but need more practice making appropriate
simplifications. We hypothesize this is related to what decisions students had opportunities
practicing in the heat transfer course. Specifically, most textbook problems require students to use
equations to model a heat transfer problem and students are given feedback from the professor on
whether the equations and calculations are correct (i.e., if they missed any key features). However,
students are rarely given models that are correct but too complicated and are asked to simplify the
model. Results from the pilot study suggests that in future courses in heat transfer, students need
to be given more opportunities to practice a variety of the problem-solving decisions, such as how
to make assumptions, how to make simplifications and approximations.
The analysis of responses to the closed-response and open-response questions reveals differences
in the development of the predictive framework26
by the experts and students. In particular, for
the sensitivity question, the average score for experts is higher than that for the students and the
responses across the experts are more consistent. This is because experts have more experience
solving authentic problems which build better intuition about what features one should pay close
attention to.
Limitations and Future Work
Although the current heat transfer assessment was able to distinguish between problem-solving
skills of experts and students, several aspects of the assessment need to be changed to improve the
level of consensus among the experts and make the questions less ambiguous. First, as mentioned
in the Results section, some of the closed-response items are ambiguous and result in different
interpretations by different experts. Second, the items in the assumptions question are all
reasonable assumptions and don’t include any options that are obviously wrong. In the next
version of the assessment, we will revise the items in the closed-response questions to remove
ambiguity and include a few assumptions, sensitivity variables, and information needed
mentioned by students that are obviously wrong. These items either come from the interview
transcripts corresponding to the closed-response questions or the extraneous items we coded for
the open-response questions. The advantage of including wrong options mentioned by students is
that experts are very unlikely to choose these options while students are likely to choose these
options, resulting in bigger separation between expert and student performances for the
closed-response questions. Third, for the open-response questions, we plan to remove repetitive
questions that generated similar responses from the participants to make the assessment
shorter.
In addition to revising the assessment and doing pilot studies with experts and students using the
revised assessment, we also plan to conduct pre- and post-test in undergraduate heat transfer
courses to study how well students’ problem solving skills improve after taking the course. The
assessment will be included as one of the questions on the first homework and last homework in
the course. Furthermore, we plan to develop a closed-response version of all of the questions in
the assessment to automate the scoring of the assessment. This will significantly reduce the time
needed to analyze the student responses and enable a wider distribution of the assessment.
Responses from initial pilot testing, think-aloud interviews with students, and expert responses
will be used to replace the open-response questions with “choose-many” multiple-choice
questions. This will ensure that the questions include reasonable distractors generated by students
as well as answers chosen by a consensus of experts.
Further refinement of this assessment and wide distribution in classrooms will be an important
advance in engineering education research, as it will provide a reliable way to measure
problem-solving—an important, but not guaranteed outcome of an engineering education. It can
be used to answer many different research questions, e.g. are there differences in outcomes
between traditional heat transfer courses and heat transfer courses that focus on developing
problem solving skills? It can also be used as an assessment tool for various engineering
departments to decide whether their undergraduate programs are adequately preparing students
for the workplace. Improving our ability to measure problem-solving is an important step in being
able to improve the way we teach problem-solving to undergraduate students and prepare them
for engineering careers. We hope to encourage other educators to use this assessment in their
courses to measure how well they are preparing their students to solve real-world engineering
problems.
References
[1] ABET, “Criteria for accrediting engineering programs, 2019 – 2020,” 2019.
[2] C. L. Dym, “Design, systems, and engineering education,” International Journal of Engineering Education,
vol. 20, no. 3, pp. 305–312, 2004.
[3] C. L. Dym, A. M. Agogino, O. Eris, D. D. Frey, and L. J. Leifer, “Engineering design thinking, teaching, and
learning,” Journal of Engineering Education, vol. 94, no. 1, pp. 103–120, 2005.
[4] D. Jonassen, J. Strobel, and C. B. Lee, “Everyday problem solving in engineering: Lessons for engineering
educators,” Journal of Engineering Education, vol. 95, no. 2, pp. 139–151, 2006.
[5] D. H. Jonassen, “Designing for decision making,” Educational Technology Research and Development, vol. 60,
no. 2, pp. 341–359, 2012.
[6] N. Shin, D. H. Jonassen, and S. McGee, “Predictors of well-structured and ill-structured problem solving in an
astronomy simulation,” Journal of Research in Science Teaching, vol. 40, no. 1, pp. 6–33, 2003.
[7] H. J. Passow, “Which ABET competencies do engineering graduates find most important in their work?”
Journal of Engineering Education, vol. 101, no. 1, pp. 95–118, 2012.
[8] Q. Symonds, “The global skills gap in the 21st century,” 2018. [Online]. Available:
https://www.qs.com/portfolio-items/the-global-skills-gap-in-the-21st-century/
[9] C. Grant and B. Dickson, “Personal skills in chemical engineering graduates: the development of skills within
degree programmes to meet the needs of employers,” Education for Chemical Engineers, vol. 1, no. 1, pp.
23–29, 2006.
[10] A. J. Dutson, R. H. Todd, S. P. Magleby, and C. D. Sorensen, “A review of literature on teaching engineering
design through project-oriented capstone courses,” Journal of Engineering Education, vol. 86, no. 1, pp. 17–28,
1997.
[11] C. L. Dym, “Learning engineering: Design, languages, and experiences,” Journal of Engineering Education,
vol. 88, no. 2, pp. 145–148, 1999.
[12] N. R. Council, “Discipline-based education research: Understanding and improving learning in undergraduate
science and engineering,” Washington, DC: The National Academies, 2012.
[13] W. K. Adams, “Development of a problem solving evaluation instrument; untangling of specific problem
solving skills,” Unpublished Doctoral Dissertation, University of Colorado, 2007.
[14] C. J. Harris, J. S. Krajcik, J. W. Pellegrino, and A. H. DeBarger, “Designing knowledge-in-use assessments to
promote deeper learning,” Educational Measurement: Issues and Practice, vol. 38, no. 2, pp. 53–67, 2019.
[15] F. Fischer, C. A. Chinn, K. Engelmann, and J. Osborne, Scientific Reasoning and Argumentation: The Roles of
Domain-specific and Domain-general Knowledge. Routledge, 2018.
[16] P. Kind and J. Osborne, “Styles of scientific reasoning: A cultural rationale for science education?” Science
Education, vol. 101, no. 1, pp. 8–31, 2017.
[17] ACT CAAP Technical Handbook. Iowa City, IA: ACT, 2007.
[18] S. Klein, R. Benjamin, R. Shavelson, and R. Bolus, “The collegiate learning assessment: Facts and fantasies,”
Evaluation Review, vol. 31, no. 5, pp. 415–439, 2007.
[19] E. T. S. (ETS), MAPP User’s Guide. Princeton, NJ: Educational Testing Service, 2007.
[20] D. P. Simon and H. A. Simon, “Individual differences in solving physics problems,” in Children’s Thinking:
What Develops?, R. S. Siegler, Ed. Lawrence Erlbaum Associates, Inc, 1978, pp. 325–348.
[21] J. Piaget, Success and Understanding. Cambridge, MA: Harvard University Press, 1978.
[22] J. Lave and E. Wenger, Situated Learning: Legitimate Peripheral Participation. Cambridge university press,
1991.
[23] L. B. Resnick, “Shared cognition: Thinking as social practice,” in Perspectives on Socially Shared Cognition,
L. B. Resnick, J. M. Levine, and S. D. Teasley, Eds. American Psychological Association, 1991, pp. 1–20.
[24] G. Lintern, B. Moon, G. Klein, and R. R. Hoffman, “Eliciting and representing the knowledge of experts,” in
The Cambridge Handbook of Expertise and Expert Performance, 2nd ed., K. A. Ericsson, R. R. Hoffman,
A. Kozbelt, and A. M. Williams, Eds. Cambridge, UK: Cambridge University Press, 2018, pp. 165–191.
[25] K. Mosier, U. Fischer, R. R. Hoffman, and G. Klein, “Expert professional judgments and ”naturalistic decision
making”,” in The Cambridge Handbook of Expertise and Expert Performance, 2nd ed., K. A. Ericsson, R. R.
Hoffman, A. Kozbelt, and A. M. Williams, Eds. Cambridge, UK: Cambridge University Press, 2018, pp.
453–475.
[26] A. M. Price, C. J. Kim, E. W. Burkholder, A. V. Fritz, and C. E. Wieman, “A detailed characterization of the
expert problem-solving process in science and engineering: Guidance for teaching and assessment,” CBE—Life
Sciences Education, vol. 20, no. 3, p. ar43, 2021.
[27] A. M. Price, E. W. Burkholder, S. Salehi, C. J. Kim, V. Isava, M. P. Flynn, and C. E. Wieman, “An accurate and
practical method for measuring science and engineering problem-solving expertise,” Submitted.
[28] E. Burkholder, A. Price, M. Flynn, and C. Wieman, “Assessing problem solving in science and engineering
programs,” in Proceedings of the Physics Education Research Conference, 2019.
[29] W. K. Adams and C. E. Wieman, “Development and validation of instruments to measure learning of
expert-like thinking,” International Journal of Science Education, vol. 33, no. 9, pp. 1289–1312, 2011.
[30] Complete Anatomy App. [Online]. Available: https://3d4medical.com/press-category/complete-anatomy
[31] K. A. Ericsson and H. A. Simon, “Verbal reports as data,” Psychological Review, vol. 87, no. 3, p. 215, 1980.

More Related Content

Similar to Assessing Authentic Problem-Solving In Heat Transfer

1936 teaching material_and_energy_balances_to
1936 teaching material_and_energy_balances_to1936 teaching material_and_energy_balances_to
1936 teaching material_and_energy_balances_to
chandro57
 
Ibdp physics exetended essay and army ppt.pptx
Ibdp physics exetended essay and army ppt.pptxIbdp physics exetended essay and army ppt.pptx
Ibdp physics exetended essay and army ppt.pptx
Aarti Akela
 
RMIP-21RMI56-Module 1.docx for electronics
RMIP-21RMI56-Module 1.docx for electronicsRMIP-21RMI56-Module 1.docx for electronics
RMIP-21RMI56-Module 1.docx for electronics
shazmeentuba11
 
Conceptual problem solving in high school physicsJennifer .docx
Conceptual problem solving in high school physicsJennifer .docxConceptual problem solving in high school physicsJennifer .docx
Conceptual problem solving in high school physicsJennifer .docx
donnajames55
 
Lucio_3300_L8-RP copy
Lucio_3300_L8-RP copyLucio_3300_L8-RP copy
Lucio_3300_L8-RP copy
Isaac Lucio
 

Similar to Assessing Authentic Problem-Solving In Heat Transfer (20)

NGSS Active Physics Alignment by chapter updated 6/1/15
NGSS Active Physics Alignment by chapter updated 6/1/15NGSS Active Physics Alignment by chapter updated 6/1/15
NGSS Active Physics Alignment by chapter updated 6/1/15
 
NGSS Active Physics Alignment by Chapter updated 6/3/15
NGSS Active Physics Alignment by Chapter updated 6/3/15NGSS Active Physics Alignment by Chapter updated 6/3/15
NGSS Active Physics Alignment by Chapter updated 6/3/15
 
NGSS Active Physics Alignment by chapter updated 6/1
NGSS Active Physics Alignment by chapter updated 6/1NGSS Active Physics Alignment by chapter updated 6/1
NGSS Active Physics Alignment by chapter updated 6/1
 
A New Educational Thermodynamic Software to Promote Critical Thinking in Yout...
A New Educational Thermodynamic Software to Promote Critical Thinking in Yout...A New Educational Thermodynamic Software to Promote Critical Thinking in Yout...
A New Educational Thermodynamic Software to Promote Critical Thinking in Yout...
 
Quartet2
Quartet2Quartet2
Quartet2
 
1936 teaching material_and_energy_balances_to
1936 teaching material_and_energy_balances_to1936 teaching material_and_energy_balances_to
1936 teaching material_and_energy_balances_to
 
Ibdp physics exetended essay and army ppt.pptx
Ibdp physics exetended essay and army ppt.pptxIbdp physics exetended essay and army ppt.pptx
Ibdp physics exetended essay and army ppt.pptx
 
14. Students Difficulties In Practicing Computer-Supported Data Analysis So...
14. Students  Difficulties In Practicing Computer-Supported Data Analysis  So...14. Students  Difficulties In Practicing Computer-Supported Data Analysis  So...
14. Students Difficulties In Practicing Computer-Supported Data Analysis So...
 
Fluid-Machinery-1.pdf this document is used for cput lecture in mechanical e...
Fluid-Machinery-1.pdf this document is used for cput  lecture in mechanical e...Fluid-Machinery-1.pdf this document is used for cput  lecture in mechanical e...
Fluid-Machinery-1.pdf this document is used for cput lecture in mechanical e...
 
MarchMtg17.pptx
MarchMtg17.pptxMarchMtg17.pptx
MarchMtg17.pptx
 
Big picture of electronics and instrumentation engineering
Big picture of electronics and instrumentation engineeringBig picture of electronics and instrumentation engineering
Big picture of electronics and instrumentation engineering
 
EDUC 4762 Assignment 4.3
EDUC 4762 Assignment 4.3EDUC 4762 Assignment 4.3
EDUC 4762 Assignment 4.3
 
An Exercise to Promote and Assess Critical Thinking in Sociotechnical Context...
An Exercise to Promote and Assess Critical Thinking in Sociotechnical Context...An Exercise to Promote and Assess Critical Thinking in Sociotechnical Context...
An Exercise to Promote and Assess Critical Thinking in Sociotechnical Context...
 
RMIP-21RMI56-Module 1.docx for electronics
RMIP-21RMI56-Module 1.docx for electronicsRMIP-21RMI56-Module 1.docx for electronics
RMIP-21RMI56-Module 1.docx for electronics
 
NGSS Active Physics Alignment by Performance Expectation updated 6/1
NGSS Active Physics Alignment by Performance Expectation updated 6/1NGSS Active Physics Alignment by Performance Expectation updated 6/1
NGSS Active Physics Alignment by Performance Expectation updated 6/1
 
NGSS Active Physics Alignment by Performance Expectation updated 6/1/15
NGSS Active Physics Alignment by Performance Expectation updated 6/1/15NGSS Active Physics Alignment by Performance Expectation updated 6/1/15
NGSS Active Physics Alignment by Performance Expectation updated 6/1/15
 
NGSS Active Physics Alignment by Performance Expectation Updated 6-3
NGSS Active Physics Alignment by Performance Expectation Updated 6-3NGSS Active Physics Alignment by Performance Expectation Updated 6-3
NGSS Active Physics Alignment by Performance Expectation Updated 6-3
 
Conceptual problem solving in high school physicsJennifer .docx
Conceptual problem solving in high school physicsJennifer .docxConceptual problem solving in high school physicsJennifer .docx
Conceptual problem solving in high school physicsJennifer .docx
 
Lucio_3300_L8-RP copy
Lucio_3300_L8-RP copyLucio_3300_L8-RP copy
Lucio_3300_L8-RP copy
 
Grad Sem PPT.pptx
Grad Sem PPT.pptxGrad Sem PPT.pptx
Grad Sem PPT.pptx
 

More from Nathan Mathis

More from Nathan Mathis (20)

Page Borders Design, Border Design, Baby Clip Art, Fre
Page Borders Design, Border Design, Baby Clip Art, FrePage Borders Design, Border Design, Baby Clip Art, Fre
Page Borders Design, Border Design, Baby Clip Art, Fre
 
How To Write Your Essays In Less Minutes Using This Website Doy News
How To Write Your Essays In Less Minutes Using This Website Doy NewsHow To Write Your Essays In Less Minutes Using This Website Doy News
How To Write Your Essays In Less Minutes Using This Website Doy News
 
Lined Paper For Beginning Writers Writing Paper Prin
Lined Paper For Beginning Writers Writing Paper PrinLined Paper For Beginning Writers Writing Paper Prin
Lined Paper For Beginning Writers Writing Paper Prin
 
Term Paper Example Telegraph
Term Paper Example TelegraphTerm Paper Example Telegraph
Term Paper Example Telegraph
 
Unusual How To Start Off A Compare And Contrast Essay
Unusual How To Start Off A Compare And Contrast EssayUnusual How To Start Off A Compare And Contrast Essay
Unusual How To Start Off A Compare And Contrast Essay
 
How To Write A Methodology Essay, Essay Writer, Essa
How To Write A Methodology Essay, Essay Writer, EssaHow To Write A Methodology Essay, Essay Writer, Essa
How To Write A Methodology Essay, Essay Writer, Essa
 
Recolectar 144 Imagem Educational Background Ex
Recolectar 144 Imagem Educational Background ExRecolectar 144 Imagem Educational Background Ex
Recolectar 144 Imagem Educational Background Ex
 
Microsoft Word Lined Paper Template
Microsoft Word Lined Paper TemplateMicrosoft Word Lined Paper Template
Microsoft Word Lined Paper Template
 
Owl Writing Paper
Owl Writing PaperOwl Writing Paper
Owl Writing Paper
 
The Essay Writing Process Essays
The Essay Writing Process EssaysThe Essay Writing Process Essays
The Essay Writing Process Essays
 
How To Make A Cover Page For Assignment Guide - As
How To Make A Cover Page For Assignment Guide - AsHow To Make A Cover Page For Assignment Guide - As
How To Make A Cover Page For Assignment Guide - As
 
Awesome Creative Writing Essays Thatsnotus
Awesome Creative Writing Essays ThatsnotusAwesome Creative Writing Essays Thatsnotus
Awesome Creative Writing Essays Thatsnotus
 
Sites That Write Papers For You. Websites That Write Essays For You
Sites That Write Papers For You. Websites That Write Essays For YouSites That Write Papers For You. Websites That Write Essays For You
Sites That Write Papers For You. Websites That Write Essays For You
 
4.4 How To Organize And Arrange - Hu
4.4 How To Organize And Arrange - Hu4.4 How To Organize And Arrange - Hu
4.4 How To Organize And Arrange - Hu
 
Essay Written In First Person
Essay Written In First PersonEssay Written In First Person
Essay Written In First Person
 
My Purpose In Life Free Essay Example
My Purpose In Life Free Essay ExampleMy Purpose In Life Free Essay Example
My Purpose In Life Free Essay Example
 
The Structure Of An Outline For A Research Paper, Including Text
The Structure Of An Outline For A Research Paper, Including TextThe Structure Of An Outline For A Research Paper, Including Text
The Structure Of An Outline For A Research Paper, Including Text
 
What Are Some Topics For Exemplification Essays - Quora
What Are Some Topics For Exemplification Essays - QuoraWhat Are Some Topics For Exemplification Essays - Quora
What Are Some Topics For Exemplification Essays - Quora
 
Please Comment, Like, Or Re-Pin For Later Bibliogra
Please Comment, Like, Or Re-Pin For Later BibliograPlease Comment, Like, Or Re-Pin For Later Bibliogra
Please Comment, Like, Or Re-Pin For Later Bibliogra
 
Ide Populer Word In English, Top
Ide Populer Word In English, TopIde Populer Word In English, Top
Ide Populer Word In English, Top
 

Recently uploaded

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 

Recently uploaded (20)

Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 

Assessing Authentic Problem-Solving In Heat Transfer

  • 1. Paper ID #36855 Assessing authentic problem-solving in heat transfer Jiamin Zhang Jiamin Zhang, PhD, is a postdoctoral scholar and lecturer in physics at Auburn University. Her research focuses on studying authentic problem-solving in undergraduate engineering programs and what factors impact student persistence in STEM. She earned her PhD in chemical engineering from the University of California, Santa Barbara. Soheil Fatehiboroujeni (Assistant Professor ) Soheil Fatehiboroujeni received his Ph.D. in mechanical engineering from the University of California, Merced in 2018 focused on the nonlinear dynamics of biological filaments. As an engineering educator and postdoctoral researcher at Cornell University, Sibley School of Mechanical and Aerospace Engineering, Soheil worked in the Active Learning Initiative (ALI) to promote student-centered learning and the use of computational tools such as MATLAB and ANSYS in engineering classrooms. In Spring 2022, Soheil joined Colorado State University as an assistant professor of practice in the Department of Mechanical Engineering. His research is currently focused on the long-term retention of knowledge and skills in engineering education, design theory and philosophy, and computational mechanics. Matthew Ford Matthew J. Ford (he/him) received his B.S. in Mechanical Engineering and Materials Science from the University of California, Berkeley, and went on to complete his Ph.D. in Mechanical Engineering at Northwestern University. After completing a postdoc with the Cornell Active Learning Initiative, he joined the School of Engineering and Technology at UW Tacoma to help establish its new mechanical engineering program. His teaching and research interests include solid mechanics, engineering design, and inquiry-guided learning. He has supervised undergraduate and master's student research projects and capstone design teams. Eric Burkholder (Postdoctoral Scholar) Eric Burkholder is an assistant professor of physics and of chemical engineering at Auburn Univeristy. He received his PhD in chemical engineering from Caltech and spent three years as a postdoc in Carl Wieman's group at Stanford University. His research focuses broadly on problem-solving in physics and engineerin courses, as well as issues related to retention and equity in STEM. Š American Society for Engineering Education, 2022 Powered by www.slayte.com
  • 2. Assessing authentic problem-solving in heat transfer Introduction Engineers are known as problem-solvers. In their work, they encounter ill-structured problems that require them to collect additional information, consider external constraints, and reflect on their solution process1,2,3,4,5,6 . Recent graduates cite these skills as the most important technical skills required of them in their everyday work7 . However, there are reports from employers and researchers that undergraduate students are not prepared for solving these kinds of problems when they graduate8,9 . One of the reasons for this skills gap is that the majority of problem-solving in traditional undergraduate engineering programs consists of solving textbook problems. Textbook problems are designed to exercise a limited set of knowledge and skills, and thus may not reflect the problem-solving practices that are used in real-world settings. Textbook problems do not allow room for students to make their own assumptions, decide what information is needed, decide how to present their findings, etc. Project-based learning, such as in capstone design courses and senior labs10,11 , is one alternative that allows room for more student decision-making, but these opportunities are often limited in undergraduate curricula. A principal challenge in teaching problem-solving is that one cannot teach what one cannot measure, and real-world problem-solving skills are difficult to measure. As discussed above, textbook problems may not measure real-world problem-solving skills. Design projects and other long-term projects may be more realistic measures of problem-solving practice, but they are impractical for large-scale assessment. Indeed, the National Academies are calling for the development and widespread use of research-based assessments to better measure student understanding and skills in undergraduate programs12 . There are some existing instruments that aim to measure problem-solving, but they each have limitations. Some attempt to measure problem-solving using puzzle-like challenges that do not rely on any scientific or engineering content knowledge13 . These assessments thus do not measure how one solves engineering problems, and it is not clear to what extent problem-solving on these assessments correlates with problem-solving in real-world practice14,15,16 . Other “critical-thinking” or aptitude assessments are commercially available, but due to their proprietary nature, the evidence for reliability and validity of these assessments is not widely available17,18,19 . To-date, there are almost no assessments of problem-solving in engineering. Our aim in this paper is to describe the development and pilot-testing of an assessment of problem-solving in engineering. We chose heat transfer as the context for our assessment so that
  • 3. it would be useful to educators in a wide variety of engineering disciplines. Theory Problem-solving has long been studied in engineering education research12 . Early studies are based on information-processing models, which posit a step-by-step approach to problem-solving20 . These models consider how knowledge is represented, the role of background knowledge, and limits on working memory. Other early studies are grounded in constructivism21 or socioconstructivism22,23 . One limit to these frameworks is that they are often not grounded in direct empirical evidence. Cognitive systems engineering24 , grounded in naturalistic decision-making theory25 , incorporates both cognitive and ecological (contextual) elements of problem-solving and is grounded in empirical evidence. This work often studies how skilled practitioners make critical decisions when solving specific problems in a real-world setting. Based on this work, researchers have developed a framework for characterizing expert problem-solving in science and engineering that describes problem-solving as a set of 29 decisions-to-be-made26 . How these decisions are made is argued to be highly context dependent and draws upon deep disciplinary knowledge. In addition, Price et al.27 have come up with a general template for how to assess these decisions. The researcher chooses an authentic context and structures the problem so that the solver is asked to make a relevant subset of the 29 decisions to solve the problem. The basic components of the framework are 1) provide an authentic problem context, 2) ask a series of questions that require test-takers to make decisions about problem definition and planning how to solve, 3) provide more information, 4) ask a series of questions that require decisions about interpreting information and drawing conclusions, and 5) have test-takers choose and reflect on their solution. One of the authors has previously developed such an assessment in chemical process design28 . One important feature of these assessments is that students are not graded based on a scale that compares them to one another, but rather compares their responses to a consensus of experts. This represents a philosophical shift in educational assessment. As described in Adams and Wieman29 , the development phase of the assessment consists of student and expert pilot-testing as an integral phase of rubric development. We describe this process in greater detail below. Assessment When choosing the context for the heat transfer assessment, we wanted to ensure (1) the relevant physics includes the key concepts in heat transfer (e.g., convection and conduction), (2) any additional physics should be simple enough to explain, and (3) a professor who does research in heat transfer or someone from industry who regularly uses heat transfer to design systems should be able to be considered “expert-enough” to solve the problem. The physical context we chose is the countercurrent heat exchange between arteries and veins in the human finger. This countercurrent heat exchange mechanism reduces the degree to which the returning venous blood must be warmed by the core, at the expense of a lower average temperature in the extremities. We situate the participant of the assessment as a member of an engineering team responsible for
  • 4. the design and development of a thermal regulation system for the next generation of space suits. The preliminary step for the engineering team is to develop a quantitative model of the body’s natural thermal regulation mechanisms, particularly in extremities. The model should capture the effect of countercurrent heat exchange between arteries (warmer blood) and veins (cooler blood) in a finger. The assessment begins broadly by asking the participant how they would model the heat transfer in the finger. Then, it asks what assumptions the participant would make to simplify the problem, what information they need to solve the problem, and what variables they think the findings will be most sensitive to. These questions prompt the participant to identify key features of the problem and think about possible models. We then provide the participant with a model proposed by their colleague for the heat exchange in the finger and ask more detailed questions about the feasibility of the model to draw the participant’s attention to some important features of the model, such as the geometry. We then ask the participant what information they would need to further evaluate the model. Next, we provide a revised, more detailed model and ask the participant to give further feedback. We then ask some more detailed questions about the second model, again to draw the participant’s attention to important features of the model, such as the boundary conditions and governing equations. The schematics provided at the beginning of the assessment and for the two models are included in Figure 1. After asking questions about the two proposed models, we provide a comparison of experimental measurements and simulation results from the second model and ask the participant whether they think the model matches the experimental data. Finally, we ask the participant to compare the two proposed models and discuss how they would model the phenomenon given the two models. (a) (b) (c) Figure 1: Schematics in the heat transfer assessment (a) Blood vessels in a human hand (from Complete Anatomy App30), (b) Resistor network model (Model 1), (c) Finite element model (Model 2).
  • 5. Figure 2 provides a summary of the sequence of provided information and questions in the heat transfer assessment. The color coding reflects the decisions relevant to the assessment that are part of the 29 decisions identified by Price et al.26 Each item in the table in the middle of the figure represents either provided information or a question to be answered, color coded according to the template item type. Items with multiple color codes indicate that multiple decisions were probed in one question. For the items that have half-yellow, information was provided in the context of a question being asked. iterations Figure 2: Assessment sequence with color coding for decision probed. Each item represents either provided context/information or a question to be answered, color coded according to the template item type. Items with multiple color codes indicate that multiple decisions were probed in one question (or where half-yellow, information was provided in the context of a question being asked). After developing the assessment, we first asked former teaching assistants in heat transfer courses to take the assessment to ensure that the questions were being interpreted the way we intended and that the prompts were adequately capturing the thought process of the solver29 . The assessment was then pilot tested with 12 experts (faculty members who do research in heat transfer, as well as professional engineers who design heat exchangers) and 12 undergraduate students. The experts were either acquaintances of members of the research team or volunteers recruited through email lists of The American Physical Society (APS) and American Society for Engineering Education (ASEE). The students were undergraduate students who were taking or had recently taken a heat transfer class in mechanical engineering at the time of the assessment. We used think-aloud interviews31 for both the experts and students during the pilot-testing. The participants were given a link to the Qualtrics survey for the assessment and were asked to complete the assessment and talk about their thought process during the interview. The responses in Qualtrics, interview transcripts, and audio files were used during the data analysis.
  • 6. Analysis We used data from the expert responses to create a scoring rubric for the assessment. Students were graded based on how well their responses agreed with the experts. We started by analyzing the three closed-response, multiple-choice questions: assumptions, information needed, and sensitivity (numbers 3, 4, and 5 in Fig. 2). As an example, for the sensitivity question (item 5 in the assessment), the participants were asked to choose five out of nine variables that they thought the findings would be the most sensitive to. Choices with stronger expert consensus (either for or against) were given more weight to the total score than choices on which experts disagreed or highlighted different key features. The score for each question was calculated using the following formula: score = X items (2 ∗ response − 1) ∗ (%consensus − 50%). (1) For Item 3 (assumptions) and Item 5 (sensitivity), possible answers to each item in the question are TRUE and FALSE, which translate to values of 0 and 1, respectively. For Item 4 (information needed), possible answers to each option in the question are Definitely Want, Might Want, and Don’t Need, with values of 1, 0.5, and 0, respectively. The expert consensus is calculated by summing the expert response values and dividing by the total number of experts. According to Equation 1, if the %consensus for an item in the assumptions question is 75% and a student selected TRUE for that assumption, they will get 0.25 added. If a student selected FALSE for that assumption, they will get 0.25 subtracted. For questions on which experts disagree (i.e., % consensus is close to 50%), our weighting scheme ensures that these questions only make a very small contribution to the total score. To develop the rubric for the open-response questions, we needed to code the expert responses to identify themes of answers that experts agree on. We used an emergent scheme and did three rounds of coding, with 3 coders coding separately then discussing the codes for the first two rounds and 2 coders coding separately then discussing the codes for the third round. The first round of coding focused on counting codes separately for each question. The second round of coding focused on the decisions experts and students made when responding to each question. From the first two rounds of coding, we found many overlapping codes for multiple questions and grouped questions that got similar answers together for counting the codes: Items 7-11 for feedback on Model 1, Items 13-17 for feedback on Model 2, Items 19 & 22 for feedback and questions about the experimental data, and Items 20 & 21 for feedback and questions about the simulation results. We then used the expert responses to come up with a list of key themes which were mentioned by multiple experts. For example, the experts overwhelmingly mentioned that Model 1 (resistor network model) was missing the temperature dependence in the axial direction. The lists of codes for these groups of questions are included in Table 1. Items 8, 14, and 21 explicitly asked the participants a yes/no question about the feasibility or validity of the model. Item 8 asked whether the colleague will successfully investigate the role of countercurrent heat exchange in the finger using the resistor network model. Out of the 12 experts interviewed, 7 experts explicitly said the colleague won’t be successful because the first model has limitations. One expert said the model is a good start, two experts said the model is OK, and
  • 7. two experts didn’t explicitly answer the question. Similar level of consensus was achieved for Items 14 and 21. For Item 14, the majority of experts explicitly stated that the colleague will be successful using the finite element model; for Item 21, the majority of experts explicitly stated that the simulation results don’t agree with the experimental data. The other codes listed in Table 1 are derived from either general questions asking the participant for their feedback or asking about specific aspects of the model, such as the geometry, and boundary conditions. The open-response nature of the questions results in high variability among experts and experts can choose a variety of things to mention. Even a few experts mentioning the same code indicates high importance. Thus, we considered codes that were mentioned by at least 3 experts to be expert consensus items. The other items mentioned by experts were labeled as expert non-consensus items. Students were then graded based on how well their responses agreed with the experts. Students were rewarded for highlighting expert consensus items, and penalized for highlighting extraneous features that experts did not consider to be important. Not all non-consensus items mentioned by students are considered extraneous: in the feedback to Model 1, one student stated that the resistors should be in parallel, which is incorrect. This response was considered extraneous. When responding to Item 9, “are there important features missing from the model”, one student asked for more material properties. While this was not mentioned by any of the experts, we didn’t label this response as extraneous because it is a reasonable comment consistent with a correct understanding of the model. For each group of questions, students were awarded one point for each expert consensus item and penalized 0.5 points for each extraneous item: score = Nexpert consensus items − 0.5Nextraneous items. (2) When coding the questions to identify expert-consensus feedback items, we coded Questions 19 & 22 and Questions 20 & 21 separately, because 19 & 22 were about the experiment and data collection while 20 & 21 were about the model prediction and validation. However, when calculating the scores, we grouped questions 19, 20, 21 and 22 together because the questions asked participants to compare the experimental results with the simulation results and thus belong to the same overall category. Items 2, 23, and 24 were not coded because there was no clear expert or student consensus, and interview audio files suggested that experts did not interpret those questions in the manner intended. These questions will be eliminated or significantly revised in future iterations. There were a few experts and students who did not complete all of the questions in the Qualtrics survey. In those cases, we used the recorded interview to fill in the missing responses. After using this approach, we only had one missing response from one student for one of the options in Item 4 information needed. We left the response blank when calculating the score for that student so the student did not gain or lose points for the missing response. One expert misinterpreted the questions about the two models and the experimental data and two experts had incomplete responses to questions about Model 1 and questions about experiments and simulation. Their responses were excluded when calculating the statistics and plotting the scores in the Results section.
  • 8. Table 1: Rubric for scoring the assessment based on responses from eleven experts. The feasibility and validity items were mentioned by at least seven experts. The other rubric items were mentioned by at least three experts. The contributing questions refer to the items in Figure 2. Topic Area Elements of Expert Solution (Expert Consensus Codes) Contributing Questions Feedback on Model 1 (Resistor Network Model) • Feasibility: No, colleague won’t be successful using this modela • Missing axial dependence • Problems with lumped flows or missing capillaries • Missing internal convective resistance • Unclear boundary conditions (Ta, Tv not specified) 7 - 11 Feedback on Model 2 (Finite Element Model) • Feasibility: Yes, colleague will be successful using this modelb • Remove transient term in equation • Neglect circumferential conduction (in the theta direction) • Neglect axial conduction • Difficulty in estimating external convection coefficient • Problems with lumped flows or missing capillaries 13 - 17 Feedback on Experiment and Simulation Questions about experiment and data collection: • Asking about error bars • Asking about measurement probe • Asking about experimental control: environment • Asking about subject differences 19 & 22 Questions about model prediction and validation: • Validity: No, model doesn’t match experimental datac • Large temperature mismatch between model and experiment • Asking about error bars in simulation result • Asking about model parameter: perfusion rate 20 & 21 a Number of experts who explicitly stated colleague won’t be successful : number of experts who explicitly stated colleague will be successful = 7 : 2 b Number of experts who explicitly stated colleague will be successful : number of experts who explicitly stated colleague won’t be successful = 7 : 1 c Number of experts who explicitly stated model doesn’t match experimental data : number of experts who explicitly stated model matches experimental data = 8 : 0 Results For the three closed-response questions, the average scores of the students are all lower than that of the experts and the standard deviations for the student scores are larger than that of the experts.
  • 9. The sensitivity question is the best at distinguishing between expert and student responses, with the average score for students two standard deviations below the average score for experts. The assumptions question has the smallest separation between the expert and student performances, with the average score for students only half of one standard deviation below the average score for experts. The descriptive statistics for the closed-response questions are included in Table 2. The maximum possible score was calculated by selecting the expert consensus choice for all of the options. The number of items indicates the number of sub-questions (or items) in a question. For example, in the information question, 32 different pieces of information were listed (such as outside air temperature and average blood temperature in the vein) and the participant was asked to select from Definitely Want, Might Want, and Don’t Need for each piece of information. Table 2: Descriptive statistics for expert and student scores in closed-response questions (items 3-5). Assumptions (Item 3) Information (Item 4) Sensitivity (Item 5) Max Possible Score 1.58 7.75 2.00 Min Possible Score -1.58 -7.75 -2.00 Number of Items 10 32 9 Number of Subject 12 12 12 Experts Average 0.99 5.07 1.25 Standard Dev. 0.60 0.88 0.38 Number of Subject 12 12 12 Students Average 0.75 4.00 0.49 Standard Dev. 0.75 1.73 0.63 To compare the performance of experts and students across these three questions that were on different scales, we rescaled the expert and student scores using the expert average and expert standard deviation: scaled score = score − mean(expert scores) stdv(expert scores) . (3) The scaled scores for Assumptions, Information, and Sensitivity are plotted in Figure 3. The box encompasses the 25th to the 75th percentiles of each score. Note that the thick center line in the box plot indicates the median, and thus may differ from zero. One of the reasons why the assumptions question and information needed question have low separation between experts and students is that different experts had different interpretations of the assumptions and information. For example, “ignore axial heat transfer” is ambiguous. Several experts didn’t make this assumption because axial advection (heat transported by the flow of warm blood) shouldn’t be ignored. However, in the feedback to Model 2, several experts recommended ignoring axial conduction. If we were to change the wording of the assumptions question to “ignore axial conduction” or “ignore axial advection”, we will likely get better consensus among experts. Furthermore, experts won’t necessarily make a simplifying assumption if they think the data can be easily obtained. For example, for “assume specific heat and density of blood and flesh are the same” and “assume thermal conductivity of blood and flesh are the same” some experts didn’t make these assumptions because they think the data will be easy to find.
  • 10. Figure 3: Comparison of scaled expert and student scores of closed-response questions. The scores are scaled by subtracting the average expert score then dividing by the expert standard deviation. The median student score is lower than the median expert score on all items. The sensitivity question shows the clearest differentiation between students and experts. For the open-response questions, we calculated the scores for three groups of questions: Model 1, Model 2, and experiment & simulation. The statistics of the expert and student scores are included in Table 3. The maximum possible score is the total number of expert consensus items for each group of questions. The number of subjects differs because some expert responses were missing or discarded, as described in the Analysis section. For all of the open-response question groups, the average scores for the students are lower than that of the experts. Model 2 questions have the largest separation between expert and student performances, with the average student score more than one standard deviation below that of the experts. Model 1 questions and experiment & simulation questions have smaller separation between expert and student performances, with the average student score less than one standard deviation below that of the experts. Importantly, although the average scores for Model 1 and Model 2 questions are very similar for experts, the average Model 2 student score is only one third of the average Model 1 student score. Model 2 is the finite element model, which is a more complicated model compared to the resistor network model and students are not performing as well for Model 2. We plotted the scaled scores in Figure 4. The scores are rescaled according to Eqn. 3. One striking feature of the box plot is that for Model 2 questions, the 75th percentile of the student score is lower than the 25th percentile of the expert score, again highlighting the large separation between experts and students for questions regarding the finite element model. Out of the 12 students in the pilot study, only two students mentioned “remove transient term”, and only one student mentioned “neglect axial conduction” for Model 2. This implies that students are not very good at making simplifying assumptions for the finite element model. For Model 1, however, students are much better at identifying the model flaw: “missing axial
  • 11. Table 3: Descriptive statistics for expert and student scores in open-response questions. Model 1 Questions (Items 7-11) Model 2 Questions (Items 13-17) Experiment and Simulation (Items 19-22) Max Possible Score 5 6 8 Number of Subject 9 11 9 Experts Average 2.39 2.41 3.89 Standard Dev. 1.27 1.41 1.27 Number of Subject 12 12 12 Students Average 1.54 0.50 2.83 Standard Dev. 1.22 0.71 1.47 Figure 4: Comparison of scaled expert and student scores of open-response questions. The scores are scaled by subtracting the average expert score then dividing by the expert standard deviation. The median student score is lower than the median expert score for all question groups. The Model 2 questions show the clearest differentiation between students and experts. dependence”, with 6 students mentioning this in their response. Discussion Across the three closed-response questions and the three groups of open-response questions, students on average score lower than experts. In particular, the sensitivity question and Model 2 questions have the largest separation between expert performance and student performance. Although the questions for Model 1 and Model 2 are nearly identical, with the only difference being the specific feature highlighted in Items 10 and 17, the average Model 2 student score is only one third of the average Model 1 student score. For Model 1, all of the expert consensus codes point out flaws in the model (e.g., “missing axial dependence”), whereas only one of the
  • 12. five expert consensus codes for Model 2 is about model flaws (“problems with lumped flows or missing capillaries”). The consensus codes for model flaws correspond to the decision “what are the important underlying features?” in the list of decisions described by Price et al.26 . The majority of the expert consensus codes for Model 2 suggest simplifications for the model (e.g., “remove transient term in equation”), which corresponds to the decision “what approximations and simplifications to make?” in the list of decisions described by Price et al.26 The differences in student performance for Model 1 and Model 2 questions suggest that students are good at identifying important missing features, but need more practice making appropriate simplifications. We hypothesize this is related to what decisions students had opportunities practicing in the heat transfer course. Specifically, most textbook problems require students to use equations to model a heat transfer problem and students are given feedback from the professor on whether the equations and calculations are correct (i.e., if they missed any key features). However, students are rarely given models that are correct but too complicated and are asked to simplify the model. Results from the pilot study suggests that in future courses in heat transfer, students need to be given more opportunities to practice a variety of the problem-solving decisions, such as how to make assumptions, how to make simplifications and approximations. The analysis of responses to the closed-response and open-response questions reveals differences in the development of the predictive framework26 by the experts and students. In particular, for the sensitivity question, the average score for experts is higher than that for the students and the responses across the experts are more consistent. This is because experts have more experience solving authentic problems which build better intuition about what features one should pay close attention to. Limitations and Future Work Although the current heat transfer assessment was able to distinguish between problem-solving skills of experts and students, several aspects of the assessment need to be changed to improve the level of consensus among the experts and make the questions less ambiguous. First, as mentioned in the Results section, some of the closed-response items are ambiguous and result in different interpretations by different experts. Second, the items in the assumptions question are all reasonable assumptions and don’t include any options that are obviously wrong. In the next version of the assessment, we will revise the items in the closed-response questions to remove ambiguity and include a few assumptions, sensitivity variables, and information needed mentioned by students that are obviously wrong. These items either come from the interview transcripts corresponding to the closed-response questions or the extraneous items we coded for the open-response questions. The advantage of including wrong options mentioned by students is that experts are very unlikely to choose these options while students are likely to choose these options, resulting in bigger separation between expert and student performances for the closed-response questions. Third, for the open-response questions, we plan to remove repetitive questions that generated similar responses from the participants to make the assessment shorter. In addition to revising the assessment and doing pilot studies with experts and students using the revised assessment, we also plan to conduct pre- and post-test in undergraduate heat transfer courses to study how well students’ problem solving skills improve after taking the course. The
  • 13. assessment will be included as one of the questions on the first homework and last homework in the course. Furthermore, we plan to develop a closed-response version of all of the questions in the assessment to automate the scoring of the assessment. This will significantly reduce the time needed to analyze the student responses and enable a wider distribution of the assessment. Responses from initial pilot testing, think-aloud interviews with students, and expert responses will be used to replace the open-response questions with “choose-many” multiple-choice questions. This will ensure that the questions include reasonable distractors generated by students as well as answers chosen by a consensus of experts. Further refinement of this assessment and wide distribution in classrooms will be an important advance in engineering education research, as it will provide a reliable way to measure problem-solving—an important, but not guaranteed outcome of an engineering education. It can be used to answer many different research questions, e.g. are there differences in outcomes between traditional heat transfer courses and heat transfer courses that focus on developing problem solving skills? It can also be used as an assessment tool for various engineering departments to decide whether their undergraduate programs are adequately preparing students for the workplace. Improving our ability to measure problem-solving is an important step in being able to improve the way we teach problem-solving to undergraduate students and prepare them for engineering careers. We hope to encourage other educators to use this assessment in their courses to measure how well they are preparing their students to solve real-world engineering problems. References [1] ABET, “Criteria for accrediting engineering programs, 2019 – 2020,” 2019. [2] C. L. Dym, “Design, systems, and engineering education,” International Journal of Engineering Education, vol. 20, no. 3, pp. 305–312, 2004. [3] C. L. Dym, A. M. Agogino, O. Eris, D. D. Frey, and L. J. Leifer, “Engineering design thinking, teaching, and learning,” Journal of Engineering Education, vol. 94, no. 1, pp. 103–120, 2005. [4] D. Jonassen, J. Strobel, and C. B. Lee, “Everyday problem solving in engineering: Lessons for engineering educators,” Journal of Engineering Education, vol. 95, no. 2, pp. 139–151, 2006. [5] D. H. Jonassen, “Designing for decision making,” Educational Technology Research and Development, vol. 60, no. 2, pp. 341–359, 2012. [6] N. Shin, D. H. Jonassen, and S. McGee, “Predictors of well-structured and ill-structured problem solving in an astronomy simulation,” Journal of Research in Science Teaching, vol. 40, no. 1, pp. 6–33, 2003. [7] H. J. Passow, “Which ABET competencies do engineering graduates find most important in their work?” Journal of Engineering Education, vol. 101, no. 1, pp. 95–118, 2012. [8] Q. Symonds, “The global skills gap in the 21st century,” 2018. [Online]. Available: https://www.qs.com/portfolio-items/the-global-skills-gap-in-the-21st-century/ [9] C. Grant and B. Dickson, “Personal skills in chemical engineering graduates: the development of skills within
  • 14. degree programmes to meet the needs of employers,” Education for Chemical Engineers, vol. 1, no. 1, pp. 23–29, 2006. [10] A. J. Dutson, R. H. Todd, S. P. Magleby, and C. D. Sorensen, “A review of literature on teaching engineering design through project-oriented capstone courses,” Journal of Engineering Education, vol. 86, no. 1, pp. 17–28, 1997. [11] C. L. Dym, “Learning engineering: Design, languages, and experiences,” Journal of Engineering Education, vol. 88, no. 2, pp. 145–148, 1999. [12] N. R. Council, “Discipline-based education research: Understanding and improving learning in undergraduate science and engineering,” Washington, DC: The National Academies, 2012. [13] W. K. Adams, “Development of a problem solving evaluation instrument; untangling of specific problem solving skills,” Unpublished Doctoral Dissertation, University of Colorado, 2007. [14] C. J. Harris, J. S. Krajcik, J. W. Pellegrino, and A. H. DeBarger, “Designing knowledge-in-use assessments to promote deeper learning,” Educational Measurement: Issues and Practice, vol. 38, no. 2, pp. 53–67, 2019. [15] F. Fischer, C. A. Chinn, K. Engelmann, and J. Osborne, Scientific Reasoning and Argumentation: The Roles of Domain-specific and Domain-general Knowledge. Routledge, 2018. [16] P. Kind and J. Osborne, “Styles of scientific reasoning: A cultural rationale for science education?” Science Education, vol. 101, no. 1, pp. 8–31, 2017. [17] ACT CAAP Technical Handbook. Iowa City, IA: ACT, 2007. [18] S. Klein, R. Benjamin, R. Shavelson, and R. Bolus, “The collegiate learning assessment: Facts and fantasies,” Evaluation Review, vol. 31, no. 5, pp. 415–439, 2007. [19] E. T. S. (ETS), MAPP User’s Guide. Princeton, NJ: Educational Testing Service, 2007. [20] D. P. Simon and H. A. Simon, “Individual differences in solving physics problems,” in Children’s Thinking: What Develops?, R. S. Siegler, Ed. Lawrence Erlbaum Associates, Inc, 1978, pp. 325–348. [21] J. Piaget, Success and Understanding. Cambridge, MA: Harvard University Press, 1978. [22] J. Lave and E. Wenger, Situated Learning: Legitimate Peripheral Participation. Cambridge university press, 1991. [23] L. B. Resnick, “Shared cognition: Thinking as social practice,” in Perspectives on Socially Shared Cognition, L. B. Resnick, J. M. Levine, and S. D. Teasley, Eds. American Psychological Association, 1991, pp. 1–20. [24] G. Lintern, B. Moon, G. Klein, and R. R. Hoffman, “Eliciting and representing the knowledge of experts,” in The Cambridge Handbook of Expertise and Expert Performance, 2nd ed., K. A. Ericsson, R. R. Hoffman, A. Kozbelt, and A. M. Williams, Eds. Cambridge, UK: Cambridge University Press, 2018, pp. 165–191. [25] K. Mosier, U. Fischer, R. R. Hoffman, and G. Klein, “Expert professional judgments and ”naturalistic decision making”,” in The Cambridge Handbook of Expertise and Expert Performance, 2nd ed., K. A. Ericsson, R. R. Hoffman, A. Kozbelt, and A. M. Williams, Eds. Cambridge, UK: Cambridge University Press, 2018, pp. 453–475. [26] A. M. Price, C. J. Kim, E. W. Burkholder, A. V. Fritz, and C. E. Wieman, “A detailed characterization of the expert problem-solving process in science and engineering: Guidance for teaching and assessment,” CBE—Life Sciences Education, vol. 20, no. 3, p. ar43, 2021. [27] A. M. Price, E. W. Burkholder, S. Salehi, C. J. Kim, V. Isava, M. P. Flynn, and C. E. Wieman, “An accurate and practical method for measuring science and engineering problem-solving expertise,” Submitted. [28] E. Burkholder, A. Price, M. Flynn, and C. Wieman, “Assessing problem solving in science and engineering programs,” in Proceedings of the Physics Education Research Conference, 2019.
  • 15. [29] W. K. Adams and C. E. Wieman, “Development and validation of instruments to measure learning of expert-like thinking,” International Journal of Science Education, vol. 33, no. 9, pp. 1289–1312, 2011. [30] Complete Anatomy App. [Online]. Available: https://3d4medical.com/press-category/complete-anatomy [31] K. A. Ericsson and H. A. Simon, “Verbal reports as data,” Psychological Review, vol. 87, no. 3, p. 215, 1980.