SlideShare a Scribd company logo
Using computational methods to discover
student science conceptions in interview
data
Bruce Sherin
School of Education and Social Policy
Northwestern University

LAK 2012
Goals of this work
• Use computational analytic methods with traditional data
  • Videos of interviews intended to study kids’ “prior conceptions” in
    science.
• Automate the traditional analysis
  • The traditional analysis:
     1. Identify a set of “conceptions”
     2. Code the data in terms of these conceptions.

• Go as far as possible with simple analytic techniques.
Some specifics
• The data: 54 interviews with middle school students.
• The subject matter: The earth’s seasons
• The approach: Simple vector space models, clustering

There are reasons to think automating the analysis of this
data should be difficult
  • The amount of data is small
  • Student speech is halting and ambiguous
  • Gestures and diagrams are important
Prior science conceptions
• Prior science conceptions: The prior understandings that students
  bring to science learning
• Bibliography by Duit (2009) lists over 8000 papers

     Health and disease               Nature of matter
     Genetics                         Ecosystems
     Evolution                        Water cycle
     Geologic Time                    Weather


• Two theoretical poles:
  • Theory-Theory: Prior science knowledge consists of relatively well-
    elaborated theories.
  • Knowledge-in-Pieces. Prior science knowledge consists of a moderately
    large number of not-well-organized conceptions.
    • In an interview, students may construct explanations in-the-moment, drawing on
      some of these conceptions.
The seasons corpus
• 54 interviews with middle school students
• Our interview protocol, in brief:
  1. “Why is it warmer in the summer and colder in the winter?”
  2. Follow up questions for clarification.
  3. Asked to make a drawing.
  4. Follow up questions for clarification.
  5. Challenges for certain answers.
Prototypical explanations
Example Interview: Edgar
Starts with side-based, emphasizing the sun’s rays:
E: Here’s the earth slanted. Here’s the axis. Here’s the
   North Pole, South Pole, and here’s our country. And
   the sun’s right, and the rays hitting like directly right
   here. So everything’s getting hotter over the summer
   and once this thing turns, the country will be here
   and the sun can't reach as much. It's not as hot as
   the winter.


Shifts to typical closer-farther
E: Actually, I don't think this moves it turns and it moves like that and it
   turns and that thing like is um further away once it orbit around the s-
   Earth- I mean the sun.
Example Interview: Zelda




Tilt-based explanation, with the tilt causing light to be more or less direct
Z: Because, I think because the earth is on a tilt, and then, like that side of
   the Earth is tilting toward the sun, or it’s facing the sun or something so
   the sun shines more directly on that area, so its warmer.
Example Interview: Caden
Tilt-based explanation, with the tilt causing closer-farther
I: So the first question is why is it warmer in the summer and colder in the
   winter?
C: Because at certain points of the earth’s rotation, orbit around the sun,
   the axis is pointing at an angle, so that sometimes, most times,
   sometimes on the northern half of the hemisphere is closer to the sun
   than the southern hemisphere, which, change changes the
   temperatures. And then, as, as it’s pointing here, the northern
   hemisphere it goes away, is further away from the sun and get’s colder.
I: Okay, so how does it, sometimes the northern hemisphere is, is toward
   the sun and sometimes it’s away?
C: Yes because the at—I’m sorry, the earth is tilted on its axis. And it’s
   always pointed towards one position.
Analysis Procedure
1. Clean transcripts, removing everything except words
  spoken by students
2. Break each transcript into 100-word segments, with a
  moving window that steps forward 25 words
  • Results in 794 segments

3. Map each segment to a vector
4. DeviationalizeTM the vectors
5. Cluster the vectors
6. Interpret the clusters
7. Apply clusters to analyze transcripts
Mapping segments to vectors
• Compile the vocabulary                   sun       4   2.1
  • Stop list consisting of 782 words      earth     2   1.7
  • Results in vocabulary with 647 words   side      0    0
                                           away      2   1.7
• For each segment count number of
                                           tilted    1    1
  occurrences of each of these words.
                                           closer    1    1
• Weight as 1 + log(count)
                                           axis      2   1.7
• Normalize                                day       0    0
• Result: 794 vectors, each with 647       farther   1    1
  dimensions.                              time      3   2.1
                                           …         …   …
DeviationalizeTM
• Average all of the segment vectors and replace each by
 their difference from this average.
Cluster
• Use hierarchical agglomerative clustering

       # of clusters   Sizes of the clusters
             10        19 72 9 68 140 62 44 122 136 122
              9        19 72 68 62 44 122 136 122 149
              8        19 72 68 44 122 136 122 211
              7        72 68 44 122 122 211 155
              6        68 44 122 122 211 227
              5        68 122 122 211 271
              4        122 122 271 279
              3        271 279 244


 What do the clusters mean?
Apply clusters to transcripts
For each transcript:
• Segment into 100-word chunks
• Find the vector for each segment
• For each segment, find the dot product between the
  segment vector and each of the cluster centroids
• Plot the results
Edgar
Starts with side-based, emphasizing the sun’s rays:
E: … and the rays hitting like directly right here. … once this thing turns, the
    country will be here and the sun can't reach as much
Shifts to typical closer-farther
E: that thing like is um further away once it orbit around the s- Earth- I
   mean the sun.
Zelda
Tilt-based explanation, with the tilt causing light to be more or less direct
Z: … that side of the Earth is tilting toward the sun, or it’s facing the sun or
   something so the sun shines more directly on that area, so its warmer.
Caden
Tilt-based explanation, with the tilt causing closer-farther
C: … the axis is pointing at an angle, so that sometimes … the northern
   half of the hemisphere is closer to the sun… .
Summary
• Used traditional data set:
  • Videos of interviews intended to study kids’ “prior conceptions” in
    science.
• Set out to produce a “knowledge-in-pieces” analysis
• Notable difficulties:
  • Small amount of data
  • Halting and ambiguous speech.
  • Gestures, diagrams are referenced
• Keep the methods as simple as possible
  • Deviationalizing is an exception
• Results are “suggestive”
  • (That we can capture features of student knowledge that are widely
    recognized to be important)
What does this buy me?
What role might these computational techniques play in the
toolkit of researchers who study prior conceptions science
students?
• Can we replace human coders?
  • Actually, a human played an important role here.
• Can play a role as kind of independent support for the
 work of human analysts!
Open issues and next steps
1. Apply to subject matter other than the seasons
2. Systematic comparison to human analysis
3. Apply to answer some new research questions
4. Systematic investigation of alternative analysis methods
  • In the paper: (1) Different segment size, (2) Without deviationalizing

5. Why does this work?

More Related Content

Viewers also liked

Titulares
TitularesTitulares
Titulares
periodismouss
 
Description animals
Description animalsDescription animals
Description animals
Alexander Cardenazz Gonzalez
 
Chick fil-a managerial analysis presentation
Chick fil-a managerial analysis presentationChick fil-a managerial analysis presentation
Chick fil-a managerial analysis presentation
jus032000
 
Final age discrimination powerpoint
Final age discrimination powerpointFinal age discrimination powerpoint
Final age discrimination powerpoint
jus032000
 
Proposal1 pengajuan kpk k ajian korupsi1
Proposal1 pengajuan kpk k ajian korupsi1Proposal1 pengajuan kpk k ajian korupsi1
Proposal1 pengajuan kpk k ajian korupsi1
Drigiv Star
 
Revisi proposal pengajuan kpk k ajian korupsi1
Revisi proposal pengajuan kpk k ajian korupsi1Revisi proposal pengajuan kpk k ajian korupsi1
Revisi proposal pengajuan kpk k ajian korupsi1Drigiv Star
 
дипломна работа Power point
дипломна работа Power pointдипломна работа Power point
дипломна работа Power pointDimitar195
 
Persentasi Ku
Persentasi Ku Persentasi Ku
Persentasi Ku
Drigiv Star
 
Lpj munas bem si
Lpj munas bem siLpj munas bem si
Lpj munas bem si
Drigiv Star
 
SLIDE TEHNIK PERSIDANGAN
SLIDE TEHNIK PERSIDANGAN SLIDE TEHNIK PERSIDANGAN
SLIDE TEHNIK PERSIDANGAN
Drigiv Star
 
George yudice a conveniência da cultura
George yudice   a conveniência da culturaGeorge yudice   a conveniência da cultura
George yudice a conveniência da cultura
Gicélia Ribeiro
 

Viewers also liked (11)

Titulares
TitularesTitulares
Titulares
 
Description animals
Description animalsDescription animals
Description animals
 
Chick fil-a managerial analysis presentation
Chick fil-a managerial analysis presentationChick fil-a managerial analysis presentation
Chick fil-a managerial analysis presentation
 
Final age discrimination powerpoint
Final age discrimination powerpointFinal age discrimination powerpoint
Final age discrimination powerpoint
 
Proposal1 pengajuan kpk k ajian korupsi1
Proposal1 pengajuan kpk k ajian korupsi1Proposal1 pengajuan kpk k ajian korupsi1
Proposal1 pengajuan kpk k ajian korupsi1
 
Revisi proposal pengajuan kpk k ajian korupsi1
Revisi proposal pengajuan kpk k ajian korupsi1Revisi proposal pengajuan kpk k ajian korupsi1
Revisi proposal pengajuan kpk k ajian korupsi1
 
дипломна работа Power point
дипломна работа Power pointдипломна работа Power point
дипломна работа Power point
 
Persentasi Ku
Persentasi Ku Persentasi Ku
Persentasi Ku
 
Lpj munas bem si
Lpj munas bem siLpj munas bem si
Lpj munas bem si
 
SLIDE TEHNIK PERSIDANGAN
SLIDE TEHNIK PERSIDANGAN SLIDE TEHNIK PERSIDANGAN
SLIDE TEHNIK PERSIDANGAN
 
George yudice a conveniência da cultura
George yudice   a conveniência da culturaGeorge yudice   a conveniência da cultura
George yudice a conveniência da cultura
 

Similar to BSherin LAK presentation

Why is the equator warmer than the north pole
Why is the equator warmer than the north poleWhy is the equator warmer than the north pole
Why is the equator warmer than the north pole
Ed Brodhurst
 
Seasons of the Earth.pptx
Seasons of the Earth.pptxSeasons of the Earth.pptx
Seasons of the Earth.pptx
AngelicaMarvieGunday
 
Lect1(unit).ppt
Lect1(unit).pptLect1(unit).ppt
Lect1(unit).ppt
LuongTuan15
 
Opticsandlight
OpticsandlightOpticsandlight
Opticsandlight
Eric Campbell
 
Physics Chapter 1 Part 1
Physics Chapter 1 Part 1Physics Chapter 1 Part 1
Physics Chapter 1 Part 1
wsgd2000
 
Core Content Coaching Grade 8 Reason for the Seasons 14-15
Core Content Coaching Grade 8 Reason for the Seasons 14-15Core Content Coaching Grade 8 Reason for the Seasons 14-15
Core Content Coaching Grade 8 Reason for the Seasons 14-15
raegan_witt-malandruccolo
 
Science-8-Quarter-3-Week-7.pptx
Science-8-Quarter-3-Week-7.pptxScience-8-Quarter-3-Week-7.pptx
Science-8-Quarter-3-Week-7.pptx
IlaMeColambot
 
Earth's Rotation
Earth's RotationEarth's Rotation
Earth's Rotation
CiaraAshleyBrusola1
 
important ppt .pptx
important ppt .pptximportant ppt .pptx
important ppt .pptx
preetsingh01041980
 
ULiS presentation - Mark McGowran
ULiS presentation - Mark McGowranULiS presentation - Mark McGowran
ULiS presentation - Mark McGowran
markomcg
 
Southampton, seminar june 2013
Southampton, seminar june 2013Southampton, seminar june 2013
Southampton, seminar june 2013
Wouter Van Joolingen
 
Conceptual science teaching: concept cartoons & concepTests
Conceptual science teaching: concept cartoons & concepTests Conceptual science teaching: concept cartoons & concepTests
Conceptual science teaching: concept cartoons & concepTests
Stefaan Vande Walle
 
Lighting and shading
Lighting and shadingLighting and shading
Lighting and shading
Sri Harsha Vemuri
 
Aapt summer 2012 active engagement materials for subatomic physics
Aapt summer 2012   active engagement materials for subatomic physicsAapt summer 2012   active engagement materials for subatomic physics
Aapt summer 2012 active engagement materials for subatomic physics
Jeff Loats
 
Uni session 2 communication for effective learning core level 6 clic(2)
Uni session 2 communication for effective learning core level 6 clic(2)Uni session 2 communication for effective learning core level 6 clic(2)
Uni session 2 communication for effective learning core level 6 clic(2)
MariaElsam
 
COT SCIENCE 6_Q4_Week 5.pptx
COT SCIENCE 6_Q4_Week 5.pptxCOT SCIENCE 6_Q4_Week 5.pptx
COT SCIENCE 6_Q4_Week 5.pptx
Jenriel Mellomida
 
brodieclass1_08 (1).ppt
brodieclass1_08 (1).pptbrodieclass1_08 (1).ppt
brodieclass1_08 (1).ppt
ssuserc7c104
 
plantilla de ciencias naturales 2024 lo mesjog
plantilla de ciencias naturales 2024 lo mesjogplantilla de ciencias naturales 2024 lo mesjog
plantilla de ciencias naturales 2024 lo mesjog
KarenValoyes
 
Physical Sciences - Science Presentation
Physical Sciences - Science PresentationPhysical Sciences - Science Presentation
Physical Sciences - Science Presentation
aryapatel8585
 
Physical Sciences - Science - 6th grade by Slidesgo.pptx
Physical Sciences - Science - 6th grade by Slidesgo.pptxPhysical Sciences - Science - 6th grade by Slidesgo.pptx
Physical Sciences - Science - 6th grade by Slidesgo.pptx
KariEmuLLah
 

Similar to BSherin LAK presentation (20)

Why is the equator warmer than the north pole
Why is the equator warmer than the north poleWhy is the equator warmer than the north pole
Why is the equator warmer than the north pole
 
Seasons of the Earth.pptx
Seasons of the Earth.pptxSeasons of the Earth.pptx
Seasons of the Earth.pptx
 
Lect1(unit).ppt
Lect1(unit).pptLect1(unit).ppt
Lect1(unit).ppt
 
Opticsandlight
OpticsandlightOpticsandlight
Opticsandlight
 
Physics Chapter 1 Part 1
Physics Chapter 1 Part 1Physics Chapter 1 Part 1
Physics Chapter 1 Part 1
 
Core Content Coaching Grade 8 Reason for the Seasons 14-15
Core Content Coaching Grade 8 Reason for the Seasons 14-15Core Content Coaching Grade 8 Reason for the Seasons 14-15
Core Content Coaching Grade 8 Reason for the Seasons 14-15
 
Science-8-Quarter-3-Week-7.pptx
Science-8-Quarter-3-Week-7.pptxScience-8-Quarter-3-Week-7.pptx
Science-8-Quarter-3-Week-7.pptx
 
Earth's Rotation
Earth's RotationEarth's Rotation
Earth's Rotation
 
important ppt .pptx
important ppt .pptximportant ppt .pptx
important ppt .pptx
 
ULiS presentation - Mark McGowran
ULiS presentation - Mark McGowranULiS presentation - Mark McGowran
ULiS presentation - Mark McGowran
 
Southampton, seminar june 2013
Southampton, seminar june 2013Southampton, seminar june 2013
Southampton, seminar june 2013
 
Conceptual science teaching: concept cartoons & concepTests
Conceptual science teaching: concept cartoons & concepTests Conceptual science teaching: concept cartoons & concepTests
Conceptual science teaching: concept cartoons & concepTests
 
Lighting and shading
Lighting and shadingLighting and shading
Lighting and shading
 
Aapt summer 2012 active engagement materials for subatomic physics
Aapt summer 2012   active engagement materials for subatomic physicsAapt summer 2012   active engagement materials for subatomic physics
Aapt summer 2012 active engagement materials for subatomic physics
 
Uni session 2 communication for effective learning core level 6 clic(2)
Uni session 2 communication for effective learning core level 6 clic(2)Uni session 2 communication for effective learning core level 6 clic(2)
Uni session 2 communication for effective learning core level 6 clic(2)
 
COT SCIENCE 6_Q4_Week 5.pptx
COT SCIENCE 6_Q4_Week 5.pptxCOT SCIENCE 6_Q4_Week 5.pptx
COT SCIENCE 6_Q4_Week 5.pptx
 
brodieclass1_08 (1).ppt
brodieclass1_08 (1).pptbrodieclass1_08 (1).ppt
brodieclass1_08 (1).ppt
 
plantilla de ciencias naturales 2024 lo mesjog
plantilla de ciencias naturales 2024 lo mesjogplantilla de ciencias naturales 2024 lo mesjog
plantilla de ciencias naturales 2024 lo mesjog
 
Physical Sciences - Science Presentation
Physical Sciences - Science PresentationPhysical Sciences - Science Presentation
Physical Sciences - Science Presentation
 
Physical Sciences - Science - 6th grade by Slidesgo.pptx
Physical Sciences - Science - 6th grade by Slidesgo.pptxPhysical Sciences - Science - 6th grade by Slidesgo.pptx
Physical Sciences - Science - 6th grade by Slidesgo.pptx
 

Recently uploaded

Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 

Recently uploaded (20)

Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 

BSherin LAK presentation

  • 1. Using computational methods to discover student science conceptions in interview data Bruce Sherin School of Education and Social Policy Northwestern University LAK 2012
  • 2. Goals of this work • Use computational analytic methods with traditional data • Videos of interviews intended to study kids’ “prior conceptions” in science. • Automate the traditional analysis • The traditional analysis: 1. Identify a set of “conceptions” 2. Code the data in terms of these conceptions. • Go as far as possible with simple analytic techniques.
  • 3. Some specifics • The data: 54 interviews with middle school students. • The subject matter: The earth’s seasons • The approach: Simple vector space models, clustering There are reasons to think automating the analysis of this data should be difficult • The amount of data is small • Student speech is halting and ambiguous • Gestures and diagrams are important
  • 4. Prior science conceptions • Prior science conceptions: The prior understandings that students bring to science learning • Bibliography by Duit (2009) lists over 8000 papers Health and disease Nature of matter Genetics Ecosystems Evolution Water cycle Geologic Time Weather • Two theoretical poles: • Theory-Theory: Prior science knowledge consists of relatively well- elaborated theories. • Knowledge-in-Pieces. Prior science knowledge consists of a moderately large number of not-well-organized conceptions. • In an interview, students may construct explanations in-the-moment, drawing on some of these conceptions.
  • 5. The seasons corpus • 54 interviews with middle school students • Our interview protocol, in brief: 1. “Why is it warmer in the summer and colder in the winter?” 2. Follow up questions for clarification. 3. Asked to make a drawing. 4. Follow up questions for clarification. 5. Challenges for certain answers.
  • 7. Example Interview: Edgar Starts with side-based, emphasizing the sun’s rays: E: Here’s the earth slanted. Here’s the axis. Here’s the North Pole, South Pole, and here’s our country. And the sun’s right, and the rays hitting like directly right here. So everything’s getting hotter over the summer and once this thing turns, the country will be here and the sun can't reach as much. It's not as hot as the winter. Shifts to typical closer-farther E: Actually, I don't think this moves it turns and it moves like that and it turns and that thing like is um further away once it orbit around the s- Earth- I mean the sun.
  • 8. Example Interview: Zelda Tilt-based explanation, with the tilt causing light to be more or less direct Z: Because, I think because the earth is on a tilt, and then, like that side of the Earth is tilting toward the sun, or it’s facing the sun or something so the sun shines more directly on that area, so its warmer.
  • 9. Example Interview: Caden Tilt-based explanation, with the tilt causing closer-farther I: So the first question is why is it warmer in the summer and colder in the winter? C: Because at certain points of the earth’s rotation, orbit around the sun, the axis is pointing at an angle, so that sometimes, most times, sometimes on the northern half of the hemisphere is closer to the sun than the southern hemisphere, which, change changes the temperatures. And then, as, as it’s pointing here, the northern hemisphere it goes away, is further away from the sun and get’s colder. I: Okay, so how does it, sometimes the northern hemisphere is, is toward the sun and sometimes it’s away? C: Yes because the at—I’m sorry, the earth is tilted on its axis. And it’s always pointed towards one position.
  • 10. Analysis Procedure 1. Clean transcripts, removing everything except words spoken by students 2. Break each transcript into 100-word segments, with a moving window that steps forward 25 words • Results in 794 segments 3. Map each segment to a vector 4. DeviationalizeTM the vectors 5. Cluster the vectors 6. Interpret the clusters 7. Apply clusters to analyze transcripts
  • 11. Mapping segments to vectors • Compile the vocabulary sun 4 2.1 • Stop list consisting of 782 words earth 2 1.7 • Results in vocabulary with 647 words side 0 0 away 2 1.7 • For each segment count number of tilted 1 1 occurrences of each of these words. closer 1 1 • Weight as 1 + log(count) axis 2 1.7 • Normalize day 0 0 • Result: 794 vectors, each with 647 farther 1 1 dimensions. time 3 2.1 … … …
  • 12. DeviationalizeTM • Average all of the segment vectors and replace each by their difference from this average.
  • 13. Cluster • Use hierarchical agglomerative clustering # of clusters Sizes of the clusters 10 19 72 9 68 140 62 44 122 136 122 9 19 72 68 62 44 122 136 122 149 8 19 72 68 44 122 136 122 211 7 72 68 44 122 122 211 155 6 68 44 122 122 211 227 5 68 122 122 211 271 4 122 122 271 279 3 271 279 244 What do the clusters mean?
  • 14.
  • 15. Apply clusters to transcripts For each transcript: • Segment into 100-word chunks • Find the vector for each segment • For each segment, find the dot product between the segment vector and each of the cluster centroids • Plot the results
  • 16. Edgar Starts with side-based, emphasizing the sun’s rays: E: … and the rays hitting like directly right here. … once this thing turns, the country will be here and the sun can't reach as much Shifts to typical closer-farther E: that thing like is um further away once it orbit around the s- Earth- I mean the sun.
  • 17. Zelda Tilt-based explanation, with the tilt causing light to be more or less direct Z: … that side of the Earth is tilting toward the sun, or it’s facing the sun or something so the sun shines more directly on that area, so its warmer.
  • 18. Caden Tilt-based explanation, with the tilt causing closer-farther C: … the axis is pointing at an angle, so that sometimes … the northern half of the hemisphere is closer to the sun… .
  • 19. Summary • Used traditional data set: • Videos of interviews intended to study kids’ “prior conceptions” in science. • Set out to produce a “knowledge-in-pieces” analysis • Notable difficulties: • Small amount of data • Halting and ambiguous speech. • Gestures, diagrams are referenced • Keep the methods as simple as possible • Deviationalizing is an exception • Results are “suggestive” • (That we can capture features of student knowledge that are widely recognized to be important)
  • 20. What does this buy me? What role might these computational techniques play in the toolkit of researchers who study prior conceptions science students? • Can we replace human coders? • Actually, a human played an important role here. • Can play a role as kind of independent support for the work of human analysts!
  • 21. Open issues and next steps 1. Apply to subject matter other than the seasons 2. Systematic comparison to human analysis 3. Apply to answer some new research questions 4. Systematic investigation of alternative analysis methods • In the paper: (1) Different segment size, (2) Without deviationalizing 5. Why does this work?

Editor's Notes

  1. So, many of the presentations we’ve seen in this conference apply computational analytic methods to data that was itself collected by a computer.In contrast, in the work I’m going to describe here, I apply computational analytic methods to a very traditional kind of data, It’s a type of data that has been very important in the learning sciences.That’s videos clinical interviews used to study kids’ prior conceptions in science.I think this is a sensible thing to do, for a few reasons.Most importantly, because this is well-traveled territory, I know a lot about what constitutes a deep learning-related analysis of the data.So that put us in the position to see what what computational techniques can do.And that’s what I am trying to do. I think I know what a deep analysis looks like.I want to see how much of the analysis that I usually do by hand, can be automated. Very crudely, we can think of the analysis of this kind of prior conception data as involving two steps.First, we look across all of the data to identify a common set of conceptions. Then we code the data in terms of these conceptions.I’d like to automate both of these steps.Finally, given these objectives I think it makes sense to start with simple methods, and ride them as far as possible, so we really understand exactly what each piece of the analysis is doing.----- Meeting Notes (4/27/12 14:45) -----01:15
  2. Before diving in, I just want to do a little more to set the stage.The data I’ll be using is a corpus of 54 interviews with middle school students.I’m going to be looking at a portion of these interviews in which the students were asked to explain the Earth’s seasons. This makes sense because it has been a very heavily studied bit of subject matter. The analysis approach I’m using involves simple vector space models combined with hierarchical clusteringFinally, before starting, I wanted to point out that there are reasons think automating the analysis of this data should be difficult- The amount of data is small, especially in comparison to the usual type of data we work with in learning analytics- As you’ll see, the data is very messy. Student speech is halting and ambiguous.It’s often hard even for human analysts to understand what kids are saying. And a lot of the communication is done with gestures and diagrams that the automated analysis won’t have access to..----- Meeting Notes (4/27/12 14:56) -----:50
  3. I want to tell you just a bit about research on kids prior conceptions in science.When I talk about prior conceptions, I’m talking about the prior understandings about the natural world that students bring with them to science learning.It’s pretty widely accepted in the science education literature that many of the key issues in science instruction revolve around the prior conceptions of students. And there’s been just a huge range of research on prior conceptionsFor example, a prominent bibliography of student conceptions research, that was last updated in 2009, lists over 8000 papers on prior conceptions across many domains.It’s common in research on prior conceptions two contrast two theoretical polls. The first is the theory theory perspective. According to this perspective, prior science knowledge consists of relatively well-elaborated theories.At the other extreme is the KiP perspective. According to this perspective.My own point of view lies closer to the KiP perspective. This has important implications for the type of analysis that I do.The way I understand it, when you ask a student a question in one of these interviews, they may construct explanations in the moment, drawing on the variety of conceptions they have available.For my analysis, then, I’m looking to identify these conceptions, and to understand how a student draws on these conceptions to construct their explanation in the moment.As far as I’m concerned, for an computational analysis to be worth its salt, it needs to be able to produce this kind of analysis.----- Meeting Notes (4/27/12 15:01) -----1:30
  4. Okay, now give you more of a sense for what the data corpus is like. As I said, I have 54 interviews with middle school students.And the interview protocol went something like this.First, we started by asking why…Then the interview had the option to follow up with questions for clarifications.Then we asked the student to make a drawing to illustrate their explanation.Then the interviewer again had the freedom to ask follow up questions for clarification.Finally, the interviewer was prepared with a number of challenges designed for certain answers.----- Meeting Notes (4/27/12 15:05) -----:30
  5. We really see quite a diversity of explanations in these interviews. They are kind of all over the place.And it really looks like kids putting together pieces in various ways.But I can give you a sense for some of the most typical families of answers.For example, there are..----- Meeting Notes (4/27/12 15:09) -----:50
  6. Now I want to highlight three specific interviews that I’ll refer to later.The first interview is with a student named Edgar.Edgar started off by giving a version of a side-based explanation, with an emphasis on the Sun’s rays.He drew this picture and he said: …Then, after the interviewer asked him to elaborate, he spontaneously shifted to a closer-farther explanation.He said… As he said this, he was gesturing and pointing to his drawing. Hopefully it’s clear from this that the speech isn’t that easy to understand, and that a pure text transcript loses a lot of information.----- Meeting Notes (4/27/12 15:18) -----:50
  7. So that’s an example in which a student shifts from one explanation to another.Now, I’ll quickly tell you about two other interviews in which students pretty much stuck to a single explanation.Both of these students gave varieties of tilt-based explanations.Zelda gave an explanation that is very close to the correct explanation.It’s a tilt based explanation, in which the hemisphere tilted toward the sun gets more direct sunlight.She said … . This is pretty much correct, although there’s a little whiff of side-based there.----- Meeting Notes (4/27/12 15:20) -----0:40
  8. Finally, Caden gave a tilt-based explanation, but one that was less correct. In his explanation, the tilt of the Earth makes one hemisphere or the other closer to the sun (and thus warmer).He says:----- Meeting Notes (4/27/12 15:21) -----0:20
  9. So that’s the data. Now I’ll tell you about the automated analysisFirst I’ll give you the procedure in overview, then I’ll drill down in a few places and give more detail.First, I clean all of the student transcripts, removing everything…Second, I break each transcript into 100-word long segments, with a moving window that steps forward 25 words at a time.Those choices were made heuristically, and I say a bit more about that in the paper.This method of segmenting results in 794 segments of text from the interviews.Then I map each of these 100-word segments to a vector in a high dimensional space.Fourth I deviationalize the vectors. Obviously I’ll have to tell you what I mean.Then I cluster the vectors Sixth we interpret the clusters that are produced,Finally we use the clusters to analyze individual student transcripts.Now some detail----- Meeting Notes (4/27/12 15:28) -----01:00
  10. 01:10To start, I’ll tell you more about how each segment is mapped onto a vector.There are many fancy vector space methods. But here I have tried to just use the most vanilla methods possible.First, I compile the vocabulary of all of the words that appear in the corpus.Then I remove all of the words that appear on a stop list. I used a standard stop list I work with that has 782 words.This results in a vocabulary with 647 words.- So that’s a list of words like this.Then for each of the segments, I count how many times each of these words appear.That gives me a list of numbers that might look something like this.Then I reweight these counts by 1 + log(count). This is a relatively standard weighting function, and it has the effect of somewhat moderating the effects of large values.That might look like this.Finally, I normalize each of the vectors.So the result is a set of 794 vectors, one for each of the segments, each of which has 647 dimensions
  11. So, here’s the one place where things get a little funky.Next I would like to cluster the vectors. But it turns out that if I just go ahead and try to cluster the vectors, I don’t ever get anything sensible.The problem, I think, is that the content of these segments, across the board, is just too similar.Every little part of every one of these interviews is focused on quite narrow content. It’s all about explaining the seasons, and just about everyone talks about the earth, sun, and, and how they move, etc.So we need a way to pop out the differences that do exist.I tried quite a range of standard approaches, but none of them worked.But this approach did work: I summed all of the vectors for segments and computed their average. Then I replaced each of the vectors with each difference from that average.Here I show what it’s like if you have just two vectors. I’ve got v1 and v2, I find their average, and then I replaced them with v1’ and v2’That’s just for two vectors. But I can do the exact same procedure with all 794 vectors.
  12. After deviationalizing, we’re ready to cluster the vectors.I used hierarchical agglomerative clustering.In this audience, you might all know what that means.I start with each of the segment vectors in its own cluster.Then I iterate. At each iteration, I merge the two clusters that are most similar, which reduces the number of clusters by one.The result is that I start with 794 clusters, with each segment in its own cluster. Then I merge pairs of clusters, one step at time.To decide which clusters to merge at each step, I computed the centroid of each cluster, and then merged the two clusters with the most similar centroids.S o that process generates a series of candidate clusterings of the data. You can imagine it as a list of possibilities, starting with the possibility in which every segment in its own cluster, and ending with them all in one bigcluster.What I have here in this tables are the sizes of clusters that are formed at various stages in the process. The bottom row, for example, shows the results when the segments are grouped into three clusters. They contain 271 segments, 279 segments, and 244 segments respectively. As you move up the table the number of clusters grows, and the size of each cluster shrinks. Across multiple analyses, I have found that working with a set of about 7 clusters strikes a workable balance. With 7 clusters, it is possible to resolve interesting features of the data, while producing results (in the form of graphs) that are not overly difficult to interpret.So we’ll work with this set of 7 clusters, which contain between 44 and 211 segments.The next question with have to deal with is what these clusters that we’ve identified mean.----- Meeting Notes (4/27/12 15:49) -----01:41
  13. Each of the 7 clusters can be thought of as defined by its centroid vector—the average of all of the vectors that comprise the cluster.These centroids are each, in turn, are described by a list of 647 entries, each of which corresponds to one of the words in the vocabulary. One way to attempt to understand the meaning of the clusters, then, is to look at the words that have the largest value in each centroid vector. So that’s what I’ve done here.For each of the 7 clusters, I’ve listed the 10 words that are most characteristic of that cluster clusters, ignoring rare words.These tables have the words along with the corresponding values in the normalized cluster centroid.Remember that these are supposed to in some way be our conceptions.Some of these seem very suggestive. For example, cluster 1 which starts out with tilted toward and ways seems to be an important piece of a tilt-based explanation.Cluster 4 seems to be an important piece of a side-based explanation. And cluster 7 seems to be a core bit of a closer-farther explanation.But these clusters are not supposed to necessarily align with full-fledged explanations of the seasons. They are clusters of segments, which it is hoped can align with smaller conceptual units that, when combined, form the basis of a constructed explanation. And, some of the other clusters do seem to offer the possibility of an analysis of that sort. For example, we should expect tilt-based explanations to often be seen in concert with talk about the Earth’s hemispheres (Cluster 3). And recall that tilt-based explanations invoke different mechanisms by which the changing tilt of the earth impacts temperature. For example, Caden argued that the tilting of the Earth causes parts of the earth to be alternately closer or farther from the sun. In contrast, Zelda’s explanation focused on the impact of the Earth’s tilt and how it impacts the angle and directness of the sun’s rays. We should thus be able to see these ideas in combination, when we look at individual interviews.Similarly, we should expect to see side-based explanations (Cluster 4) in tandem with clusters having to do with the rotation of the Earth. Ideas about the rotation of the Earth seem to appear in Cluster 2 and Cluster 6. Cluster 2 seems to truly be focused on the spinning of the Earth. Cluster 6, in contrast, seems to be more about day and night.Finally, cluster 5 seems to have something to do with rays of light striking the earth at an angle.
  14. The next step is to apply this set of conceptions back to the transcripts.To analyze one transcript I go through these steps:I break the transcript into 100-words segments, just as I did before.Then I find the vector for each segment.Then, for each segment, I find the dot product between the segment vector and each of the cluster centroidsThe assumption is that vectors with a higher dot product are more similar.Finally, I plot the results in a relatively easy to interpret form.
  15. This is what that looks like with Edgar, the first student I talked about.Remember that edgar started with…And then he shifted to….Here’s what the analysis of edgar looks like.What this is showing, is that Edgars transcript has been broken into 10 segments.For each of those segments, there’s a grouping of 7 bars, one corresponding to each of the 7 conceptions..The story here is pretty clear. In the first four segments, this blue bar dominates. The blue bar corresponds to clsuter 5.In the latter 6 segments, the black bar, cluster 7 dominates.Cluster 5 was this cluster about light rays hitting.Cluster 7 is this farther-closer conception. And this transition happens right where it should happen in the transcript.So this seems to be capturing something very much like the analysis in terms of conceptions that is produced by human analysts.
  16. Now let’s look at Zelda. Remember, Zelda gave a consistent tilt-based explanation, in which the hemisphere tilted toward the sun receives more direct sunlight. Clearly these red bars dominated. They go with Cluster 1, which was this cluster that’s about tilting towards and away.You also see little bits of cluster 5 and cluster 4. Cluster 5 is about the rays hitting at an angle. Cluster 4 is about the side. Cluster 5 is really something you’d expect to see in the sort of correct explanation given by Zelda. When you’re talking about the directness of the light hitting the earth, you’re talking about the angle at which the sun’s rays hit the surface.So that’s what we’re seeing here with cluster 5.Cluster 4 you wouldn’t expect to see so much in a tilt-based explanation. But remember, in Zelda’s transcript, there really were some hints that she was slipping to side-based talk.
  17. Finally, let’s look at Caden.Remember Caden also gave a tilt-based explanation. But his explanation was less correct than Zelda’s.In his explanation, the tilt of the earth affects the temperatureBecause the hemisphere tilted toward the sun is closer to the sun.His graph does look a bit different. We’ve got red, yellow and black bars that are positive.The red bar is the tilt cluster. It’s there but relatively small.The yellow bar is about hemispheres. Talk about hemispheres was always a big part of tilt-based explanations.One of the key features of tilt-based explanations is that they are able to explain why the different hemispheresExperience different seasons at a given time.And Caden did talk quite a bit about the Earth’s hemispheres.It’s very nice to see the black bar there too,That’s the closer-farther conceptionSo that captures this important feature of Caden’s explanation, that the tilt works by making parts of the earth closer or farther from the sun.
  18. Now I’m ready to sum up:I attempted to apply learning analytics to atype of data that has a long history in the learning sciences: videos of interviews intended to study kid’s prior conceptions in science.And I set out to use a learning analytics type approach to produce a knowledge-in-pieces style analysis of that data.This meant that I was trying to produce an analysis of a certain sort. I see the student as possessing a number of relevant conception. And I want to understand how they construct an explanation, in the moment, out of those conceptions.This project faces some notable difficulties in comparison to other learning analytics efforts.The amount of data is small, student speech can be halting and ambiguous, and they are often referencing gestures and diagrams.Even with these difficulties, I set out to keep the methods as simple as possible- One exception was my use of deviationalizing.I am willing to confidently declare that the results are “suggestive.”What it suggest to me is that it might well be possible to use computational methods to capture features of student knowledge that are widely recognized to be important.
  19. One question that I want to ask is what role these computational techniques can play in the toolkit of researchers like me, who are interested in the study of prior conceptions of science students.An obvious question to ask in that regard is to ask if this type of analysis could replace human coders.Whether or not this may ultimately be possible, I should be clear that the anaIysis presented here wasin many ways still highly dependent on human interpretation. For example, I had to make a judgments about the appropriate number of clusters to work with. Even more importantly, I had to make sense of the lists of words that were associated with each of the clusters.Actually, replacing human coders is not where I think the big win will be, at least not early on.Instead, I believe that the biggest and most immediate contribution can be in the support these techniquescan provide for traditional kinds of analysis.It can provide a kind of independent test of validity.
  20. Finally, I want to briefly mention some open issues and next steps. I’m going to just list these quickly. There’s a bit more about each of them in the paper.One obvious task would be to try to apply similar methods to subject matter other than the seasons. It’s possible that the relatively success I’ve had has something to do with particular features of this subject matter.Second, my presentation here has been somewhat selective and anecdotal. Ultimately, I’d like to come up with a way to make a comparison to human analysts in a more systematic way.Third, it would be nice to apply this type of method to answer new research questions. I didn’t do this here. This was more like a test of concept. To see if I could produce something like analyses I had already done. Fourth, there’s a lot of work to be done to systematically investigate a range of alternative parameters and analysis methods. There’s a little bit more in the paper, I try different segment sizes and I show what happens if I run the analysis without deviationalizing. But the space of reasonable possibilities is quite large.Finally, I’d like to get a good handle on why this gives results that are at all reasonable. As a human analyst, for example, I feel like all parts of the data are crucial.I need to pay attention to what students draw, their gestures, their facial expressions.But the automated analyses have none of that. I think it would be actually quite interesting for my field to understand why this can work at all, without that information.