Learnersourcing: Improving Learning with Collective Learner Activity

1,185 views

Published on

Slides from my thesis defense: "Learnersourcing: Improving Learning with Collective Learner Activity"

Millions of learners today are watching videos on online platforms, such as Khan Academy, YouTube, Coursera, and edX, to take courses and master new skills. But existing video interfaces are not designed to support learning, with limited interactivity and lack of information about learners' engagement and content. Making these improvements requires deep semantic information about video that even state-of-the-art AI techniques cannot fully extract. I take a data-driven approach to address this challenge, using large-scale learning interaction data to dynamically improve video content and interfaces. Specifically, this thesis introduces learnersourcing, a form of crowdsourcing in which learners collectively contribute novel content for future learners while engaging in a meaningful learning experience themselves. I present learnersourcing applications designed for massive open online course videos and how-to tutorial videos, where learners' collective activities 1) highlight points of confusion or importance in a video, 2) extract a solution structure from a tutorial, and 3) improve the navigation experience for future learners. This thesis demonstrates how learnersourcing can enable more interactive, collaborative, and data-driven learning.

Published in: Education
2 Comments
3 Likes
Statistics
Notes
No Downloads
Views
Total views
1,185
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
36
Comments
2
Likes
3
Embeds 0
No embeds

No notes for slide
  • Video has emerged as a primary medium for online learning, and millions of learners today are watching videos on web platforms such as Khan Academy, Coursera, edX, and YouTube.

  • Videos are great in that they can be accessed any time from anywhere, and learners can watch at their own pace.
    Also, once a video is published, millions of learners can learn from the same material.
    It’s an efficient and scalable way to deliver education.
  • [1 min]

    But does it necessarily mean that video is a better medium for learning?
    Unfortunately, I would argue that the answer is no, at least not at the moment.
    Let’s compare video learning against some of its competitors.
  • In in-person learning, like 1:1 private tutoring or classroom lectures, the learner and the instructor interact directly.
    This direct link enables many effective instructional strategies,
    So that learners can stay engaged, while instructors can provide immediate feedback and adapt their instruction.
  • In video learning, however, there is no direct channel between the learner and the instructor.
    The video interface is in the middle. While this asynchronous interaction makes scalable delivery possible,
    There are trade-offs involved. Many of the good ingredients in 1:1 learning are missing.
    // most video interfaces are not really designed to support learning.

    As a result, many video learners resort to passive and isolated viewing.
    And the video interface doesn’t really adapt to the learner,
    Finding information and navigating to specific points inside a video is also difficult.

    “video search has been a notoriously difficult problem.”

    I think these limitations come from two major problems that most video interfaces have.
    First, video interfaces don’t know who the learners are and how they are watching the video.
    Second of all, video interfaces don’t know what the content is and how to make it more helpful for learners.

    ===

    Human computation involves computational problems
         - Replaces computers with humans
         - Hard for computers, easy for humans. Or they work together to make it easy

    Crowdsourcing
         - Replaces human workers with members of the public
         - ”Outsourcing it to an undefined, generally large group of people in the form of an open call" - Jeff Howe

    Human computation is a different perspective on how humans and computers interact.
    It is different in that the human participation is determined by the computational framework.
    Humans are helping computers in the computation, and crowd intelligence is augmenting the computational power that the computer has.

  • For video learning to truly scale, I think it’s essential that the video interfaces understand the learners’ engagement and understand the content better.

    Understanding learner engagement is hard because the interfaces cannot reach beyond the screen to really know what learners are doing.
    “out in the wild”

    Understanding content is hard because it requires understanding the domain, learning objective, and presentational approach,
    Each of which is a challenging problem on its own.
    “out in the wild”

    ===
    Yet we hope to be able to analyze learners’ video usage patterns and watching behaviors.
    (Interaction pattern & behavior analysis)

    And we hope to be able to mine, index, and summarize the video.
    (Video mining, indexing, summarization)
  • For video learning to truly scale, I think it’s essential that the video interfaces understand the learners’ engagement and understand the content better.

    Understanding learner engagement is hard because the interfaces cannot reach beyond the screen to really know what learners are doing.
    “out in the wild”

    Understanding content is hard because it requires understanding the domain, learning objective, and presentational approach,
    Each of which is a challenging problem on its own.
    “out in the wild”

    ===
    Yet we hope to be able to analyze learners’ video usage patterns and watching behaviors.
    (Interaction pattern & behavior analysis)

    And we hope to be able to mine, index, and summarize the video.
    (Video mining, indexing, summarization)
  • For video learning to truly scale, I think it’s essential that the video interfaces understand the learners’ engagement and understand the content better.

    Understanding learner engagement is hard because the interfaces cannot reach beyond the screen to really know what learners are doing.
    “out in the wild”

    Understanding content is hard because it requires understanding the domain, learning objective, and presentational approach,
    Each of which is a challenging problem on its own.
    “out in the wild”

    ===
    Yet we hope to be able to analyze learners’ video usage patterns and watching behaviors.
    (Interaction pattern & behavior analysis)

    And we hope to be able to mine, index, and summarize the video.
    (Video mining, indexing, summarization)
  • For video learning to truly scale, I think it’s essential that the video interfaces understand the learners’ engagement and understand the content better.

    Understanding learner engagement is hard because the interfaces cannot reach beyond the screen to really know what learners are doing.
    “out in the wild”

    Understanding content is hard because it requires understanding the domain, learning objective, and presentational approach,
    Each of which is a challenging problem on its own.
    “out in the wild”

    ===
    Yet we hope to be able to analyze learners’ video usage patterns and watching behaviors.
    (Interaction pattern & behavior analysis)

    And we hope to be able to mine, index, and summarize the video.
    (Video mining, indexing, summarization)
  • To address these challenges in video learning, I take a data-driven approach in my research.
    I use data generated from learners’ interaction with video content
    to understand and improve learning.

    The fact that we have a large group of learners watching the same video provides some unique opportunities.
    We now have a way to track the learning process at a fine-grained level: second by second and click by click.
    And use this understanding to improve the content and video interfaces.

    What becomes important here are the tools to collect, process, and present this large-scale learning interaction data.
  • too much focus on what learners DO.
    I’m telling my trick already. There’s no “wow, this is a neat solution to the problem”, because I’m giving away the suspense too easily.
    When I present my solution, people should be reacting, “oh, wow, that’s a neat solution"

    ===
    And my secret recipe is learnersourcing.

    As the name implies, learnersourcing is crowdsourcing with learners as a crowd.

    As we know, crowdsourcing has offered a new solution to many computationally difficult problems by introducing human intelligence as a building block.

    While crowdsourcing often issues an open call to an undefined crowd,
    learnersourcing uses a specialized crowd, learners who are inherently motivated and naturally engaged in their learning.
    This difference makes learnersourcing tackle unique problems that computers or general crowdsourcing cannot easily solve.

    And the idea is that the byproduct of learners’ natural learning activities can be used to
    Dynamically improve the future learner’s experience.


    ===

    Natural interaction with content, Or design special interactions to collect useful input from learners
  • too much focus on what learners DO.
    I’m telling my trick already. There’s no “wow, this is a neat solution to the problem”, because I’m giving away the suspense too easily.
    When I present my solution, people should be reacting, “oh, wow, that’s a neat solution"

    ===
    And my secret recipe is learnersourcing.

    As the name implies, learnersourcing is crowdsourcing with learners as a crowd.

    As we know, crowdsourcing has offered a new solution to many computationally difficult problems by introducing human intelligence as a building block.

    While crowdsourcing often issues an open call to an undefined crowd,
    learnersourcing uses a specialized crowd, learners who are inherently motivated and naturally engaged in their learning.
    This difference makes learnersourcing tackle unique problems that computers or general crowdsourcing cannot easily solve.

    And the idea is that the byproduct of learners’ natural learning activities can be used to
    Dynamically improve the future learner’s experience.


    ===

    Natural interaction with content, Or design special interactions to collect useful input from learners
  • too much focus on what learners DO.
    I’m telling my trick already. There’s no “wow, this is a neat solution to the problem”, because I’m giving away the suspense too easily.
    When I present my solution, people should be reacting, “oh, wow, that’s a neat solution"

    ===
    And my secret recipe is learnersourcing.

    As the name implies, learnersourcing is crowdsourcing with learners as a crowd.

    As we know, crowdsourcing has offered a new solution to many computationally difficult problems by introducing human intelligence as a building block.

    While crowdsourcing often issues an open call to an undefined crowd,
    learnersourcing uses a specialized crowd, learners who are inherently motivated and naturally engaged in their learning.
    This difference makes learnersourcing tackle unique problems that computers or general crowdsourcing cannot easily solve.

    And the idea is that the byproduct of learners’ natural learning activities can be used to
    Dynamically improve the future learner’s experience.


    ===

    Natural interaction with content, Or design special interactions to collect useful input from learners
  • [5 min]

    Let me give you some concrete examples of what I mean by learnersourcing. A 30-second tour.
    What if the video understood how learners watch the video and adapt?

    Here’s a video player that adapts to learners’ watching behavior.
    As the learner watches a video, the clickstream log is analyzed by the system for a meaningful pattern.
    The system finds hot spots in the video where many learners were feeling confused about.
    Based on this information, the video interface dynamically improves various features for future learners.
  • I implemented this idea in a video player called LectureScape.
    I’ll talk about it in more detail later in the talk.
  • Let’s look at another example.
    This time, the video player adds a new learning activity for video viewers.
    Learners are asked to summarize the section that they just watched.
    The system collects summary labels from multiple learners and finalized into a video outline.
    The UI then adds this outline next to the video player for future learners.
  • Some colleagues and I implemented this idea in a video player called Crowdy.
    I’ll also talk about it in more detail later in the talk.
  • I categorize learnersourcing into two models.

    In passive learning, I use existing data streams coming from leaners’ video watching activity
    to process and analyze them into meaningful information for future learners.

    On the other hand, in active learning,
    I design new learning activities that benefit learners AND contribute data that can be used to improve learning exp. for the student.
  • In my research, I’ve designed, built, and studied various active and passive learnersourcing applications to support video learning at scale.
    I’ll cover most of them at least briefly later.
  • [7 min]

    Essentially, in learnersourcing, I strive to establish a feedback loop between the learner and the system.
    As the learner watches the video naturally or answers some prompt that is pedagogically meaningful,
    The byproduct of these activities are processed by the system to produce a meaningful outcome.
    Then the system used this outcome to dynamically improve content and UI for future learners.

    Hopefully by the end of the talk, you’ll be convinced that learnersourcing can…

    In large-scale video learning environments, interfaces powered by learnersourcing can enhance content navigation, create a sense of learning with others, and
    ultimately improve learning.


  • [7 min]

    Essentially, in learnersourcing, I strive to establish a feedback loop between the learner and the system.
    As the learner watches the video naturally or answers some prompt that is pedagogically meaningful,
    The byproduct of these activities are processed by the system to produce a meaningful outcome.
    Then the system used this outcome to dynamically improve content and UI for future learners.

    Hopefully by the end of the talk, you’ll be convinced that learnersourcing can…

    In large-scale video learning environments, interfaces powered by learnersourcing can enhance content navigation, create a sense of learning with others, and
    ultimately improve learning.


  • There are more than 6000 MOOCs now, some of which are from this very school and some from faculty in this room.

    A MOOC often include tens to hundreds of video clips, and research shows that students taking a MOOC spend a majority of their time watching videos.

    But MOOC instructors often don’t have a good sense of how learners are using the videos.
  • Traditional classrooms, on the other hand, provide natural interaction data.
    Instructors in a classroom can visually check students’ engagement.
    Students might be engaged and paying attention, confused with a question, or bored and falling asleep.
    Or it might be the entire class that falls asleep.

    === relate to the current room and how I can adapt dynamically

    While online videos provide access any time from anywhere, they disconnect the interaction channel between instructor and students.
    One problem of videos is that interaction between student and instructor, and student and student is

    ===
    http://www.flickr.com/photos/10816734@N03/8249942979/in/photolist-dz27Cc-7FGT5v-fvhqZF-auPupW-8c6a4M-9uEhUU-dZUzSs-i54S8p-i55y7F-i54SgR-i54ZH3-i55xKi-i55179-i55yck-i54S3V-i55y6i-i54ZLE-i55xY4-i553f9-cRANHJ-98HXSU-apaQNA-aGNKak-cJ2nes-cJ2niN-cJ2nuh-cJ2nbq-cJ2npU-8aJrP5-dgF74D-bBLYjg-ddYkR6-9qqiYV-98HTZN-bEhdMJ-7WnzsU-88biBa-8qU4Cy-eEknDg-7Cpa5o-eEoW74-dmhvoT-9f7aPF-9irnJd-brfBxH-bpxmdx-87vrbw-8tXMNr-am1uZL-cXJdEo-bDpXMx
    http://www.flickr.com/photos/armgov/4991079510/sizes/z/
    https://www.pc4all.co.kr/new_co_upload_view.asp?num=448&category=comm
  • For instructors recording videos for MOOCs, it’s like they are talking to a wall.
    Instructors have no way to see if students are engaged, confused, or bored.
    Students also have no way to express their reaction, or see how other students are watching the video.
    This seriously restricts our understanding of how students learn with the videos, and limits the video learning experience itself.

    ===
    // But a MOOC video experience is not like a classroom experience.
    // There’s a disconnect between instructors and students, in both time and location.

    http://www.flickr.com/photos/liquidnight/2214264870/sizes/l/
  • We first looked at a few major factors that might affect learners’ engagement with video.
    By performing a post-hoc analysis looking at average session length as a metric,
    We found that …

    These are useful guidelines for instructors and video editors, and when we interviewed a video editor at edX,
    He enjoyed the fact that much of what he knew from his experience and intuition has been confirmed.

    We wanted to now look at how learners navigate inside a video.
  • Watch sequentially: common when you watch for the first time

    Pause: you might be confused, pace is too fast

    Return: revisit important concept

    Skip: search for a specific part, or quickly review

    We’ve seen many of such examples in the data.

  • It might be far-fetched to make any conclusion based on one student’s data,
    But what if you had thousands of students’ data?

    ===

    If every student watches it differently, why would combining them all be helpful?
    Clarify why having the pattern is useful.
  • It might be far-fetched to make any conclusion based on one student’s data,
    But what if you had thousands of students’ data?
    ===
    Now that we looked at why data matters, and how data can be used,
    let me tell you about the specific dataset we used for our analysis.
  • These hot spots in the video may indicate points of confusion or importance.
    "give me more time" / "I'm not tracking/following" and "this is important"

    ===
    Re-watching peak: focusing on nonsequential sessions
    Play peak: points of interest
    They often correlate, but not always.
  • Qualitative coding

    ----- Meeting Notes (2/24/14 16:16) -----
    one presentation style to another.
    code example, cut, talking head.
  • Remember that our motivation looking at this data was to find ways to use this data to improve students’ learning experience.
    Let’s see how the data analysis I presented so far can be connect can be used to achieve this goal.
    ----- Meeting Notes (10/7/14 17:52) -----
    analysis is useful for instructors and video editors,
    but can we more directly impact students' learning experience?
    what if we use this data to dynamically change the way video player works?
  • Remember that our motivation looking at this data was to find ways to use this data to improve students’ learning experience.
    Let’s see how the data analysis I presented so far can be connect can be used to achieve this goal.
    ----- Meeting Notes (10/7/14 17:52) -----
    analysis is useful for instructors and video editors,
    but can we more directly impact students' learning experience?
    what if we use this data to dynamically change the way video player works?
  • I’d like to focus on just a few of them.
  • It is an example of control-display ratio adaptation
    [5, 16, 37], dynamically changing the ratio between
    physical cursor movement and on-screen cursor movement.

    The faster the dragging, the weaker the friction. We achieve this effect by temporarily
    hiding the real cursor, and replacing it with a phantom cursor
    that moves slower than the real cursor within peak ranges
    (Figure 3). The idea of enlarging the motor space

  • ----- Meeting Notes (1/28/15 18:46) -----
    - simulated navigation tasks? simulated task that people are liekly to encounter.
    - separate navigation vs learning.
    - what is within subjects?
    - say what UI evaluation is normally like: are you good at teaching UI concepts to the audience?
  • Removed “socially-ranked search”
  • Extending this preliminary lab evaluation, I’m currently collaborating with a team at edX to integrate the LectureScape player into the platform.
    edX instructors will have an option to opt in to use LectureScape as the default video player in our live deployment.
    Real class
    Streaming data
    Cold start problem, adaptive interfaces…
  • “we talk about millions of learners on MOOCs but it’s common for a single video to have millions of viewers.”
  • You can learn how to do almost anything online nowadays by watching how-to videos,
    including cooking, applying photoshop filters, assembling furniture, and applyingmakeup.
  • How do you go back to a step that you missed for the first time?
    You have to use the timeline slider to make imprecise estimates.

    ===
    Q: Why videos in the first place?
    A: 1) Formative study at Adobe last year showed that it is a preference issue. There are certain types of people who learn better with videos, and they would turn to videos as much as possible, while others use static, step-by-step HTML tutorials.

    2) Video captures physical procedures. Step-by-step sometimes skips important steps.

    limits in navigation affects the learning experience and turns people away from watching video.
  • They have a specific structure, which is that they contain step-by-step instructions.

    What are some of the properties of how-to videos we can leverage in improving the learning experience?
  • Combining the lessons from the literature, we can conclude that seeing and interacting with the solution structure helps.
  • Ask a question: what are some steps you can think about?
    May be hard to remember and understand what each step does.
  • ToolScape adds the interactive timeline to let learners click each tool or work-in-progress image to repeat, jump, or skip steps in the workflow.
  • ToolScape adds the interactive timeline to let learners click each tool or work-in-progress image to repeat, jump, or skip steps in the workflow.

    Wadsworth constant

  • the before rating
    The self-efficacy measure didn't change much after the task in the baseline condition,
    whereas the toolscape condition showed a significant increase.

  • In addition to feeling more confident, learners also believed that they performed better with ToolScape.

  • Finally, external raters rated designs produced using ToolScape higher.
    Lower is better.

    Considering that it’s the same videos and we didn’t improve the content,
    The result is quite surprising that the way learners interacted with the material affected the outcome.

    Summary: "navigational, self-efficacy, and performance benefits of having step-by-step information about the solution"
  • Mention automatic method
  • ToolScape adds the interactive timeline to let learners click each tool or work-in-progress image to repeat, jump, or skip steps in the workflow.

  • ----- Meeting Notes (4/29/14 09:20) -----
    So our solution to the annotation challenge is the multi-stage crowdsourcing workflow.
  • Mention DB-Scan

  • ----- Meeting Notes (4/29/14 09:20) -----
    We automatically create a multiple choice question
  • How do we pick the label after we break ties?
  • Make it clear why we name it “Expand”

    Avoid saying “Label”. It’s choosing or identifying images to accompany a step.
    ----- Meeting Notes (4/29/14 09:20) -----
    animate for after?
  • How do we pick the frame after we break ties?
    Retake the screenshot?
    ----- Meeting Notes (4/29/14 09:20) -----
    why not apply pixel diff and string diff in advance?
    these are often not identical, so we first wanted to leave it to Turkers, and then apply automatic methods. we might unnecessarily filter things.

  • ----- Meeting Notes (5/1/14 08:11) -----
    1. domain-independent
    2. 80% good? obviously it's not perfect. but looking into 20% revealed interesting thing.
  • Crowdy: video learning website with learnersourcing workflow
  • These reflective exercises can be beneficial to learning. Self-explanation, meta-cognition, paying attention.
  • Also useful activity where you get a chance to “compare” multiple alternatives.
  • “Evaluate” the scope of the description.
  • 922 subgoals created, 966 upvotes, 527 upvotes
  • 922 subgoals created, 966 upvotes, 527 upvotes
  • 14 out of 17
    learner labels matching or better than expert labels
  • 14 out of 17
    learner labels matching or better than expert labels
  • 14 out of 17
    learner labels matching or better than expert labels
  • the choices “...made me feel as though I was on the same wavelength still.”
  • Crowdy: learning activity design
    this shows a pattern in my learnersourcing research
    design a meaningful activity, find a way to create something useful
  • Now I’ll briefly mention some of my future research ideas, and wrap up.
  • Many learnersourcing applications that I presented today deal with clickstream data or simple prompts.
    I think there are exciting opportunities in broadening the scope of learnersourcing to other learning contexts beyond videos.
    - Programming IDE that learnersources multiple implementations of a same function?
    - Graphical design tool that learnersources multiple visual assets?
    - Writing tool that learnersources multiple expressions and phrases?

    With these ideas, we can make the existing creativity support tools more social and interactive.



    ===
  • I’ve shown how we can add annotations to existing videos with learnersourcing workflows.
    Now that we have the enabling technology, what if we had thousands of videos fully annotated and summarized?
    What interesting applications can we build?

    Now that videos are indexed at the step level,
    Look at 100 different ways to perform a step and see what more common and less common approaches are.
    Based on this data, we can also make search, browsing, and recommendation work at the step level.

    ===
    Similar for more conceptual lecture videos where students’ alternative explanations can be indexed, browsed, etc.
  • We can also push forward the boundary of learnersourcing by expanding the role of learners.
    What if the entire course materials are created, taught, and improved by learners?
    With large-scale explanations, feedback, and improvements that are all learnersourced,
    Learners take a more active role in learning, and future learners can choose the best set of resources that work for them.
  • Many social domains suffer from the same limitations that video learning does:
    Members are passive and isolated, there is lack of channel to express individual’s input and contribute in a meaningful way.

    Learnersourcing presented a model where micro contributions from members of a community can make a difference.
    This conceptual idea can generalize to various social domains such as open government, nutrition, healthcare, and accessiblity, just to name a few.

    And the goal is to support more community-driven planning, discussion, decision making, and creative processes.

    ===
    a lot of community practice and civic issues matter to members, but the decision process and complicated structure are often not accessible to the members. The conceptual idea behind learnersourcing presents a technique for engaging community members in the process, while members engage in natural or voluntary or intrinsically motivated activities.
  • Many social domains suffer from the same limitations that video learning does:
    Members are passive and isolated, there is lack of channel to express individual’s input and contribute in a meaningful way.

    Learnersourcing presented a model where micro contributions from members of a community can make a difference.
    This conceptual idea can generalize to various social domains such as open government, nutrition, healthcare, and accessiblity, just to name a few.

    And the goal is to support more community-driven planning, discussion, decision making, and creative processes.

    ===
    a lot of community practice and civic issues matter to members, but the decision process and complicated structure are often not accessible to the members. The conceptual idea behind learnersourcing presents a technique for engaging community members in the process, while members engage in natural or voluntary or intrinsically motivated activities.
  • I started this talk by asking the question of “how can we make video learning really scale?”
  • I think unfortunately this is where we are. The learning experience becomes worse while delivery of content scales up.

    ===
    As number increases, fully supporting the good ingredients of 1:1 tutoring becomes harder and harder.
  • Many researchers interested in learning at scale are working hard to enable the good components of in-person learning in online settings.
    And learnersourcing is one such attempt at creating online learning environments that truly scale.
  • But with the unique opportunities we have because of the scale, because of the data, and because of the video,
    I believe we can be more ambitious and provide an even better learning experience than in-person learning.
    Learnersourcing is a step toward this vision, by supporting more interactive, collaborative, and data-driven learning.

    ===
    "beyond being there": technology for distance work, take advantage of new medium. this kind of logic can be applied in my intro.
  • Methods for collecting and processing large-scale data from learners
  • Learnersourcing: Improving Learning with Collective Learner Activity

    1. 1. Juho Kim (MIT CSAIL) Learnersourcing: Improving Learning with Collective Learner Activity
    2. 2. Video learning at scale
    3. 3. Video enables learning at scale
    4. 4. # of learners millionsone hundreds Scalable delivery ≠ Scalable learning
    5. 5. In-person learning: Direct learner-instructor interaction Effective pedagogy
    6. 6. Video learning: Mediated learner-instructor interaction Video interfaces are limiting.
    7. 7. no information about learners Challenges in video learning at scale ? ? no information about content lack of interactivity
    8. 8. Challenges in video learning at scale Understand learners’ engagement Understand video content Support interactive learning
    9. 9. Data-Driven Approach use data from learner interaction to understand and improve learning second-by-second process tracking data-driven content & UI updates
    10. 10. Learnersourcing crowdsourcing with learners as a crowd
    11. 11. Learnersourcing crowdsourcing with learners as a crowd inherently motivated naturally engaged
    12. 12. Learnersourcing crowdsourcing with learners as a crowd Learners’ collective learning activities dynamically improve content & UI for future learners. inherently motivated naturally engaged
    13. 13. Learners watch videos. UI provides social navigation & recommendation. System analyzes interaction traces for hot spots. [Learner3879, Video327, “play”, 35.6] [Learner3879, Video327, “pause”, 47.2] … Video player adapts to collective learner engagement
    14. 14. Learners are prompted to summarize video sections. UI presents a video outline. System coordinates learner tasks for a final summary. What’s the overall goal of the section you just watched? X ……………… V ……………… X ……………… X ……………… Video player coordinates learners to generate a video outline
    15. 15. Two types of learnersourcing Passive track what learners are doing Active ask learners to engage in activities
    16. 16. Learnersourcing applications for educational videos ToolScape [CHI 2014] Interaction Peaks [L@S 2014] LectureScape [UIST 2014] RIMES [CHI 2015] Mudslide [CHI 2015]Crowdy [CSCW 2015]
    17. 17. Learnersourcing requires a multi-disciplinary approach Crowdsourcing – Quality control, Task design, Large-scale input mgmt Social computing – Incentive design, Sense of community among learners UI design – Data-driven & dynamic interaction techniques Video content analysis – Computer vision, Natural language processing Learning science – Pedagogically useful activity, Theoretical background
    18. 18. Thesis statement “In large-scale video learning environments, interfaces powered by learnersourcing can enhance content navigation, create a sense of learning with others, and improve engagement and learning.”
    19. 19. I. Passive learnersourcing (MOOC videos) – Video player clickstream analysis [L@S 2014a, L@S 2014b] – Data-driven content navigation [UIST 2014a, UIST 2014b] II. Active learnersourcing (how-to videos) – Step-by-step information [ CHI 2014] – Summary of steps [CSCW 2015]
    20. 20. I. Passive learnersourcing (MOOC videos) – Video player clickstream analysis [L@S 2014a, L@S 2014b] – Data-driven content navigation [UIST 2014a, UIST 2014b] II. Active learnersourcing (how-to videos) – Step-by-step information [ CHI 2014] – Summary of steps [CSCW 2015]
    21. 21. Video lectures in MOOCs
    22. 22. Classrooms: rich, natural interaction data armgov on Flickr | CC by-nc-saMaria Fleischmann / Worldbank on Flickr | CC by-nc-nd Love Krittaya | public domain unknown author | from pc4all.co.kr
    23. 23. liquidnight on Flickr | CC by-nc-sa
    24. 24. First MOOC-scale video interaction analysis Data Source: 4 edX courses (fall 2012) Domains: computer science, statistics, chemistry Video Events: start, end, play, pause, jump Learners Videos Mean Video Length Processed Video Events 127,839 862 7:46 39.3M
    25. 25. Factors affecting video engagement Shorter videos - significant drop after 6 mins Informal shots over studio production - more personal feel helps Tablet drawing tutorials over slides - continuous visual flow helps How Video Production Affects Student Engagement: An Empirical Study of MOOC Videos. Philip J. Guo, Juho Kim, Rob Rubin. Learning at Scale 2014. Metric: session length
    26. 26. How do learners navigate videos? • Watch sequentially • Pause • Re-watch • Skip / Skim • Search
    27. 27. Collective interaction traces video time Learner #1 Learner #2 Learner #3 Learner #4 . . . . . . Learner #7888 Learner #7887
    28. 28. Collective interaction traces into interaction patterns video time interaction events
    29. 29. Interaction peaks Temporal peaks in the number of interaction events, where a significant number of learners show similar interaction patterns video time Understanding In-Video Dropouts and Interaction Peaks in Online Lecture Videos. Juho Kim, Philip J. Guo, Daniel T. Seaton, Piotr Mitros, Krzysztof Z. Gajos, Robert C. Miller. Learning at Scale 2014. ? !interaction events
    30. 30. What causes an interaction peak? Video interaction log data Video content analysis – Visual content (video frames) – Verbal content (transcript)
    31. 31. Observation: Visual / Topical transitions in the video often coincide with a peak.
    32. 32. Returning to content interaction video time # play button clicks
    33. 33. Beginning of new material interaction video time # play button clicks
    34. 34. Data-driven video interaction techniques Use interaction peaks to • draw learners’ attention • support diverse navigational needs • create a sense of learning with others
    35. 35. LectureScape: Lecture video player powered by collective watching data Data-driven interaction techniques for improving navigation of educational videos. Juho Kim, Philip J. Guo, Carrie J. Cai, Shang-Wen (Daniel) Li, Krzysztof Z. Gajos, Robert C. Miller. UIST 2014.
    36. 36. “Where did other learners find confusing / important?” “I want a quick overview of this clip.” “I want to see that previous slide.”
    37. 37. Roller coaster
    38. 38. Phantom cursor • Visual & physical emphasis on interaction peaks • Read wear [Hill et al., 1992], Semantic pointing [Blanch et al., 2004], Pseudo-haptic feedback [Lécuyer et al., 2004]
    39. 39. Visual clip highlights • Interaction data + frame processing
    40. 40. pinning
    41. 41. Pinning: Automatic side-by-side view Pinned slide Video stream
    42. 42. Lab study: 12 edX & on-campus students • LectureScape vs baseline interface • Navigation & learning tasks Visual search “Find a slide where the instructor displays on screen examples of the singleton operation.” Problem search “If the step size in an approximation method decreases, does the code run faster or slower?” Summarization “write down the main points of a video in three minutes.”
    43. 43. Diverse navigation patterns With LectureScape: • more non-linear jumps in navigation • more navigation options - rollercoaster timeline - phantom cursor - highlight summary - pinning “[LectureScape] gives you more options. It personalizes the strategy I can use in the task.”
    44. 44. Interaction data give a sense of “learning together” Interaction peaks matched with participants’ points of “confusion” (8/12) and “importance” (6/12) “It’s not like cold-watching. It feels like watching with other students.” “[interaction data] makes it seem more classroom-y, as in you can compare yourself to how other students are learning and what they need to repeat.”
    45. 45. Summary of passive learnersourcing • Unobtrusive, adaptive use of interaction data • Analysis of MOOC-scale video clickstream data • LectureScape: video player powered by learners’ collective watching behavior • Data-driven interaction techniques for social navigation
    46. 46. I. Passive learnersourcing (MOOC videos) – Video player clickstream analysis [L@S 2014a, L@S 2014b] – Data-driven content navigation [UIST 2014a, UIST 2014b] II. Active learnersourcing (how-to videos) – Step-by-step information [ CHI 2014] – Summary of steps [CSCW 2015]
    47. 47. how-to videos online
    48. 48. Navigating how-to videos is hard find repeat skip
    49. 49. How-to videos contain a step-by-step solution structure Apply gradient map
    50. 50. Completeness & detail of instructions [Eiriksdottir and Catrambone, 2011] Proactive & random access in instructional videos [Zhang et al., 2006] Interactivity: stopping, starting and replaying [Tversky et al., 2002] Subgoals: a group of steps representing task structures [Catrambone, 1994, 1998] Seeing and interacting with solution structure helps learning
    51. 51. Learning with solution structure helps
    52. 52. Learning with solution structure helps
    53. 53. Learning with solution structure helps
    54. 54. Improving how-to video learning Interacting with the solution • UI for solution structure navigation Seeing the solution • Extract steps + subgoals at scale
    55. 55. Improving how-to video learning Interacting with the solution • UI for solution structure navigation Seeing the solution • Extract steps + subgoals at scale
    56. 56. ToolScape: Step-aware video player Crowdsourcing Step-by-Step Information Extraction to Enhance Existing How-to Videos. Juho Kim, Phu Nguyen, Sarah Weir, Philip J. Guo, Robert C. Miller, & Krzysztof Z. Gajos. CHI 2014. Best of CHI Honorable Mention.
    57. 57. work in progress images parts with no visual progress step labels & links
    58. 58. Study: Photoshop design tasks 12 novice Photoshop users manually annotated videos
    59. 59. Baseline ToolScape
    60. 60. Participants felt more confident about their design skills with ToolScape. – Self-efficacy gain – Four 7-Likert scale questions – Mann-Whitney’s U test (Z=2.06, p<0.05), error bar: standard error 1.4 0 1 2 3 4 5 6 7 ToolScape Baseline 0.13.8 3.8 Before task Self-efficacy gain after task
    61. 61. Participants believed they produced better designs with ToolScape. – Self-rating on designs produced – One 7-Likert scale question – Mann-Whitney’s U test (Z=2.70, p<0.01), error bar: standard error 5.3 3.5 0 1 2 3 4 5 6 7 ToolScape Baseline
    62. 62. Participants actually produced better designs with ToolScape. – External rating on designs – Krippendorff’s alpha = 0.753 – Wilcoxon Signed-rank test (W=317, Z=-2.79, p<0.01, r=0.29) – Error bar: standard error 5.7 7.3 0 2 4 6 8 10 12 ToolScape Baseline Ranking: lower is better
    63. 63. Improving how-to video learning Interacting with the solution • UI for solution structure navigation Seeing the solution • Extract steps + subgoals at scale
    64. 64. Extracting solution structure • Step-by-step information extraction • Subgoal label generation
    65. 65. Goals for annotation method • domain-independent • existing videos • non-expert annotators Learners Crowd workers
    66. 66. Crowd-powered algorithms improvement $0.05 3 votes @ $0.01 … Crowd workflow for complex tasks • Soylent [UIST 2010], CrowdForge [UIST 2011], PlateMate [UIST 2011], Turkomatic [CSCW 2012]
    67. 67. Extracting solution structure • Step-by-step information extraction • Subgoal label generation
    68. 68. 3. Before/after results per each step 1. Step time 2. Step label Desired annotations
    69. 69. Multi-stage annotation workflow When & What are the steps? Vote & Improve Before/After the steps? FIND VERIFY EXPAND Crowdsourcing Step-by-Step Information Extraction to Enhance Existing How-to Videos. Juho Kim, Phu Nguyen, Sarah Weir, Philip J. Guo, Robert C. Miller, & Krzysztof Z. Gajos. CHI 2014. Best of CHI Honorable Mention.
    70. 70. When & What are the steps? Vote & Improve Before/After the steps? FIND VERIFY EXPAND Input video
    71. 71. When & What are the steps? Vote & Improve Before/After the steps? FIND VERIFY EXPAND Input video
    72. 72. When & What are the steps? Vote & Improve Before/After the steps? FIND VERIFY EXPAND Input video
    73. 73. When & What are the steps? Vote & Improve Before/After the steps? FIND VERIFY EXPAND Input video
    74. 74. When & What are the steps? Vote & Improve Before/After the steps? FIND VERIFY EXPAND Input video Output timeline
    75. 75. Evaluation • Generalizable? 75 Photoshop / Cooking / Makeup videos • Accurate? precision and recall against trained annotators’ labels • Non-expert annotators?
    76. 76. Across all domains, ~80% precision and recall Domain Precision Recall Cooking 0.77 0.84 Makeup 0.74 0.77 Photoshop 0.79 0.79 All 0.77 0.81 Precision: % correct labels extracted by crowd Recall: % ground truth labels extracted by crowd
    77. 77. Timing is 2.7 seconds off on average Ground truth: one step every 17.3 seconds 2.7 seconds User 1 User 2 User 3
    78. 78. Extracting solution structure • Step-by-step information extraction • Subgoal label generation
    79. 79. • Requires domain experts and knowledge extraction experts to work together. [Catrambone, 2011] • Insight: the subgoal labeling process is a good exercise for learning! – Reflect on – Explain – Summarize Generating subgoal labels is difficult
    80. 80. Multi-stage learnersourcing workflow Learnersourcing Subgoal Labels for How-to Videos. Sarah Weir, Juho Kim, Krzysztof Z. Gajos, & Robert C. Miller. CSCW 2015.
    81. 81. Stage 1. Generate subgoal labels • Learner: summarize
    82. 82. Stage 2. Evaluate candidate labels • Learner: compare
    83. 83. Stage 3. Proofread subgoal labels • Learner: inspect
    84. 84. Sidebar w/ interactive subgoals & steps
    85. 85. Crowdy evaluation • Does participating in learnersourcing improve learning? • Does the learnersourcing workflow produce good subgoal labels?
    86. 86. Study 1: Pedagogical benefits of learnersourcing • 300 Turkers • Intro stats video • IV: 3 video interfaces (Between-subjects) • DV: – Learning • Pretest + Posttest • Retention test (3-5 days after video watching) – Workload (NASA TLX Test)
    87. 87. Baseline - No prompting - No subgoal shown Expert - No prompting - Subgoal shown Crowdy - Prompting - Subgoal shown
    88. 88. Retention test: Crowdy = Expert > Baseline 1-way ANOVA: F(2, 226)=3.6, p< 0.05, partial η2=0.03 Crowdy vs Baseline: p < 0.05, Cohen’s d = 0.38 Expert vs Baseline: p < 0.05, Cohen’s d = 0.35 Error bar: Standard error 1-way ANOVA: F(2, 226)=4.8, p< 0.01, partial η2=0.04 Crowdy vs Baseline: p < 0.05, Cohen’s d = 0.38 Expert vs Baseline: p < 0.01, Cohen’s d = 0.45 Error bar: Standard error 100  79100  75 100  75
    89. 89. Pretest + Posttest scores were not different across conditions. One-way ANOVA: p > 0.05 Error bar: Standard error
    90. 90. Crowdy didn’t add additional workload. Questions on mental demand, physical demand, temporal demand, performance, effort, and frustration 7-point Likert scale (1: low workload, 7: high workload) One-way ANOVA: p > 0.05 Error bar: Standard error
    91. 91. Study 2: Subgoal labeling quality • ~50 web programming + statistics videos • Classroom + live website deployment • ~1,000 participating users (out of ~2,500 visitors) 922 966 527 Stage 1 subgoals created Stage 2 upvotes Stage 3 upvotes
    92. 92. Analyzed 4 most popular videos 4 external raters compared expert vs learner subgoals Subgoal quality evaluation
    93. 93. Majority of learner-generated subgoals were rated as matching or better than expert-generated ones. Analyzed 4 most popular videos 4 external raters compared expert vs learner subgoals
    94. 94. Interview with learners & creator • Learners • Creator “I was more... attentive to watching, to trying to understand what exactly am I watching.” “Having pop up questions means the viewer has to be paying attention.” the choices “...made me feel as though I was on the same wavelength still.”
    95. 95. Learnersourcing design principles Crowdsourcing simple and concrete task quality control data collection microscopic, focused task cost: money Learnersourcing pedagogically meaningful task incentive design learning + data collection overall contribution visible cost: learners’ time & effort
    96. 96. Summary of active learnersourcing • Techniques for extracting solution structure from existing videos • Video UIs for learning with steps & subgoals • Studies on learning benefits + label quality • Learnersourcing activity design: Engaging & pedagogically meaningful tasks, while byproducts make useful information
    97. 97. Future research agenda
    98. 98. • Richer learner responses Learnersourcing research agenda
    99. 99. • Richer learner responses • Large-scale corpus of annotated videos - Multiple learning paths - Deep search, browsing, recommendation Learnersourcing research agenda
    100. 100. • Richer learner responses • Large-scale corpus of annotated videos - Multiple learning paths - Deep search, browsing, recommendation • Completely learnersourced course - Course created, taught, improved entirely by learners Learnersourcing research agenda
    101. 101. Generalizing learnersourcing community-guided planning, discussion, decision making, collaborative work - Conference planning [UIST 2013, CHI 2014, HCOMP 2013, HCOMP 2014] - Civic engagement [ CHI 2015, CHI 2015 EA]
    102. 102. Learning at scale: Does learning scale? learning benefit per learner # of learners
    103. 103. Learning at scale: Does learning scale? learning benefit per learner # of learners
    104. 104. Learning at scale research: Enable the good parts of in-person learning, at scale learning benefit per learner # of learners
    105. 105. Vision for learnersourcing learning benefit per learner Interactive, collaborative, data-driven online education # of learners
    106. 106. Contributions Learnersourcing: support video learning at scale • UIs – Novel video interfaces & data-driven interaction techniques powered by large-scale learning interaction data • Workflows – Techniques for inferring learner engagement from clickstream data, and extracting semantic information from educational videos • Evaluation studies – Studies measuring pedagogical benefits, resulting data quality, and learners’ qualitative experiences
    107. 107. Learnersourcing: Improving Learning with Collective Learner Activity Juho Kim | MIT CSAIL | juhokim@mit.edu | juhokim.com ToolScape Interaction Peaks LectureScape Crowdy

    ×