Writing and Speech Recognition

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    2 Favorites

    Writing and Speech Recognition - Presentation Transcript

    1. Speech, Ink, and Slides: The Interaction of Content Channels Richard Anderson Crystal Hoyer Craig Prince Jonathan Su Fred Videon Steve Wolfman Repeat Intro of Self Mention: -Richard -Jonathan In Audience
    2. Background
      • Content channels simply refers to the various sources of information in some context (e.g. audio, slides, digital ink, video, etc.)
      • Our focus is on the use of digital ink in the classroom setting
      • We want to capture/playback/analyze these channels intelligently
    3. Why do we want to analyze content channels?
      • We want to make it easier to interact with electronic materials
        • Better search and navigation of presentations
        • Accessibility for the hearing/learning/visually impaired
        • Generating text transcripts
        • Recognizing high level behaviors
      Conversion to: Braille/Screen Reader
    4. Distance Learning Classes
    5. Classroom Presenter
      • General tool for giving presentations on the Tablet PC
      • Many similar systems – our findings applicable to all such systems
      • Enables writing directly on the slides
      • Tablet PC enables high-quality digital ink
      • Used in over 100 courses so far
      • Allows us to collect real usage data
    6. Questions We Wanted to Explore
      • High Level Question: What is the potential for automatic analysis of archived content?
      • Other Questions:
        • How well can digital ink be recognized by itself?
        • How closely are different content channels tied together?
          • Speech and Ink?
          • Ink and Slide Content?
        • Can we identify high level behaviors by analyzing the content channels?
    7. Research Methodology
      • We wanted to understand what real presentation data is like
      • We collected several 100’s of hrs. of recorded lectures from distance learning classes
      • Analyzed the data in various ways to help answer our guiding questions.
        • Note: All examples given here are from real presentations!
    8. Outline
      • Motivation
      • Handwriting Recognition
      • Joint Writing and Speech Recognition
      • Attentional Mark Identification
      • Activity Inference: Recognizing Corrections
    9. Handwriting Recognition
      • Classroom lectures on Tablet PC offer interesting challenges for handwriting recognition
        • Somewhat Awkward
          • Small Surface to Write On
          • Bad Angle to the Tablet PC
        • Hastily Written
          • Concentrating on Speaking
          • Excited / Nervous
    10. Recognition Examples
      • The Good:
      • The Bad:
      • The Ugly:
      Mark: Success/Failure
    11. Recognition Procedure
      • Studied isolated words/phrases written on slides
      • Removed all non-textual ink
      • Fed through the Microsoft Handwriting Recognizer
      • No training done!
    12. Handwriting Recog. Results Mention That These Results Are Surprisingly Good! Each Row Represents a Different Lecturer 260 (21%) 18 (1%) 123 (10%) 850 (68%) Total 58 (11%) 2 <(1%) 46 (9%) 408 (79%) Prof. E 111 (26%) 9 (2%) 45 (11%) 262 (61%) Prof. D 19 (44%) 1 (3%) 5 (11%) 18 (42%) Prof. C 71 (29%) 6 (2%) 26 (10%) 146 (59%) Prof. B 1 (6%) 0 (0%) 1 (6%) 16 (88%) Prof. A None Close Alternate Exact
    13. Outline
      • Motivation
      • Handwriting Recognition
      • Joint Writing and Speech Recognition
      • Attentional Mark Identification
      • Activity Inference: Recognizing Corrections
      Look at Potential
    14. Joint Writing and Speech Recognition
      • Co-expression of ink and speech
        • Is digital ink spoken as it is written?
          • Yes, but how often? How “closely” to the written text?
        • Can speech be used to disambiguate handwriting ?
        • Can handwriting be used to disambiguate speech ? (incl. deictic references)
      In Time/Accuracy, Wanted Empirical Evidence
    15. Examples
      • Difficult for Speech and Ink Recognition
      • Difficult Written Abbreviations
      • Speech/Ink Used to Disambiguate Ink/Speech
      DigiMon Java 2 Enterprise Edition Eswaran, Gray, Loric, Traiger corn flakes
    16. Experiment
      • Examined instances of isolated word writing
      • Selected word writing episodes at random but uniformly from the various instructors
      • Generated transcripts manually from the audio
      • Checked whether the instructor spoke the exact word written
      • Measured the time between the written and spoken word
    17. Speech/Text Co-occurrence Results Each Row Represents a Different Lecturer
    18. Outline
      • Motivation
      • Handwriting Recognition
      • Joint Writing and Speech Recognition
      • Attentional Mark Identification
      • Activity Inference: Recognizing Corrections
    19. Attentional Mark Identification
      • Attentional Marks are…
      • First step is to Identify a stroke as a mark
      • Tying Attentional Marks to slide content is important
      • Attentional Ink provides a concrete link between speech and slide content !
    20. Example
    21. Method
      • Segmentation
        • Few strokes
        • Close spatial and temporal proximity
      • Mark Recognition
        • Created hand tuned classifiers for: Circles, Lines, Bullets/Ticks
      • Matched with slide content
    22. Experiment
      • Identified and Classified Attention Marks by Hand
        • Two different people per slide
        • Identified type of mark as well as slide content mark referred to
      • Identified Attention Marks Automatically
      • Compared Resulting Identification
    23. Content Matching Issues
      • Hard to determine exactly what content a mark refers to
      Not just a recognition Issue, but also related to HOW people draw
    24. Content Matching Cont.
      • Granularity of content parsing can be an issue
    25. Attentional Ink Recognition Accuracy 532 118 (22%) 50 (9%) 35 (7%) 329 (62%) 87 35 (40%) 0 (0%) 0 (0%) 52 (60%) Bullets 339 66 (20%) 44 (13%) 22 (6%) 207 (61%) Underlines 106 17 (16%) 6 (6%) 13 (12%) 70 (66%) Circles Non-Match Close Exact to Punctuation Exact
    26. Outline
      • Motivation
      • Handwriting Recognition
      • Joint Writing and Speech Recognition
      • Attentional Mark Identification
      • Activity Inference: Recognizing Corrections
    27. Recongizing Corrections
      • Why?
        • Want to answer the broad question:
          • - “Can we recognize patterns of activity by analyzing the ink and speech channels?”
        • Useful for Presenters
          • - Occurs frequently (about 1-3 per lecture)
        • But Non-trivial
      Our vision allows false positives
    28. Recognizing Corrections
      • Identified Six Types of Corrections
      Looked through large # of lectures, wide range of marks
    29. Example Results No Table Because: 1. Not a robust experiment 2. Proof of Concept
    30. Wrap-up
      • We wanted to understand the nature of real data to direct our focus when building tools for automatic analysis
      • Our studies provided the necessary understanding to accomplish this
    31. Wrap-up (Cont.)
      • Specific Results:
        • Basic handwriting recognition is surprisingly good
        • Very strong co-occurrence of written and spoken words
        • We were able to identify attentional marks and the content associated with them
        • Activity Recognition: There are certain high-level activities that we can identify
      ALL OPEN for Refinement
    32. Questions?
      • E-mail
        • [email_address]
        • [email_address]
      • Classroom Presenter Website
        • http://www.cs.washington.edu/education/dl/presenter/

    + TeresaLSotoTeresaLSoto, 3 years ago

    custom

    2093 views, 2 favs, 0 embeds more stats

    Writing Recognition, Digital Ink, Speech Recognitio more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 2093
      • 2093 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 2
    • Downloads 94
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories