Your SlideShare is downloading. ×
Presentation on Internship Work
Speech and Eye Tracking Enabled Computer Assisted Translation
(SEECAT)
Copenhagen Business...
BACKGROUND
Michael Carl
Associate Professor
CBS
Srinivas Bangalore
Principal Member
AT&T Research Labs
BACKGROUND
Why SEECAT ?
BACKGROUND
We need translation
To convey our thoughts foreign language speaker
To understand foreign language text and spe...
BACKGROUND
ProZ
Tomedes
Verbalizeit
gengo
Straker Translations
BACKGROUND
Translog – Manual Translation
BACKGROUND
CASMACAT – Computer Assisted Translation
BACKGROUND
SEECAT as an extension of CASMACAT
Translator reads a source text on a computer screen and speaks out the trans...
STUCTURE of INTERSHIP
21 May- 7 Jun
Lectures and hands on sessions (CBS)
8 Jun- 28 Jun
Divided into teams, worked at summe...
GAZE TEAM
Himanshu, Kritika and Rucha
Part -1
Word- Fixation Remapping
Part-2
Mutual disambiguation between gaze and speech
Word- Fixation Remapping
Word- fixation mapping is useful for cognitive/linguistic research, usability studies
and most im...
Word- Fixation Remapping
Issues
Identification of the Fixations in a stream of gaze samples.
Mapping the Fixations to word...
Word- Fixation Remapping
Our Approach
Word- Fixation Remapping
Our Approach
Word- Fixation Remapping
Our Approach
Word- Fixation Remapping
Our Approach
Word- Fixation Remapping
Our Approach -> Evaluation
● Input:
– Manually annotated fixation to word mapping (Gold Standard)...
Mutual disambiguation between gaze and speech
Motivation
Mutual disambiguation between gaze and speech
Motivation
Mutual disambiguation between gaze and speech
Motivation
Ambiguity in gaze
Variable Error
-Midas Touch
System Errors
- Eye...
Mutual disambiguation between gaze and speech
Motivation
Consider a simple example:
● User reads the text “where is the ba...
Mutual disambiguation between gaze and speech
Inspiration from literature research
Meyer et.al. studied eye movements in a...
Mutual disambiguation between gaze and speech
Experiments -> Reading Task
• 5 participants read English Text
• Eye Gaze an...
Mutual disambiguation between gaze and speech
Experiments -> Translation Task
• 4 participants translated English Text
• 4...
Mutual disambiguation between gaze and speech
Approach
ASR word lattice
Reference sentence:
Leaving next day in the
morning
Mutual disambiguation between gaze and speech
Approach
Gaze word lattice
Mutual disambiguation between gaze and speech
Approach
Gaze bag of words
Mutual disambiguation between gaze and speech
Approach
Composed word-lattice
Reference sentence: Leaving next day in the m...
Mutual disambiguation between gaze and speech
System
• Performed experiments on Translog
• Speech hypothesis are provided ...
Mutual disambiguation between gaze and speech
System -> experiment with algo
• Weights of gaze words : should consider or ...
ASR
SCLITE was used to get the word accuracies of then-best hypotheses
with respect to the reference sentence.
Eye Gaze
Pr...
Mutual disambiguation between gaze and speech
Research Design
Reading Task
Independent Variables
Domain of test data
Weigh...
Mutual disambiguation between gaze and speech
Reading– Paired T-test
In domain Out of Domain All
Gaze_Precision 8.63075E-0...
Mutual disambiguation between gaze and speech
Reading– Absolute % improvements
Mutual disambiguation between gaze and speech
Translation – Paired T-test
En-Hi En-Sp En-It En-Da
Gaze_Precision 9.22715E-...
Mutual disambiguation between gaze and speech
Translation – Absolute % improvements
Mutual disambiguation between gaze and speech
Conclusions
Reading Task
• Significant improvement in both Gaze F-measure an...
Mutual disambiguation between gaze and speech
Conclusions
Translation Task
• Significant improvement in Gaze F-measure onl...
Mutual disambiguation between gaze and speech
Overview flowchart
Input from gaze
Fixation-word
remapping algo
Got x, y fro...
LEARNING
Academic
Worked with Tobii T-60
Experiment Design
Python
Latex
Moses
Translog
Putty
Cygwin
Audacity
OpenFST
Got a...
Some photos
Thanks
Hoping the monkeys to be friends forever
Upcoming SlideShare
Loading in...5
×

Intern presentation

290

Published on

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
290
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Intern presentation"

  1. 1. Presentation on Internship Work Speech and Eye Tracking Enabled Computer Assisted Translation (SEECAT) Copenhagen Business School By: Himanshu Bansal
  2. 2. BACKGROUND Michael Carl Associate Professor CBS Srinivas Bangalore Principal Member AT&T Research Labs
  3. 3. BACKGROUND Why SEECAT ?
  4. 4. BACKGROUND We need translation To convey our thoughts foreign language speaker To understand foreign language text and speech ------------------------------------------------- Training Data for automated system To prepare high quality manuscripts of same text in different language
  5. 5. BACKGROUND ProZ Tomedes Verbalizeit gengo Straker Translations
  6. 6. BACKGROUND Translog – Manual Translation
  7. 7. BACKGROUND CASMACAT – Computer Assisted Translation
  8. 8. BACKGROUND SEECAT as an extension of CASMACAT Translator reads a source text on a computer screen and speaks out the translation in the target language, a process called sight translation. This sight translation process is supported by an Automatic Speech Recognition (ASR) and a Machine Translation (MT) system, which transcribe the spoken speech signal into the target text and which assist the translator with partial translation proposals, predictions and completions on the computer monitor. An eye-tracking device follows the translators gaze path on the screen, detects where he or she faces translation problems and triggers reactive assistance.
  9. 9. STUCTURE of INTERSHIP 21 May- 7 Jun Lectures and hands on sessions (CBS) 8 Jun- 28 Jun Divided into teams, worked at summer house (Nykobing, Falster) 29 Jun- 21 July Integration (CBS) # Excursions planned for every weekend
  10. 10. GAZE TEAM Himanshu, Kritika and Rucha Part -1 Word- Fixation Remapping Part-2 Mutual disambiguation between gaze and speech
  11. 11. Word- Fixation Remapping Word- fixation mapping is useful for cognitive/linguistic research, usability studies and most importantly for providing interactivity into the system
  12. 12. Word- Fixation Remapping Issues Identification of the Fixations in a stream of gaze samples. Mapping the Fixations to words/characters (Dealing with variable error) Evaluation scheme for the fixation mapping
  13. 13. Word- Fixation Remapping Our Approach
  14. 14. Word- Fixation Remapping Our Approach
  15. 15. Word- Fixation Remapping Our Approach
  16. 16. Word- Fixation Remapping Our Approach
  17. 17. Word- Fixation Remapping Our Approach -> Evaluation ● Input: – Manually annotated fixation to word mapping (Gold Standard) – Machine computed fixation to word mapping ● Output: – The average character/word error. ● Method: – Compute the overlaps in the gaze fixation durations in the manual and machine annotations. – For the overlapping fixations, compute the absolute differences in the cursor positions of the two mappings.
  18. 18. Mutual disambiguation between gaze and speech Motivation
  19. 19. Mutual disambiguation between gaze and speech Motivation
  20. 20. Mutual disambiguation between gaze and speech Motivation Ambiguity in gaze Variable Error -Midas Touch System Errors - Eye Tracker - Algorithm - Calibration Ambiguity in ASR Domain of training data Accent of speaker Morphology of language Speaking Style -Co-articulation
  21. 21. Mutual disambiguation between gaze and speech Motivation Consider a simple example: ● User reads the text “where is the bat” ● ASR can help map gaze points to ● Gaze can help disambiguate ASR output Where is the bat. There it is, behind the door I can't find it! Where is it? Look properly! Its right there. Here is the mat Here is the bat where is the bat where is a pat ASR Hypothesis where it there theis Possible words being gazed Intersection
  22. 22. Mutual disambiguation between gaze and speech Inspiration from literature research Meyer et.al. studied eye movements in an object naming task. It was shown that people consistently fixated objects prior to naming them. Griffin showed that when multiple objects were being named in a single utterance, Speech about one object was being produced while the next object was fixated and lexically processed.
  23. 23. Mutual disambiguation between gaze and speech Experiments -> Reading Task • 5 participants read English Text • Eye Gaze and Speech Recorded • 6 sets of 11 sentences • 5 sets in domain and 1 out of domain
  24. 24. Mutual disambiguation between gaze and speech Experiments -> Translation Task • 4 participants translated English Text • 4 sets of 10-10 very simple sentences • Target languages - Hi, Sp, Da, It • Eye gaze on source language words and speech in target languages recorded
  25. 25. Mutual disambiguation between gaze and speech Approach ASR word lattice Reference sentence: Leaving next day in the morning
  26. 26. Mutual disambiguation between gaze and speech Approach Gaze word lattice
  27. 27. Mutual disambiguation between gaze and speech Approach Gaze bag of words
  28. 28. Mutual disambiguation between gaze and speech Approach Composed word-lattice Reference sentence: Leaving next day in the morning
  29. 29. Mutual disambiguation between gaze and speech System • Performed experiments on Translog • Speech hypothesis are provided by AT&T Watson server • Transformed these format to word-lattice format using python • Generate bag of words from x,y coordinates using our algo of part 1 using c# and python • In case of translation tasks, gaze bag of words are transformed into target language bag of words using lexicons (1 more level of ambiguity) • Composed these lattices using OpenFST
  30. 30. Mutual disambiguation between gaze and speech System -> experiment with algo • Weights of gaze words : should consider or not • Weights of ASR words: should consider or not Then used Latin square -> • Unweighted ASR with Weighted Gaze Bag-of-words (WGUA) • Unweighted ASR with Unweighted Gaze Bag-of-words (UGUA) • Weighted ASR with Weighted Gaze Bag-of-words (WGWA) • Weighted ASR with Unweighted Gaze Bag-of-words (UGWA)
  31. 31. ASR SCLITE was used to get the word accuracies of then-best hypotheses with respect to the reference sentence. Eye Gaze Precision: ((Wg) inters (Wr))/Wg Recall = ((Wg) inters (Wr))/Wr F-Measure (Harmonic mean of precision and recall) Sentence Recognition Error (SRI; 1 or 0 ) Wg =Unique words in the gazed words Wr =Unique words in the reference sentence Mutual disambiguation between gaze and speech Evaluation
  32. 32. Mutual disambiguation between gaze and speech Research Design Reading Task Independent Variables Domain of test data Weights of ASR Weights of Eye Gaze Dependent Variable Gaze f-measure Gaze SRI ASR Word accuracy Translation Task Independent Variables Target language Weights of ASR Weights of Eye Gaze Dependent Variable Gaze f-measure Gaze SRI ASR Word accuracy
  33. 33. Mutual disambiguation between gaze and speech Reading– Paired T-test In domain Out of Domain All Gaze_Precision 8.63075E-07 no improvement 9.2475E-07 Gaze_Recall 0.048194288 no improvement 0.048220416 Gaze_F-Measure 1.68557E-06 no improvement 1.7924E-06 ASR_WrdAccr 0.040033133 0.86206786 0.067110316 In domain Out of Domain All Gaze_Precision 8.63075E-07 no improvement 9.2475E-07 Gaze_Recall 0.048194288 no improvement 0.048220416 Gaze_F-Measure 1.68557E-06 no improvement 1.7924E-06 ASR_WrdAccr 0.007594247 0.86206786 0.017268861 In domain Out of Domain All Gaze_Precision 8.63075E-07 no improvement 9.2475E-07 Gaze_Recall 0.048194288 no improvement 0.048220416 Gaze_F-Measure 1.68557E-06 no improvement 1.7924E-06 ASR_WrdAccr 0.040033133 0.86206786 0.067110316 In domain Out of Domain All Gaze_Precision 8.63075E-07 no improvement 9.2475E-07 Gaze_Recall 0.048194288 no improvement 0.048220416 Gaze_F-Measure 1.68557E-06 no improvement 1.7924E-06 ASR_WrdAccr 0.040033133 0.363217468 0.099456245 WGUAUGWAUGUAWGWA
  34. 34. Mutual disambiguation between gaze and speech Reading– Absolute % improvements
  35. 35. Mutual disambiguation between gaze and speech Translation – Paired T-test En-Hi En-Sp En-It En-Da Gaze_Precision 9.22715E-10 9.69933E-10 8.11911E-10 0.000916591 Gaze_Recall 3.15662E-06 0.000781894 0.016622281 1.32874E-10 Gaze_F-Measure 1.19047E-09 1.42E-10 9.78254E-11 0.000278612 ASR_Word_Accuracy 0.001722134 0.002676057 0.108137263 En-Hi En-Sp En-It En-Da Gaze_Precision 9.22715E-10 9.69933E-10 8.11911E-10 0.000916591 Gaze_Recall 3.15662E-06 0.000781894 0.016622281 1.32874E-10 Gaze_F-Measure 1.19047E-09 1.42E-10 9.78254E-11 0.000278612 ASR_Word_Accuracy 0.702333466 0.323474945 0.108137263 En-Hi En-Sp En-It En-Da Gaze_Precision 9.22715E-10 9.69933E-10 8.11911E-10 0.000916591 Gaze_Recall 3.15662E-06 0.000781894 0.016622281 1.32874E-10 Gaze_F-Measure 1.19047E-09 1.42E-10 9.78254E-11 0.000278612 ASR_Word_Accuracy 0.003101235 0.011298332 0.209938743 En-Hi En-Sp En-It En-Da Gaze_Precision 9.22715E-10 9.69933E-10 8.11911E-10 0.000916591 Gaze_Recall 3.15662E-06 0.000781894 0.016622281 1.32874E-10 Gaze_F-Measure 1.19047E-09 1.42E-10 9.78254E-11 0.000278612 ASR_Word_Accuracy 0.045589916 0.181222117 0.108137263 UGUAWGUAWGWAUGWA
  36. 36. Mutual disambiguation between gaze and speech Translation – Absolute % improvements
  37. 37. Mutual disambiguation between gaze and speech Conclusions Reading Task • Significant improvement in both Gaze F-measure and ASR accuracy after integration • Gaze recall fall significantly • SRI also improved • Improvement in In-domain task was lot more than out-of-domain task • Out of the four experiments UGWA is observed best
  38. 38. Mutual disambiguation between gaze and speech Conclusions Translation Task • Significant improvement in Gaze F-measure only for all languages • ASR accuracy improved non-significantly • For Hindi and Danish SRI decreased a lot • Again UGWA is observed to be best (For 3 languages)
  39. 39. Mutual disambiguation between gaze and speech Overview flowchart Input from gaze Fixation-word remapping algo Got x, y from already logged files EVALUATION: fixation duration intersection b/w machine and manual 3 manual and 1 machine Static text reading experiment Eye gaze data captured (translog) Audio recorded at sentence level Word- lattices Bag of Words EVALUATION: comparison with BoW of reference sentence: precision and recall 10 best hypothesis Watson server EVALUATION: compared 1st best with reference text (edit distance) - ScLite Word lattices Eye gaze disambiguation ASR disambiguation With weighted & un- weighted ASR lattices Improved BoW Improved Hypothesis EVAL EVAL Majority based also Sentence Identfi.
  40. 40. LEARNING Academic Worked with Tobii T-60 Experiment Design Python Latex Moses Translog Putty Cygwin Audacity OpenFST Got an idea of MT and ASR Personal Communication Skills Project-Management morning reporting presentations weekly targets and check Kayaking Two string kite Bit of Cooking
  41. 41. Some photos
  42. 42. Thanks Hoping the monkeys to be friends forever

×