A collaborative textual annotation tool        Andrew Kehoe & Matt GeeResearch & Development Unit for English Studies     ...
Background• Corpus Linguistics: developing software to build and  analyse large text collections: crawling, indexing,  ann...
New Audiences• Bringing Corpus Linguistic techniques to new  audiences: i. School (A-Level) English students ii. Literary ...
New Corpora• Literary collections, including:  – Novels of Charles Dickens  – Works of Thomas Carlyle  – Works of James Jo...
Colleagues’ Own ExamplesThe doctor seemed especially troubled by the fact of the robbery havingbeen unexpected, and attemp...
130 instances of ‘in the * [way|line]’
Testing Intuitions“Dickens is known for a rich range of writing styles-indignant, ironical, melodramatic, and sentimental,...
Limitations• Literary scholars saw benefits of corpus linguistic  techniques but concerned about straying too far  from th...
“corpus stylistics can make an importantcontribution to the investigation of the interplaybetween conventional, idiosyncra...
A collaborative textual annotation tool
Literary StudyHow do you study a literary text?‘Close Reading’: detailed study of short text extractsdown to individual wo...
An Established Tradition• Can be traced back to 11th Century.                                        Martin Luther:       ...
• Text quickly becomes                                cluttered with underlining/• (re-)read the text            notes on ...
is on e-textsIncreasing emphas            are tobut surprising lack of softw                  ding. support close rea     ...
Limitations of Traditional Model• ‘Book Lovers Fear Dim  Future for Notes in the  Margins’, New York  Times, Feb 20 2011: ...
Our Solution• Web-based collaborative annotation system operating  down to word level.• Initial prototype late-2007  allow...
Pilot Study• Structured feedback collected from 25 Leicester students  across 3 modules (2 BA, 1 MA).   – 96% found word-l...
Demonstration of FeaturesTry it yourself for free at:http://emargin.bcu.ac.uk/
HIGHLIGHT CLICKED
MULTIPLEANNOTATIONS
MOUSE DRAGGED
COMMENTENTEREDANNOTATION  SAVED
ANNOTATIONADDED TO TEXT, AVAILABLE TO OTHER USERS
ANNOTATION  OPENED
REPLY ENTERED
REPLY SAVED,AVAILABLE TOOTHER USERS
NEW HIGHLIGHTCOLOUR CHOSEN
TAG ENTEREDANNOTATION  SAVED
TAG SAVED INANNOTATION
TAG CLOUD
LOOK UPHIGHLIGHTED TEXT    IN THE OED  (FOR EXAMPLE)
Case Study: Student Projects• Individual research projects on 3rd year BA Narrative  Analysis module at BCU.• Making conne...
Future Plans• Separate layers of annotation• Retain text layout and formatting• Import and Export
Future Plans• Integrate linguistic analysis features   – Corpora   – Tools      • Concordancing      • Wordlists      • Ke...
Phase 2: 2012-13• JISC Embedding Benefits funds for integration  with Virtual Learning Environments (VLEs) using  IMS Lear...
Beyond English• English Literature in first instance but transferable to  any text-based discipline: Law, Social Sciences,...
eMargin at #tagginganna workshop, Leicester
eMargin at #tagginganna workshop, Leicester
eMargin at #tagginganna workshop, Leicester
Upcoming SlideShare
Loading in …5
×

eMargin at #tagginganna workshop, Leicester

429 views
366 views

Published on

Presentation of the eMargin collaborative annotation tool given at the Higher Education Academy #tagginganna workshop at the University of Leicester, 5 July 2012

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
429
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

eMargin at #tagginganna workshop, Leicester

  1. 1. A collaborative textual annotation tool Andrew Kehoe & Matt GeeResearch & Development Unit for English Studies emargin.bcu.ac.uk
  2. 2. Background• Corpus Linguistics: developing software to build and analyse large text collections: crawling, indexing, annotation, search.• Our own large-scale search engine for linguistic study.• 10bn words of web text (part-of-speech tagged).• Includes collections of news and blogs.• Lets users extract examples of words/phrases in context, monitor change across time, etc.www.webcorp.org.uk
  3. 3. New Audiences• Bringing Corpus Linguistic techniques to new audiences: i. School (A-Level) English students ii. Literary colleagues (teachers/researchers/critics)• A move toward literary texts and Corpus Stylistic approaches
  4. 4. New Corpora• Literary collections, including: – Novels of Charles Dickens – Works of Thomas Carlyle – Works of James Joyce – Works of Samuel Beckett – Poems of Percy Bysshe Shelley – Restoration Drama – Science Fiction• Downloaded and processed whole of Project Gutenberg (23,484 texts; 1.6 billion words)
  5. 5. Colleagues’ Own ExamplesThe doctor seemed especially troubled by the fact of the robbery havingbeen unexpected, and attempted in the night-time; as if it were theestablished custom of gentlemen in the housebreaking way to transactbusiness at noon, and to make an appointment, by post, a day or twoprevious. (Oliver Twist)But there was no hitch in the conversation nevertheless; for one gentleman,who travelled in the perfumery line, exhibited an interesting nick-nack, inthe way of a remarkable cake of shaving soap which he had lately met within Germany; (Martin Chuzzlewit)
  6. 6. 130 instances of ‘in the * [way|line]’
  7. 7. Testing Intuitions“Dickens is known for a rich range of writing styles-indignant, ironical, melodramatic, and sentimental,all of which appear in David Copperfield. To set thenostalgic tone for this novel, he also uses certainwords like "little" and "old" more than usual, sohis language seems especially sentimental.” (Barron’s Book Notes: David Copperfield, 1985, p.32)
  8. 8. Limitations• Literary scholars saw benefits of corpus linguistic techniques but concerned about straying too far from the text.• Literary language is highly creative/variable.• Corpus Linguistic techniques work best with exact repetitions, not so good at finding paraphrases in fully automated way.• Difficult to pick up themes/motifs without human input.
  9. 9. “corpus stylistics can make an importantcontribution to the investigation of the interplaybetween conventional, idiosyncratic andcreative patterns of language use. Corpusstylistics also highlights that intuition andautomatic processes should work together” (Mahlberg, 2007:224)
  10. 10. A collaborative textual annotation tool
  11. 11. Literary StudyHow do you study a literary text?‘Close Reading’: detailed study of short text extractsdown to individual word level.
  12. 12. An Established Tradition• Can be traced back to 11th Century. Martin Luther: Lectures on Romans (1515) Glossae: student’s notes in the margins Image from: Cummings, B. (2002) The Literary Culture of the Reformation (Oxford: OUP).
  13. 13. • Text quickly becomes cluttered with underlining/• (re-)read the text notes on each re-reading• underline important words • Annotations tied to printed• make notes in margin copy of text• colour-code • Difficult to share / combine in class• draw out themes/motifs • Annotations not archivable / searchable
  14. 14. is on e-textsIncreasing emphas are tobut surprising lack of softw ding. support close rea teDifficult to annota nnotations Difficult to share a nough for N ot fine-grained e academic study
  15. 15. Limitations of Traditional Model• ‘Book Lovers Fear Dim Future for Notes in the Margins’, New York Times, Feb 20 2011: –writing comments alongside passages…is a rich literary pastime, sometimes regarded as a tool of literary archaeology, …but it has an uncertain fate in a digitalized world
  16. 16. Our Solution• Web-based collaborative annotation system operating down to word level.• Initial prototype late-2007 allowing basic highlighting/ commenting.• Classroom trials at BCU and Leicester.
  17. 17. Pilot Study• Structured feedback collected from 25 Leicester students across 3 modules (2 BA, 1 MA). – 96% found word-level commenting useful. – 88% found highlighting useful. – 92% agreed that “reading others’ comments helped me formulate my own ideas”. – 96% found prototype ‘easy’ to use.• Pilot study suggested which features of most use.• JISC Learning & Teaching Innovation grant (June 2011–May 2012) to build fully-functioning, open-source system.
  18. 18. Demonstration of FeaturesTry it yourself for free at:http://emargin.bcu.ac.uk/
  19. 19. HIGHLIGHT CLICKED
  20. 20. MULTIPLEANNOTATIONS
  21. 21. MOUSE DRAGGED
  22. 22. COMMENTENTEREDANNOTATION SAVED
  23. 23. ANNOTATIONADDED TO TEXT, AVAILABLE TO OTHER USERS
  24. 24. ANNOTATION OPENED
  25. 25. REPLY ENTERED
  26. 26. REPLY SAVED,AVAILABLE TOOTHER USERS
  27. 27. NEW HIGHLIGHTCOLOUR CHOSEN
  28. 28. TAG ENTEREDANNOTATION SAVED
  29. 29. TAG SAVED INANNOTATION
  30. 30. TAG CLOUD
  31. 31. LOOK UPHIGHLIGHTED TEXT IN THE OED (FOR EXAMPLE)
  32. 32. Case Study: Student Projects• Individual research projects on 3rd year BA Narrative Analysis module at BCU.• Making connections between literary and linguistic study by examining narrative theories.• Example: April’s study of newspaper narratives – 10 articles each from The Sun and The Guardian – Analysed using 3 narrative models: Labov (1972), White (1997), Hoey (2001). – In eMargin: used a different colour for each model and tags to indicate the different stages of the model. – Shows that eMargin can be used individually as a well as collaboratively.
  33. 33. Future Plans• Separate layers of annotation• Retain text layout and formatting• Import and Export
  34. 34. Future Plans• Integrate linguistic analysis features – Corpora – Tools • Concordancing • Wordlists • Keywords • Collocation
  35. 35. Phase 2: 2012-13• JISC Embedding Benefits funds for integration with Virtual Learning Environments (VLEs) using IMS Learning Tools Interoperability specification: – Single sign-in for seamless transition from VLE to eMargin – Easier group management - import class lists from VLE – Compatible with all major VLEs (Moodle, Blackboard Learn, WebCT, etc.) – Explore potential of eMargin as an e-assessment tool
  36. 36. Beyond English• English Literature in first instance but transferable to any text-based discipline: Law, Social Sciences, Theology, Languages (and potential beyond text…)• Trialled at Birmingham School of Acting• Collaborative research/editing tool• Beyond HE: United World College of SE Asia• Working to increase uptake across disciplinesemargin.bcu.ac.uk

×