Your SlideShare is downloading. ×
0
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Online corpus: Literacy teachers' best friend
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Online corpus: Literacy teachers' best friend

1,444

Published on

Presentation delivered at Dyslexia Guild Summer Conference 2011 in Oxford. (Slideshow updated based on feedback from the session).

Presentation delivered at Dyslexia Guild Summer Conference 2011 in Oxford. (Slideshow updated based on feedback from the session).

Published in: Education, Business, Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,444
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
25
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • An important dichotomy (one of many) in the study of language
  • Confirms hypothesis that children more than adults and boys more than girls; how about the dyslexia v. dystopia
  • Transcript

    • 1. Online Corpus Literacy Teachers’ Best Friend Dominik Luke š http ://dominiklukes.net Dyslexia Guild Summer Conference 2011 training.dyslexiaaction.org.uk
    • 2. Outline training.dyslexiaaction.org.uk http://www.flickr.com/photos/adactio/3563832656 What is a corpus Answering questions with a corpus The language of corpus searches The corpus and the classroom Practice
    • 3. Corpus / Corpora training.dyslexiaaction.org.uk ????
    • 4. of about training.dyslexiaaction.org.uk language knowledge http://www.flickr.com/photos/missturner/3029700617/
    • 5. Prescriptivism training.dyslexiaaction.org.uk … how language should be used Descriptivism … how language is used v
    • 6. training.dyslexiaaction.org.uk “ Most of the prescriptive rules of the language mavens make no sense on any level. They are bits of folklore that originated for screwball reasons several hundred years ago… For as long as they have existed, speakers have flouted them…”
    • 7. training.dyslexiaaction.org.uk “ intellectual abdication” “should be ashamed” “ current around 1900” “ a perversion of grammatical education” “ blind to textual evidence even when he himself exhibits it” “ dishonest and stupid” “ vile little compendium of tripe about style” Grammarian Geoffrey K Pullum on … “ More passives in Orwell's pompous essay with the warning about how you mustn't use them than in any periodical you can lay your hands on! “
    • 8. This usage stuff is not straightforward and easy. If ever someone tells you that the rules of English grammar are simple and logical and you should just learn them and obey them, walk away, because you're getting advice from a fool. http://languagelog.ldc.upenn.edu/nll/?p=2790
    • 9. Corpus training.dyslexiaaction.org.uk Key modern tool for finding out about how language works…
    • 10. Corpus training.dyslexiaaction.org.uk … is a large database of representative language samples …
    • 11. Corpus training.dyslexiaaction.org.uk … 100s of millions of words from (mostly) written language in different genres in small samples (~2000 words) …
    • 12. Corpus training.dyslexiaaction.org.uk … used for linguistic research, making dictionaries, writing grammars, …
    • 13. training.dyslexiaaction.org.uk
    • 14. Corpora available for teachers training.dyslexiaaction.org.uk http://corpus.byu.edu
    • 15. Access to COCA and related BYU corpora is free… training.dyslexiaaction.org.uk … but free registration required for more than ~10 queries a day
    • 16. training.dyslexiaaction.org.uk
    • 17. training.dyslexiaaction.org.uk Brown – the grandfather COCA BNC Webcorp Google
    • 18. training.dyslexiaaction.org.uk
    • 19. training.dyslexiaaction.org.uk
    • 20. training.dyslexiaaction.org.uk http://www.flickr.com/photos/atoach/3900591006/ Searching a corpus early on in the process of making a generalization can save you a lot of unpleasant surprises later.
    • 21. How do we use the word dyslexia? <ul><li>We speak more often of dyslexic children than adults. </li></ul><ul><li>We speak more often of dyslexia than any other dys- word. </li></ul>training.dyslexiaaction.org.uk
    • 22. Concordance BNC: dyslexic [n*] COCA: dyslexic [n*] http://www.americancorpus.org/ http://corpus.byu.edu/bnc
    • 23. training.dyslexiaaction.org.uk COCA: dys*
    • 24. Suffixing rules training.dyslexiaaction.org.uk *yed *ied
    • 25. Suffixing rules training.dyslexiaaction.org.uk *yed *ied pl a yed st a yed portr a yed enj o yed unempl o yed surv e yed d ied t r ied mar r ied wor r ied identi f ied app l ied
    • 26. The Corpus Magic training.dyslexiaaction.org.uk * [ ] ? Different corpora use slightly different codes. Read the manual. [n* ]
    • 27. The Corpus Magic training.dyslexiaaction.org.uk * [ ] ? Any one character Any number of characters (incl 0) Lemma (all inflectional forms of a word) Different corpora use slightly different codes. Read the manual. [n* ] Part of speech tags (e.g. nouns)
    • 28. training.dyslexiaaction.org.uk * *each each, reach, beach, teach, outreach, …, impeach, … teach* teachers, teaching, …, teachable, teacher-librarians, … t*ch touch, teach, tech, torch, trench, twitch, …, three-inch, … teach * teach the, teach us, teach students, …
    • 29. training.dyslexiaaction.org.uk ? ?each reach, beach, teach, peach, leach, keach, … each? each- (1), each# (1) [ie nothing] ?each? peachy, bleachy, teacha, reachs (2) [ie spelling error] , … t?ch tech, tach, toch, tuch, tsch, tich t??ch touch, teach, torch, tisch, …
    • 30. [Lemma] training.dyslexiaaction.org.uk
    • 31. Part of speech tags training.dyslexiaaction.org.uk [run] . [n*] [run] [n*]
    • 32. Common tags training.dyslexiaaction.org.uk [n*] noun [NN2] plural nouns [v*] verb [VVD] verb past tense [aj*] (BNC) / [j*] (COCA) adjective [av*] (BNC) / [r*] (COCA) adverb
    • 33. Help training.dyslexiaaction.org.uk
    • 34. training.dyslexiaaction.org.uk
    • 35. training.dyslexiaaction.org.uk
    • 36. You can also training.dyslexiaaction.org.uk cats and dogs search for idioms ?each*s combine wildcards [=pretty] search for synonyms car|bike|horse search for alternatives used -car exclude searches For more details see:
    • 37. Concordance + KWIC training.dyslexiaaction.org.uk *ies.[N*]
    • 38. KWIC – K ey- W ord In C ontext training.dyslexiaaction.org.uk *ies.[N*]
    • 39. Limit searches by genre training.dyslexiaaction.org.uk
    • 40. Other questions corpus can answer <ul><li>Are there more nouns or verbs ending in -ies? * ies.[V*] vs. *ies.[N*] </li></ul><ul><li>Are there four-letter verbs ending in -ed in the present tense? ??ed.[VVB] </li></ul><ul><li>What are the most common adjectives describing students vs. pupils. [j*] [student] vs. [j*] [pupil] </li></ul><ul><li>What do we say teachers do most often? [teacher] [vvb] </li></ul>training.dyslexiaaction.org.uk
    • 41. Corpus, rules, and regularity training.dyslexiaaction.org.uk http://www.flickr.com/photos/51505078@N00/352492687 pre* *ed *ies.[V*]
    • 42. Collocations Limits on variability training.dyslexiaaction.org.uk See also Kennedy, p. 80-23
    • 43. Collocations (cont) Limits on variability training.dyslexiaaction.org.uk See also Kennedy, p. 80-23
    • 44. Collocations (cont) training.dyslexiaaction.org.uk [teacher] must [v*]
    • 45. Idioms and set phrases training.dyslexiaaction.org.uk 275 results 359 results
    • 46. Google as a Corpus training.dyslexiaaction.org.uk &quot; put the search text in quotes &quot; use * for the search item
    • 47. training.dyslexiaaction.org.uk
    • 48. Google as a Corpus Pros & Cons training.dyslexiaaction.org.uk PRO: rare, low frequency usage, uptodate usage CON: no sampling, no frequency sort, no genre limit, no part of speech tags
    • 49. Google results counts are only rough estimates… training.dyslexiaaction.org.uk http://searchengineland.com/why-google-cant-count-results-properly-53559 Different people searching in different geographic locations can get different numbers Sometimes searching for A gives fewer results than searching for A without B
    • 50. … but Google fights can be fun training.dyslexiaaction.org.uk
    • 51. WebCorp is makes Google search results linguist-friendly training.dyslexiaaction.org.uk
    • 52. Avoid Common Corpus Errors training.dyslexiaaction.org.uk Be aware of limitations : sampling, coverage, size, presence of typos and errors, bad part of speech tagging Beware of low frequency results Beware of homographs Check results come from multiple sources Check KWIC to confirm relevance Limit search by genre http://www.flickr.com/photos/andreassolberg/433734311
    • 53. Check examples and sources training.dyslexiaaction.org.uk
    • 54. Always check low frequency results training.dyslexiaaction.org.uk must [v*] [n*] … sometimes they come from the same source
    • 55. False roots http://etymonline.com corner, silly, preface, cockroach, protest, stable …
    • 56. Make your own corpus with TextSTAT http:// neon.niederlandistik.fu-berlin.de/en/textstat
    • 57. Make your own corpus with AntConc training.dyslexiaaction.org.uk http://www.antlab.sci.waseda.ac.jp/software.html
    • 58. Corpus in the classroom training.dyslexiaaction.org.uk teacher preparation student discovery
    • 59. Teacher preparation training.dyslexiaaction.org.uk <ul><li>find relevant, common examples </li></ul><ul><li>prepare worksheets </li></ul><ul><li>check for exceptions </li></ul><ul><li>find out answers to student questions about rules and usage </li></ul>
    • 60. Student discovery training.dyslexiaaction.org.uk <ul><li>show search results to students to work out rules or word meanings </li></ul><ul><li>teach students how to search for questions </li></ul><ul><li>ask students to give each other puzzles for searching </li></ul>
    • 61. For heavy classroom use… training.dyslexiaaction.org.uk register for group access to prevent spam lock out
    • 62. Corpus v dictionary training.dyslexiaaction.org.uk
    • 63. Non-classroom corpus use training.dyslexiaaction.org.uk supplement dictionary cross-word puzzles check typical usage when writing
    • 64. Where to go next? training.dyslexiaaction.org.uk http://www.corpora4learning.net
    • 65. Thank you Contact http://dominiklukes.net training.dyslexiaaction.org.uk

    ×