Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Bridging informal MOOCs & formal EAP 
programmes with language corpora 
Alannah Fitzgerald, Shaoqun Wu, Ian Witten, Martin...
Today’s TaLC Session... 
• Development of Tools and Language Corpora 
– Design-Based Research with the FLAX Project 
• Ope...
Who are we in this flax research & 
Development collaboration?
FLAX Language at Waikato University 
http://flax.nzdl.org FLAX image by permission of non-commercial reuse by Jane Gallowa...
FLAX Language Project at the 
Greenstone Digital Library Lab, 
Waikato University NZ 
Professor Ian Witten 
FLAX Project L...
Data Mining with Weka MOOC 
https://www.youtube.com/user/WekaMOOC/videos?sort=p&flow=grid&view=0
Research on Open Corpora with FLAX 
http://oerresearchhub.org/
OER Research Hypotheses 
http://oerresearchhub.org/collaborative-research/hypotheses/
Research with Queen Mary U. of London 
http://language-centre.sllf.qmul.ac.uk/home
Openness across formal & informal 
higher education
MOOCs You May Know
Openness in Mainstream MOOCs? 
http://www.michaelbransonsmith.net/blog/2012/12/19/day-of-the-mooc-now-animated/
The End of the University 
As We Know It 
“The future looks like this: Access to college-level 
education will be free for...
The Education Apocalypse: 
#opened13 Keynote 
“Where in the stories we’re telling about the future of education are 
we se...
Current MOOC Language Issues 
• Mainstream MOOCs (Coursera, edX, Udacity) are predominantly in 
the English Language 
– MO...
Openness in Mainstream EAP??
Be Free to Do Whatever You Want! 
• Open Resources for ESAP 
Soup Dragons: 
– Building & Sharing Open ESAP 
Corpora to Pro...
Open Source language TOOLS 
development
Google-esque Interface Designs 
Designed for the non-expert corpus user, namely: 
learners, teachers, subject academics, i...
Introducing the Wikipedia Miner Toolkit 
(Milne & Witten, 2013)
Building Interactivity into 
FLAX Language Collections
FLAX Activities Continued
FLAX Across Platforms 
• FLAX Website flax.nzdl.org for hosting open online 
language collections 
• Building directly ont...
Training Videos for FLAX on YouTube 
http://www.youtube.com/watch?v=fysDzYjbhh0
Domain-specific open language 
collections building
Collaboration with Subject Specialists 
“In the emerging academic literacies approach involving cooperation 
between subje...
Earth’s Virology Professor with 
Coursera MOOCs 
“Natural science might be characterized as a discipline of discovery, 
id...
Virology Language Collection in FLAX 
Type of media in the FLAX Virology 
Collection 
Number of items in the FLAX Virology...
Streaming Open Lectures/Podcasts
Virology Collocations
Virology Terms and Concept Support
Domain-specific Collocations 
We focus on lexical collocations with noun-based 
structures because they are the most salie...
Lexical Bundles 
“Lexical bundles” are multi-word sequences with 
distinctive syntactic patterns and discourse functions t...
ESAP Law Collections in FLAX at QMUL 
Type of media in the FLAX 
Law Collections 
Number and source of items in the FLAX 
...
Working with Full Texts
Collocations Within ESAP Collections
Linking to the FLAX Learning Collocations 
Collection (BNC, BAWE, Wikipedia)
Good Ol’ Part-Of-Speech Tagging
Wikify Your Collections
Lexical Bundles
FLAX HTML Formatting Tool
Researching resources at the 
interface of openness for academic 
English
Key Data Sets Will Consist Of: 
• Online survey data 
– MOOC learners for evaluation of collections 
– Language Teaching p...
Interfacing Communities 
http://videolectures.net/ocwc2014_fitzgerald_resources/
FLAX Multilingual Open-Source Software 
http://videolectures.net/ocwc2014_fitzgerald_multilingual_world/
References 
• Biber, D., Conrad, S., & Cortes, V. (2003). Lexical bundles in speech and 
writing: an initial taxonomy. In ...
Thank You 
FLAX Language Project http://flax.nzdl.org/ 
Shaoqun Wu: shaoqun@waikato.ac.nz / Ian Witten: ihw@cs.waikato.ac....
Upcoming SlideShare
Loading in …5
×

Bridging Informal MOOCs & Formal English for Academic Purposes Programmes with Language Corpora

1,306 views

Published on

Presented at the Teaching and Language Corpora (TaLC) Conference in Lancaster on July 23, 2014. Based on collaborative work with the FLAX Language Project (Shaoqun Wu and Ian Witten) and the Language Centre at Queen Mary University of London (Martin Barge, William Tweddle, Saima Sherazi).

Published in: Education
  • Be the first to comment

Bridging Informal MOOCs & Formal English for Academic Purposes Programmes with Language Corpora

  1. 1. Bridging informal MOOCs & formal EAP programmes with language corpora Alannah Fitzgerald, Shaoqun Wu, Ian Witten, Martin Barge, William Tweddle, Saima Sherazi https://www.flickr.com/photos/library_of_congress/8725417555 /
  2. 2. Today’s TaLC Session... • Development of Tools and Language Corpora – Design-Based Research with the FLAX Project • Openness in Corpus-Based Tools, Resources & Practices • New & Old Contexts of Learning, Teaching & Research with Corpus-Based Approaches – Bridging Formal & Informal Higher Education with Open Do-It-Yourself ESAP Language Collections
  3. 3. Who are we in this flax research & Development collaboration?
  4. 4. FLAX Language at Waikato University http://flax.nzdl.org FLAX image by permission of non-commercial reuse by Jane Galloway
  5. 5. FLAX Language Project at the Greenstone Digital Library Lab, Waikato University NZ Professor Ian Witten FLAX Project Lead Dr Shaoqun Wu FLAX Project Lead Researcher & Developer
  6. 6. Data Mining with Weka MOOC https://www.youtube.com/user/WekaMOOC/videos?sort=p&flow=grid&view=0
  7. 7. Research on Open Corpora with FLAX http://oerresearchhub.org/
  8. 8. OER Research Hypotheses http://oerresearchhub.org/collaborative-research/hypotheses/
  9. 9. Research with Queen Mary U. of London http://language-centre.sllf.qmul.ac.uk/home
  10. 10. Openness across formal & informal higher education
  11. 11. MOOCs You May Know
  12. 12. Openness in Mainstream MOOCs? http://www.michaelbransonsmith.net/blog/2012/12/19/day-of-the-mooc-now-animated/
  13. 13. The End of the University As We Know It “The future looks like this: Access to college-level education will be free for everyone; the residential college campus will become largely obsolete; tens of thousands of professors will lose their jobs; the bachelor’s degree will become increasingly irrelevant; and ten years from now Harvard will enroll ten million students.” (Harden, 2013) http://www.the-american-interest.com/article.cfm?piece=1352
  14. 14. The Education Apocalypse: #opened13 Keynote “Where in the stories we’re telling about the future of education are we seeing salvation? Why would we locate that in technology and not in humans, for example? Why would we locate that in markets and not in communities? What happens when we embrace a narrative about the end-times — about education crisis and education apocalypse? Who’s poised to take advantage of this crisis narrative? Why would we believe a gospel according to artificial intelligence, or according to Harvard Business School [Christensen’s Disruptive Innovation theory], or according to Techcrunch...?” (Watters, 2013) http://hackeducation.com/2013/11/07/the-education-apocalypse/
  15. 15. Current MOOC Language Issues • Mainstream MOOCs (Coursera, edX, Udacity) are predominantly in the English Language – MOOC learners are not registered as language learners • Impact on retention and course completion • Crowdsourcing and funding for commercial translations of MOOCs is currently limited – Translations of lectures only do not assist with assessment requirements in e.g. English-medium MOOCs • Receptive versus productive language needs • Mainstream MOOCs do not (in most cases) license content openly as Open Educational Resources (OER) – Open licensing with Creative Commons is vital for developing derivative resources to support language learning – Building linguistic support into MOOC learning platforms? e.g. a combination of translation and corpus-based tools? • Online learning offers a compelling case for corpus-based approaches
  16. 16. Openness in Mainstream EAP??
  17. 17. Be Free to Do Whatever You Want! • Open Resources for ESAP Soup Dragons: – Building & Sharing Open ESAP Corpora to Promote DIY Corpus-Based Approaches – Developing Automated Interactivity into ESAP Corpora – Developing ESAP Course Book and Lesson Plan Derivatives – Researching and Developing ESAP Corpora & Derivatives – Researching and Developing Corpus Tools e.g. Interfaces http://en.wikipedia.org/wiki/The_Soup_Dragons
  18. 18. Open Source language TOOLS development
  19. 19. Google-esque Interface Designs Designed for the non-expert corpus user, namely: learners, teachers, subject academics, instructional designers and language resource developers.
  20. 20. Introducing the Wikipedia Miner Toolkit (Milne & Witten, 2013)
  21. 21. Building Interactivity into FLAX Language Collections
  22. 22. FLAX Activities Continued
  23. 23. FLAX Across Platforms • FLAX Website flax.nzdl.org for hosting open online language collections • Building directly onto the Web with OER • FLAX multilingual open-source software for downloading onto your PC • For offline use • Building collections out of sight using All Rights Reserved content • FLAX for MOODLE plug-in • FLAX for MOOC Platforms? • FLAX in conjunction with translation technologies?
  24. 24. Training Videos for FLAX on YouTube http://www.youtube.com/watch?v=fysDzYjbhh0
  25. 25. Domain-specific open language collections building
  26. 26. Collaboration with Subject Specialists “In the emerging academic literacies approach involving cooperation between subject specialists and writing teachers, the aim is to help the students develop metacognitive awareness of the roles and functions of writing in that discipline, to enable them to stand back from it and observe how it functions, and then to help them gradually participate in the genres, where genre is understood as a constellation of actions rather than a list of formal features.” (Breeze, 2012)
  27. 27. Earth’s Virology Professor with Coursera MOOCs “Natural science might be characterized as a discipline of discovery, identifying and describing entities that had not been previously considered. As a result, natural science employs a large set of highly technical words, like dextrinoid, electrophoresis, and phallotoxins. Most of these words do not have commonplace synonyms, because they refer to entities, characteristics, or concepts that are not normally discussed in everyday conversation.” (Biber, 2006)
  28. 28. Virology Language Collection in FLAX Type of media in the FLAX Virology Collection Number of items in the FLAX Virology Collection Podcast audio transcripts (This Week in Virology) 130 YouTube video transcripts (2013 virology course at Columbia, also in Coursera) 110 Academic blog posts (Virology Blog) 540 Open Access research articles (relevant to virology course and divided into paper sections) 40
  29. 29. Streaming Open Lectures/Podcasts
  30. 30. Virology Collocations
  31. 31. Virology Terms and Concept Support
  32. 32. Domain-specific Collocations We focus on lexical collocations with noun-based structures because they are the most salient and important patterns in topic-specific text: •verb + noun e.g. detect virus particles •noun + noun e.g. tobacco mosaic virus •adjective + noun e.g. negative strand virus •noun + of + noun e.g. genome of the virus
  33. 33. Lexical Bundles “Lexical bundles” are multi-word sequences with distinctive syntactic patterns and discourse functions that are commonly used in academic prose (Biber & Barbieri, 2007; Biber et al, 2003, 2004). Typical patterns in the virology MOOC lectures include: •noun phrase + of e.g. a DNA copy of •prepositional phrase + of e.g. at the end of •it + verb/adjective phrase e.g. it turns out that •be + noun/adjective phrase e.g. is an example of •verb phrase + that e.g. you can see that
  34. 34. ESAP Law Collections in FLAX at QMUL Type of media in the FLAX Law Collections Number and source of items in the FLAX Law Collections Podcast audio files & transcripts (OpenSpires) 10-15 Lectures (Oxford Law Faculty & the Centre for Socio-Legal Studies) MOOC lecture transcripts & videos (streamed via YouTube & Vimeo) 4 MOOC Collections: Copyright Law (Harvard/edX), English Common Law (Uni. of London/Coursera), Age of Globalization (Texas at Austin/edX), Environmental Law & Politics (OpenYale) Student PhD thesis writing & Pre-sessional for Law ESAP essays British Law Report Corpus (BLaRC) (Marin, 2012) 10-20 EThoS Theses at the British Library; 20+ Essays from QMUL Law Pre-sessional 8-million word corpus derived from freely available content on the BAILII website Open Access research articles (relevant to QMUL Law Pre- and In-Sessional language courses) 40 Articles (DOAJ - Directory of Open Access Journals)
  35. 35. Working with Full Texts
  36. 36. Collocations Within ESAP Collections
  37. 37. Linking to the FLAX Learning Collocations Collection (BNC, BAWE, Wikipedia)
  38. 38. Good Ol’ Part-Of-Speech Tagging
  39. 39. Wikify Your Collections
  40. 40. Lexical Bundles
  41. 41. FLAX HTML Formatting Tool
  42. 42. Researching resources at the interface of openness for academic English
  43. 43. Key Data Sets Will Consist Of: • Online survey data – MOOC learners for evaluation of collections – Language Teaching professionals on perceptions of OER • Offline data for evaluation of collections and course book derivatives of the collections for ESAP – Survey and Think-Aloud Protocols to evaluate the FLAX Language System – Student texts from Law students (Queen Mary University of London). • Interview and focus-group data (f2f and online via Skype) – With stakeholders (language teachers, academics, MOOC providers) involved in the development of the academic language collections used in this research.
  44. 44. Interfacing Communities http://videolectures.net/ocwc2014_fitzgerald_resources/
  45. 45. FLAX Multilingual Open-Source Software http://videolectures.net/ocwc2014_fitzgerald_multilingual_world/
  46. 46. References • Biber, D., Conrad, S., & Cortes, V. (2003). Lexical bundles in speech and writing: an initial taxonomy. In A. Wilson, P. Rayson, & T. McEnery (Eds.), Corpus linguistics by the lune: A festschrift for Geoffrey Leech (pp. 71–92). Frankfurt/Main: Peter Lang. • Biber, D., Conrad, S., & Cortes, V. (2004). If you look at . . .: lexical bundles in university teaching and textbooks. Applied Linguistics, 25, 371–405. Biber, D. (2006). University Language, A corpus-based study of spoken and written registers. John Benjamins, Amsterdam. • Biber, D., Barbieri F. (2007). Lexical bundles in university spoken and written registers. English for Specific Purpose, 26, 263–286. • Breeze, R. (2012). Rethinking Academic Writing Pedagogy for the European University. Rodopi, Amsterdam. • Harden, N. (2013). The end of the university as we know it. The American Interest. Retrieved from http://www.the-american-interest. com/article.cfm?piece=1352 • Milne, D. & Witten, I.H. (2013). An open-source toolkit for mining Wikipedia. Artificial Intelligence, 194, 222-239. • Watters, A. (2013). The Education Apocalypse #opened13. Retrieved from http://www.hackeducation.com/2013/11/07/the-education-apocalypse/
  47. 47. Thank You FLAX Language Project http://flax.nzdl.org/ Shaoqun Wu: shaoqun@waikato.ac.nz / Ian Witten: ihw@cs.waikato.ac.nz OER Research Hub http://oerresearchhub.org/ Alannah Fitzgerald: a_fitzg@education.concordia.ca; @AlannahFitz; www.alannahfitzgerald.org TOETOE Blog; Slideshare: http://www.slideshare.net/AlannahOpenEd/ The Language Centre – Queen Mary University of London http://language-centre. sllf.qmul.ac.uk/ Martin Barge m.i.barge@qmul.ac.uk William Tweddle w.tweddle@qmul.ac.uk Saima Sherazi s.n.sherazi@qmul.ac.uk

×