Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Radio Ga Ga: corpus-based resources, you’ve yet to have your finest hour


Published on

From the blog TOETOE (ˈtɔɪtɔɪ): Technology for Open English - Toying with Open E-resources

Published in: Education
  • Be the first to comment

  • Be the first to like this

Radio Ga Ga: corpus-based resources, you’ve yet to have your finest hour

  1. 1. Articles from TOETOE Technology for Open English Toying with Open E- resources (ˈtɔɪtɔɪ)Radio Ga Ga: corpus-based resources, you’ve yet tohave your finest hour2012-09-30 04:09:14 adminRadio Ga Ga album cover byQueen via WikipediaThese past few months I’ve been tuning into a lot of different practitioner events anddiscussions across a range of educational communities which I feel are ofrelevance to English language education where uses for corpus-based resourcesare concerned. There’s something very distinct about the way these differentcommunities are coming together and in the way they are sharing their ideas andoutputs. In this post, I will liken their behaviour to different types of radio stationbroadcast, highlighting differences in communication style and the types of audience(and audience participation) they tend to attract.I’ve also been re-setting my residential as well as my work stations. No longer atDurham University’s English Language Centre, I’m now London-based and have justset off on a whirlwind adventure for further open educational resources (OER)development and dissemination work with collaborators and stakeholders in avariety of locations around the world. TOETOE is going international and is nowbeing hosted by Oxford University Computing Services (OUCS) in conjunctionwith the Higher Education Academy (HEA) as part of the UK government-fundedOER International programme.I will also be spreading the word about the newly formed Open Education SpecialInterest Group (OESIG), the Flexible Language Acquisition (FLAX) open corpus-based language resources project at the University of Waikato, and select researchcorpora, including the British National Corpus (BNC) and the British AcademicWritten English (BAWE) corpus, both managed by OUCS, which have been prisedopen by FLAX and TOETOE for uses in English as a Foreign Language (EFL) –also referred to as English as a Second Language (ESL) in North America – andEnglish for Academic Purposes (EAP). Stay tuned to this blog in the coming monthsfor more insights into open corpus-based English language resources and theiruses in different teaching and learning contexts.This post is what those in the blogging business refer to as a ‘cornerstone’ post as itincludes many insights into the past few months of my teaching fellowship in OERwith the Support Centre in Open Educational Resources (SCORE) at the OpenUniversity in the UK. Many posts within one as it were. This post also provides aroad map for taking my project work forward while identifying shorter bloggingthemes for posts that will follow this one. This particular post will also act as themother-ship TOETOE post from which subsequent satellite posts will be linked.Please use the menu hyperlinks in the section below to dip in and out of sections ofthis blog post. I have elected to choose this more reflective style of writing throughblogging so that my growing understandings in this area are more accessible tounanticipated readers who may stumble upon this blog and hopefully makecomments to help me refine my work. Two more formal case studies on myTOETOE project to date will be coming out soon via the HEA and the JISC.What station(s) are you listening to?BBC Radio has been going since 1927. With audiences in the UK, four stations inparticular are firm favourites: youth oriented BBC Radio 1 featuring new andcontemporary music; BBC Radio 2 with middle of the road music for the moremature audience; high culture and arts oriented BBC Radio 3, and; news andcurrent affairs oriented BBC Radio 4. Of course there are many more stations butthese four are very typical of those found around the world. What is more, I’veselected these four very distinct stations as the basis to build a metaphor around theway four very distinct educational practitioner communities are intersecting withcorpus-based language teaching resources. This metaphor will draw on thoughtwaves from the following:[1] what’s new and hip in open corpus-based resources and practices;[2] the greatest hits in ELT materials development and publishing;[3] research from teaching and language corpora, and;[4] the current talk in EAP: open platforms for defining practice. RADIO 1 – WHAT’S NEW AND HIP IN OPEN CORPUS-BASED RESOURCES AND PRACTICES Flipped conferencing Focusing on linked resources: which academic vocabulary list? Open eBooks for language learning and teaching MOOC on Open Translation tools and practices Bringing open corpus-based projects to the Open Education community A world declaration for OER Wikimedia – why not? The open approach to corpus resources developmentOriginal, in-house and live, this station brings us what’s new in the world of OER forcorpus-based language resources.Flipped conferencingKicking things off in late March with Clare Carr from Durham, we co-presented anOER for EAP corpus-based teacher and learner training cascade project at the
  2. 2. Eurocall CMC & Teacher Education Annual Workshop in Bologna, Italy. This wasvery much a flipped conference whereby draft presentation papers were sent to beread in advance by participants and where the focus was on discussion rather thanpresentation at the physical event. Russell Stannard of Teacher Training Videos(TTV) was the keynote speaker at this conference and I have been developing sometraining resources for the FLAX open-source corpus collections which will be readyto go live on TTV soon. New collections in FLAX have opened up the BAWE corpusand have linked this to the BNC, a Google-derived n-gram corpus as well asWikimedia resources, namely Wikipedia and Wiktionary. These collections in FLAXshow what’s cutting edge in the developer world of open corpus-based resourcesfor language learning and teaching.Focusing on linked resources: which academic vocabulary list?In a later post, I will be looking at Mark Davies’ new work with Academic VocabularyLists based on a 110 million-word academic sub corpus in the Corpus ofContemporary American (COCA) English – moving away from the Academic WordList (AWL) by Coxhead (2000) based on a 3.5 million-word corpus – and hisinnovative web tools and collections based on the COCA. Once again, Davies’Word and Phrase project website at Brigham Young University contains a bundle ofpowerfully linked resources, including a collocational thesaurus which links to otherleading research resources such as the on-going lexical database project atPrinceton, WordNet.The open approach to developing non-commercial learning and teaching corpus-based resources in FLAX also shows the commitment to OER at OUCS (includingthe Oxford Text Archive), where the BAWE and the BNC research corpora are bothmanaged. Click on the image below to visit the BAWE collections in FLAX.BAWE case study text from the Life Sciences collection in FLAX with Wikipedia resourcesOpen eBooks for language learning and teachingLearning Through Sharing: Open Resources, Open Practices, OpenCommunication, was the theme of the EuroCALL conference and to follow things upthe organisers have released a call for OER in languages for the creation of an openeBook on the same theme. The book will be “a collection of case studies providingpractical suggestions for the incorporation of Open Educational Resources (OER)and Practices (OEP), and Open Communication principles to the languageclassroom and to the initial and continuing development of language teachers.” Thisopen-access e-Book, aimed at practitioners in secondary and tertiary education, willbe freely available for download. If you’re interested in submitting a proposal tocontribute to this electronic volume, please send in a case study proposal(maximum 500 words) by 15 October 2012 to the co-editors of the publication, AnaBeaven (University of Bologna, Italy), Anna Comas-Quinn (Open University, UK) andBarbara Sawhill (Oberlin College, USA).MOOC on Open Translation tools and practicesAnother learning event which I’ve just picked up from EuroCALL is a pilot MassiveOpen Online Course in open translation practices being run from the British OpenUniversity from 15th October to 7 December 2012 (8 weeks), with theaccompanying course website opening on Oct 10th 2012. Visit the “Get involved” tabon the following site: “Open translation practices rely on crowdsourcing, and are used for translating open resources such as TED talksand Wikipedia articles, and also in global blogging and citizen media projects suchas Global Voices. There are many tools to support Open Translation practices, fromGoogle translation tools to online dictionaries like Wordreference, or translationworkflow tools like Transifex.” Some of these tools and practices will be explored inthe OT12 MOOC.Bringing open corpus-based projects to the Open Education communityOn the back of the Cambridge 2012 conference: Innovation and Impact – OpenlyCollaborating to Enhance Education held in April, I’ve been working on anothereBook chapter on open corpus-based resources which will be launched very soonat the Open Education conference in Vancouver. The Cambridge 2012 event wasjointly hosted in Cambridge, England by the Open Course Ware Consortium(OCWC) and SCORE. Presenting with Terri Edwards from Durham, we coveredEAP student and teacher perceptions of training with open corpus-based resourcesfrom three projects: FLAX, the Lextutor and AntConc. These three projects vary interms of openness and the type of resources they are offering. In future posts I willbe looking at their work and the communities that form around their resources inmore depth. The following video from the conference has captured our presentation
  3. 3. and the ensuing discussion at this event to a non-specialist audience who arecurious to know how open corpus-based resources can help with the openeducation vision. Embedding these tools and resources into online and distanceeducation to support the growing number of learners worldwide who wish to accesshigher education, where the OER and most published research are in English,opens a whole new world of possibilities for open corpus-based resources and EAPpractitioners working in this area. 00:00/27:22A further video from a panel discussion which I contributed to – an OERkaleidoscope for languages – looks at three further open language resourcesprojects that are currently underway and building momentum here in the UK:OpenLives, LORO, the CommunityCafe. Reference to other established OERprojects for languages and the humanities including LanguageBox and the HumBoxare also made in this talk.A world declaration for OERThe World OER congress in June at the UNESCO headquarters in Paris markedten years since the coining of the term OER in 2002 along with the formal adoptionof an OER declaration (click on the image to see the declaration). I’ve included thefollowing quotation from the OER declaration to provide a backdrop to this growingopen education movement as it applies to language teaching and learning,highlighting that attribution for original work is commonplace with creative commonslicensing.Emphasizing that the term Open Educational Resources (OER) was coined atUNESCO’s 2002 Forum on Open Course Ware and designates “teaching, learningand research materials in any medium, digital or otherwise, that reside in the publicdomain or have been released under an open license that permits no-cost access,use, adaptation and redistribution by others with no or limited restrictions. Openlicensing is built within the existing framework of intellectual property rights asdefined by relevant international conventions and respects the authorship of thework”.Wikimedia – why not? Earlier in September, I volunteered to present at the EduWikiconference in Leicester which was hosted by the Wikimedia UK chapter. Mostpeople are familiar with Wikipedia which is the sixth most visited website in theworld. It is but one of many sister projects managed by the Wikimedia Foundation,however, along with others such as Wikiversity, Wiktionary etc.I will also be blogging soon about widely held misconceptions for uses of Wikipediain EAP and EFL / ESL while exploring its potentials in writing instruction withreference to some very exciting education projects using Wikipedia around theworld. The types of texts that make up Wikipedia alongside many academics’realisations that they need to be reaching wider audiences with their work throughmore accessible modes of writing transmission are all issues I will be commentingon in this blog in the very near future.Presenting the work the FLAX team have done with text mining, incorporating DavidMilne’s Wikipedia mining tool, the potential of Wikipedia as an open corpus resourcein language learning and teaching is evident. I was demonstrating how thisWikipedia corpus has been linked to other research corpora in FLAX, namely theBNC and the BAWE, for the development of corpus-based OER for EFL / ESL andEAP. And, let’s not forget that it’s all for free!The open approach to corpus resources developmentThere is no reason why the open approach taken by FLAX cannot be extended tobuild open corpus-based collections for learning and teaching other modernlanguages, linking different language versions of Wikipedia to relevant researchcorpora and resources in the target language. In particular, functionality in the FLAXcollections that enable you to compare how language is used differently across arange of corpora, which are further supported by additional resources such asWiktionary and Roget’s Thesaurus, make for a very powerful language resource.Crowd-sourcing corpus resources through open research and education practicesand through the development of open infrastructure for managing and making theseresources available is not as far off in the future as we might think. The CommonLanguage Resources and Technology Infrastructure (CLARIN) mission in Europe isa leading success story in the direction currently being taken with corpus-basedresources (read more about the recent workshop for CLARIN-D held in Leipzig,Germany). RADIO 2 – THE GREATEST HITS IN ELT
  4. 4. RADIO 2 – THE GREATEST HITS IN ELT MATERIALS DEVELOPMENT AND PUBLISHING Crosstalk in ELT materials development and publishing The broken record in ELT publishing Open Textbooks A deficit in corpus-based resources training Gangnam style corpus-based resources development PublishOER A matter of scale in open and distance education Thinking beyond classroom-based practiceIn a previous post, I left off with reflections from the 2012 IATEFL conference andexhibition in Glasgow. Wandering through the exhibition hall crammed with vendor-driven English language resources for sale from the usual suspects (big brandpublishers), the analogy of the greatest hits came to mind with respects to EFL /ESL and EAP materials development and publishing. But at this same IATEFL eventthere was also a lot of co-channel interference feeding in from the world of self-publishing, reflecting how open digital scholarship has become mainstream practicein Teaching English as a Foreign Language (TEFL), also known as TeachingEnglish as a Second Language (TESL) in North America. The launch of the roundinitiative at IATEFL, bridging the gap between ELT blogging and book-making, wherethe emphasis is on teachers as publishers is but one example.Crosstalk in ELT materials development and publishingLet’s take a closer look at the crosstalk happening within the world of ELT materialsdevelopment and publishing, where messages are being transmitted simultaneouslyfrom radio 1 and radio 2 type stations. Across the wider ELT world, TEFL / TESLhas embraced Web 2.0 far more readily than EAP (but there are interesting signs ofopen online life emerging from some EAP practitioners, which I will highlight in thelast section of this blog).Within TEFL, we can observe more in the way of collaboration between open andproprietary publishing practices. English360, also present at IATEFL 2012,combines proprietary content from Cambridge University Press with teachers’lesson plans, along with tools for creating custom-made pay-for online Englishlanguage courses. Across the ELT resources landscape open resources andpractices proliferate, including: free ELT magazines and journals; blogs andcommentary-led discussions; micro-blogging via twitter feeds and tweetchatsessions; instructional and training videos via YouTube and iTunesU (bothproprietary channels that hold a lot of OER), and; online communities with lessonplan resource banks. These and many more open educational practices (OEP) arethe norm in TEFL / TESL. And, let’s not forget Russell Stannard’s Teacher TrainingVideos website of free resources for navigating web-based language tools andprojects drawing on his service as the Web Watcher at English TeachingProfessional for well over a decade now.The broken record in ELT publishingBroken record of “I believe inmiracles” by Ian Crowther viaFlickrYet, both the TEFL / TESL and EAP markets are still well and truly saturated withthe glossy print-based textbook format, stretching to the CD-ROM and mostlypassword-protected online resource formats. The greatest hits get played over andover again and the needle continues to get stuck in many places.Exactly why does the closed textbook format concern me so much? It’s an issue ofgranularity or size really which leads to further issues with flexibility, specificity andcurrency. As we all know, there are only so many target language samples and tasktypes that you can pack into a print-based textbook. Beyond the trendy conversation-based topics, what are sometimes useful and transferable are the approaches thatmake up the pedagogy contained therein. Unlocking these approaches and linking towider and more relevant and authentic language resources is key. We can see thisapproach to linked resources development taken by the web-based FLAX andWordandPhrase corpus-based projects. Publishers are aware of the limitations ofthe textbook format but they’re also trying to reach a large consumer base to boosttheir sales so it remains in their best interests to keep resources generic. Think of allthe academic English writing books out there, many of which claim to be based onthe current research for meeting your teaching and learning needs for academicEnglish writing across the disciplines, but turn out to be more of the same topic-based how-to skills books working within the same essayist writing tradition.Open textbooksThe open textbook movement brings a new type of textbook to the world ofeducation. One that can be produced at a fraction of the cost and one that can betailored, linked to external resources, changed and updated whenever thepedagogical needs arise.The argument in favour of textbooks in ELT has always been one for providingstructure to the teaching and learning sequence of a particular syllabus or course.Locked-down proprietary textbook, CD-ROM and online resource formats are notonly expensive but they are inflexible. And, these force teachers into problematicpractices. Despite trying to point out the perils of plagiarism to our students, aslanguage teachers we are supplementing textbooks with texts, images and audio-visual material from wherever we can beg, borrow and steal them. Of course we dothis for principled pedagogical reasons and if we don’t plan on sharing theseteaching materials beyond classroom and password-protected VLE walls we’reprobably OK, right?I’ve seen many a lesson handout or in-house course pack for language teaching thatincludes many third party texts and images which are duly referenced. Whether theteacher/materials developer puts the small ‘c’ in the circle or not, marking thishandout or course pack as copyrighted, the default license is one of copyright to theinstitution where that practitioner works. And, this is where the problem lies. Thehandout or course pack is potentially in breach of the copyright of any third partymaterials used therein, unless the teacher/materials developer has gainedclearance from the copyright holders or unless those third party materials are openlylicensed as OER for re-mixing. Good practice with materials development andlicensing will ensure that valuable resources created by teachers can be legitimatelyshared across learning and teaching communities. You can do this through openpublishing technologies and/or in collaboration with publishers.
  5. 5. A deficit in corpus-based resources trainingGood corpus-derived textbooks from leading publishing houses do exist. Finally, theteaching of spoken grammar gets the nod with The Handbook of Spoken Grammartextbook by Delta Publishing. But, and this is a big but, do these textbooks go farenough to address the current deficit in teacher and learner training with corpus-based tools and resources? I expect the publishers would direct this question to theacademic monographs, of which there are a fair few, on Data Driven Learning(DDL) and corpus linguistics. I have some on my bookshelf and there are manymore in the library where I am a student/fellow, all cross-referenced to academicjournal articles from research into corpus linguistics and DDL which I will be talkingabout more in the third section of this blog. But exactly how accessible are theseresources – in terms of their cost, the academic language they are packaged in, theclosed proprietary formats they are published in, and in relation to much of thesubscription-only corpora and concordancing software their research is based on?It’s no wonder that training in corpus tools and resources is not part of mainstreamEnglish language teacher training. Of course, there are open exceptions that providenew models in corpus-based resources development and publishing practices andthis is very much what the TOETOE project is trying to share with languageeducation communities.Corpus linguists are well aware that corpus-based resources and tools in languageteaching and materials development haven’t taken off as a popular sport inmainstream language teaching and teacher training. This does run counter to thefindings from the research, however, where the argument is that DDL has reacheda level of maturity (Nesi & Gardner, 2011; Reppen, 2010; O’Keefe, 2007; Biber,2006). Similarly, many of the findings from leading researchers (too many to cite!) inlanguage and teaching corpora have been baffled by the chasm between theresearch into DDL and the majority of mainstream ELT materials that appear on themarket that continue to ignore the evidence about actual language usage fromcorpus-based research studies. Once again, this comes back to the issue ofspecific versus generic language materials and the issues raised around limitationswith developing restricted resource formats.Gangnam style corpus-based resources developmentGangnam Style by PSY 싸이 강남스타일 via FlickrSo what’s it going to take for corpus-based resources to take off Gangnam style inmainstream language teaching and teacher training? And, how are we going tomake these resources cooler and more accessible so as to stop language teachingpractitioners from giving them a bad rap? More and more corpus-based tools andresources are being built with or re-purposed with open source technologies andplatforms. We are now presented with more and more web-based channels for thedissemination of educational resources, offering the potential for massification andexciting new possibilities for achieving what has always eluded the languageeducation and language corpora research community, namely the wide-scaleadoption of corpus-based resources in language education.I’ve actually been asked to take the word ‘corpus’ out of a workshop title by aconference organiser so as to attract more participants. If you’re interested inexpressing your own experiences with using corpora in language teaching andwould like to make suggestions for where you think data-driven learning should beheading you can complete Chris Tribble’s on-going online survey on DDL here.Radio, what’s new? Someone still loves you (corpus-based resources)…PublishOERPublishers constantly need ideas for and examples of good educational resources.No great surprises there. I would like to propose that OER and OEP are a great wayto get noticed by publishers to start working with them. Sitting on the steeringcommittee meeting with the JISC-funded PublishOER project members atNewcastle University in the UK in early September, we also had representativesfrom Elsevier, RightsCom, the Royal Veterinary College (check out their excitingWikiVet OER project) and JISC Collections at the table. Elsevier who have borne thebrunt of a lot of the lash back in academic publishing from the Open Accessmovement are trying to open up to the fast changing landscape of open practices inpublishing. PublishOER are creating new mechanisms, a permissions requestsystem, for allowing teachers and academics to use copyrighted resources in OER.These OER will include links and recommendations leading back to the publishers’copyrighted resources as a mechanism for promoting them. Publishers are alsointerested in using OER developed by teachers and academics that are welldesigned and well received by students. Re-mixable OER offer great businessopportunities for publishers as well as great dissemination opportunities for DDLresearchers and practitioners, enabling effective corpus-based ELT resources toreach broader audiences.Sustainability is an important issue with any project, resource, event or community.How many times have we seen school textbook sets stay unused on shelves, orheard of government-funded project resources that go unused perhaps due to a lackof discoverability? To build new and useful resources online does not necessarilymean that teachers and learners will come in droves to find and use theseresources even if they are for free. David Duebelbeiss of EFL Classroom 2.0 iscurrently exploring new business models for sharing and selling ELT resources.One example is the sale of lesson plans in a can which were once free and now sellfor $19.95, a “once and forever payment”. Some teachers can even make it rich asis reported in this businessweek article about a kindergarten teacher who sold herpopular lesson plans through the TeachersPayTeachers initiative.Transaction costs in materials development don’t only include the cost of the toolsand resources that enable materials development, they also include the cost interms of time spent on developing resources and marketing them. Open educationalso points to the unnecessary cost in duplicating the same educational resourcesover and over again because they haven’t been designed and licensed openly forsharing and re-mixing. Putting your resources in the right places, in more than one,and working with those that understand new markets, new technologies and newbusiness models, including open education practitioners and publishers, are allways forward to ensure a return on investment with materials development.Hopefully, by providing new frequencies for practitioners to tune into for how tocreate resources from both open and proprietary resources a new mixed economy(as the PublishOER crowd like to refer to it) will be realised.A matter of scale in open and distance educationLet’s not forget those working in ELT around the world, many of whom arevolunteers, who along with their students simply cannot afford the cost of proprietaryand subscription-only educational resources, let alone the investment and
  6. 6. infrastructure for physical classrooms and schools. Issues around technology andELT resources and practices in developing countries did surface at IATEFL 2012but awareness around the more pressing issues may not be finding ways toeffectively filter their way through to well-resourced ELT practitioners and theinstitutions that employ them. ELT is still fixated on classroom-based teachingresources and practices.The Hornby Educational Trust in collaboration with the British Council which is aregistered charity have been offering scholarships to English language teachersworking in under-resourced communities since 1970. I attended a session given bythe Hornby scholars at IATEFL 2012 and although I was impressed by theenthusiasm and range of expertise of those who had been selected forscholarships, reporting on ELT interventions they had devised in their local contexts,I couldn’t help but wonder about the scale of the challenges we currently face ineducation globally. How are we going to provide education opportunities for theadditional 100 million learners currently seeking access to the formal post-secondary sector (UNESCO, 2008)? In Sub-Saharan Africa, more than half of allchildren will not have the privilege of a senior high school education (Ibid). Whatopen and distance education teaches us is that there are just not enoughteachers/educators out there. Nor will the conventional industrial model ofeducational delivery be able to meet this demand.As DDL researchers and resource developers who are looking for ways to make ourresearch and practice more widely adopted in language teaching and learningglobally, wouldn’t we also want to be thinking about where the real educationalneeds are and how we might be reaching under-resourced communities with opencorpus-based educational resources for uses in EFL / ESL and EAP among othertarget languages? First of all, we would need to devote more attention to unpackingcorpus-based resources so that they are more accessible to the non-expert user,and we would need to find more ways of making these resources morediscoverable.In interviews released as OER on YouTube by DigitaLang with leading TEFLers atIATEFL 2012, I was able to catch up on opinions around the use of technology inELT. Nik Peachey corrected the often widely held misconception about the digitaldivide for uses of technology in developing countries, pointing to the adoption ofmobile and distance education rather than the importation of costly print-basedpublished materials with first-world content and concerns that are ofteninappropriate for developing world contexts. You can view his interview here:Thinking beyond classroom-based practiceScott Thornbury, writer of the A-Z of ELT blog – another influential and populardiscussion site for the classic hits in ELT for those who are both new and old to thefield – also praised the Hornby scholars and gave his views on technology in ELT ina further IATEFL 2012 DigitaLang interview. He talks about the ‘human factor’ assomething that occurs in classroom-based language teaching. In order to nurturethis human factor, he recommends that technology be kept for uses outside theclassroom or at best for uses in online teacher education. Open and distanceeducation practitioners and researchers would also agree that well-resourced face-2-face instruction yields high educational returns as in the case of the Hornbyscholarships, but they would also argue that this is not a scalable business modelfor meeting the needs of the many who still lack access to formal post-secondaryeducation. What is more, the human factor as evidenced in online collaborativelearning is well documented in the research from open and distance education as itis from traditional technology-enhanced classroom-based teaching.For a view into how open and distance education practitioners and researchers aretrying to scale these learning and accreditation opportunities for the developingworld, the following open discussion thread from Wayne Mackintosh on MOOCs fordeveloping countries – discussion from the OERuniversity Google Groups providesan entry point:“Access to reliable and affordable internet connectivity poses unique challenges inthe developing world. That said, I believe it possible to design open courses whichuse a mix of conventional print-based materials for “high-bandwidth” data and mobiletelephony for “low-bandwidth” peer-to-peer interactions. So for example, the OERudelivery model will be able to produce print-based study materials and it would bepossible to automatically generate CD-ROM images of the rich media (videos /audio) contained in the course for offline viewing. We already have the capability togenerate collections of OERu course materials authored in WikiEducator toproduce print-based equivalents which could be reproduced and distributed locally.The printed document provides footnotes for all the web-links in the materials whichOERu learners could investigate when visiting an Internet access point. OERucourses integrate microblogging for peer-to-peer interactions and we produce atimeline of all contributions via discussion forums, blogs etc. The bandwidthrequirements for these kind of interactions are relatively low which address to someextent the cost of connectivity.”RADIO 3 – RESEARCH IN TEACHING AND LANGUAGE CORPORA Bridging Teaching and Language Corpora (TaLC) Prising open corpus linguistics research in Data Driven Learning (DDL) DIY corpora with AntConc in English for Specific Academic Purposes (ESAP) Beyond books and podcasts through linking and crowd-sourcingI confess that I spend most of my time listening to BBC Radio 3. The parallel that Iwill draw here is that I was never formally educated in classical music in the sameway as I have never worked toward formal qualifications in corpus linguistics duringany of my studies. Because I am working broadly across the areas of languageresources development and enhancing teaching and learning practices throughtechnology it was only a matter of time, however, before I started exploring andtoying with corpus-based resources. I met Dr. Shaoqun Wu of the FLAX projectwhile at a conference in Villach, Austria in 2006 and by 2007 I had begun to delveinto the world of open-source digital library collections development with theUniversity of Waikato’s Greenstone software, developed and distributed incooperation with UNESCO, for realising the much broader vision of reaching under-
  7. 7. resourced communities around the world with these open technologies andcollections.Bridging Teaching and Language Corpora (TaLC)Let’s fast forward to the 2012 Teaching and Language Corpora Conference inWarsaw, Poland. Although I have participated in corpus linguistics conferencesbefore, this was my first time to attend the biennial TaLC conference. TaLCers arevery much researchers working in the area of corpus linguistics and DDL and thisconference was themed around bridging the gap between DDL research and usesfor corpus-based resources and practices in language teaching and learning.One of the keynote addresses from Mike Thomas, Let’s Marry, called for greaterconnectedness in pursuing relationships between those working in DDL researchand those working in pedagogy and language acquisition. At one point he asked theaudience to make a show of hands for those who knew of big names in the ELTworld, including Scrivener, Harmer and Thornbury. Only a few raised their hands. Healso made the point that these same ELT names don’t make their way into citationsfor research on DDL. Interestingly, I was tweeting points made in the sessions Iattended to relevant EAP and ELT / EFL / ESL communities online without a TaLCconference hashtag. It would’ve been great to have the other TaLCers tweetingalong with me, raising questions and noting key take-away points from theconference to engage interested parties who could not make the conference inperson and to catalogue a twitterfeed for TaLC that could be searched by anyone viathe Internet at a later point in time. It would’ve also been great to record keynote andpresentation speakers as webcasts for later viewing. When approached about theseissues later, however, the conference organisers did express interest in ways ofamplifying their events by building such mechanisms for openness into their nextconference.Prising open corpus linguistics research in Data Driven Learning (DDL)Problems with accessing and successfully implementing corpus-based resourcesinto language teaching and learning scenarios have been numerous. As I discussedin section 2 of this blog, many of the concordancing tools referred to in the researchhave been subscription-based proprietary resources (for example, the WordsmithTools), most of which have been designed for at least the intermediate-levelconcordance user in mind. These tools can easily overwhelm language teachingpractitioners and their students with the complex processing of raw corpus data thatare presented via complex interfaces with too many options for refinement. MikeScott, the main developer of the Wordsmith Tools has also released a free versionof his concordancing suite with less functionality and this would suffice for manylanguage teaching and learning purposes. He attended my presentation on openingup research corpora with open-source text analysis tools and OER and was veryopen-minded as were the other TaLCers whom I met at the conference regardingnew and open approaches for engaging teachers and learners with corpus-basedresources.There are many freely available annotated bibliographies compiled by corpuslinguists which you can access on the web for guidance on published research intocorpus linguistics. Many researchers working in this area are also putting pre-printversions of their research publications on the web for greater access anddissemination of their work, see Alex Boulton’s online presence for an example ofthis. Also hinted at earlier in part 2 of this blog are the closed formats many of thispublished research takes, however, in the form of articles, chapters and the fewteaching resources available that are often restricted to and embedded withinsubscription-only journals or pricey academic monographs. For example, Berglund-Prytz’s ‘Text Analysis by Computer: Using Free Online Resources to ExploreAcademic Writing’ in 2009 is a great written resource for where to get started withOER for EAP but ironically the journal it is published in, Writing and Pedagogy, is notfree. Lancaster University is home to the openly available BNCweb concordancingsoftware which you only need register for to be able to install a free standard copyon your personal computer. A valuable companion resource on BNCweb waspublished by Peter Lang in 2008 but once again this is not openly accessible tointerested readers who cannot afford to buy the book. The great news is that themain TaLC10 organiser, Agnieszka Lenko, has spearheaded openness with thismost recent event by trying to secure an Open Access publication for the TaLC10proceedings papers with Versita publishers in London.DIY corpora with AntConc in English for Specific Academic Purposes (ESAP)At TaLC10 I discovered a lot of overlap with Maggie Charles’ work on building DIYcorpora with EAP postgraduate students using the AntConc freeware by LaurenceAnthony. We had also included workshops on AntConc for students in our OER forEAP cascade at Durham so it was great to see another EAP practitioner working inthis way who had gathered data from her on-going work in this area for presentationand discussion at the conference. Many of her students at the University of OxfordLanguage Centre are working toward dissertation or thesis writing which raisesinteresting questions around enabling EAP students to become proficient indeveloping self-study resources for English for Specific Academic Purposes(ESAP). Her recent paper in the English for Specific Purposes Journal (2012) pointsto AntConc’s flexibility for student use due to it being freeware that can be installedon any personal computer or flash-drive key for portable use. Laurence Anthony’swebsite also offers a lot of great video training resources for how to use AntConc.The potential that AntConc offers for building select corpora to those studentscurrently pursuing inter-disciplinary studies in higher education is also noted byCharles. Having said this, drawbacks with certain more obscure subject disciplines,for example Egyptology (Ibid.), that had not yet embraced digital research culturesand were still publishing research in predominantly print-based volumes or image-based .pdf files made the development of DIY corpora still beyond the reach of thosefew students.Beyond books and podcasts through linking and crowd-sourcingTOETOE: English for Academic Purposes (EAP) with OER​ from AlannahFitzgerald
  8. 8. While presenting on the power of linked resources within the FLAX collections andpushing these outward to wider stakeholder communities through TOETOE, I cameacross another rapid innovation JISC-funded OER project at the Beyond Booksconference at Oxford. The Spindle project, also based at OUCS, has been exploringlinguistic uses for Oxford podcasts with work based on open-source automatictranscription tools. Automatic transcription is often accompanied with a high rate ofinaccuracy. Spindle has been looking at ways for developing crowd-sourcing webinterfaces that would enable English language learners to listen to the podcasts andcorrect the automatic transcription errors as part of a language learning crowd-sourcing task.Automatic keyword generation was also carried out in the SPINDLE project onOpenSpires project podcasts, yielding far more accurate results. These keywordlists which can be assigned as metadata tags in digital repositories and channelslike iTunesU offer further resource enhancement for making the podcasts morediscoverable. Automatically generated keyword lists such as these can also be usedfor pedagogical purposes with the pre-teaching of vocabulary, for example. TheTED500 corpus by Guy Aston which I also came across at TaLC10 is based on theTED talks (ideas worth spreading) which have also been released under creativecommons licences and transcribed through crowd-sourcing.The potential for open linguistic content to be reused, re-purposed and redistributedby third parties globally, provided that they are used in non-commercial ways andare attributed to their creators, offers new and exciting opportunities for corpusdevelopers as well as educational practitioners interested in OER for languagelearning and teaching.RADIO 4 – THE CURRENT TALK IN EAP: OPEN PLATFORMS FOR DEFININGPRACTICE Toward open practices in EAP English for Specific Academic Purposes with data driven learning resources A parallel universe in EAP materials development In-house EAP materials developmentA lot of talk around defining current and trending practices in EAP can be tuned intovia open as well as proprietary channels. In this section, I will refer to new-foundopen practices in EAP which are embracing Web 2.0 technologies amidst abackdrop of closed practices in EAP academic publishing and within subscription-only EAP memberships. I will open up discussion around these different practiceswithin EAP to sketch out common ground for where EAP could be heading withrespects to global outreach.Toward open practices in EAPRecent months have evidenced a steady opening up of practices for sharingexpertise and resources in EAP. The new EAP teaching blog based at NottinghamUniversity as a discussion-based side-shoot to their new Masters programme inEAP teaching makes use of the most widely used open-source blogging software,WordPress. Thanks to our friends in Canada, EAP tweetchat sessions are run ontwitter with the hashtag #EAPchat every first and third Monday of the month, bringingtogether EAP practitioners who wish to participate in global EAP discussions as wellas suggest topics for upcoming tweetchat sessions. An archived transcript page isavailable at the end of each EAPchat twitter session.Free webinars from Oxford University Press (OUP), the largest academic publishinghouse in the world, are also broadcasting talk on EAP to the world. Julie Moore whohas collaborated on the new Oxford EAP book series has also contributed freewebinars with OUP attended by EAP practitioners from around the world. A reviewof one of Julie’s webinars on academic grammar can be found on the OUP-sponsored ELT global blog. Wouldn’t it be great if more EAP practitioners opened uptheir practice in this way to suggest areas of expertise in EAP that they would like tocontribute and broadcast via webinars with OUP’s considerable market outreach?The EAP community in the UK mainly gathers around BALEAP with theirProfessional Issues Meetings, accreditation scheme, biennial conference and livelyemail discussion list. There is a noticeable push-pull between open and closed EAPpractices within BALEAP which I would like to bring into the open for discussion.Openness was built into the Durham PIM on the EAP Practitioner in June of thisyear to make this the first BALEAP event to have a twitter hashtag thanks to forwardthinking from Steve Kirk. Since this PIM he has also been curating a useful EAPpractitioner resources site with!There does seem to be a willingness on the part of BALEAP members to explorewith new technologies so that their discussions around issues on EAP are openlyavailable. However, the BALEAP email discussion list which I mentioned above isthe only one of half a dozen similarly JISC-hosted email discussion lists that I belongto which is closed off by the BALEAP membership subscription pay-wall. The otherswhich I subscribe to for free are all open, and discussion transcripts from theircontributing members can be searched via the web through the JISC emailarchives. This has been a BALEAP executive committee decision to keep the emaildiscussion list closed and I question whether this decision best reflects the currentdrive toward openness among BALEAP members who are interested in sharingtheir insights and expertise with those around the world for whom BALEAPmembership is not an affordable option.BALEAP recently added the strap-line the global forum for EAP practitioners to itswebsite. Formerly the British Association of Lecturers in EAP (hence the continuityfrom the acronym to the name BALEAP), some of their event and research outputscan be found on their website but others can only be accessed via the subscription-only Journal of English for Academic Purposes (JEAP). And, you can probablyguess where I’m going here with concerns around openness or lack thereof withrespects to being the global EAP practitioner forum…Nonetheless, an invaluable EAP resource that BALEAP have put out onto the wildweb is the EAP teacher competency framework. An EAP practitioner portfoliomentoring programme is currently in the pilot stages and there is talk of matchingEAP teaching competencies in BALEAP with the UK Professional StandardsFramework (UKPSF) at the HEA, but once again for those non-UK and freelanceEAP practitioners who do not work for UK higher education institutions thatsubscribe to the HEA such an alignment of frameworks may not be suitable orrelevant. That said, the essence of the UKPSF is useful and perhaps with thecurrent OER International programme at the HEA we can see ownership ofthe UKPSF go international? HEA accreditation as a UK body will remain a reality,however, so it will be interesting to see what the HEAL working party at BALEAPwho are collaborating with the HEA will come up with in response to shaping theidentity of BALEAP who aspire to be known as the global forum for EAPpractitioners.Having recently formed a Web Resources Sub Committee (WRSC) with othertechnologically and OER oriented EAPers at BALEAP we may yet see things openup. Below is the presentation Ylva Berglund Prytz and myself (both on the WRSCat BALEAP) gave on Openness in English for Specific Academic Purposes (ESAP)at the PIM in Sheffield in November, 2011.
  9. 9. Openness in English for Specific Academic Purposes from Alannah FitzgeraldElsevier are the publishers of JEAP and from experience open access in academicpublishing has come about through the pressure tactics of certain academiccommunities of practice lobbying for green and gold standard open accesspublications in their representative fields. Open Access week – set the default toopen is coming up again on October 22nd.Moving to open access research publications all depends on the culture of theacademic research community. It will take those EAP practitioners and researchersworking in privileged and well-resourced institutions that can easily affordinstitutional subscriptions to memberships like BALEAP to seriously consider openaccess and the potential for global reach of research into EAP. It will also take thoseEAP practitioners who are working off their institutional radars, so to speak, and whoare experimenting with Web 2.0 technologies to get their message and expertise outthere for global interaction around issues in EAP practice and research. Something Ipicked up from Steve Kirk’s! account is a recent book setting an open trendin EAP publishing, Writing Programs Worldwide: Profiles of Academic Writing inMany Places which is published in a free digital online format as well as a pay-forprint version. This echoes what publishers are doing with big names in more openfields such as the Bloomsbury Academic publication of The Digital Scholar by MartinWeller. Exciting times and opportunities lie ahead for EAP publishing.English for Specific Academic Purposes with data driven learning resourcesIt seems to be no great coincidence that Tim Johns who coined the term DataDriven Learning (DDL) in 1994 had also come up with the term English forAcademic Purposes (EAP) in 1974 (Hyland, 2006). According to Chris Tribble’spreliminary results from his latest survey in-take on DDL (announced at the TaLCclosing keynote address), EAP practitioners still make up a high percentage of thosewho took the survey, indicating greater uptake of corpus-based resources andpractices in EAP than those in EFL / ESL, for example.Open corpus-based tools and resources have the potential to equip and enable EAPpractitioners to develop relevant ESAP materials. Awareness of and training in theseopen corpus-based resources will need to be shared across the EAP community,however, to ensure that we are crowd-sourcing our expertise and our resources inthis area. If you click on the image below this will take you to a talk I gave at theOpen University in the UK on addressing academic literacies with corpus-basedOER. This was inspired by the Tribble DDL survey and the lead up to the TaLC10conference. It was an added bonus to have one of the BAWE corpus developerteam members in the audience that day and to receive positive feedback on howFLAX have opened up the BAWE in collaboration with TOETOE and OUCS.OU video presentation on Addressing Academic Literacies with open corpus-based resourcesOver the course of this academic year FLAX and TOETOE will continue to buildonto work around opening up research corpora like the BAWE and the BNCmanaged by OUCS for developing resources for ESAP. We will also be engagingwith various stakeholder groups through f2f workshops, online surveys andinterviews for open corpus-based resources evaluation which I will be sharinginsights from on this blog.One final word on OER and where corpus-based resources might play a significantrole in making higher education more accessible to the estimated 100 millionlearners worldwide who currently qualify to study at university level but do not havethe means to do so (UNESCO, 2008). Because English is the educational linguafranca, open educationalists are going to source support resources for academicEnglish from the approaches and materials that are currently popular and openlyavailable to re-use under creative commons licences. This throws up interestingissues around specificity in EAP for supporting learners with discipline-specificEnglish.
  10. 10. A parallel universe in EAP materials development / resourcesCartoon image referred to by Niko Pfund, USApresident of OUP in podcast on Ebooks,Reading and Scholarship in a Digital AgeIt would be an understatement to say that the academic publishing world isundergoing a radical transformation with the arrival of digital and open publishingformats which are democratising publishing as we know it. Niko Pfund, President ofOxford University Press (USA), discusses the ways in which technology affectsreading, scholarship, publishing and even thinking in a presentation he gave atOxford recently which you can access by clicking on the cartoon image above.I learned a lot from this podcast, including OUP’s commitment since 2003 topublishing all research monographs in both digital and print formats. I also learned oftheir admiration for what Wikipedians have done for opening up knowledge andpublishing through human crowd-sourcing that utilise open technologies andplatforms. A parallel drawn here to something that was brought up repeatedly at theWikimedia conference is how academic publishing houses like OUP are well placedto open up the disciplines in the same way as Wikipedia by bringing the voices ofthe academy into the public sphere through more accessible means ofcommunication than research, and by effectively linking this research to currentworld events to gain wider relevance and readership.Pfund refers to messy experimental times in academic publishing with lots of newbusiness models currently being explored for spear-heading changes in publishing.OUP heavily subsidise and give away a lot of published resources including ELTtextbooks to the developing world, but not yet under open licences (someone pleasecorrect me if I’m wrong here) for those practitioners working in under-resourcedcommunities so that they can re-mix and re-distribute these same resources.OUCS and OUP are literally down the road from one another, a parallel universe asit were. The former is research, learning and teaching focused with a strongcommitment to public scholarship, and the later is focused on exploring newpractices and business models for delivering the best in academic publishing.Arguably, there is a lot of overlap that can be tapped into here for the collaborativedevelopment of open corpus-based resources and practices for the global ELTmarket.In-house EAP materials developmentEAP teachers have been developing in-house EAP materials in response to thegeneric EAP teaching resources available on the mainstream market as a means tomeeting the real needs of their students going onto all number of degreeprogrammes. However, as I mentioned in section 2 of this blog post, many of thesein-house EAP materials make use of third party copyrighted texts and thereforecannot be shared beyond the secret garden of the classroom or the institutionalpassword-protected VLE. An enormous opportunity presents itself here to EAPpractitioners and corpus linguists alike to push out resources in English for SpecificAcademic Purposes (ESAP) using open Data-Driven Learning (DDL) methods,texts, tools and platforms for sharing OER for ESAP. A significant cultural shift inpractice will be required, however, to realise this vision for developing flexible andopen ESAP resources that can be adapted for use in multiple educational contextsboth off- and on-line. Once again, in subsequent blog posts, I will be presenting openeducational practices and open research methods to open up discussion for waysforward with this particular global EAP vision.References:Anthony, L. (n.d.). Laurence Anthony’s Website: AntConc.Alexander, O., Bell, D., Cardew, S., King, J., Pallant, A., Scott, M., Thomas, D., & WardGoodbody, M. (2008) Competency framework for teachers of English for Academic Purposes,BALEAP.Altbach, P. G., Reisberg, L., & Rumbley, L. E. (2009). Trends in Global Higher Education:Tracking an Academic Revolution. A Report Prepared for the UNESCO 2009 WorldConference on Higher Education. Retrieved from, Y (2009). Text Analysis by Computer: Using Free Online Resources toExplore Academic Writing. Writing and Pedagogy 1(2): 279–302.Biber, D., (2006). University language: a corpus-based study of spok en and written registers.Amsterdam: John Benjamins.British National Corpus, version 3 (BNC XML Edition). 2007. Distributed by Oxford UniversityComputing Services on behalf of the BNC Consortium.Coxhead, A. (2000). The Academic Word List.Lexical Analysis Software & Oxford University Press (1996-2012). Wordsmith Tools.Hoffmann, S., Evert, S., Smith, N., Lee, D. & Berglund Prytz, Y. (2008). Corpus Linguisticswith BNCweb – a Practical Guide. Frankfurt am Main: Peter Lang.Hyland, K. (2006). English for Academic Purposes: An Advanced Handbook. London:Routledge.Johns, T. (1994). From Printout to Handout: Grammar and Vocabulary Teaching in the Contextof Data-driven Learning. In Odlin, T. (ed.), Perspectives on Pedagogical Grammar: 27-45.Cambridge: Cambridge University Press.Nesi, H, Gardner, S., Thompson, P. & Wickens, P. (2007). The British Academic WrittenEnglish (BAWE) corpus, developed at the Universities of Warwick, Reading and OxfordBrookes under the directorship of Hilary Nesi and Sheena Gardner (formerly of the Centre forApplied Linguistics [previously called CELTE], Warwick), Paul Thompson (Department ofApplied Linguistics, Reading) and Paul Wickens (Westminster Institute of Education, OxfordBrookes), with funding from the ESRC (RES-000-23-0800)Nesi, H. and Gardner, S. (2012). Genres across the Disciplines: Student writing in highereducation. Cambridge: Cambridge University Press.O’Keeffe, A., McCarthy, M., & Carter R. (2007). From Corpus to Classroom: language use andlanguage teaching. Cambridge: Cambridge University Press.Reppen, R. (2010). Using Corpora in the Language Classroom . Cambridge: CambridgeUniversity Press.