Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Proposing Corpus NLP approach to MOOC Discussion


Published on

Poster Presentation in UCREL Corpus NLP Summer School 2018

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Proposing Corpus NLP approach to MOOC Discussion

  1. 1. Shi Min Chua, Background Most Massive Open Online Courses (MOOCs) have a dedicated discussion space for learners to interact with each other. Several automatic categorizations of the discussion postings have been implemented in in learning analytics research: • Supervised Machine Learning (O’Riordan et al, 2016) to categorize postings based on frameworks such as o Community of Inquiry (Garrison, Anderson, & Archer, 2001): social, cognitive, teaching o Bloom’s Taxonomy (Bloom et al, 1956): remember, understand, apply, analyse, evaluate, create • Content related vs. non-content related (Cui & Wise, 2015) • Sentiment Analysis (Wen, Yang, Rose, 2014) These categorizations are useful for evaluation yet disregard the dialogic nature of the discussion forums and underpinning linguistic resources. Work-in-Progress Keyword Analysis & Lexical Bundles • Facilitators (816058) vs. Learners (11206220) • Learners’ lone posts (2401795) vs. initiating posts (6162230) • Learners’ replies (6162230) vs. initiating and lone posts (8564025) Linguistic Resources for Dialogic Learning • Pronouns (Oliveira et al, 2007) • Personalized framing (Csomay, 2017) • Hedging (Brennan & Ohaeri, 1999; Concannon, Healey & Purver, 2003) • Stance expression (Hyland, 2011) • Interactivity (Kleinke, 2017) • Experience talk (Kaanta & Lehtinen, 2016) • Discourse organizers (Conrad & Biber, 2004) • Referential (Conrad & Biber, 2004) • Story-like (Alsop & Nesi, 2018) • Questions (Tracy & Robles, 2009) • Conditionals (O’Keeffe & Walsh, 2016) • Agreement and disagreement (Baym, 1996) • ? Research Questions What linguistic resources are used in MOOC discussions? • What linguistic resources are used by facilitators to create dialogic learning in MOOC discussions? • Why do some posts receive replies but some don’t? (learners’ posts) • What happens within a conversation thread? (learners’ replies) Findings: Facilitators’ Linguistic Resources Interactivity Names Pronouns you, your, yourself, we, us Discourse Particles hi, yes, thanks, please, sorry Meta-language Discussion-related point, points, pointing, comment, comments, question, discussion, post, feedback, answer, questions, pointing, reply, discussed, posted Logistics and Learning Materials click, check, button, materials, download, page, link, videos, section, text, pdf, fixed, sections, website Course and MOOCs mooc, futurelearn Conceptual Objects issue, issues, topic, case, research, researchers Referential week, weeks, later, next, coming Stance Expression Modality might, 'll, can, will, want, 'd, Booster indeed, just, exactly, directly Positive Evaluation right, fine, good, great, interesting Emotions glad, worry, afraid Hedging Expression sounds, sure, thoughts, find Speech act suggest, mean, ask, suggestion, refering, asking Uncategorized Connectors if, e.g., example, terms, i.e., meantime, then, examples, depends Punctuation ),'(-:!?" Grammatical particles here, this, that, there, the, these, are, is, be, 's, do, on, for Uncategorized option, b, different, what, two, available, free Uncategorized Verbs let, hear, hope, note, see, look, using, try, collect, uses Facilitator: A disappointing news item on the ALT mailing list today, ….<url>… Sadly this will also set a US legal precedent, so we'll probably see a great deal more free and open content disappearing. So now its all completely inaccessible - to everyone :-( So a question <...> Do you think such freely-provided 'open content' should be taken down if it isn't captioned <…>?<...> Learner A: I think this is rather unfair. I can understand that the UoC wants to make sure all their content is accessible but surely there could be a statement to say <…> Learner B: Hi <…> Possible for Corpus NLP To examine characteristics - lone posts vs. initiating posts - popular posts (most liked) POS & Semantic Tagging or Biber’s Tagger for multidimensional analysis Use the dimensions to categorize the postings, similar to previous learning analytic research yet from the linguistic point of view. To examine turn-taking Collocation of linguistic features between posts and replies, rather than by N-positions?