• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"
 

eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"

on

  • 464 views

 

Statistics

Views

Total Views
464
Views on SlideShare
369
Embed Views
95

Actions

Likes
0
Downloads
1
Comments
0

2 Embeds 95

http://www.emadridnet.org 94
http://webcache.googleusercontent.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education" eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education" Presentation Transcript

    • Analyzing the students´ behavior and relevant topics in virtual learning communities Llanos Tobarra, Antonio Robles-Gómez, Salvador Ros, Roberto Hernández, Agustín C. Caminero Computers in Human Behavior 31(2014) 659-669 , online December (2013) JCR Q1 Departamento de Sistemas de Comunicación y Control Universidad Nacional de Educación a Distancia (UNED) {llanos,arobles,sros,roberto,accaminero}@scc.uned.es 1
    • Outline • Introduction • Outcomes 2
    • Introduction • UNED is a distance methodology university. • Need of some specific techniques for monitoring and analysing the information gathered by LMS. • Learning Analytics is defined as: The measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimising learning and3
    • Different Approaches Type of Analytics Educational data mining Academic Analytics Who Benefits? Course-level: social networks, conceptual development, discourse analysis, “intelligent curriculum” Learning Analytics Level or Object of Analysis Learners, faculty Departmental: predictive modeling, patterns of success/failure Institutional: learner profiles, performance of academics, knowledge flow Learners, faculty Administrators, funders, marketing Regional (state/provincial): comparisons between systems Funders, administrators National and International National governments, education authorities 4
    • Learning Analytics Process 5
    • Outline • Introduction • Outcomes 6
    • Where do we focus? • Forums – Essential for negotation and exchange of ideas. – Collaborative learning – High correlation of students participation levels with positive learning outcomes and knowledge constructions 7
    • Outcomes • Provide and extensive analysis of the student´s behaviour ia an on-line learning community • Propose a set of algorithms to characterize in an automatic way the most relevants topics of the community • How ? – Students and faculty`interacction by means of the messages in the forums have been analyzed. • Results 8
    • Questions • What are the students’ behavior patterns during their interaction and participation in the asynchronous virtual discussion forums of the virtual learning community? • What are the most relevant topics and subtopics in the asynchronous on-line discussion forums of the on- 9
    • Input data • Data from two academic years 2010-2011,2011-2012 • Forum Student • Forum Activities 1-6 • Forum Activities 7-11 • Forum Faculty • About 2000 messages 10
    • Procedure • Data collection and statistical analysis • Semantic analysis • Calculation of stem networks 11
    • Procedure For each participant (Statistical indicators) • Number of published messages • Number of replies • Number of initiated conversations • Number of initiated conversations witout replies • Number of conversations where the participant has posted a mesage • Number of forums where the 12
    • Procedure • Semantics – Splitting message in basic tokens – Remove stop-words – Obtaim the token stem (Porter algoritm) – Calculate daily and global frecuencies Apache Lucene Library, Snowball tool 13
    • First Question • What are the students’ behavior patterns during their interaction and participation in the asynchronous virtual discussion forums of the virtual learning community? 14
    • Student behaviour modelling • Students can be classified depending of their pattern of behaviour as: – Producers • Proactive • Reactive – Consumers SIIE'12 15
    • Second Question • What are the most relevant topics and subtopics in the asynchronous on-line discussion forums of the online learning community? 16
    • Topic Modelling Process • The topics modelling process deals with the detection of the most relevant topics which are employed in asynchronous discussion forums of on-line educational environments. 17
    • Topic Dynamics • First decomposition: – Chatter topics, which are internally driven, can be known as sustained discussion topics. New thoughts on chatter topics are published all days at an educational community and some members can react to previous ideas posted. – Spike topics, which are externally induced, produce sharp rises in postings. 18
    • Topic Dynamics • First decomposition: – Chatter topics, which are internally driven, can be known as sustained discussion topics. New thoughts on chatter topics are published all days at an educational community and some members can react to previous ideas posted. – Spike topics, which are externally induced, produce sharp rises in postings. 19
    • Topic Dynamics (II) • Second decomposition: – Just spike. These topics have a very low correlation with any chatter topic, but they are very correlated to an external event, such as congratulations for the new year or initial introductions of participants. They are initially inactive, although they become very active within a particular time sub-window. After that, they come back inactive. – Spiky chatter. These topics have a high correlation with a chatter level and, additionally, they are very sensitive to external events. The scores could be classified as a spiky chatter subtopic due to its strong correlation with the exam topic and its influence with an external event (as the publication of the participants’ scores is). 20 – Mostly chatter. These topics are continuously
    • Selecting forum topics • Three weigth functions • Best fit Weighted frecuency 21
    • Third Question • Could they be characterized in an automatic way? • Two algoritms: – One for mostly chatter – Second spike chatter • Results – Topics and subtopics 22
    • Example: topic modelling result 23
    • What else? • Create Topic networks per Forum 24
    • SIIE'12 25
    • SIIE'12 26
    • Thanks for your attention!!! ¿any question? 27
    • Topic Modelling: Chatter SIIE'12 • The DumpTerms set contains all terms already detected as irrelevant topics, such as names or surnames. • Plural detection. • Accumulated frequency (f(ti)) is computed for each term. • Then, they’re ranked. • As result we obtained a set 28 called Chatter.
    • Topic Modelling: Spikes SIIE'12 • For each pair, ti of T set and tj of Chatter set, the number of appearances (si) of both terms together in any message mk is counted. • Also, the probability of apparition of tj given ti (cri) is calculated. • In case these values are 29