VSS 2011 Data Mining (Thursday, 10:45)

671 views

Published on

Towards the Development of a Real-Time Decision Support System for Online Learning, Teaching and Administration

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
671
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Tell them how you’re going to bore themBore themTell them how you bored them
  • In 2007 conducted the first phase which looked at the status of PD for K-12 online teachers.In 2008 conducted phase II looking at unique need of K-12 online teachers2009 began two evaluations as pilot investigations into the evaluative phase of the research series. Primarily to help us understand more clearly the factors in evaluating teacher effectiveness as well as how best to gather data on a national level. Discuss complexity of measuring effectiveness of teacher training on student outcomes.
  • EDM is relatively new to education
  • EDM is relatively new to education
  • Combining survey data with data mining of learning management system (LMS) server logs to investigate learner behaviors via cluster analysis, sequential association analysis, and decision tree analysis. Data mining is commonly used in business and ecommerce but can also be applied in educational settings as a tool for pattern discovery and predictive modeling. Practical applications include tracking learner behaviors, identifying struggling students, depicting learning preferences, improving course design, personalizing instruction, predicting student performance, and data visualization of learner behaviors.
  • EDM is relatively new to education
  • EDM is relatively new to education
  • Two major applications, relationship mining (reveal relationship between learning interactions) and prediction (identify key predictors of learning behaviors or performances), are the most common approaches of EDM. Techniques such as data visualization, clustering, classification, association rule, and decision tree (Romero & Ventura, 2007) are the most popular EDM techniques. These EDM techniques empower educational researchers to present data visually, classify students by certain criteria, identify and monitor patterns of learning behaviors, and to predict learning outcomes and task performance accordingly.This evaluation used participant self-report data from surveys, analyses of participants’ completed work samples and LMS data mining to evaluate satisfaction with the training, the level of engagement participants experienced with the training, and perceived impact on teaching practice as a result of the training.
  • Two major applications, relationship mining (reveal relationship between learning interactions) and prediction (identify key predictors of learning behaviors or performances), are the most common approaches of EDM. Techniques such as data visualization, clustering, classification, association rule, and decision tree (Romero & Ventura, 2007) are the most popular EDM techniques. These EDM techniques empower educational researchers to present data visually, classify students by certain criteria, identify and monitor patterns of learning behaviors, and to predict learning outcomes and task performance accordingly.This evaluation used participant self-report data from surveys, analyses of participants’ completed work samples and LMS data mining to evaluate satisfaction with the training, the level of engagement participants experienced with the training, and perceived impact on teaching practice as a result of the training.
  • Instead of focusing on content or discussion only, the results indicate participants tended to switch between content and discussion within one session because the first two rules have higher support and confidence rates than rules three and four. The results also show that different types of interactions (content-participant, participant-instructor, and participant-participant) were well facilitated in the workshops overall.
  • Classification of survey questions and online engagement behaviors based on similarity of participants’ responses.
  • Association rule analysis revealed higher support and confidence ratings for participant learning paths that included both discussion forums and content access These workshops included different types of interactions (content-participant, participant-instructor, and participant-participant).
  • This study explored the potential applications of data mining in support of online PD. Two important outcomes were expected as a result. First, it allowed us to begin the process of developing a model for utilizing data as a predictor of learner performance. This outcome is valuable in the creation of warning or recommendation systems that can be used to notify both instructors and students of behaviors that may result in unsatisfactory course outcomes. Second, statistical data demonstrating how learners engage with course materials can lead to improved course design, adjustment of learning strategies, and improvement in learner performance through individualized support mechanisms. It is expected that a greater understanding of the potential to leverage data collected everyday through LMS’s will be of great benefit in evaluating the effectiveness of instruction at all levels of education.
  • Figure 3 (Lv2) includes four students (S1, S2, S3, and S4) randomly selected from X1. The results show that S1 and S3 shared similar activity patterns. S2 is significantly more active than the other three students and preferred to work ahead. The frequency of S4 is similar to S1 and S3. However, S4 showed different learning preferences from the other two students. Figure 2 (Lv1) shows daily patterns of activity frequency by week for all four target courses. In the case study, courses X1 and X2 are the two sections of course X and courses Y1 and Y2 are the two sections of course Y. Figure 2 reveals the following results: First, X1 students were more active than students in X2. Second, assignments for all courses were due on Tuesdays. Courses X2, Y1, and Y2 show higher activity frequencies than the other days. However, students in X1 preferred to work one day before the assignment was due.
  • Figure 4 (Lv2) illustrates the activity patterns of course X1. The following behaviors—frequency of course pages accessed, number of discussions read, number of discussions posted, number of discussions answered, and frequency of tools accessed—were accumulated on different time sections and days of the week. The results revealed the following behavioral characteristics: First, reading is the major activity because reading posts and materials are the top two most frequent behaviors. In addition, these two behaviors showed similar patterns, which indicates that when students read course materials, they will read discussions too. Second, Sunday is the most popular day for replying to discussions. Third, most learning behaviors occurred on Monday and Tuesday, and between 13:00 and 00:59.
  • Clustering algorithms were used to categorize students into homogeneous groups. K-means clustering techniques were applied to group students based on their shared characteristics: terms of frequency of course material accessed, frequency of “tools” link accessed, number of discussion posted, number of discussions read, number of discussion replied, and final grade. This method was intended to gather individuals who were “close” into the same group for further analysis In order to compare results, the cluster number was limited to four. Because highly skewed data will influence the results of clustering analysis, normalization methods were applied to the highly skewed fields. Shared Characteristics of Course XCluster 1 (3 students) indicate a relatively low level of engagement (frequency of course materials accessed: 0.25, frequency of tool links accessed: 0.26; number of discussion posted: 0.17; number of discussion read: 0.09; number of discussion replied: 0.1) which resulted in lower performance (final grade: 0.35). Cluster 2 (3 students) indicates relatively higher level of engagements (0.95, 0.82, 0.88, 0.82, and 0.96 accordingly) which resulted in higher performance (0.78). Cluster3 (17 students) represents students who are around average on all indicators (0.38, 0.41, 0.34, 0.27, 0.28, and 0.77 accordingly). Cluster 4 (14 students) are efficient students who have lower engagement level (0.18, 0.14, 0.23, 0.11, and 0.14) with higher learning outcomes (0.76).
  • Shared Characteristics of Course YCluster 1 (2 students) are relatively low-engaged students (0.04, 0.01, 0, 0, and 0.02 accordingly), which resulted in lower performance (0.2). Cluster 2 (23 students) are relatively high-engaged students (0.93, 0.75, 0.17, 0.6, and 0.49), which resulted in higher performance (0.93). The other two groups (Cluster 3, 13 students and Cluster 4, 2 students) are of particular interest for doing research and adjusting teaching strategies. Cluster 3 represents students who need further facilitation. They are relatively high-engaged (0.41, 0.25, 0.38, 0.44, and 0.64 accordingly), but their performances are the lowest in the course (0.13). Group 4 students are high performers (0.79) with low discussion participation (noPost: 0.3, noRead: 0.18, and noReply: 0.29). Based on results, these students are more efficient than other students in the class. Further investigations on critical thinking and learning strategy might help to improve the data interpretation. Based on the results of Figures 5 and 6, clustering analysis provides an overview of students’ learning profiles, identifies interesting groups for further analysis, and suggests possible teaching strategy adjustments.
  • Path analysis is one of the association rule techniques for analyzing data to determine the most frequent sequential paths taken by users within one session. The link graphs (Figures 7 and 8) display association results by using nodes and links. The default size of a node indicates the behavior counts in the association rules (support). Larger nodes have greater counts than smaller nodes. The thickness of links between nodes indicates the confidence level of a rule. Thicker links indicate higher confidence. In order to show frequent learning paths, rules below a 10 % support rate were discarded in the results.Figure 7 shows results of path analysis for Course X. The results reveal that the course homepage is the center of course activities. The most frequent learning paths involved reading. Reading discussions and course materials are highly associated with the homepage.Figure 8 includes results of path analysis for course Y. The results revealed that students were involved in more types of interactions, including reading course materials and discussions (student-content) and posting discussions (student-student or student-teacher). The following two factors might influence how students acted in the course X and course Y. 1. Course structure design: the instructor of course X adopted Moodle’s topic design and students can access course components though direct links on the course home page. Conversely, the instructor of course Y adopted Moodle’s page design and organized course components hierarchically by using drop-down menus.2. Teaching strategy: discussion grades for course X were based on discussion participation. On the other hand, students in course Y needed to work as discussion facilitators in turn. In addition, discussion grades were based on quality of discussion (via peer evaluation) and discussion participation.
  • Both courses showed discussion participation (replies and posts) as the most important behavior for predicting students’ overall performance. Courses X and Y allocated similar grade ratio on discussion participation (20% and 24% accordingly). However, the discussion grade for course X was based on participation only while the design of Course Y required small groups of students to work in turn as discussion board facilitators to encourage more meaningful discussions. The design used in Course Y improved the quality of discussion and influenced students’ behaviors (Figure 8). As a result, course Y students obtained benefits from reading discussions (Figure 10).
  • Using multiple forms of data allows for a more meaningful analysis about actual student behaviors, and the identification of potential relationships with demographic data, satisfaction data and student outcomes. The result is a much richer and deeper analysis of student performance and teaching as well as course design effectiveness than could ever be accomplished with survey data or mining behaviors alone.
  • Cluster 1 (11 students, pass rate 84.61%, 9 females and 2 males):Cluster 1 consists of the youngest students among the 6 clusters. They were the highest-engaged students compared with the other clusters. On average, they took 1.18 courses and might fail in some. Cluster 2 (104 students, pass rate 54.09%, 56 females and 48 males):Cluster 2 consists of older students. They were slightly lower-engaged than Clusters 5 and 6. On average, they took 2.47 courses in Spring 2010 and failed about half of them.Cluster 3 (295 students, pass rate 0%, all males)Cluster 3 consists of low-engaged male students. On average, they took 1.23 courses and failed in all of them. Cluster 4 (241 students, pass rate 0%, all females)Similar to Cluster 3, Cluster 4 consists of low-engaged female students. On average, they took 1.23 courses and failed all of them. Cluster 5 (1,374 students, pass rate = 100%, all males)Cluster 5 represents male students who were highly engaged and passed all courses. On average, they took 1.22 courses.Cluster 6 (1,899 students, pass rate = 100%, all females)Similar to Cluster 5, Cluster 6 represents female students who were highly engaged and passed all courses. On average, they took 1.24 courses.
  • Cluster 1 (316 students, pass rate = 55.07%, all males): Cluster 1 consists of students who are older than Cluster 3 to 6. They were lower-engaged than Cluster 5 and 6 but higher than Cluster 3 and 4. On average, each student took 2.76 courses and failed about half of them. Cluster 2 (320 students, pass rate = 56.11%, all females): Similar to Cluster 1, Cluster 2 consists of students who are older than Clusters 3 to 6. They are lower-engaged than Cluster 5 and 6 but higher than Cluster 3 and 4. On average, each student took 3.03 courses and failed about half of the courses. Cluster 3 (594 students, pass rate = 0%, all males): Cluster 3 and 4 include the lowest-engaged students. Cluster 3 students are all male. On average, each student took 1.43 courses and failed all of them. Cluster 4 (601 students, pass rate = 0%, all females): Cluster 4 includes the lowest-engaged female students. On average, each student took 1.39 courses and failed all of them. Cluster 5 (2,311 students, pass rate = 100%, all males): Cluster 5 and 6 represent the highest-engaged students. Cluster 5 students are all male. On average, each student took 1.59 courses and passed all of them.Cluster 6 (3,397 students, pass rate = 100%, all females): Cluster 6 represents the highest-engaged female students. On average, each student took 1.64 courses and passed all of them.
  • Due to the previous results indicating that students in Math, Science, and English had lower performance than those in other subject areas, researcher were interested in identifying potential anomalies within this group which might help explain the reasons for the results. Further analysis was applied to identify which Math, Science, and English courses resulted in the highest performance and which Math, Science, and English courses resulted in the lowest performance. Researchers divided courses into three conditions: (a) high-engaged, high-performance, (b) high-engaged, low performance, and (c) low-engaged, low-performance based on student behaviors within the course. Courses categorized as high-engaged and high-performance might represent courses with both effective design and effective implementation because students were highly engaged and achieved expected outcomes. Those categorized as high-engaged and low-performance might represent courses with less effective course design because students were unable to achieve expected outcomes despite what appears to be effective implementation. Finally, courses categorized as low-engaged and low performance might represent courses with less effective course design and less effective course implementation. Math CoursesHigh engaged & high performance => “The course was not available at my school.”High engaged & low performance => Various reasons.Low engaged & low performance => “I was making up a class I had failed.” Science CoursesHigh engaged & high performance => “The course was not available at my school” & other.High engaged & low performance => Various reasons.Low engaged & low performance => “I want room in my schedule for another elective” & other. English CoursesHigh engaged & high performance => “The course was not available at my school” & other.High engaged & low performance => Other.Low engaged and low performance => “I was making up a class I had failed & other.”
  • Researchers divided courses into three conditions: (a) high-engaged, high-performance, (b) high-engaged, low performance, and (c) low-engaged, low-performance based on student behaviors within the course. Courses categorized as high-engaged and high-performance might represent courses with both effective design and effective implementation because students were highly engaged and achieved expected outcomes. Those categorized as high-engaged and low-performance might represent courses with less effective course design because students were unable to achieve expected outcomes despite what appears to be effective implementation. Finally, courses categorized as low-engaged and low performance might represent courses with less effective course design and less effective course implementation. Our analysis revealed that regardless of the content area, most high-engaged, low performance, or low-engaged, low performance courses were entry-level courses. Most high-engaged, high performance courses were advanced level courses.
  • Due to the previous results indicating that students in Math, Science, and English had lower performance than those in other subject areas, researcher were interested in identifying potential anomalies within this group which might help explain the reasons for the results. Further analysis was applied to identify which Math, Science, and English courses resulted in the highest performance and which Math, Science, and English courses resulted in the lowest performance. Researchers divided courses into three conditions: (a) high-engaged, high-performance, (b) high-engaged, low performance, and (c) low-engaged, low-performance based on student behaviors within the course. Courses categorized as high-engaged and high-performance might represent courses with both effective design and effective implementation because students were highly engaged and achieved expected outcomes. Those categorized as high-engaged and low-performance might represent courses with less effective course design because students were unable to achieve expected outcomes despite what appears to be effective implementation. Finally, courses categorized as low-engaged and low performance might represent courses with less effective course design and less effective course implementation. Math CoursesHigh engaged & high performance => “The course was not available at my school.”High engaged & low performance => Various reasons.Low engaged & low performance => “I was making up a class I had failed.” Science CoursesHigh engaged & high performance => “The course was not available at my school” & other.High engaged & low performance => Various reasons.Low engaged & low performance => “I want room in my schedule for another elective” & other. English CoursesHigh engaged & high performance => “The course was not available at my school” & other.High engaged & low performance => Other.Low engaged and low performance => “I was making up a class I had failed & other.”
  • VariablesAverage course grade (dependent)Average course satisfaction (independent)Average instructor satisfaction (independent)
  • VariablesAverage course grade (dependent)Average course satisfaction (independent)Average instructor satisfaction (independent)
  • However, six students indicated low instructor satisfaction, despite extremely high frequency of course access and high final grades. VariablesAverage course grade (dependent)Average course satisfaction (independent)Average instructor satisfaction (independent)
  • The following are the significant contributing variables, listed by descending level of importance in positive and negative categories respectively:
  • Also, the findings illustrate the flaw in sole reliance on self-report and perception data in program evaluation to inform strategic decisions.  Although students obtained the lowest average grades in Math and English courses, they did not show significantly lower satisfaction levels in these two subject areas. Since perception data only reflected positive experiences, the picture of students’ experiences in courses could be misrepresented and partial if students’ learning behaviors were not analyzed.
  • 1) demonstrating how data mining can be incorporated into course evaluation in order to support decision making at the course level and at the institutional level; 2) exploring potential applications at the K-12 level for educational data mining that has already been broadly adopted in higher education institutions; 3) providing a framework of data triangulation that generates high-quality and non-partial results by combining student learning logs with demographic data and course evaluation survey; 4) depicting profiles of successful and at-risk students and identifying important predictors of student performance, course satisfaction, and instructor satisfaction for K-12 online education.
  • VSS 2011 Data Mining (Thursday, 10:45)

    1. 1. Towards the Development of a Real-TimeDecision Support System for Online Learning,Teaching and Administration Kerry Rice, Ed.D. Associate Professor and Chair Andy Hung, Ed. D Assistant Professor Yu-Chang Hsu, Ph. D. Assistant Professor
    2. 2. M.S. in Educational TechnologyMasters in Educational TechnologyEd. D in Educational TechnologyK-12 Online Teaching EndorsementGraduate Certificates: Online Teaching - K12 & Adult Learner Technology Integration Specialist School Technology CoordinatorOnline Teacher PD PortalGame Studio: Mobile Game DesignLearning Technology Design Lab
    3. 3. EDTECH Fast Facts• Largest graduate program at BSU• Fully online, self-support program• Served over 1,200 unique students last year• Interdisciplinary partnerships with Math, Engineering, Geoscience, Nursing, Psychology, Literacy, Athletics.• Partnerships with iNACOL, AECT, ISTE, Google, Stanford, IDLA, Connections Academy, K12, Inc., ID State Department of Education, Discovery Education, Nicolaus Copernicus University, Poland• First dual degree program – National University of Tainan.• Save 200+ tons of CO2 emissions annually
    4. 4. Image created using wordle: http://www.wordle.net/
    5. 5. Going Virtual! Research Series
    6. 6. Going Virtual! Research Series2007: The Status of Professional Development• Who delivered/received PD?• When and how PD was delivered?• Content and sequence of PD?2008: Unique Needs and Challenges• Amount of PD?• Preferred delivery format?• Most important topics for PD?2009: Effective Professional Development of K-12 Online Teachers• Program evaluations• Complexities of measuring “effectiveness”2010: The Status of PD and Unique Needs of K-12 Online Teachers• Revisit questions from 2007 & 2008• What PD have you had? What do you need?2011: Development of an Educational Data Mining model• Pass Rate Predictive Model• Engagement• Association Rules
    7. 7. Going Virtual! Research Series 258 Respondents 884 K-12 Online 830 K-12 Online Going Virtual! 2008 Going Virtual! 2007 Going Virtual! 2010 Teachers TeachersDescriptive •167 K-12 online teachers •61 Administrators •727 virtual schools •417 Virtual School •14 Trainers •99 supplemental •318 Supplemental Over 40 virtual programs •81 Blended schools and •54 brick and mortar •12 Brick N Mortar online programs online programs Over 50 virtual Over 60 virtual schools and online Over 30 states schools and programs online programs Over 30 states Over 40 states & 24 countries Traditional Going Virtual 2011 • Virtual Charter • Supplemental Goals: Program Evaluative • Program evaluation With DATA MINING • Develop cloud-based, real-time • Online Teacher PD Workshops Decision Support System • Online Graduate Courses (DSS) • End of Year Program Evaluation • Link PD effectiveness to student outcomes
    8. 8. Traditional Evaluation Systems Teacher Student Program Effectiveness Outcomes Highly AYP? Performance qualified? Parent Improved Test Participation Satisfaction Scores Annual Parent Attendance Performance Satisfaction Range of ISAT/DWA implementation Student Self-Efficacy Satisfaction Knowledge of Satisfaction STS
    9. 9. Leveraging Data Systems PD Teacher StudentEffectiveness Effectiveness Outcomes Change in Quality teaching Satisfaction practice Self report Self report Quantity AND Quality Usefulness of Engagement Interaction Self report Course Dropout Engagement Design Rate Low-level data Performanc e Low-level data Learning Patterns
    10. 10. Data Mining Data mining techniques can be applied in online environments to understand hidden relationships between logged activities, learner experiences, andperformance. It can be used in education to track learner behaviors, identify struggling students, depict learning preferences, improve course design, personalize instruction, and predict student performance.
    11. 11. Educational Data MiningSpecial Challenges• Learning behaviors are complex• Target variables (learning outcomes/performance) require wide range of assessments and indicators• Goal of improving online teaching and learning is hard to quantify• Limited number of DM techniques suitable to meet educational goals• Only interactions that occur in the LMS can be tracked through data mining. What if learning occurs outside the LMS?• Still a very intensive process to identify rules and patterns
    12. 12. DM Applications in Education• Pattern discovery (data visualization, clustering, sequential path analysis) – Track students’ learning progress – Identify outliers (outstanding or at-risk students) – Depict students’ learning preferences (learner profiling) – Identify relationships of course components (web mining)• Predictive Modeling (decision tree analysis) – Suggest personalized activities (classification prediction) – Foresee student performance (numeric prediction) – Adaptive evaluation system development• Algorithm generation: analysis methods can be integrated into platforms.
    13. 13. Data Preprocessing• Data Collection• Data Cleaning• Session Identification• Behavior Identification
    14. 14. Data Transformation
    15. 15. 3 Data Mining Studies• Study #1: Teacher Training Workshops 2010 – Survey Data + Data Mining + Student Outcomes• Study #2: Graduate Courses 2010 – Data Mining + Student Outcomes (no demographic data)• Study #3: End of Year K-12 Program Evaluation (2009 – 2010) – Data Mining + Student Outcomes + Demographic Data + Survey Data
    16. 16. Study #1: Teacher Training Workshops 2010• Survey Data + Data Mining + Student Outcomes• Research Goal: To demonstrate the potential applications of data mining with a case study – Program evaluation of workshop quality for continuous improvement of design and delivery. – Evaluation of PD impact on both teachers (and students).
    17. 17. Study #1: Teacher Training Workshops 2010• Blackboard• 103 participants• 31,417 learning logs• clustering analysis, sequential association analysis, and decision tree analysis• Engagement variables – Frequency of logins – Length of time online (survey and dm) – Frequency of content access – Number of discussion posts
    18. 18. Learning Paths• Association Rule Analysis – Participants tended to switch between content and discussion within one session. – Different types of interactions (content-participant, participant-instructor, and participant-participant) were well facilitated in the workshops overall.
    19. 19. PerformancePass Rate Predictive Model• Decision Tree Analysis – Improved grades and pass rate (from 88% to 92% and 89% to 94% respectively) when participants’ logged into LMS more than 10 times over six weeks. The average for both is further improved to 98% when frequency of logins increased to 17 times. Increased logins = Increased performance
    20. 20. Quality of ExperienceEngagement • Clustering + Survey Questions – More time spent online = more time spent offline. – Previous online teaching experience = more hours spent both online and offline.
    21. 21. DM Conclusions• Interaction and engagement were important factors in learning outcomes.• The results indicate that the workshops were well facilitated, in terms of interaction.• Participants who had online teaching experience could be expected to have a higher engagement level but prior online learning experience did NOT show a similar relationship.• There is a direct relationship between the amount of time learners spent online and their average course logins to engagement and performance. Specifically, more time spent online and a higher frequency of logins equates to increased engagement and improved performance.
    22. 22. Overall Conclusions• Two factors influenced expectation ratings: – Practical new knowledge – Ease of locating information• Three factors influenced satisfaction ratings: – Usefulness of subject-matter – Well-structured website – Sufficient technical supports• Instructor quality was related to: – Stimulated interest – Preparation for class – Respectful treatment of students – Peer collaboration – Assessments aligned to course objectives – Support services for technical problems
    23. 23. Study #2: Graduate Courses 2010• Data Mining + Student Outcomes (no demographic data)• Research Goal: To demonstrate the potential applications of data mining with a case study – Generate personalized advice – Identify struggling students – Adjust teaching strategies – Improve course design – Data Visualization• Study Design – Comparative (between and within courses) – Random course selection
    24. 24. Study #2: Graduate Course 2010• Moodle• Two graduate courses (X and Y)• Each with two sections – X1 (18 students) – X2 (19 students) – Y1 (18 students) – Y2 (22 students)• 2,744,433 server logs
    25. 25. Study #2: Graduate Course 2010• Variables – ID’s (user and session) – Learning Behaviors (reading materials, posting disc.) – Time/duration – Grades or pass/fail (independent variables)
    26. 26. Learner BehaviorsWeekday Student Patterns Weekday Course Patterns
    27. 27. Weekday and Time Patterns of Learning Behaviors• Reading is the major activity; Similar patterns• Sunday => reply discussions• Monday & Tuesday, between 1pm and midnight
    28. 28. Shared Student Characteristics Course X
    29. 29. Shared Student Characteristics Course Y
    30. 30. Learner Behaviors
    31. 31. Predictive Analysis – Course XDiscussion board posts andreplies were the mostimportant variable forpredicting performance(27+ replies = betterperformance)Some lower performershad high reply numbers (>43)Cluster analysis revealedthat students tended toonly read discussions.
    32. 32. Predictive Analysis – Course YNumber of discussionboard posts read was themost important predictor ofperformance (378+ =better performance)Fewer discussions read +more replies (54+ = betterperformance)The design of course Yimproved the quality ofdiscussions and influencedstudent behaviors.
    33. 33. Study #3: End of Year K-12 Program Evaluation• Demographics + Survey Data + Data Mining + Student Outcomes• Research Goal: Large scale program evaluation – How can the proposed program evaluation framework support decision making at the course and institutional level? – Identify key variables and examine potential relationships between teacher and course satisfaction, student behaviors, and student performance outcomes
    34. 34. Study #3: End of Year K-12 Program Evaluation (2009 – 2010) • Blackboard LMS • 7500 students • 883 courses • 23,854,527 learning logs (over 1 billion records)
    35. 35. Total Variables = 22stuID Login_AvgAge Module_AvgCity GenderDistrict HSGradYearGrade_Avg SchoolClick_Avg No_CourseContent_Access_Avg No_FailCourse_Access_Avg No_PassPage_Access_Avg Pass rateDB_Entry_Avg cSatisfaction_AvgTab_Access_Avg iSatisfaction_Avg
    36. 36. Engagement• Average frequency of logins per course.• Average frequency of tab accessed per course• Average frequency of module accessed per course• Average frequency of clicks per course• Average frequency of courses accessed (from the Blackboard portal)• Average frequency of page accessed per course (page tool)• Average frequency of course content accessed per course (content tool)• Average number of discussion board entries per course.
    37. 37. Cluster Analysis - by Student Spring 2010
    38. 38. Cluster Analysis - by Student• High engagement = high performance• The optimal number of courses = 1 to 2 per semester• Older students (age > 16.91) tended to take more than two courses with pass rates ranging from 54.09-56.11%• High-engaged students demonstrated engagement levels twice that of low-engaged students• Female students were more active than male students in online discussions (with higher DB_Entry avg frequency)• Female students had higher pass rates than male students
    39. 39. Cluster Analysis – by CourseIdentified lowest performing courses (Math, Science andEnglish) were analyzed with cluster analysis.• High-engaged + high performance = good design and good implementation?• High engaged + low performance = bad design and good implementation?• Low engaged + low performance = bad design and bad implementation?
    40. 40. Cluster Analysis – by CourseSubject areas in which the level of activity was consistent with student outcomes: – High Performance and High Engagement = Driver Education, Electives, Foreign Language, Health, and Social Studies – Low Engagement and Low Performance = EnglishSubject areas in which the level of activity was inconsistent with student outcomes: – High Engagement and Low Performance = Math and Science. Why?
    41. 41. Cluster Analysis – by Course• Regardless of the content area or level of engagement, low performance courses were entry-level• Most high-engaged, high performance courses were advanced level courses.• Regardless of Math, Science, or English subject-matter, entry level courses tended to have lower performance whether students were categorized as low-engaged or high- engaged.• The reasons students enrolled in a course may influence their engagement level and performance. Student survey responses indicated that students who retook courses they have previously failed, tended to demonstrate lower engagement and lower performance.
    42. 42. Predictive Analysis – Pass Rate• Positive correlation between engagement level and performance (higher engaged => higher performance)• Engagement level and gender have stronger effects on student final grades than age, school district, school, and city. For most students, high engaged => high performance• Overall, female students performed better than male students• Students who were around 16 years old or younger performed better than those who were 18 years or older.• Compared with other Blackboard components such as discussion board entries and content access, tab access had negative effects on student performance (higher tab access => lower performance)
    43. 43. Predictive Analysis – Course Satisfaction• Students with higher average final grades (> 73.25) had higher course satisfaction.• Students who passed all courses or passed some of their courses had higher course satisfaction than all-failed students.• Students who took two or more courses in Spring 2010, whether they passed those courses or not, had higher course satisfaction.• Female students had higher course satisfaction than male students.• Online behaviors (i.e., frequency of page accessed and number of discussion board entries) had minor effects on course satisfaction (higher frequency/number => higher course satisfaction).
    44. 44. Predictive Analysis – Instructor Satisfaction• Students with higher average final grades (> 73.25%) indicated higher instructor satisfaction.• Students who took two or more courses in Spring 2010, whether they passed those courses or not, showed higher instructor satisfaction.• Female students indicated higher instructor satisfaction than male students.• Online behaviors (frequency of module accessed) had minor effects on instructor satisfaction (higher frequency => higher course satisfaction).• Older students (> 17.5 years old) had higher instructor satisfaction.
    45. 45. Regression Analysis• Spring 2010 – Survey data + Data Mining• Purpose: To identify which variables contributed significantly toward students’ average final grade.• Positive (higher values, higher average final grade) – Self-reported GPA (Likert-scale type of response) – Satisfaction toward positive experience (Likert-scale type of response) – Satisfaction toward course content (Likert-scale type of response) – Time on coursework (Likert-scale type of response) – Course access (based on LMS server log data)• Negative (higher values, lower average final grade) – Effort and challenge (based on Likert-scale type of response on the survey) – Tab access (based on LMS server log data)
    46. 46. Conclusions• Higher-engaged students usually had higher performance – limited to courses which were well-designed and implemented. In this study, entry-level courses tended to have lower performance whether students were categorized as low engaged or high engaged high• Satisfaction and engagement levels could not guarantee high performance
    47. 47. Characteristics of successful students• Female• 16.5 years or younger• Took one or two courses per semester• Took Foreign Language or Health course• Lived in larger cities
    48. 48. Characteristics of at-risk students• Male• 18 years or older• Took more than two courses per semester• Took entry-level courses in Math, Science, or English• Lived in smaller cities
    49. 49. **We are looking for partners

    ×