Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Education's Clarion Call: Strata, Santa Clara, 2013


Published on

  • Be the first to comment

Education's Clarion Call: Strata, Santa Clara, 2013

  1. 1. EduDataScienceTeaming to Improve US Education with Big Data Science Marie Bienkowski SRI International February 27, 2013 O’Reilly Strata, Santa Clara, CA
  2. 2. • Hour-long classes, “seat time” requirements• Students grouped by age• Lecture-based teaching• Paper textbook as primary learning resource; No cell phones in class• Small, delayed and disconnected data: some testing feedback, reports (midterm, final), attendance, free lunch eligible
  3. 3. The US Dropout Factory
  4. 4. Deeply Digital Learning• Flipped classroom w/online practice and homework via adaptive tutors• More engaging and inspiring 24/7 learning: games, projects, badges for competencies• Learners collaborate by ability, interest• Digital media/platforms for open or personalized learning• Data ecosystems including the Internet of Learning Things
  5. 5. K-12 In-School Time is 1 Million Minutes http://life-slc.orgMany orders of magnitude more learning data with digital learning:big data will be available – 55M K-12 students; 77M total in K-college
  6. 6. Analytics and Data Mining• Continuously improving courses, curricula, and apps• Continuous and stealth testing• Personalized, adaptive learning pathways, including recommended online learning resources• Support students to succeed with right challenge, right encouragement, and right engagement• Interactive data visualization systems (aka “dashboard”) for learners, teachers, leaders
  7. 7. Meeting the Data Science Challenge
  8. 8. Paradigms of Scientific Discovery• Empirical – started thousands of years ago• Theoretical – last few hundred years• Computational – last 30 – 40 years• Data Exploration (eScience) John Stamper, DataShop
  9. 9. EduDataScience is about Discovery• Automated assessment of • You can test students or student skill, mastery learning, watch them as they learn to efficient and effective learning see what they know (Corbett, 2001)• By discovering knowledge models automatically using • If you know what students data mining, student time can need, you can give it to be used more effectively (Cen them and they will learn et al 2008, Stamper et al AIED better 2011)
  10. 10. EduDataScience is about Discovery• Conducting research on disengaged behaviors (McQuiggan, Rowe, Lee, & Lester 2008; Rowe, McQuiggan, Robison, & • Students can learn from Lester 2009), led to tightened games that have stories and improved narrative, leading to positive learning outcomes (Rowe, Shores, Mott, & Lester 2011)
  11. 11. EduDataScience is about Discovery• By automatically detecting when students “game the system” (cf. Baker et al., 2004; Walonoski & Heffernan, 2006a, Johns & Woolf, 2006), it was • You can tell when possible to build automated students cheat and interventions that reduce make them stop gaming and improve learning (Baker et al., 2006; Walonoski & Heffernan, 2006b; Arroyo et al., 2007) Examples courtesy Ryan S.J d. Baker
  12. 12. Panelists • Zachary Pardos– MIT • Jace Kohlmeier– Khan Academy • Sharren Bates– inBloomLearn more! Zach and I will hold Office Hours tomorrow at 10am
  13. 13. The Data Zeitgeist in EducationAnd the Discoveries We Need to Succeed Zachary A. Pardos, Ph.D.
  14. 14. The Data Zeitgeist in Education • Impetus to use • The same classroom paradigm technology and data to has existed for centuries reform education • Data has been used in almost all other industries to optimize • Growth of computer outcomes tutoring system • Bioinformatics • Financial analysis • Statistical methods in particle physics • Why not education? 14UMAP 2011 Zach Pardos Strata 2013 - Santa Clara, CA February 27th, 2013
  15. 15. The Data Zeitgeist in Education • Impetus to use technology and data to reform education • Growth of educational- technology systems • Major increases in funding 15UMAP 2011 Zach Pardos Strata 2013 - Santa Clara, CA February 27th, 2013
  16. 16. The Data Zeitgeist in Education • Using technology and data to reform education • Produces the Cognitive Tutor • Growth of educational- – Used by over 600,000 students technology systems per year – Recently acquired by the Apollo • Major increases in group for $75m – Apollo group owns University of funding Phoenix • Largest online university (500k students) 16UMAP 2011
  17. 17. The Data Zeitgeist in Education • Using technology and data to reform education • Growth of educational- • Has tripled its daily technology systems student usage every year • Major increases in • Was the running for part funding of a $4.35b federal initiative to reform education in MA 17UMAP 2011 Zach Pardos Strata 2013 - Santa Clara, CA February 27th, 2013
  18. 18. The Data Zeitgeist in Education • Using technology and data to reform education • National standardized test being • Growth of educational- deployed in the 2014-2015 school technology systems year • Major increases in – Two versions of the test funding – One will be computer adaptive – Tens of millions of students’ data per year – Districts, States will be seeking big data solutions 18UMAP 2011 Zach Pardos Strata 2013 - Santa Clara, CA February 27th, 2013
  19. 19. The Data Zeitgeist in Education • Using technology and data to reform education • Started with Stanford AI course • Growth of educational- • Nearly 3m registrants since 2011 technology systems • 100s of college courses (growing) • Major increases in funding 19UMAP 2011 Zach Pardos Strata 2013 - Santa Clara, CA February 27th, 2013
  20. 20. • Joint venture between MIT and Harvard to build a platform to host massive open-access online college courses (MOOC)• Additional Universities joining steadily• High enrollments (30k-154k)
  21. 21. The data Student participation •154,000 enrolled •108,000 entered class •7,000 received certificate course interfaceCourse components•434 lecture videos•37 homework problems•105 lecture problems•1009 book pages•14 labs•145 tutorial videos•2 exams
  22. 22. The data The Approach -adapt a Bayesian model of learning -hypothesize that resources influence learning -see if hypothesis generalizes to new students Model Parameters P(L0) = Probability of initial knowledge knowledge P(T) = Probability of learning {video} Knowledge Tracing{book} {answer} P(G) = Probability of guess P(S) = Probability of slip P(L0) P(T) P(T) Nodes representation K K K K = knowledge node Q = question nodeWhat resources are working? Node states P(G) Q Q Q P(S)-post-tests are too far apart K = two state (0 or 1)-prediction of performance aloneQ = two state (0 or 1) not adequate 0 1 1-in need of a model of learning question(Pardos et al, Educational Data Mining, 2013 (under review))
  23. 23. Other factors in learning• Summarizing student affect over two school years by analyzing tutor log data• Correlated to State Test Outcome• Positive correlation: Frustration, Concentration, Confusion (while receiving tutor help) Pardos, Baker et al. (Learning Analytics & Knowledge, 2013)
  24. 24. Exploring interaction of other factors• Can non-cognitive contextual information about the student help explain efficacy? Model Parameters {confused} {confused} P(L0) = Probability of initial knowledge P(T) = Probability of learning {video} {book} Knowledge Tracing• In order to investigate many factors, P(G) = Probability of guess we need to be looking beyondof slip a P(S) = Probability P(L0) P(T) P(T) single course of data. Nodes representation K K K• Live analysis of K = knowledge node efficacy trends Q = question node P(G) Node states P(S) Q Q Q K = two state (0 or 1) Q = two state (0 or 1) 0 1 1
  25. 25. What We Need Join us!• Increased capability in analyzing continuous streams of big data• Operationalizing learner analytics• Problem solvers who want to make an impact Zach Pardos
  26. 26. Jace @derandomized
  27. 27. Big problems… >1,000,000,000 School-aged children around the world 142,800,000 25% Of US college freshmen School-aged children not in school need remedial classes; Only 85% costing $3 billion annually Of primary school students worldwide graduate from primary schoolStatistics from UNESCO Institute for Statistics (UIS); National Center for Education Statistics; Complete College America
  28. 28. Cumulative visits to Khan Academy (Millions)… big data >400 million lessons 60 million users to date delivered >1 billion problems answered > 5 million Unique users / month 216 countries 15,000 classrooms around the world 28
  29. 29. Hard problems… • Thoughtful measurement of learning • Multidimensional objectives (time, breadth, depth) • Engagement-productivity tension • Sophisticated modeling needs
  30. 30. … hard work.• Thoughtful measurement of learning => great assessments• Multidimensional objectives (time, breadth, depth) => user goals• Engagement-productivity tension => game mechanics• Sophisticated modeling needs => AI/machine learning + phenomenal researchers + brute force
  31. 31. Analytics stack • Google App Engine • Amazon Web Services (S3, EC2, EMR, Hive) • Python (NumPy, SciPy, scikit-learn) Open source at:
  32. 32. Small team. Huge scale. In the last year, 24 employees reached 43 million unique students Our team Our users in 216 countries
  33. 33. Jace Kohlmeier @derandomizedA free, world-classeducation foranyone, anywhere
  34. 34. Making Big Data Work in K-12
  35. 35. What Data? For what purpose?• Big Data not yet working in k-12• At the policy level, collecting information about student background and achievement has become practice once-a-year between schools, districts and state education organizations• This has given great insight into achievement gaps and underserved communities and schools• However – it’s difficult to connect those problems to data-driven solutions• Individual companies and research institutions have advanced the field of learning analytics but only with intense, expensive research efforts
  36. 36. Enabling Great Teaching and Learning with Data• For teachers, differentiated or customized instruction is a common goal• Teachers are expected to understand exactly what each of their student needs, discover and successfully deliver those educational experiences across a student population of up to 200 kids a day• As tech professionals, we all can think of zillions of opportunities for data-driven tools to support these instructional processes – Dashboards and data analysis tools – Recommendation engines – Early warning systems – Communication tools – Dynamic scheduling – Teacher development – x 1zillion• Big Data should be powering personalized learning at scale. Helping teachers, students and families to pursue the best possible learning opportunities for the best possible education and life outcomes
  37. 37. Current State and Complicating Factors• While there are innovative products available, it is incredibly difficult for education agencies to successfully implement them with a product portfolio approach• State and school district customers don’t always know how to successfully map instructional processes to requirements, set expectations for continuous improvement, select tools to successfully support process and insist on future-friendly data and network infrastructure• Why? – $ – Capacity – High-risk regulatory framework – Highly structured budgets and contract requirements – Expense of one-off data integrations – Existing large-footprint software bundles that address multiple processes – Legal requirements around evaluating teachers – Complicated relationships between school districts and states
  38. 38. Meanwhile in Classrooms• This leaves teachers in one of two bad scenarios: – Limited set of tools where the district has not made investment – Large set of high-quality tools that do not interoperate – making it nearly impossible to use the tools successfully• It’s even worse for students: – Students with access to tech at home experience a huge difference in how they use tools and strategies in and outside of the classroom – Students without access to tech at home miss out on whole new ways to experience the world• Personalized Learning remains a theoretically good idea that can’t get to scale – Missed opportunity of months of classroom time spent reviewing last years subject mater to figure out where kids are – Thriving kids unable to push farther than their classroom curriculum – Struggling kids not making the progress they need to in order to succeed
  39. 39. inBloom and Big Data• inBloom supports the K-12 community’s move towards great data-driven tools for classroom use built on an interoperable data and content architecture• We support states and districts who are taking a more process- and quality-based approach to launching tech initiatives• Our success is determined by the success of partners – software providers who launch great data-driven tools• If the learning applications and tools our students, teachers and families use together can get all the data they need to be successful and report back their outcomes, K-12 can finally join the big data movement• Big Data = personalization of education opportunities, continuous improvement of tools and strategies, improved student outcomes
  40. 40. Find Out More• inBloom Strata Booth••• @sharrensharren• SXSW EDU NEXT WEEK