Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Science Reinvents Learning?

4,755 views

Published on

Presented 2015-08-24 at SF Bay ACM, held at the eBay south campus in San Jose.

http://meetup.com/SF-Bay-ACM/events/221693508/

Project Jupiter https://jupyter.org/ evolved from IPython notebooks, and now supports a wide variety of programming language back-ends. Notebooks have proven to be effective tools used in Data Science, providing convenient packages for what Don Knuth coined as "literate programming" in the 1980s: code plus exposition in markdown. Results of running the code appear in-line as interactive graphics -- all packaged as collaborative, web-based documents. Some have said that the introduction of cloud-based notebooks is nearly as large of a fundamental change in software practice as the introduction of spreadsheets.

O'Reilly Media has been considering the question, "What comes after books and video?" Or, as one might imagine more pointedly, what comes after Kindle? To that point we have collaborated with Project Jupyter to integrate notebooks into our content management process, allowing authors to generate articles, tutorials, reports, and other media products as notebooks that also incorporate video segments. Code dependencies are containerized using Docker, and all of the content gets managed in Git repositories. We have added another layer, an open source project called Thebe that provides a kind of "media player" for embedding the containerized notebooks into web pages

Published in: Education

Data Science Reinvents Learning?

  1. 1. 2015-08-24 • San Jose Paco Nathan, @pacoid
 Director, O’Reilly Learning Data Science Reinvents Learning? Beyond Gutenberg and Erasmus meetup.com/SF-Bay-ACM/events/221693508/
  2. 2. 2 Some Background… • O’Reilly Learning: you may only hear about us in 
 a few instances, if we do our job well; ACM is a great forum for this discussion • prior: built-out the community evangelism and training program for Apache Spark at Databricks • prior: led Data teams for several years, working on 
 large-scale ML apps in industry, including: one of the largest Hadoop instances running in AWS (2008); 
 one of the first 100% AWS system architectures (2006) • … • ancient prior: Stanford CSD teaching fellowship (1984-86, Alice Supton, Stuart Reges) peer-teaching CS course which later became Residential Computing
  3. 3. WWSVD?
  4. 4. 4 Intro Quite candidly, the one common catch phrase 
 in SiliconValley that I find most terrifying: “It’s like Uber, for ___”
  5. 5. 5 Intro Ostensibly that leads to a question, how might 
 an “Uber for Education” look?
  6. 6. 6 Intro Ostensibly that leads to a question, how might 
 an “Uber for Education” look? a) Similar to Cthulhu, we might regret actually seeing that
  7. 7. 7 Intro Ostensibly that leads to a question, how might 
 an “Uber for Education” look? a) Similar to Cthulhu, we might regret actually seeing that
  8. 8. 8 Intro Ostensibly that leads to a question, how might 
 an “Uber for Education” look? a) Similar to Cthulhu, we might regret actually seeing that b) Would we really need that anywho?
  9. 9. 9 Intro Ostensibly that leads to a question, how might 
 an “Uber for Education” look? a) Similar to Cthulhu, we might regret actually seeing that b) Would we really need that anywho? c) Uber itself might not take that approach …
  10. 10. 10 Intro Ostensibly that leads to a question, how might 
 an “Uber for Education” look? a) Similar to Cthulhu, we might regret actually seeing that b) Would we really need that anywho? c) Uber itself might not take that approach … Perhaps “Uber for Learning” might be somewhat
 more apt? In any case, what comes after Books, Kindle, MOOCs?
  11. 11. 11 Some Definitions… “Learning” ergo… “Education” ergo… “School”
  12. 12. “Learning” ergo… “Education” ergo… “School” X 12 Some Definitions… Schools are great to have… If you need a school, pick a 
 good one and go To be clear, we’re not a school
  13. 13. 13 Some Definitions… Even the best schools these days question
 what they will become in 5-10 years Not-so-best schools are perhaps questioning 
 much more than that
  14. 14. 14 Some Definitions… Oh BTW, too many (funded) teams seem to 
 have this mediocre idea for “education”: 1. assessment: collect test scores ➜ 2. define “quantified student” ➜ 3. reuse online marketing funnel ad-tech ➜ 4. invoke agile coding teams ➜ 5. ship mobile/cloud-based SaaS platform ➜ 6. ... 7. profit
  15. 15. Oh BTW, too many (funded) teams seem to have this mediocre idea for “education” 1. assessment: collect test scores 2. define “quantified student” 3. reuse online marketing funnel ad-tech 4. invoke agile coding 5. ship a mobile/cloud-based SaaS platform 6. ... 7. profit 15 Some Definitions… LMS
  16. 16. K-12 not so much, except perhaps in the case of Safari for Schools undergrad textbooks? graduate textbooks, conferences? professional focus of our audience 16 Some Definitions…
  17. 17. 17 • vocational: 
 making a career move • aspirational: 
 improvement within a career path • proficiency: 
 has a specific pain-point, needs to resolve it • familiarity: 
 wants to join in a team dialog about a topic, 
 e.g., conversational programmer Learner Personas for professional category
  18. 18. What about MOOCs?
  19. 19. 19 What about MOOCs? Massive Open Online Courses – 
 seven year trend, beginning with: Connectivism and Connective Knowledge
 George Siemens, Stephen Downes
 University of PEI (2008)
 http://cck11.mooc.ca/
  20. 20. 20 What about MOOCs?
  21. 21. 21 What about MOOCs? Anthony Joseph
 UC Berkeley early Jun 2015 edx.org/course/uc-berkeleyx/uc- berkeleyx-cs100-1x- introduction-big-6181 Ameet Talwalkar
 UCLA late Jun 2015 edx.org/course/uc-berkeleyx/ uc-berkeleyx-cs190-1x- scalable-machine-6066
  22. 22. 22 What about MOOCs? Pros: • cost-effective to reach a large audience • popular with students • ¿ addresses “train the trainers” bottleneck ? Cons: • expensive to produce and curate • most students are sampling • low completion rates • somewhat chaotic • lecture fatigue • ¿ reinforces advantage of the elites ?
  23. 23. 23 What about MOOCs? Online education: MOOCs taken by educated few
 Ezekiel Emanuel, Nature 503, 342 (2013-11-21) • 80% students already have an advanced degree • 80% come from the richest 6% of the population Michael Shanks @Stanford: retrenchment around traditional disciplines will make disparities even more pronounced An Early Report Card on Massive Open Online Courses
 Geoffrey Fowler, WSJ (2013-10-08) Amherst, Duke, etc., have rejected edX see: Open edX Universities Symposium @GWU, 2015-11-11
  24. 24. 24 • search engines surface too many choices 
 among the available learning content • we must get people wanting to interact with the material – generally due to social context • academe strives to decontextualize, which 
 is the opposite of learning in context • how do we recognize that learning has occurred? • what is the learning promise? What about MOOCs?
  25. 25. Examples for Consideration
  26. 26. 26 Introduction to Robotics Peter Corke @QUT https://moocs.qut.edu.au/learn/introduction-to- robotics-august-2015 • effective use of peer review for scaling • worked well reaching into Africa, India Peer Review
  27. 27. 27 EffectiveThinkingThrough Mathematics Michael Starbird @UT/Austin https://www.edx.org/course/effective-thinking- through-mathematics-utaustinx-ut-9-01x • getting students to articulate their epiphany moments is more interesting 
 than other results – Donna Kidwell Epiphany Moments
  28. 28. 28 Caltech Offers Online Course with 
 Live Lectures in Machine Learning Yaser Abu-Mostafa (2012-03-30) http://www.caltech.edu/news/caltech-offers-online- course-live-lectures-machine-learning-4248 • significant improvement through the use of “flipped” a.k.a. inverted classrooms Inverted Classrooms
  29. 29. 29 Scalable Learning
 David Black-Schaffer @Uppsala
 Sverker Janson @KTH SICS https://www.scalable-learning.com/ • active learning: Flipped Classroom and Just-in-timeTeaching • exams built directly into specific diagrams within videos • metrics for where in video+code that students get stuck • instructor can customize subsequent classroom discussions 
 (active teaching phase) based on stuck/unstuck metrics Inverted Classrooms
  30. 30. 30 How to Flip a Class 
 CLT @UT/Austin
 http://ctl.utexas.edu/teaching/flipping-a-class/how 1. identify where the flipped classroom model makes 
 the most sense for your course 2. spend class time engaging students in application activities with feedback 3. clarify connections between inside and outside 
 of class learning 4. adapt your materials for students to acquire course content in preparation of class 5. extend learning beyond class through individual 
 and collaborative practice Inverted Classrooms
  31. 31. 31 Learning programming at scale Philip Guo 
 O’Reilly Radar (2015-08-13) http://radar.oreilly.com/2015/08/learning- programming-at-scale.html • PythonTutor • Codechella Tutors could keep an eye on around 
 50 learners during a 30-minute session, 
 start 12 chat conversations, and 
 concurrently help 3 learners at once Collaborative Learning
  32. 32. 32 Data-driven Education and the Quantified Student Lorena Barba @GWU PyData Seattle 2015 https://youtu.be/2YIZ2SY9mW4 • keynote talk: abstract, slides • homepage If you study just one link in this entire talk…
  33. 33. Project Jupyter
  34. 34. 34 If by some bizarre chance you haven’t used 
 it already, go to https://jupyter.org/ • 50+ different language kernels • new funding 2015-07 • UC Berkeley, Cal Poly • nbgrader autograder by Jess Hamrick • jupyterhub multi-user server • curating a list of examples • repeatable science! see also:
 Teaching with Jupyter Notebooks
 http://tinyurl.com/scipy2015-education Project Jupyter
  35. 35. 35 Deploying JupyterHub for Education
 Jessica Hamrick
 Rackspace blog (2015-03-24)
 https://developer.rackspace.com/blog/deploying- jupyterhub-for-education/ Project Jupyter
  36. 36. 36 Literate Programming
 Don Knuth
 Univ of Chicago Press (1992)
 literateprogramming.com/ Instead of imagining that our main task is 
 to instruct a computer what to do, let us
 concentrate rather on explaining to human
 beings what we want a computer to do Evoking some earlier works…
  37. 37. 37 Most definitely check out CodeNeuro, both online and the conf/hackathon… Some great examples: Jeremey Freeman, HHMI Janelia Farm
 http://notebooks.codeneuro.org/ Matthew Conlen, NY Data Company
 http://lightning-viz.org/ Olga Botvinnick, UCSD
 http://yeolab.github.io/flotilla/docs/gallery/ Great Examples
  38. 38. 38 http://mybinder.org/ turn a GitHub repo into a collection 
 of interactive notebooks powered by Jupyter and Kubernetes Launch Vehicles
  39. 39. Jupyter, Thebe, Atlas, Docker
  40. 40. 40 Embracing Jupyter Notebooks at O'Reilly
 Andrew Odewahn
 O’Reilly Media (2015-05-07) https://beta.oreilly.com/ideas/jupyter-at-oreilly O’Reilly Media is using our Atlas platform 
 to make Jupyter Notebooks a first class authoring environment for our publishing program Jupyter, Thebe, Atlas, Docker, etc. Content Toolchain
  41. 41. 41 Embracing Jupyter Notebooks at O'Reilly Andrew Odewahn O’Reilly Media (2015-05-07) https://beta.oreilly.com/ideas/jupyter-at-oreilly O’Reilly Media is using our Atlas platform to make Jupyter Notebooks a first class authoring environment for our publishing program Jupyter Content Toolchain
  42. 42. 42 On Demand Analytic and Learning Environments with Jupyter
 Kyle Kelley, Andrew Odewahn
 lambdaops.com/jupyter-environments-odsc2015/ Exploring a couple themes, in particular: • computational narratives - exploratory data analysis - software development/collaboration - API exploration - technical papers - reports, exec dashboards • code-as-media - Thebe project, etc. Content Toolchain
  43. 43. 43 Personal experiences during 2012-2015 
 as an author and instructor… Just Enough Math
 Paco Nathan
 O’Reilly Media (2014)
 http://justenoughmath.com Content Toolchain
  44. 44. 44 Learnings based on working on this project with Kyle and Andrew… How to transit from roles of data scientist, software developer, engineering director – 
 into roles of author, teacher – and vice versa Content Toolchain
  45. 45. 45 Interactive notebooks: 
 Sharing the code Helen Shen Nature (2014-11-05) nature.com/news/interactive-notebooks- sharing-the-code-1.16261 Content Toolchain
  46. 46. 46 Content Toolchain Atlas is our content platform backed by Git, for project collaboration among authors, editors, et al. https://atlas.oreilly.com/
  47. 47. 47 Content Toolchain Thebe (a moon of Jupiter) provides a layer atop Jupyter that is needed for publishing, white-labeled content, etc. https://github.com/oreillymedia/thebe
  48. 48. 48 Content Toolchain Beta is our new site design: https://beta.oreilly.com/learning
  49. 49. 49 Content Toolchain Contrast our current talent workflow and this 
 new world of Jupyter+Docker+Thebe+cloud … How would it work with known successes such 
 as Head First? production presentation Thebe: player Jupyter: notebook Docker: container web page: interaction Git: versioning Atlas: publications various formats authoring cloud infra
  50. 50. Does Science begin with Phenomenology?
  51. 51. 51 Audience Patterns for Learning: ad-hoc
  52. 52. 52 Audience Patterns for Learning: architecture events inverted on-demand Mostly Synchronous Mostly Asynch Inverted Classroom Paywall Subscription Free Content
  53. 53. 53 The Learning Architecture: Defining Development and Enabling Continuous Learning David Mallon, Dani Johnson Bersin (2014-05-06) http://www.bersin.com/Practice/Detail.aspx? docid=17435&mode=search&p=Learning-@-Development This report is designed to help leaders 
 and talent development and learning 
 professionals to take positive steps 
 toward understanding and implementing 
 learning architectures Sidebar: Learning Architecture
  54. 54. Think of a favorite open source framework … who (or where) are the experts in this graph? Sidebar: Innovators vs. Experts Diffusion of Innovation
 Everett Rogers (1962)
 http://sphweb.bumc.bu.edu/otlt/MPH-Modules/SB/SB721- Models/SB721-Models4.html 54
  55. 55. 55 Building Blocks In software engineering, we rarely hand a 
 developer the spec for some app and say 
 “Start from scratch, then come back when you’re done.” Instead: • focus on MVP • leverage APIs, libraries, microservices, etc. • iterate on small, incremental changes • this allows for TDD, CI, etc. • plus, customer experiments ➜ data science Compare/contrast that with how publishers approach authors, speakers, instructors?
  56. 56. 56 Building Blocks Proposing a new format spec to replace 
 EPUB, MOBI, etc.: • video segments + transcripts • notebooks in Jupyter+Thebe+Docker • metadata (persona, topics, cues, etc.) • links to Git repos, Dat data • annotations atop existing content • webcast/livestream • social interaction (TA/mentoring) • evaluation modules • discourse analytics most reused across a spectrum of synchronous to async instrumented for experiments, analytics, iteration
  57. 57. 57 total newbie good overview Do you have sufficient familiarity with the topic? utterly confused familiar territory Can you build on familiarity with a related topic? must get unstuck send pull request Do you have necessary proficiency in the topic? learner topic experience concise topic inter- disciplinary How many boundaries must you span to achieve structural literacy for this topic? want to for myself have to for my job What is your primary motivation to learn this topic? bleeding edge COBOL 2020 Where are you on the "diffusion of innovation" curve w.r.t. the topic? on- demand major event How high is the transaction cost for the experience delivered to you? "go read the code" full-team participation Does the learning experience immerse you within a diverse, supportive social context? Dimensional Reduction Did we mention intense needs 
 for data analytics at scale?
  58. 58. 58 Is it possible to measure “distance” between 
 a learner and a subject community? From Amateurs to Connoisseurs:
 Modeling the Evolution of User 
 Expertise through Online Reviews
 Julian McAuley, Jure Leskovec
 http://i.stanford.edu/~julian/pdfs/www13.pdf Recommender Systems
  59. 59. 59 Back to “Uber for Learning” – approaching from a learner (audience) perspective, generally within a social context Given that: • books aren’t used by learners as much anymore • experts don’t have time to write books anymore If we can: • fit learners’ needs to topics w.r.t. subject communities, 
 based on their S-curve positions • personalize lectures for learners’ pain-points • reuse containerized building blocks Imagine the extent to which our current data science 
 tooling and techniques can be leveraged? Summary
  60. 60. 60 PS: If you are interested in opportunities 
 to write, speak, teach, mentor, code, etc., 
 based on these approaches, let us know Get Involved!
  61. 61. Thank You! and Stay Tuned…
  62. 62. presenter: Just Enough Math O’Reilly (2014) justenoughmath.com monthly newsletter for updates, 
 events, conf summaries, etc.: liber118.com/pxn/

×