Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Computable Content: Lessons Learned


Published on

Strata UK 2017. Computable content leverages Jupyter notebooks to make learning materials more powerful by integrating compute engines, data sources, etc. O’Reilly Media extended this approach to create the new Oriole Online Tutorial medium, publishing notebooks from authors along with video timelines. (A free public tutorial, Regex Golf, by Peter Norvig demonstrates what’s possible with this technology integration.) Each user session launches a Docker container on a Mesos cluster for fully personalized compute environments. The UX is entirely browser based.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Computable Content: Lessons Learned

  1. 1. Computable  Content:  
 Lessons  Learned Paco  Nathan  @pacoid   Director,  Learning  Group  @  O’Reilly  Media   2017-­‐05-­‐25
  2. 2. An  observation 2
  3. 3. Oriole 3
  4. 4. One  approach 4
  5. 5. Jupyter  use  @  O’Reilly  Media ▪ Embracing  Jupyter  notebooks  at  O’Reilly­‐at-­‐oreilly   ▪ Learning  alongside  innovators,  thought-­‐by-­‐thought,  in  context­‐oriole-­‐learn-­‐alongside-­‐innovators-­‐ thought-­‐by-­‐thought-­‐in-­‐context   ▪ Oriole  online  tutorials   ▪ How  do  you  learn?­‐do-­‐you-­‐learn 5
  6. 6. For  example ▪ A  unique  new  medium  blends  code,  data,  text,  
 and  video  into  a  narrated  learning  experience
 with  computable  content   ▪ Purely  browser-­‐based  UX;  zero  installation  
 required   ▪ Substantially  higher  engagement  metrics   ▪ Opens  the  door  for  live  coding  in  assessments 6­‐golf-­‐with-­‐peter-­‐norvig
  7. 7. Motivations O’Reilly  needed  a  way  for  authors  to  use  Jupyter  notebooks  to  create   professional  publications.  We  also  wanted  to  integrate  video  narration   into  the  UX.  The  result  is  a  unique  new  medium  called  Oriole:   ▪ Context  as  a  “unit  of  thought”   ▪ Code  and  video  sync’ed  together   ▪ Each  web  session  get  its  own  Docker  container  in  the  cloud   ▪ 100%  HTML  experience,  no  download/install/config  needed   ▪ Jupyter  notebooks  used  in  the  middleware   ▪ Leverage  interactive,  data-­‐driven  graphics 7
  8. 8. Outcomes 8 ▪ Tutorials  are  now  much  quicker  to  publish  than   “traditional”  books  and  videos   ▪ Less  time  required  for  innovators  in  programming,   data  science,  devops,  design,  etc.  –  who  tend  to  be   really  busy  people   ▪ Audience  gets  direct,  hands-­‐on,  contextualized   experience  across  a  wide  variety  of  programming   environments
  9. 9. Limitations 9 ▪ Notebook  kernels  run  REPLs,  so  older  languages   were  not  feasible   ▪ Brief  code  blocks  with  tangible  outcomes  –  
 precludes  business  topics,  systems  engineering,  etc   ▪ What  materials  will  fit  within  a  Docker  container?
  10. 10. Third  iteration  of  Jupyter  @  O’Reilly 10 1. notebooks  as  supplemental  material  to   other  published  work   2. notebooks  published  as  HTML,  as  articles   3. computable  content,  containerized   notebooks  +  video  narratives   4. hosted  notebooks
  11. 11. Long-­‐term  goal:     make  learning  materials  more  powerful  by   integrating  compute  engines  +  data  services 11
  12. 12. Project  Jupyter 12
  13. 13. Project  Jupyter ▪ The  evolution  of  iPython  notebooks,  applied   to  a  range  of  different  programming  languages   and  environments   ▪   ▪ IPython-­‐kernels-­‐for-­‐other-­‐languages 13
  14. 14. Projects 14 ▪ JupyterHub   ▪ Jupyter  in  Education!forum/ jupyter-­‐education   ▪ JupyterLab   ▪ Jupyter  Kernels IPython-­‐kernels-­‐for-­‐other-­‐languages
  15. 15. A  suite  of  network  protocols Think  of  Jupyter,  at  its  core,  as  a  suite  of   network  protocols:   Jupyter  is  to  the  remote  semantics  of  a  REPL   as…   HTTP  is  to  the  remote  semantics  of  file  share 15
  16. 16. A  suite  of  network  protocols 16 Code%runs% in%a%REPL Kernel Edi0ng%+% Results Notebook Network Protocol
  17. 17. History,  Context 17
  18. 18. Notebook  metaphor Wolfram  Research  introduced   notebooks  in  1988  for  working  
 with  Mathematica 18
  19. 19. Related  work 19
  20. 20. Literate  programming Don  Knuth   Paraphrased:
 Instead  of  telling  computers  what  to   do,  tell  other  people  what  you  want   the  computers  to  do 20
  21. 21. Speech  acts PyCon  2016  Keynote,  Lorena  Barba
 (video) PyCon2016_Keynote/3407779
 (slides)   Highly  recommended:  speech  acts  
 (based  on  Winograd  and  Flores)  
 as  theory  here 21
  22. 22. Best  Practices 22
  23. 23. The  following  lessons  learned  in  using  Jupyter   notebooks  +  video  for  learning  materials  apply   well  in  many  situations  for  data  science  teams   working  across  an  organization 23
  24. 24. Teaching  with  Jupyter  –  1  of  2 ▪ focus  on  a  concise  “unit  of  thought”   ▪ invest  the  time  and  editorial  effort  to  create  a  good  intro   ▪ keep  your  narrative  simple  and  reasonably  linear   ▪ “chunk”  the  text  and  code  into  understandable  parts   ▪ alternate  between  text,  code,  output,  further  links,  etc.   ▪ code  cells  should  be  brief  (<  10  lines),  must  show  output 24
  25. 25. Teaching  with  Jupyter  –  2  of  2 ▪ load  data+libraries  from  the  container,  not  the  network   ▪ clear  all  output  then  “Run  All”  –  or  it  didn’t  happen   ▪ video  narratives:  there’s  text,  and  there’s  subtext...   ▪ pause  after  each  “beat”:  smile,  breathe,  let  people  follow  you   For  JVM  people:  stop  thinking  only  about  IDEs,  Ivy,  Maven,  etc.  (ibid,  Knuth1984)
 (apologies  for  shousng) 25
  26. 26. Sharing  is  caring In  data  science,  we  see  the  benefits  to  teams  for  shared   insights,  storytelling,  etc.   Meanwhile  domain  expertise  is  generally  more  important   than  knowledge  about  tools   There’s  a  value  for  developers  to  use  notebooks  in  lieu  of   IDEs  in  some  cases  –  what  are  those  cases?   GitHub  now  renders  notebooks,  so  they  can  be  used  for   documentation,  reporting,  etc.   Digital  Object  Identifiers  (DOI)  can  be  assigned  through   Zenodo,  making  notebooks  citable  for  academic  publication 26
  27. 27. Authoring  and  Scale-­‐Out 27
  28. 28. 28
  29. 29. Achieving  scale ▪  allows  a  notebook  author  to   build  a  container  that  includes  the  required   Jupyter  kernel,  installed  libraries,  datasets,   etc.   ▪ Install  Docker  on  your  laptop   ▪ Backend  uses  Git  and  DockerHub  to  manage   containers   ▪ For  scale,  deploy  to  DC/OS  or  a  cloud 29
  30. 30. “A  notebook,  a  container,  and  ~20  minutes  
  of  informal  video  walk  into  a  bar…” 30
  31. 31. System  architecture 31  Tutorial               Middleware  Cluster                
  32. 32. O’Reilly  Strata   NY,  Sep  25-­‐28
 SG,  Dec  4-­‐7   O’Reilly  ArXficial  Intelligence   NY,  Jun  26-­‐29
 SF,  Sep  17-­‐20   JupyterCon   NY,  Aug  22-­‐25 32
  33. 33. 33 Learn  Alongside
 Innovators Just  Enough  Math Building  Data   Science  Teams Hylbert-­‐Speys How  Do  You  Learn? periodic  newslewer  with  updates,  
 events,  conference  summaries…