Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Want to write a book in Jupyter - here's how

994 views

Published on

Describes how to create books and other interactive long form texts using Jupyter notebooks. Provides simple tools and techniques for extracting reusable code and creating tables of contents.

Published in: Software

Want to write a book in Jupyter - here's how

  1. 1. Want to write a book in Jupyter - here’s how Dr Jim Arlow, Clear View Training Limited www.clearviewtraining.com 1
  2. 2. Introduction • In this presentation, we will demonstrate how to use Jupyter notebooks to deliver interactive long form text such as books • We will quickly take a look at some common options for creating interactive text, and then go to develop a set of requirements and tools for using Jupyter as an authoring platform for books 2
  3. 3. About the author • Jim Arlow is an independent consultant, trainer and author working in OO analysis and design, UML modelling, BPMN, Metadata Management, Model Driven Architecture (MDA), requirements engineering and software engineering process design and implementation • He is author of the best selling, “UML 2 and the Unified Process”, and you can find a list of all of his books here: www.clearviewtraining.com • Contact: • www.clearviewtraining.com • https://www.linkedin.com/in/jimarlow 3 Check out my latest book! http://bit.ly/SecretsOfAnalysis
  4. 4. Options for writing long- form interactive texts • It is quite surprising how few platforms there are that are easy to use for creating interactive long-form texts! • We have direct experience with: • Apple iBooks Author • Mathematica • Jupyter • Of course, you can always use HTML, but that is a complex option… • Frameworks and libraries for interactivity • Hosting mechanisms and distribution • Etc. etc. 4
  5. 5. Apple iBooks Author • Creates truly beautiful interactive books! • Simple and fun to use! • Built in interactive widgets are limited (e.g. quizzes, interactive graphics etc.) • Hard to create custom interactive widgets using HTML5 • Can’t include live code • Limited distribution - does anyone actually use the Apple Books Store? http://bit.ly/IntroductionToBPMN2 5 Check it out!
  6. 6. Mathematica • Insanely powerful and elegant! • Easy to use authoring environment with full mathematical notation and live code • Very rich interactivity via the Manipulate command and more • Licensing to sell books is prohibitively costly for small authors/publishers • Distribution via Mathematica viewer (free) is to be phased out and replaced by deployment on a website (costly) - no sensible option left for free books http://bit.ly/2DBoBnC 6
  7. 7. Jupyter • Many of the advantages of Mathematica but with open source licensing • Live Python code (the default) has more traction than live Mathematica code • Good support for interactive widgets via interact functions • Many options for deployment • Many options for live code • More popular than Mathematica?? • How do you package and sell Jupyter Notebooks? 7
  8. 8. What is Jupyter? • From the Jupyter website http://jupyter.org/index.html: • “The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain: • Live code • Equations • Visualisations • Narrative text • Interactive widgets • Uses include: data cleaning and transformation, numerical simulation, statistical modelling, data visualisation, machine learning, and much more”. 8
  9. 9. How Jupyter works • Server runs notebook code (usually Python) and serves up static HTML and dynamic interactive widgets to the client web browser Interactive Computing Protocol Jupyter Notebook Server NB1 Code Text Data Widgets NB2 Code Text Data Widgets NB3 Code Text Data Widgets Browser 9
  10. 10. Jupyter for books • Books shall be delivered as a sequence of Jupyter notebooks, one per chapter • Each chapter shall build on the previous one (so we need to reuse code between chapters) • A reusable code base shall be extracted from book • Each chapter shall have a table of contents (TOC) • Each section shall have a link back to the TOC • The whole book shall have a table of contents 10
  11. 11. Jupyter book structure • A book comprises several chapters with one Jupyter notebook per chapter e.g. C1.ipynb • Notebooks allow headings and sub headings • Each chapter builds on the previous one and needs to reuse its code • Export of code is done using the nbconvert utility (see later) C1.ipynb C1.py C2.ipynb Convert import C3.ipynb C3.py C4.ipynb Convert import C2.py Convert import nbconvert Reusable code 11 Jupyter Notebooks Convert C4.py Python import
  12. 12. Two types of code… 1. Reusable code must be factored out for use in subsequent chapters and elsewhere: • Python classes • Functions • Global variables etc. 2. Non-reusable (demo) code that needs to be constrained to execution within a notebook: • Interactive demonstration code • Static output such as tables 12
  13. 13. Reusable and non-reusable code Reusable function Non-reusable (demo) code Interactive widgets Need to remove all non- reusable code 13
  14. 14. Identifying demo code • This is code that should not be reused outside of the notebook! • By convention: begin each demo code cell with #Demo and end it with #Demo • By metadata: add the DEMO tag to each demo code cell so that it can be identified and removed later • Open the View:Cell toolbar:Tags menu to see the tags associated with each cell We will use DEMO to filter out this cell using nbconvert 14
  15. 15. Extracting the code… • Filter the notebooks using the nbconvert utility to discard: • Markup, graphics, cell inputs, outputs and prompts • Demo code - remove all cells with the tag DEMO with the option: 
 --TagRemovePreprocessor.remove_cell_tags={"DEMO"} !jupyter nbconvert --to python --TemplateExporter.exclude_input_prompt=True --TemplateExporter.exclude_output_prompt=True --TemplateExporter.exclude_output=True --TemplateExporter.exclude_markdown=True --TagRemovePreprocessor.remove_cell_tags={"DEMO"} Introduction Pitch MusicalSetTheory MusicalSetTheory2 Scales Reusable Python Code Reusable Python Code Reusable Python Code Scales.py Reusable Code nbconvert (executed in a Jupyter cell) 15
  16. 16. What we have achieved… • Books shall be delivered as a sequence of Jupyter notebooks, one per chapter • Each chapter shall build on the previous one (so we need to reuse code between chapters) • A reusable code base shall be extracted from book • Each chapter needs a table of contents (TOC) and navigation from each section back to the TOC • The book as a whole needs a table of contents 16
  17. 17. Chapter TOC • Jupyter notebooks are missing an important feature - the ability to automatically generate a table of contents, so we will handcraft a solution! • Mark up TOC entries by convention: • The first line in a markdown cell shall be a candidate TOC entry • A TOC entry line shall begin with one or more # • A TOC entry line shall end with a single # • Note: We could have used tags again, but this markup convention is much easier! • Extract marked up TOC entries and generate TOC in notebook 17
  18. 18. Example TOC entry Ends with #Begins with # First line 18
  19. 19. Extracting the TOC entries • Jupyter notebooks are JSON documents that have a hierarchical structure • This means that it is relatively easy to parse them to extract information JSON Notebook Data cell source cell If first line of source looks like “# …#” “## …#” etc. Add to TOC Notebook with TOC 19
  20. 20. Load notebook as JSON Find TOC entry convention Generate markup hyperlinks for an indented list of headings Embed TOC in current notebook HTML anchor TOC TocTools.py Include this module in your notebooks 20
  21. 21. Call function in Jupyter notebook to embed TOC Call the function on the current notebook file to generate the TOC The generated TOC has an anchor so you can link back to it This file is Pitch.ipynb - make sure you call on the right file! 21
  22. 22. Book TOC • Ideally, the book TOC should contain a concatenation of all of the chapter TOCs with active hyperlinks • For now, we will compromise by limiting the book TOC to a list of chapters because it is easy to hyperlink to a chapter to see its contents - we may revisit this decision later! • The problem is that a normal hyperlink opens a new instance of the target notebook: 22 Pitch.ipynb Pitch.ipynb Pitch.ipynb Transposition Open a new instance
  23. 23. Generating the book TOC • Just use standard Jupyter hyperlinks that you can generate from a list of chapters: 23
  24. 24. Summary • Books shall be delivered as a sequence of Jupyter notebooks, one per chapter • Each chapter shall build on the previous one (so we need to reuse code between chapters) • A reusable code base shall be extracted from book • Each chapter shall have a table of contents (TOC) • Each section shall have a link back to the TOC • The whole book shall have a table of contents • Conclusion: Jupyter is an excellent platform for the delivery of interactive long-form content! 24

×