Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Blastn plus jupyter on Docker


Published on

Examples from bioinformatics - using containers for bioinformatics tools (such as blastn), plus example Jupyter notebooks

Published in: Data & Analytics
  • Be the first to comment

Blastn plus jupyter on Docker

  1. 1. Blastn + Jupyter on Docker Examples from Bioinformatics Samantha & Lynn Langit
  2. 2. “ ” Jupyter - Inspired by Mathematica Thanks Steven Wolfram If you can SEE it (your data and code), you can work with it better @lynnlangit
  3. 3. Next terminal <- a better Python REPL • Fernando Perez in 2001 • IPython (interactive) • Modeled - Mathematica Notebooks • IP(y): Notebook -> in a browser • 2012 IPython -> Jupyter Notebook @lynnlangit
  4. 4. Enter Jupyter Notebooks @lynnlangit
  5. 5. Jupyter Notebooks supports ML Lifecycle 1. Collect Data Retrieve Files Query SQL Databases Call Web Services “Scrape” Web Pages 2. Prepare Data Explore Data Validate Data Clean Data Features / Data 4. Evaluate Model Test Performance Compare Models Validate Model Visualize 5. Deploy Model Export Model File Prepare Job Deploy Container Re-package Model Execute code blocks: - Python, R… code - SQL queries - Shell commands 3. Train Model Prepare Training Set Experiment Test Model Visualize Write Documentation: - Markdown language Visualize Data - Viz tools…
  6. 6. Jupyter Visualizations – so many possibilities
  7. 7. Notebook Customizations Multiple Runtimes Languages Share output Code or Equations LaTex Math Comments Markdown Wiki-like Graphics Visualizations Charting Results LIVE DOCUMENTATION Reproducible Research @lynnlangit
  8. 8. Example Jupyter locally @lynnlangit
  9. 9. Mathematica evolved… Jupyter Notebook Market leader Started for single use Academic community GitHub integration Added Jupyter Hub for collaboration Zeppelin Notebook Start for collaboration Enterprise Security Vendor Notebook Databricks for Apache Spark Jupyter-like, but proprietary format @lynnlangit
  10. 10. Running Notebooks Desktop Install and run Local Server Can use Jupyter Hub for groups Cloud Large number of options @lynnlangit Docker Start a container
  11. 11. Extending, Refactoring Open Notebooks • Write functions in one notebook • Link to another notebook • Write extensions (
  12. 12. Up the bar Personalized medicine via genomic analysis @lynnlangit
  13. 13. Reproducible Research – Experiments as Code @lynnlangit
  14. 14. What is Blastn? Basic Local Alignment Search Tool - BLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.
  15. 15. Cloud-based Jupyter PaaS • AWS SageMaker • Azure Notebooks • Google Colabs Wireframe that first the role of UX in agencies @lynnlangit
  16. 16. Tools for Jupyter • Binder for GitHub • Point to your GitHub Repo • Jupyter Notebooks • Requirements.txt • It builds a Docker image • You can run your Notebooks @lynnlangit
  17. 17. Example Binder @lynnlangit
  18. 18. Example - GT-Scan2 Jupyter for Genomics Research @lynnlangit
  19. 19. Future of Jupyter for Research Academic Institutions and Research Labs UC Berkeley, Davis, San Diego Cal Poly San Luis Obispo Clemson University UC Boulder U of Illinois, Minnesota, Missouri, Rochester, Texas MIT Michigan State U Texas A & M @lynnlangit