Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Short Introduction of software engineering for bioinformatics

This is slide I used for In-house presentation

  • Login to see the comments

  • Be the first to like this

Short Introduction of software engineering for bioinformatics

  1. 1. Short Introduction to Software engineering for bioinformatics Joe Miyamoto
  2. 2. Progress of software development practice Waterfall Agile DevOps There is no silver bullet for software engineering and every each has own advantage But in terms of bioinformatics, there is very little chance for adopting Waterfall.
  3. 3. Evolution rather than progress? Waterfall Agile DevOps
  4. 4. In more precise… Non-spiralSpiral Scrum eXtreme Programming(XP) Waterfall Agile DevOps
  5. 5. Waterfall Agile DevOps
  6. 6. Waterfull stlyle development • Very old and popular style of product development Do not go back to previous section It has to be clear about what we really want to make Suited for Large-scale development “Few A-Class architect and the mass of C-class programmer”Ref: http://fireside.gamejolt.com/post/the-game-creation-process-part-2-designing-the-idea-viq5rk2t
  7. 7. Waterfull style development Advantage •Easy to manage the progress(if we found no contingency) Disadvantage •Hard to manage the progress (if we do found contingency)
  8. 8. waterfull(spiral) • Iteration of waterfull Advantage • Task becomes clear by each iteration Disadvantage • Time consuming • Hard to determine how much we have to elaborate on first iteration ref: http://www.qmetry.com/spiral.html
  9. 9. Waterfall Agile DevOps
  10. 10. Agile • Antithesis for waterfull • Not Technique, it’s Phiosophy • 1 iteration is 1~4 week,and 1 feature for each iteration. Ref: https://www.linkedin.com/pulse/essential-resources-services-technologies-your-startup-jason-oh
  11. 11. Agile Advantage • Easy to adopt changes • Make clear where we are and where we want to go Disadvantage • Necessity for refactoring -> CI(We will see later) • Communication cost -> No more than about 20 people
  12. 12. Difference of agile and spiral • Spiral … makes every feature in each iteration • Agile … implements only one feature for each iteration.
  13. 13. Non-spiralSpiral Scrum eXtreme Programming(XP) Waterfall Agile DevOps
  14. 14. One way of agile incarnation Focus on communication of developers • Make a list for features we one to implement and update constantly • Each iteration is 30 days and software has to be deployable in the end • 15 minutes standing meeting everyday • No partitioning Scrum
  15. 15. Non-spiralSpiral Scrum eXtreme Programming(XP) Waterfall Agile DevOps
  16. 16. eXtreme Programming(XP) One way of agile incarnation Focus on maintainability of Code • Test Driven Development(TDD) • Pare Programming • Joint ownership of code • Continuous Integration (CI) • Issue Tracking
  17. 17. eXtreme Programming(XP) One way of agile incarnation Focus on maintainability of Code • Test Driven Development(TDD) • Pare Programming • Joint ownership of code • Continuous Integration (CI) • Issue Tracking
  18. 18. 2 purpose of software test Test for users Focused in Agile Run test everytime we make a change to source code Test for developers
  19. 19. eXtreme Programming(XP) One way of agile incarnation Focus on maintainability of Code • Test Driven Development(TDD) • Pare Programming • Joint ownership of code • Continuous Integration (CI) • Issue Tracking
  20. 20. • Distributed Version Control System(DVCS) • Able to share history of changes • Cut a brunch for every single feature or subproject Ref: http://gotgroove.com/ecommerce-blog/guide-to-version-control-for-magento-using-git-and-beanstalk/ Mercurial (more simple DVCS for pythonista) could be enough for some bioinformaticians, though…
  21. 21. Workflow using git(≒ how to branch) There are several practice of branching but the following are the principle rule • 1 feature 1 branch • Master always have to be deployable 出典:https://www.atlassian.com/ja/git/workflows#!workflow-gitflow
  22. 22. • Hosting service for Git • Filing issue for every subject makes project trackable Coding -> Pull Request -> Review -> merge By following this flow, Source code becomes less dependent to particular person
  23. 23. Workflow using Git&github Work in local repository push Pull Request Code Review merge Fork & clone Ref: http://acrl.ala.org/techconnect/post/coding-collaboration-on-github
  24. 24. Workflow using Git&github Work in local repository push Pull Request Code Review merge Fork & clone Ref: http://acrl.ala.org/techconnect/post/coding-collaboration-on-github Ticketing ↓ Issue Tracking Buid test ↓ CI
  25. 25. eXtreme Programming(XP) One way of agile incarnation Focus on maintainability of Code • Test Driven Development(TDD) • Pare Programming • Joint ownership of code • Continuous Integration (CI) • Issue Tracking
  26. 26. Continuous Integration(CI) • Run automated test constantly • Makes easy to track a Problem Jenkins: The CI tool Ref: http://www.slideshare.net/whyme/jenkins-reviewbot
  27. 27. Github and CI tool Run test every time pushing remote Common Combination is Github + [travisCI or jenkins] Ref: https://github.com/hltfbk/Excitement-Open-Platform/wiki/Developers
  28. 28. eXtreme Programming(XP) One way of agile incarnation Focus on maintainability of Code • Test Driven Development(TDD) • Pare Programming • Joint ownership of code • Continuous Integration (CI) • Issue Tracking
  29. 29. Practice for Issue tracking • Rough schedule is tracked by Gantt chart, burn down chart Ref: https://en.wikipedia.org/wiki/Gantt_chart Ref: http://chandoo.org/wp/2009/07/21/burn-down-charts/ • More precise schedule will be managed by Tickets or issues Redmine Github + Zenhub Burn down chart Gantt chart
  30. 30. Test Driven Development(TDD) • Manage task Centrally as Ticket • Make small tasks clear and trackable 出典:http://itpro.nikkeibp.co.jp/article/COLUMN/20130927/507265/?SS=imgview&FD=55983188&ST=devops Is a commonly used tool
  31. 31. Waterfall Agile DevOps
  32. 32. DevOps • Extending “Agile” from Development to operation That is .. • Reflect changes to working system instantly when we update a code. Not only developing a software. But to Develop a Whole System.
  33. 33. Technologies for Devops •Virtualization using container • Configuration Management tool http://blog.xebialabs.com/2014/12/05/rocket-vs-docker-myth-simple-lightweight-enterprise-platform/ Fabric
  34. 34. Technologies for Devops •Virtualization using container • Configuration Management tool http://blog.xebialabs.com/2014/12/05/rocket-vs-docker-myth-simple-lightweight-enterprise-platform/ Fabric
  35. 35. Tipical Situation in bioinformatics • Small daily analysis on laptop Realize necessity of computation power Move pipeline to High- performance server Able to use Cloud? Use CloudBiolinux or other VM image From bioimg.org _人人人人人人人人人人_ > dependency hell<  ̄Y^Y^Y^Y^Y^Y^Y^Y^Y^Y ̄ Software (or package) Version difference _人人人人人人人人人人_ > No Reproducibility<  ̄Y^Y^Y^Y^Y^Y^Y^Y^Y^Y ̄
  36. 36. Container Virtualization(docker) • Include wholeThird-Party developed software into one container. • Build Once Run Anywhere • Version-controlable and has Github-like Hosting service Easy to transport between servers Develop whole container as “Software”
  37. 37. Progress of Virtualization chroot、cgroups KVM、Virtualbox Isolation of file and process space OS Virtualization • Heavy • Non-easy for Provisioning • Hard to use base image • (chroot has) a danger for depletion of computation resource by 1 user. Tries to take advantage of both
  38. 38. Emergence of Counterforce • Security problem • Dockerfile problem • Portablity problem Some bugs around caching? Peculiar way of writing ->Better to use packer Become root is must Better to be run on Linux kernel version (>= 3.8) Cloudius OSV Problem of Docker Not user-friendly enough so far Not enough community resource such as Base image Not mature enough to use
  39. 39. Technologies for Devops •Virtualization using container • Configuration Management tool http://blog.xebialabs.com/2014/12/05/rocket-vs-docker-myth-simple-lightweight-enterprise-platform/ Fabric
  40. 40. Infrastructure as code • Maintain Server condfiguration as Code • Assure to be idempotent • Easily transport pipelines between servers Fabric Ruby base Python base Chef Zero simple • Chef requires users to remember fancy jargons • CloudBiolinux supports Fabric Better to start from fabric complex

×