Your SlideShare is downloading. ×
An Open Source Framework for Teaching BIoinformatics
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

An Open Source Framework for Teaching BIoinformatics

3,803

Published on

Title: An Open Source Framework for Teaching Bioinformatics …

Title: An Open Source Framework for Teaching Bioinformatics
Author: Kam Dalquist

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,803
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
120
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Both relatively new to LMU Dondi’s background in medical informatics, data visualization, person-computer interactions During my postdoc I had served as project manager for GenMAPP, want to extend features of GenMAPP, especially for other species I am not a software developer (last time I took a computer science class was AP Pascal in high school), but I’ve had a lot of experience interacting with developers I’m proud of GenMAPP, especially that it is user-friendly for biologists, and is relatively bug free (result of my extensive testing) However, I never would have been standing up in this community to talk about it because although we believed strongly that GenMAPP should be free-of-charge, we were slow to make the source code available (it is now available on SourceForge) It has only been my collaboration with Dondi that I have been educated as to what Open Source software development truly means (Cathedral and Bazaar) This is the perfect forum for talking about our work because, while I am using the fruits of XMLPipeDB for GenMAPP as first imagined, we designed this project to have components that are resusable for other purposes and that the bioinformatics developer community is our target audience
  • Transcript

    • 1. BOSC Vienna, Austria July 20, 2007 Kam D. Dahlquist Department of Biology John David N. Dionisio Department of Electrical Engineering & Computer Science Loyola Marymount University An Open Source Framework for Teaching Bioinformatics
    • 2. Outline
      • Motivation
      • Open source culture
      • Implementation
      • --Computer science curriculum
      • --Bioinformatics course
    • 3. Scientific Computing and the Digital Divide Wilson GV (2006) Where’s the real bottleneck in scientific computing? American Scientist 94:5–6. Scientists who come to computer science after being trained in a different primary discipline often have to rediscover, relearn, or keep up with work in the computer science and software development realms in order to get the most out of their work. This causses unecessary and unknowing repetitions of past discoveries and errors. Tools or paradigms that are out-of-date in computer science and software engineering remain in place. At worst, software flaws slow or impede research. Baxter SM, Day SW, Fetrow JS, and Reisinger SJ (2006) Scientific software development is not an oxymoron. PLoS Computational Biology 2:e87.
    • 4. The Disconnect Between Undergraduate Computer Science Training and Expectations and Skill Sets Required for Industry and Research Undergraduate Training Industry Expectation Work alone Work in a team “ Toy” programs and algorithms Large, modular project Throwaway code Code longevity (for better or worse)
    • 5. inroads – The SIGCSE Bulletin, Volume 39, Number 2, 2007 June, pp. 70-74 http://recourse.cs.lmu.edu/
    • 6. Official Open Source Definition (version 1.9) Free redistribution Source code Derived works Integrity of the author’s source code No discrimination against persons or groups No discrimination against fields of endeavor Distribution of license License must not be specific to a product License must not restrict other software License must be technology-neutral
    • 7. Open Source Values
      • Source code is available, modifiable, and long-lived
      • Accountability implies community
      • --explicit “paper trail” for how a particular program
      • changes over time
      • --communication and answerability
      • Responsibilities accompany rights
      • --explicit consideration of the license
      • --giving credit where it is due
      • --appropriate access to software, documentation, and
      • communities
      • --acknowledging security, privacy, legal issues
    • 8. Open Source Teaching Framework
      • Source Code:
      • All code resides in a centralized, public repository
      • As much as possible, everyone’s code is visible to
      • everyone else for code review or team fixing
      • No code is thrown away, it remains available to future
      • “ generations”
    • 9. Open Source Teaching Framework
      • Source Code:
      • All code resides in a centralized, public repository
      • As much as possible, everyone’s code is visible to
      • everyone else for code review or team fixing
      • No code is thrown away, it remains available to future
      • “ generations”
      • Quality & Community:
      • Documentation, inline and online
      • Automated tests
      • Constructive code review, beyond “does it work?”
      • Long-term projects release early, release often
      • Form collaborative communities among faculty,
      • students, classes, and projects
    • 10. Progression through Computer Science Curriculum
      • Sample code bazaar
      • --the creation and maintenance of live, organized, searchable,
      • student accessible sample code libraries
      • --students check out, modify, and check in code
      • Test infection (Erich Gamma)
      • --test suite vs. implementation matrix
      • Life cycle of code
      • --as juniors and seniors, students revisit code written the first
      • year
      • --direct experience with need for documentation
      • Release early, release often
      • --applies to both students and faculty
    • 11. “ CourseForge” A Hardware + Software Infrastructure for Supporting the Teaching Framework
      • Certain teaching elements are impractical without
      • some degree of automation
      • Support platform for management of student work
      • --revision control
      • --test frameworks
      • --communication
      • --issue tracking
      • Derived from open source software, delivered as
      • open source software — the system will interoperate
      • with existing open source tools
    • 12. CMSI 698: Special Studies in Bioinformatics
      • Team-taught by a biologist and a computer scientist
    • 13. CMSI 698: Special Studies in Bioinformatics
      • Team-taught by a biologist and a computer scientist
      • Enrollment in Spring 2006:
      • -- eight students from Master’s degree
      • program in Computer Science
      • -- several coming from aerospace industry
      • -- none with more than college-level introductory biology
    • 14. CMSI 698: Special Studies in Bioinformatics
      • Team-taught by a biologist and a computer scientist
      • Enrollment in Spring 2006:
      • -- eight students from Master’s degree
      • program in Computer Science
      • -- several coming from aerospace industry
      • -- none with more than college-level introductory biology
      • Project-based class began development of XMLPipeDB
    • 15. CMSI 698: Special Studies in Bioinformatics
      • Team-taught by a biologist and a computer scientist
      • Enrollment in Spring 2006:
      • -- eight students from Master’s degree
      • program in Computer Science
      • -- several coming from aerospace industry
      • -- none with more than college-level introductory biology
      • Project-based class began development of XMLPipeDB
      • XMLPipeDB development continued by four students
      • in summer session course entitled Open Source Software
      • Development Workshop
    • 16. CMSI 698: Special Studies in Bioinformatics
      • Team-taught by a biologist and a computer scientist
      • Enrollment in Spring 2006:
      • -- eight students from Master’s degree
      • program in Computer Science
      • -- several coming from aerospace industry
      • -- none with more than college-level introductory biology
      • Project-based class began development of XMLPipeDB
          • --authentic bioinformatics problem to solve
      • XMLPipeDB development continued by four students
      • in summer session course entitled Open Source Software
      • Development Workshop
      • One student continued development for Master’s thesis
    • 17. XMLPipeDB Project Management: Lessons Learned
      • Students on the project had varying levels of maturity,
      • knowledge, and skill coming into the project
      • -- some naturally took on a leadership role
      • -- some hung back or did the minimum required to get by
    • 18. XMLPipeDB Project Management: Lessons Learned
      • Students on the project had varying levels of maturity,
      • knowledge, and skill coming into the project
      • -- some naturally took on a leadership role
      • -- some hung back or did the minimum required to get by
      • Needed to increase communication and sense of team
      • -- students preferred to interact with faculty for questions,
      • rather than each other
      • -- bug trackers and developer’s forum used only sporadically
      • -- implemented weekly reports on Wiki to increase accountability
    • 19. XMLPipeDB Project Management: Lessons Learned
      • Students on the project had varying levels of maturity,
      • knowledge, and skill coming into the project
      • -- some naturally took on a leadership role
      • -- some hung back or did the minimum required to get by
      • Needed to increase communication and sense of team
      • -- students preferred to interact with faculty for questions,
      • rather than each other
      • -- bug trackers and developer’s forum used only sporadically
      • -- implemented weekly reports on Wiki to increase accountability
      • [SourceForge servers were frequently down during class]
    • 20. XMLPipeDB Project Management: Lessons Learned
      • Students on the project had varying levels of maturity,
      • knowledge, and skill coming into the project
      • -- some naturally took on a leadership role
      • -- some hung back or did the minimum required to get by
      • Needed to increase communication and sense of team
      • -- students preferred to interact with faculty for questions,
      • rather than each other
      • -- bug trackers and developer’s forum used only sporadically
      • -- implemented weekly reports on Wiki to increase accountability
      • [SourceForge servers were frequently down during class]
      • 6 months from conception to product
    • 21. XMLPipeDB Project Management: Lessons Learned
      • Students on the project had varying levels of maturity,
      • knowledge, and skill coming into the project
      • -- some naturally took on a leadership role
      • -- some hung back or did the minimum required to get by
      • Needed to increase communication and sense of team
      • -- students preferred to interact with faculty for questions,
      • rather than each other
      • -- bug trackers and developer’s forum used only sporadically
      • -- implemented weekly reports on Wiki to increase accountability
      • [SourceForge servers were frequently down during class]
      • 6 months from conception to product
      • Even the weakest student contributed useable code
    • 22. Take-home Messages
      • Catch them early
      • Making open source values and software development best practices explicit will produce better software and better computer scientists
    • 23. XSD-to-DB Adam Carasso Jeffrey Nicholas Scott Spicer XMLPipeDBUtils David Hoffman Babak Naffas Jeffrey Nicholas Ryan Nakamoto UniProtDB Joe Boyle Joey Barrett GODB Scott Spicer Roberto Ruiz GenMAPP Builder Joey Barrett Jeffrey Nicholas Scott Spicer Special Thanks GenMAPP.org Development Group Caskey L. Dickson, Wesley T. Citti NSF CCLI Program (http://recourse.cs.lmu.edu) http://xmlpipedb.cs.lmu.edu LMU Bioinformatics Group Kam D. Dahlquist http://myweb.lmu.edu/kdahqui [email_address] John David N. Dionisio http://myweb.lmu.edu/dondi [email_address]

    ×