The document summarizes the Open HeliSphere project which aims to make the source code and bioinformatics pipeline for Helicos Genetic Sciences' single molecule sequencing platform openly available. The project will provide access to pre-release source code, documentation, and data through an open source website and infrastructure while dual licensing the code under GPL and commercial terms. The document outlines Helicos' hybrid open/commercial development model and provides details about their single molecule sequencing technique and bioinformatics pipeline for digital gene expression analysis.
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Kitzmiller Openhelisphereproject Bosc2008
1. The Open HeliSphere ™ project True open source from the inventors of True Single Molecule Sequencing (tSMS ™) . Aaron Kitzmiller BOSC 2008
2.
3. Single Molecule Sequencing by Synthesis Hybridize Primer 1 ~1/um 2 T G A A C G T G A A C G T G A A C G 5’ 5’ T A C T T G C C G C A A C T T G C A C T T G C C T A C T G A C G T C T T
4. Extend ‘ G’ Single Molecule Sequencing by Synthesis G G G G G G G G T G A A C G T G A A C G T G A A C G 5’ 5’ T A C T T G C C G C A A C T T G C A C T T G C C T A C T G A C G T C T T
5. Wash SM Sequence by Synthesis G G G G G G G G T G A A C G T G A A C G T G A A C G 5’ 5’ T A C T T G C C G C A A C T T G C A C T T G C C T A C T G A C G T C T T T G A A C G T A C T T G C C G C A T G A A C G A C T T G C T G A A C G A C T T G C C T A C T G A C G T C T G G 5’ 5’ T
6. Image SM Sequence by Synthesis T G A A C G T A C T T G C C G C A T G A A C G A C T T G C T G A A C G A C T T G C C T A C T G A C G T C T G G 5’ 5’ T G G G G G G G G T G A A C G T G A A C G T G A A C G 5’ 5’ T A C T T G C C G C A A C T T G C A C T T G C C T A C T G A C G T C T T T G A A C G T A C T T G C C G C A T G A A C G A C T T G C T G A A C G A C T T G C C T A C T G A C G T C T G G 5’ 5’ T
7. Cleave SM Sequence by Synthesis T G A A C G T A C T T G C C G C A T G A A C G A C T T G C T G A A C G A C T T G C C T A C T G A C G T C T G G 5’ 5’ T T G A A C G T A C T T G C C G C A T G A A C G A C T T G C T G A A C G A C T T G C C T A C T G A C G T C T G G 5’ 5’ T T G A A C G T A C T T G C C G C A T G A A C G A C T T G C T G A A C G A C T T G C C T A C T G A C G T C T G G 5’ 5’ T G G G G G G G G T G A A C G T G A A C G T G A A C G 5’ 5’ T A C T T G C C G C A A C T T G C A C T T G C C T A C T G A C G T C T T
8.
9. Raw data collection - C - A G C T - - C T - G - T A - C T - G - - A G - - A - - - - A - C - A G C - - G - - - G - T - G - - - - - - - G X C T A G C T A G C T A G C T A G C T A G C T A G C T A G - C - A - C T - - C - - G C - A - - T - - C - A - - T - G - - - A G - - A - - T - - C - A - - T - - - - A - C T - - - - - - G - T A - - T - G - - - - - T A - - T A G - - - -
22. Length distributions (yeast DGE experiment) Raw: Unfiltered reads, 6mer and above Filtered : Quality score filter, AT < 0.9, BAO dinuc<0.7, trim leading Ts, length >= 20, alignment against BAO, P102 Aligned : Normalized score >= 4 Company confidential
23. Error rates and alignments (yeast DGE experiment) Error-rates were assessed using samples of alignments with normalized alignment score ≥4 to a high-expresser (YLR110C/CCW12) 6.55% 0.44% 4.72% 1.39% Total Sub Del Ins GACGT-TATG G GTGATGGTAGTAACGATGATGACGAAGA-TAATGTAGACCCGCTGC-A C CGTGCTAAACAATCC Reference GACGT-TATG A GTGATGGTAGTAACGATGATGACGAAGA-TAATGTAGACCCGCTGC-A T CGTGCTAAACAATCC Consensus --------------------------------------------------------------------------- TGATGGTAGTAACGATGATGACGAAGA-TAA CCCGCTG--A T CGTGCTAAACA-TC Reads GACGT-TATG A GTGATGGTAGTAACGATGATGA-GAAGA GC-A T CGTGCTAAACA-TCC A-GTATATG A GTGATGGTAGTAACGATGATGACGAAGAATA A T CGTGCTAAACAATCC GACGT-TATG A GTGATGGTAGTAACGATGATGACGA AATGTAGACCCGCTGC-A T CGTGCTAAACAATCC ACGT-TATG A GTGATG-TAGTAACGATGATGACGAAGA-TAA GACGT-TATG A GT ACGAAGA-TAATGTAGACCCGCTGCTA T CGT-CTA GACGT-TATG A GTGATG-TA GA-TAATGTAGACCTGC-GC-A T CGTGCTAAACAA GACGT-TATG A GTGATG GA-TAAT-TAGACCCGCTG--A T CGTG-TAA-CAA GACGT-TATG A GTGATGGTAGTAACGATGATGACG
24.
25. Hybrid development model Source code repository Read-only source code subset User-owned packages Secure sync Company firewall