Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Sequence Services Phase 2--Eagle Genomics and Cycle Computing

1,040 views

Published on

William Spooner (Eagle) and Carl Chesal (Cycle) introduce the proof of concept provided by this consortium for Phase 2 of the Pistoia Alliance Sequence Services project. The presentation was delivered at the Pistoia Alliance Conference in Boston, MA, on April 24, 2012.

Published in: Technology, Business
  • Be the first to comment

Sequence Services Phase 2--Eagle Genomics and Cycle Computing

  1. 1. Sequence Services Phase 2Pistoia Alliance AGM, Boston MA, April 24th 2012
  2. 2. Collaborate Explore Enterprise Work together Academia to find a Government common Foundations Open purpose Innovation Nurture Exploit Build trust, Turn ideas into shared tangible benefits language2/ ElasticAP, Pistoia Alliance Conference, Boston MA, 24th April 2012
  3. 3. The Requirements FUNCTIONAL NON-FUNCTIONAL Login and workspace Charging Model Manage users Service Support Manage data Operational Upload private data Requirements Access public data Security Requirements Export Delete/archive Manage applications Share Upload scripts/pipelines Analyse data Monitor use/performance3/ ElasticAP, Pistoia Alliance Conference, Boston MA, 24th April 2012
  4. 4. The Partnership Established: 2005 2008 Domain: High performance Operational bioinformatics computing Employees: 18, 16 engineers 12, 9 engineers, pool of external consultants Location: Across USA/Canada Cambridge, UK Sectors: Pharmaceutical, Pharmaceutical, biotechnology, financial, biotechnology, agri- computer gaming, biotechnology, consumer engineering, academia. goods, food, other life sciences. Customers: North America, Europe North America, Europe, Asia Partnerships: Schrodinger, VMWare, Amazon Web Services, Canonical Cognizant, European4/ ElasticAP, Pistoia Alliance Conference, Boston MA, 24th April 2012 Bioinformatics Institute,
  5. 5. The Platform The platform for storage, analysis and sharing of life sciences data in the cloud5/ ElasticAP, Pistoia Alliance Conference, Boston MA, 24th April 2012
  6. 6. The Proposal CIO Bioinformatician Start Upload Stored ANALYSES Depositor data Manual process Biologist Pipeline Stored process Stop Share Collaborator data6/ ElasticAP, Pistoia Alliance Conference, Boston MA, 24th April 2012
  7. 7. The Architecture Customer Bioinformaticia Collaborato Single Authenticate Depositor Sign On n r SAML Token Exchange SAML HTTPS OpenAM Web IdP Amazon EC2 Cloud Gateway Shiboleth HTTPS Assets HTTPS MySQL DB Web Web Web Server SEEK Web Server CycleCloud HTTPS Data HTTPS Web Data Files Condor Web Data Fi Files Data Fi Data Fi Data Fi Encrypt/ Ensembl Decrypt BioLinux Customer Customer Sandbox Customer Sandbox S3 Storage Sandbox EC2/AMIs EC2/AMIs7/ ElasticAP, Pistoia Alliance Conference, Boston MA, 24th April 2012
  8. 8. The Present Collaborator Depositor Bioinformatician8/ ElasticAP, Pistoia Alliance Conference, Boston MA, 24th April 2012
  9. 9. The1000 Genomes• A Deep Catalogue of Human Variation – Freely available on AWS – 1,700 Individuals – 200Tb data – 10,000s data files – Almost no metadata!• ElasticAP evaluating 1000 Genomes Project Pilot 2 – 20X resequencing – 2 trios (6 individuals)
  10. 10. TRUP: Tumor RNA-seq Unified Pipeline• Collaboration between – Max Planck Institute for Molecular Genetic – Bayer Pharma AG• Identifies gene fusion events in tumor samples• Involves both alignment and de- novo sequencing steps• Pipeline is being implemented on ElasticAP – Using public GEO datasets for
  11. 11. The PoC FUNCTIONAL NON-FUNCTIONAL Login and workspace Charging Model Load data Service Support Manage public data Operational Load scripts and Requirements pipelines Security Requirements Analyse data Export data Archive data KEY Manage applications Fully implemented Partially implemented Manage users To-do list Monitor use/performance11/ ElasticAP, Pistoia Alliance Conference, Boston MA, 24th April 2012
  12. 12. The Prior Art • Eagle have been building analysis pipelines and hosting secure cloud apps for years. • Cycle have been developing HPC solutions and deploying them on the cloud for years • We built this as a platform we could use ourselves in order to carry on delivering what we already do. • But now the results are interactive, and everyone can share and participate. • The most common tasks won’t need to involve us at all. th12/ ElasticAP, Pistoia Alliance Conference, Boston MA, 24 April 2012
  13. 13. The Price • AWS-style pay as you go business model – Free sign-up and account creation – Tiered applications by the hour. – Discounts for up-front reservation fee. – Offline data import/export also available. – Flat-rate data by the gigabyte-month. – Backup data by the gigabyte-month. – Monthly billing. – Support contracts available. • Customisation and new pipelines at Eagle/Cycle standard consulting rates.13/ ElasticAP, Pistoia Alliance Conference, Boston MA, 24th April 2012
  14. 14. The Plan • Early access to preferred partner customers in July – talk to us now if you’d like to be part of that. • Full production in September with all partial/todo items implemented. • Increased number of public datasets. • Increased range of applications and pipelines. • User interface improvements based on feedback from early access period.14/ ElasticAP, Pistoia Alliance Conference, Boston MA, 24th April 2012
  15. 15. The Potential • Available as customisation projects: – Conversions to other clouds. – Conversions to run on in-house infrastructure. • Truly secure and scalable R&D collaboration environment. – Applicable to all sciences, not just genomics. Change the way you do science15/ ElasticAP, Pistoia Alliance Conference, Boston MA, 24th April 2012
  16. 16. Will Spoonerwill.spooner@eaglegenomics.com www.eaglegenomics.com

×