SlideShare a Scribd company logo
Cloud BioLinux: open source, fully-customizable
 bioinformatics computing on the cloud for the
       genomics community and beyond

          BOSC 2011 - Vienna, Austria



                  Ntino Krampis, PhD
                     Asst. Professor
            J. Craig Venter Institute (JCVI)
                 agbiotec@gmail.com
The community is what makes an open source project


Brad Chapman, Tim Booth, Mesude Bicak, Dawn Field, Dan Pass –
core development and planning

Enis Afgan, Pjotr Prins, Stephen Möller -
and all other members of the cloud biolinux community that move it fwd

J. Craig Venter Inst. -
time allowed to work on an open-source project
Expensive sequencing and large organizations
                   Commodity sequencing and small labs

●
    large sequencing center, multi-million, broad-impact sequencing projects
●   dedicated bioinformatics department, compute clusters


●   small-factor, bench-top sequencer available: GS Junior by 454
●   sequencing as a standard technique in basic biology and genetics research
●   RNAseq and ChiPseq, and each biologist will be tackling a metagenome
Will small labs become the long tail of sequencing ?




   amount of
   sequencing         Credit: WikiMedia Commons




                  number of labs
“Bioinformatics nation is a land of city-states” Lincoln Stein

●   small labs building small-scale bioinformatics infrastructures
●   duplication of effort in compiling and installing software tools
●   some groups have no hardware, expertise, or time to install and run software

●   NEBC BioLinux ( tinyurl.com/BioLinux-NEBC ) 100+ pre-configured tools
●   example: glimmer, hmmer, phylip, rasmol, genespring, clustalw, EMBOSS



    how about large-scale sequence datasets ?
Cloud BioLinux
    pre-configured and on-demand bioinformatics computing on the cloud



                      ●
                          JCVI cloud computing research
                      ●   NEBC BioLinux software repository
      +               ●   community effort – Hackathon / BOSC 2010 - 11
                      ●   Virtual Machine (VM) on Amazon cloud

                        large-scale computing independently of
      =
                      ●

                      institutional or geographic boundaries
                      ●   only need a desktop computer with internet access



cloudbiolinux.org
simple for end-users             signup at
                                             aws.amazon.com




http://tinyurl.com/cloud-biolinux-tutorial
Amazon EC2
→
linux desktop
via remote
desktop client
What if I want to
    share my
alignments with
a collaborator?

save your data as
   a new VM

  0.10$ / GB /
     month

at 15GB, it costs
  1.5$ / month
“whole system snapshot exchange” (Dudley and Butte 2010)
capture the state of the computing system and data
software execution parameters and “massaged” input datasets
Cloud BioLinux developer's framework
        create cloud VM / images with standardized software configurations


●   customize Cloud BioLinux based on community requirements

●   mix and match software from NEBC or other (DebianMed, Scientific Linux etc.)

●   share customized VMs with collaborators, avoiding effort duplication

●   deploy Cloud BioLinux on private and local clouds
software domains in bioinformatics: nextgen
sequencing, de novo assembly, annotation, phylogeny,
    molecular structures, gene expression analysis


        github.com/chapmanb/cloudbiolinux
Cloud BioLinux developer's framework


    ●   based on python-fabric auto-deployment tool

    ●   software components listed in plain text files

    ●   collaborators use files to share descriptions of cloud VM / images

    ●   start with a bare-bones VM / image

    ●   fabric downloads and installs specified software




tinyurl.com/python-fabric
Cloud Biolinux
                                      The future


●   groups.google.com/cloudbiolinux and cloudbiolinux.org

●   expand community, receive feedback, add more software to the VM

●   scalable computing: SGE (Galaxy Cloudman), Hadoop (cloudgene.uibk.ac.at)

●   add next-gen sequencing pipelines, NIH funding - adds effort in development

●   We just had a 2-day codefest at the MetaLab, http://metalab.at/
and before I finish
this talk....
Thank you !

More Related Content

Viewers also liked

F05-Cloud-Sequencescape
F05-Cloud-SequencescapeF05-Cloud-Sequencescape
F05-Cloud-Sequencescape
Bioinformatics Open Source Conference
 
G04-Misc-Debianmed
G04-Misc-DebianmedG04-Misc-Debianmed
D02-NextGenSeq-MOLGENIS
D02-NextGenSeq-MOLGENISD02-NextGenSeq-MOLGENIS
D02-NextGenSeq-MOLGENIS
Bioinformatics Open Source Conference
 
G03-SemanticWeb-OntoCAT
G03-SemanticWeb-OntoCATG03-SemanticWeb-OntoCAT
G03-SemanticWeb-OntoCAT
Bioinformatics Open Source Conference
 
F07-Cloud-Hadoop-BAM
F07-Cloud-Hadoop-BAMF07-Cloud-Hadoop-BAM
Bosc2011 arakawa
Bosc2011 arakawaBosc2011 arakawa

Viewers also liked (6)

F05-Cloud-Sequencescape
F05-Cloud-SequencescapeF05-Cloud-Sequencescape
F05-Cloud-Sequencescape
 
G04-Misc-Debianmed
G04-Misc-DebianmedG04-Misc-Debianmed
G04-Misc-Debianmed
 
D02-NextGenSeq-MOLGENIS
D02-NextGenSeq-MOLGENISD02-NextGenSeq-MOLGENIS
D02-NextGenSeq-MOLGENIS
 
G03-SemanticWeb-OntoCAT
G03-SemanticWeb-OntoCATG03-SemanticWeb-OntoCAT
G03-SemanticWeb-OntoCAT
 
F07-Cloud-Hadoop-BAM
F07-Cloud-Hadoop-BAMF07-Cloud-Hadoop-BAM
F07-Cloud-Hadoop-BAM
 
Bosc2011 arakawa
Bosc2011 arakawaBosc2011 arakawa
Bosc2011 arakawa
 

Similar to Bosc2011 ntino-krampis-full

CHPC Workshop Morning Session
CHPC Workshop Morning SessionCHPC Workshop Morning Session
CHPC Workshop Morning Session
Ntino Krampis
 
Chi next gen-ntino-krampis
Chi next gen-ntino-krampisChi next gen-ntino-krampis
Chi next gen-ntino-krampis
Ntino Krampis
 
Cloud BioLinux S.Africa
Cloud BioLinux S.AfricaCloud BioLinux S.Africa
Cloud BioLinux S.Africa
Ntino Krampis
 
Cloud ntino-krampis
Cloud ntino-krampisCloud ntino-krampis
Cloud ntino-krampis
Ntino Krampis
 
CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018
Krishna-Kumar
 
Bionimbus - Northwestern CGI Workshop 4-21-2011
Bionimbus - Northwestern CGI Workshop 4-21-2011Bionimbus - Northwestern CGI Workshop 4-21-2011
Bionimbus - Northwestern CGI Workshop 4-21-2011
Robert Grossman
 
Raspberry pi x kubernetes x tensorflow
Raspberry pi x kubernetes x tensorflowRaspberry pi x kubernetes x tensorflow
Raspberry pi x kubernetes x tensorflow
霈萱 蔡
 
Understanding Kubernetes
Understanding KubernetesUnderstanding Kubernetes
Understanding Kubernetes
Tu Pham
 
Cloud computing and bioinformatics
Cloud computing and bioinformaticsCloud computing and bioinformatics
Cloud computing and bioinformatics
Enis Afgan
 
Towards-cloud-native-HPC.pdf
Towards-cloud-native-HPC.pdfTowards-cloud-native-HPC.pdf
Towards-cloud-native-HPC.pdf
Walid Shaari
 
Federating Infrastructure as a Service cloud computing systems to create a un...
Federating Infrastructure as a Service cloud computing systems to create a un...Federating Infrastructure as a Service cloud computing systems to create a un...
Federating Infrastructure as a Service cloud computing systems to create a un...
David Wallom
 
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...
PranavPatil822557
 
2015 04 bio it world
2015 04 bio it world2015 04 bio it world
2015 04 bio it world
Chris Dwan
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 Keynote
Globus
 
20160629 Habitat Introduction: Austin DevOps/Mesos User Group
20160629 Habitat Introduction: Austin DevOps/Mesos User Group 20160629 Habitat Introduction: Austin DevOps/Mesos User Group
20160629 Habitat Introduction: Austin DevOps/Mesos User Group
Matt Ray
 
Oscon 2017: Build your own container-based system with the Moby project
Oscon 2017: Build your own container-based system with the Moby projectOscon 2017: Build your own container-based system with the Moby project
Oscon 2017: Build your own container-based system with the Moby project
Patrick Chanezon
 
Nimbus Concept
Nimbus ConceptNimbus Concept
Nimbus Concept
Nimbus Concept
 
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...
OpenNebula Project
 
Cytoscape: Now and Future
Cytoscape: Now and FutureCytoscape: Now and Future
Cytoscape: Now and Future
Keiichiro Ono
 
Moeller bosc2010 debian_taverna
Moeller bosc2010 debian_tavernaMoeller bosc2010 debian_taverna
Moeller bosc2010 debian_taverna
BOSC 2010
 

Similar to Bosc2011 ntino-krampis-full (20)

CHPC Workshop Morning Session
CHPC Workshop Morning SessionCHPC Workshop Morning Session
CHPC Workshop Morning Session
 
Chi next gen-ntino-krampis
Chi next gen-ntino-krampisChi next gen-ntino-krampis
Chi next gen-ntino-krampis
 
Cloud BioLinux S.Africa
Cloud BioLinux S.AfricaCloud BioLinux S.Africa
Cloud BioLinux S.Africa
 
Cloud ntino-krampis
Cloud ntino-krampisCloud ntino-krampis
Cloud ntino-krampis
 
CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018
 
Bionimbus - Northwestern CGI Workshop 4-21-2011
Bionimbus - Northwestern CGI Workshop 4-21-2011Bionimbus - Northwestern CGI Workshop 4-21-2011
Bionimbus - Northwestern CGI Workshop 4-21-2011
 
Raspberry pi x kubernetes x tensorflow
Raspberry pi x kubernetes x tensorflowRaspberry pi x kubernetes x tensorflow
Raspberry pi x kubernetes x tensorflow
 
Understanding Kubernetes
Understanding KubernetesUnderstanding Kubernetes
Understanding Kubernetes
 
Cloud computing and bioinformatics
Cloud computing and bioinformaticsCloud computing and bioinformatics
Cloud computing and bioinformatics
 
Towards-cloud-native-HPC.pdf
Towards-cloud-native-HPC.pdfTowards-cloud-native-HPC.pdf
Towards-cloud-native-HPC.pdf
 
Federating Infrastructure as a Service cloud computing systems to create a un...
Federating Infrastructure as a Service cloud computing systems to create a un...Federating Infrastructure as a Service cloud computing systems to create a un...
Federating Infrastructure as a Service cloud computing systems to create a un...
 
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...
 
2015 04 bio it world
2015 04 bio it world2015 04 bio it world
2015 04 bio it world
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 Keynote
 
20160629 Habitat Introduction: Austin DevOps/Mesos User Group
20160629 Habitat Introduction: Austin DevOps/Mesos User Group 20160629 Habitat Introduction: Austin DevOps/Mesos User Group
20160629 Habitat Introduction: Austin DevOps/Mesos User Group
 
Oscon 2017: Build your own container-based system with the Moby project
Oscon 2017: Build your own container-based system with the Moby projectOscon 2017: Build your own container-based system with the Moby project
Oscon 2017: Build your own container-based system with the Moby project
 
Nimbus Concept
Nimbus ConceptNimbus Concept
Nimbus Concept
 
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...
 
Cytoscape: Now and Future
Cytoscape: Now and FutureCytoscape: Now and Future
Cytoscape: Now and Future
 
Moeller bosc2010 debian_taverna
Moeller bosc2010 debian_tavernaMoeller bosc2010 debian_taverna
Moeller bosc2010 debian_taverna
 

More from Bioinformatics Open Source Conference

Running workflows through galaxy bosc presentation
Running workflows through galaxy bosc presentationRunning workflows through galaxy bosc presentation
Running workflows through galaxy bosc presentation
Bioinformatics Open Source Conference
 
Talk1 ben sadi for_gmod_bosc_2011
Talk1 ben sadi for_gmod_bosc_2011Talk1 ben sadi for_gmod_bosc_2011
Talk1 ben sadi for_gmod_bosc_2011
Bioinformatics Open Source Conference
 
Bosc mercer
Bosc mercerBosc mercer
Mobyle 1 0_new_features_new_types_of_service
Mobyle 1 0_new_features_new_types_of_serviceMobyle 1 0_new_features_new_types_of_service
Mobyle 1 0_new_features_new_types_of_service
Bioinformatics Open Source Conference
 
Bosc2011 isobar-fbp
Bosc2011 isobar-fbpBosc2011 isobar-fbp
Talk6 biopython bosc2011
Talk6 biopython bosc2011Talk6 biopython bosc2011
Talk6 biopython bosc2011
Bioinformatics Open Source Conference
 
Unipro ugene bosc 2011 update
Unipro ugene bosc 2011 updateUnipro ugene bosc 2011 update
Unipro ugene bosc 2011 update
Bioinformatics Open Source Conference
 
Bosc talk 7-15-2011x
Bosc talk 7-15-2011xBosc talk 7-15-2011x
B07-GenomeContent-Biomart
B07-GenomeContent-BiomartB07-GenomeContent-Biomart
B07-GenomeContent-Biomart
Bioinformatics Open Source Conference
 
B03-GenomeContent-Intermine
B03-GenomeContent-IntermineB03-GenomeContent-Intermine
B03-GenomeContent-Intermine
Bioinformatics Open Source Conference
 
F06-Cloud-Enabling NGS
F06-Cloud-Enabling NGSF06-Cloud-Enabling NGS
D03-NextGen-Bio-NGS
D03-NextGen-Bio-NGSD03-NextGen-Bio-NGS
F01-Cloud-Mygene.info
F01-Cloud-Mygene.infoF01-Cloud-Mygene.info
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systemsA01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
Bioinformatics Open Source Conference
 
F03-Cloud-Obiwee
F03-Cloud-ObiweeF03-Cloud-Obiwee
C02-Visualization-Applying visual analytics
C02-Visualization-Applying visual analyticsC02-Visualization-Applying visual analytics
C02-Visualization-Applying visual analytics
Bioinformatics Open Source Conference
 
B04-GenomeContent-EasyDAS
B04-GenomeContent-EasyDASB04-GenomeContent-EasyDAS
B04-GenomeContent-EasyDAS
Bioinformatics Open Source Conference
 
G07-Misc-Gmod
G07-Misc-GmodG07-Misc-Gmod
G09-Misc-EMBOSS
G09-Misc-EMBOSSG09-Misc-EMBOSS

More from Bioinformatics Open Source Conference (19)

Running workflows through galaxy bosc presentation
Running workflows through galaxy bosc presentationRunning workflows through galaxy bosc presentation
Running workflows through galaxy bosc presentation
 
Talk1 ben sadi for_gmod_bosc_2011
Talk1 ben sadi for_gmod_bosc_2011Talk1 ben sadi for_gmod_bosc_2011
Talk1 ben sadi for_gmod_bosc_2011
 
Bosc mercer
Bosc mercerBosc mercer
Bosc mercer
 
Mobyle 1 0_new_features_new_types_of_service
Mobyle 1 0_new_features_new_types_of_serviceMobyle 1 0_new_features_new_types_of_service
Mobyle 1 0_new_features_new_types_of_service
 
Bosc2011 isobar-fbp
Bosc2011 isobar-fbpBosc2011 isobar-fbp
Bosc2011 isobar-fbp
 
Talk6 biopython bosc2011
Talk6 biopython bosc2011Talk6 biopython bosc2011
Talk6 biopython bosc2011
 
Unipro ugene bosc 2011 update
Unipro ugene bosc 2011 updateUnipro ugene bosc 2011 update
Unipro ugene bosc 2011 update
 
Bosc talk 7-15-2011x
Bosc talk 7-15-2011xBosc talk 7-15-2011x
Bosc talk 7-15-2011x
 
B07-GenomeContent-Biomart
B07-GenomeContent-BiomartB07-GenomeContent-Biomart
B07-GenomeContent-Biomart
 
B03-GenomeContent-Intermine
B03-GenomeContent-IntermineB03-GenomeContent-Intermine
B03-GenomeContent-Intermine
 
F06-Cloud-Enabling NGS
F06-Cloud-Enabling NGSF06-Cloud-Enabling NGS
F06-Cloud-Enabling NGS
 
D03-NextGen-Bio-NGS
D03-NextGen-Bio-NGSD03-NextGen-Bio-NGS
D03-NextGen-Bio-NGS
 
F01-Cloud-Mygene.info
F01-Cloud-Mygene.infoF01-Cloud-Mygene.info
F01-Cloud-Mygene.info
 
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systemsA01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
 
F03-Cloud-Obiwee
F03-Cloud-ObiweeF03-Cloud-Obiwee
F03-Cloud-Obiwee
 
C02-Visualization-Applying visual analytics
C02-Visualization-Applying visual analyticsC02-Visualization-Applying visual analytics
C02-Visualization-Applying visual analytics
 
B04-GenomeContent-EasyDAS
B04-GenomeContent-EasyDASB04-GenomeContent-EasyDAS
B04-GenomeContent-EasyDAS
 
G07-Misc-Gmod
G07-Misc-GmodG07-Misc-Gmod
G07-Misc-Gmod
 
G09-Misc-EMBOSS
G09-Misc-EMBOSSG09-Misc-EMBOSS
G09-Misc-EMBOSS
 

Recently uploaded

Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
Edge AI and Vision Alliance
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
Data Hops
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
DianaGray10
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 

Recently uploaded (20)

Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 

Bosc2011 ntino-krampis-full

  • 1. Cloud BioLinux: open source, fully-customizable bioinformatics computing on the cloud for the genomics community and beyond BOSC 2011 - Vienna, Austria Ntino Krampis, PhD Asst. Professor J. Craig Venter Institute (JCVI) agbiotec@gmail.com
  • 2. The community is what makes an open source project Brad Chapman, Tim Booth, Mesude Bicak, Dawn Field, Dan Pass – core development and planning Enis Afgan, Pjotr Prins, Stephen Möller - and all other members of the cloud biolinux community that move it fwd J. Craig Venter Inst. - time allowed to work on an open-source project
  • 3. Expensive sequencing and large organizations Commodity sequencing and small labs ● large sequencing center, multi-million, broad-impact sequencing projects ● dedicated bioinformatics department, compute clusters ● small-factor, bench-top sequencer available: GS Junior by 454 ● sequencing as a standard technique in basic biology and genetics research ● RNAseq and ChiPseq, and each biologist will be tackling a metagenome
  • 4. Will small labs become the long tail of sequencing ? amount of sequencing Credit: WikiMedia Commons number of labs
  • 5. “Bioinformatics nation is a land of city-states” Lincoln Stein ● small labs building small-scale bioinformatics infrastructures ● duplication of effort in compiling and installing software tools ● some groups have no hardware, expertise, or time to install and run software ● NEBC BioLinux ( tinyurl.com/BioLinux-NEBC ) 100+ pre-configured tools ● example: glimmer, hmmer, phylip, rasmol, genespring, clustalw, EMBOSS how about large-scale sequence datasets ?
  • 6. Cloud BioLinux pre-configured and on-demand bioinformatics computing on the cloud ● JCVI cloud computing research ● NEBC BioLinux software repository + ● community effort – Hackathon / BOSC 2010 - 11 ● Virtual Machine (VM) on Amazon cloud large-scale computing independently of = ● institutional or geographic boundaries ● only need a desktop computer with internet access cloudbiolinux.org
  • 7. simple for end-users signup at aws.amazon.com http://tinyurl.com/cloud-biolinux-tutorial
  • 8. Amazon EC2 → linux desktop via remote desktop client
  • 9. What if I want to share my alignments with a collaborator? save your data as a new VM 0.10$ / GB / month at 15GB, it costs 1.5$ / month
  • 10. “whole system snapshot exchange” (Dudley and Butte 2010) capture the state of the computing system and data software execution parameters and “massaged” input datasets
  • 11. Cloud BioLinux developer's framework create cloud VM / images with standardized software configurations ● customize Cloud BioLinux based on community requirements ● mix and match software from NEBC or other (DebianMed, Scientific Linux etc.) ● share customized VMs with collaborators, avoiding effort duplication ● deploy Cloud BioLinux on private and local clouds
  • 12. software domains in bioinformatics: nextgen sequencing, de novo assembly, annotation, phylogeny, molecular structures, gene expression analysis github.com/chapmanb/cloudbiolinux
  • 13. Cloud BioLinux developer's framework ● based on python-fabric auto-deployment tool ● software components listed in plain text files ● collaborators use files to share descriptions of cloud VM / images ● start with a bare-bones VM / image ● fabric downloads and installs specified software tinyurl.com/python-fabric
  • 14. Cloud Biolinux The future ● groups.google.com/cloudbiolinux and cloudbiolinux.org ● expand community, receive feedback, add more software to the VM ● scalable computing: SGE (Galaxy Cloudman), Hadoop (cloudgene.uibk.ac.at) ● add next-gen sequencing pipelines, NIH funding - adds effort in development ● We just had a 2-day codefest at the MetaLab, http://metalab.at/
  • 15. and before I finish this talk....
  • 16.
  • 17.
  • 18.
  • 19.