SlideShare a Scribd company logo
1 of 22
Department Name (View Master > Edit Slide 1)




Computing
Infrastructures for
Clinical NGS
Barriers to centralizing data analysis at the
University of Pittsburgh

M. Michael Barmada
Department of Human Genetics
Graduate School of Public Health, University of Pittsburgh
University of Pittsburgh

• Geographically disperse campus (132 acres in Pittsburgh, plus
 regional campuses)
• Large affiliated hospital system (UPMC - 23 hospitals spread
 out over tri-state area)
• Has been ranked in the top cluster of research institutions in
 the US
  • 7th in nation in terms of funding from NIH



                                                                                      M. Michael Barmada
                                                                           Department of Human Genetics
                                                   Graduate School of Public Health, University of Pittsburgh
NGS at Pitt




                                                 M. Michael Barmada
                                      Department of Human Genetics
              Graduate School of Public Health, University of Pittsburgh
Common Analysis hurdles for NGS

•Hardware
  •Computing capacity
  •Network
  •Storage
•Software


                                                                     M. Michael Barmada
                                                          Department of Human Genetics
                                  Graduate School of Public Health, University of Pittsburgh
Common problems - (1) Hardware

• When NGS machines started appearing on campus, there
 were 14 “high-performance” computing centers
  • Most were small group- or department- specific clusters
   (<100 cores) with limited storage and standard (GigE)
   networking
  • Larger computing resources were available at the Pittsburgh
   Supercomputing Center, but with limited availability




                                                                                    M. Michael Barmada
                                                                         Department of Human Genetics
                                                 Graduate School of Public Health, University of Pittsburgh
Center for Simulation and Modeling (SAM)


• One large (>3000 cores) cluster existed on campus -
 established by computational chemistry and engineering
 groups
• Large capacity machines (>12 cores/48Gb RAM per node -
 many with 48 cores/128-256Gb RAM)
  • This cleared up the RAM and capacity problems



                                                                                  M. Michael Barmada
                                                                       Department of Human Genetics
                                               Graduate School of Public Health, University of Pittsburgh
Common Problems - (2) Storage

• Despite early successes with SAM cluster, problems started
 appearing as number of users went up
  • Storage array - SAM cluster uses a shared NFS array for /
   home - reading and writing large files (read/quality files)
   became a serious bottleneck
    • Upgraded array to high-performance system (Panasas) -
      allowed for parallel (DirectFS/pNFS) access, greater
      throughput than standard RAID




                                                                                     M. Michael Barmada
                                                                          Department of Human Genetics
                                                  Graduate School of Public Health, University of Pittsburgh
Common Problems - (3) Networking


• Networking within the SAM cluster is a combination of
 infiniband and gigabit ethernet - not a problem
• Networking on campus was a problem
  • Old network segments (100Mb), Firewalls (multiple hops)
   - maximum transfer speeds only 10-15Mbit
  • Upgrades “in progress”




                                                                                     M. Michael Barmada
                                                                          Department of Human Genetics
                                                  Graduate School of Public Health, University of Pittsburgh
Common Problems - (3) Networking

• Solutions
   • “Sneaker-net” - works, but leads to proliferation of drives,
   potential for data loss/corruption
  • Globus/GridFTP - faster than campus network (transfer
   speeds of 1-2Gbit)


  • Cloud-based services (SevenBridges) - surprisingly
   economical and efficient - sequencing centers upload data
   for individual groups, who then use the data for analysis
   (online or at local cluster) and for backup (desktops)

                                                                                       M. Michael Barmada
                                                                            Department of Human Genetics
                                                    Graduate School of Public Health, University of Pittsburgh
Common Problems - (4) Software

• Pipelines created for linking together common tools (BWA/
 NovoAlign/GATK/Annovar) - but these require familiarity with
 command line/unix environment
• With increasing use of NGS by medical/clinical research
 groups, we had more and more users who were not
 comfortable in a unix environment
• Solutions: train users or develop non-unix-based interfaces




                                                                                     M. Michael Barmada
                                                                          Department of Human Genetics
                                                  Graduate School of Public Health, University of Pittsburgh
Research Gateways

• Several Bioinformatics/NGS gateways are in the process of
 being implemented
• Each allows access to the computational resources of the SAM
 cluster using web-based or client-based interfaces




                                                                                    M. Michael Barmada
                                                                         Department of Human Genetics
                                                 Graduate School of Public Health, University of Pittsburgh
Galaxy




                                            M. Michael Barmada
                                 Department of Human Genetics
         Graduate School of Public Health, University of Pittsburgh
CLCbio Genomic Server




                                                           M. Michael Barmada
                                                Department of Human Genetics
                        Graduate School of Public Health, University of Pittsburgh
CLCbio Genomic Server



CLCbio Genomics
  Workbench




                  CLCbio Genomics
                      Server




                                                                       M. Michael Barmada
                                                            Department of Human Genetics
                                    Graduate School of Public Health, University of Pittsburgh
Genboree




                                              M. Michael Barmada
                                   Department of Human Genetics
           Graduate School of Public Health, University of Pittsburgh
M. Michael Barmada
                        Department of Human Genetics
Graduate School of Public Health, University of Pittsburgh
Issues with research gateways

• Common data storage and data dedup
   • A current focus is configuring all NGS gateways so that they
   can all share the same common storage space and files, so we
   do not need to duplicate data cross multiple storage spaces
  • CLC bio Genomics Server “plays well with others”, as does
   Yabi, but Galaxy and Genboree do a lot of file permission
   modifications
    • Solution: create a meta-data store that ensures files are
      owned by the appropriate users and have appropriate
      permissions - coupled with cron tasks to monitor user/
      permission changes
                                                                                      M. Michael Barmada
                                                                           Department of Human Genetics
                                                   Graduate School of Public Health, University of Pittsburgh
Cloud Computing

• Another alternative: Cloud computing/hybrid solutions
 (“Cloud-bursting”)
  • Currently setting up cloud-based storage/staging of NGS
   data to circumvent networking issues on campus
  • Natural extension to allow users to analyze data “in the
   cloud”
  • Similar offerings from several companies - we’re working
   with SevenBridges Genomics - nice billing interface for each
   individual use (storage/staging/analysis)


                                                                                     M. Michael Barmada
                                                                          Department of Human Genetics
                                                  Graduate School of Public Health, University of Pittsburgh
Seven Bridges Genomics




                                                            M. Michael Barmada
                                                 Department of Human Genetics
                         Graduate School of Public Health, University of Pittsburgh
NGS clinical process
                         Patient/Family   UPMC     Pitt
                           consented



                           Samples            DNA/
                                                            Sequencing
                            drawn          library prep



    Medical                                                                            QC/filtering
    Report                                                                             Alignment
                                                             Analysis
                                                                                       Variant Calling
 Interpretation            Validation                                                  Annotation



     EMR



                    Data Storage                            Data Storage
                  (BAM, VCF files)                         (Raw, BAM, VCF)


                                                                                                            M. Michael Barmada
                                                                                                 Department of Human Genetics
                                                                         Graduate School of Public Health, University of Pittsburgh
The last analysis challenge

• Even after fixing all of these issues, two major hurdles remain
   • Community
      • Organizing and coordinating all NGS efforts on campus
      would greatly speed up the pace of research
  • Education!
     • We need to educate clinicians and clinical-support staff
      (genetic counselors) to understand the limitations and
      the advantages of sequence data from the perspective of
      clinical utility


                                                                                      M. Michael Barmada
                                                                           Department of Human Genetics
                                                   Graduate School of Public Health, University of Pittsburgh
Thanks!




                                             M. Michael Barmada
                                  Department of Human Genetics
          Graduate School of Public Health, University of Pittsburgh

More Related Content

Viewers also liked

פסיכיאטריה קהילתית 2
פסיכיאטריה קהילתית 2פסיכיאטריה קהילתית 2
פסיכיאטריה קהילתית 2TsviGil
 
Create jobs - inspire a generation (overview)
Create jobs - inspire a generation (overview)Create jobs - inspire a generation (overview)
Create jobs - inspire a generation (overview)pesec
 
Ask datatech profile
Ask datatech profileAsk datatech profile
Ask datatech profileAsk Datatech
 
How the spaniards forced the spanish law
How the spaniards forced the spanish lawHow the spaniards forced the spanish law
How the spaniards forced the spanish lawUCT ICO
 
The active inclusion of young people
The active inclusion of young peopleThe active inclusion of young people
The active inclusion of young peoplepesec
 

Viewers also liked (14)

space
space space
space
 
TnD_forecasts
TnD_forecastsTnD_forecasts
TnD_forecasts
 
Quijote
QuijoteQuijote
Quijote
 
outsourcing_TnD
outsourcing_TnDoutsourcing_TnD
outsourcing_TnD
 
פסיכיאטריה קהילתית 2
פסיכיאטריה קהילתית 2פסיכיאטריה קהילתית 2
פסיכיאטריה קהילתית 2
 
Create jobs - inspire a generation (overview)
Create jobs - inspire a generation (overview)Create jobs - inspire a generation (overview)
Create jobs - inspire a generation (overview)
 
Eu paraphrase
Eu paraphraseEu paraphrase
Eu paraphrase
 
Ask datatech profile
Ask datatech profileAsk datatech profile
Ask datatech profile
 
Dpa bims 2015_eng
Dpa bims 2015_engDpa bims 2015_eng
Dpa bims 2015_eng
 
321 unit 16 success
321 unit 16 success321 unit 16 success
321 unit 16 success
 
How the spaniards forced the spanish law
How the spaniards forced the spanish lawHow the spaniards forced the spanish law
How the spaniards forced the spanish law
 
Science50
Science50Science50
Science50
 
The active inclusion of young people
The active inclusion of young peopleThe active inclusion of young people
The active inclusion of young people
 
Game
GameGame
Game
 

Similar to Pitt's Barriers to Centralizing Clinical NGS Data Analysis

Data sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryData sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryResearch Information Network
 
Data management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK StoryData management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK StoryCarole Goble
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBenjamin Good
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxxRowlet
 
A Step to the Clouded Solution of Scalable Clinical Genome Sequencing (BDT308...
A Step to the Clouded Solution of Scalable Clinical Genome Sequencing (BDT308...A Step to the Clouded Solution of Scalable Clinical Genome Sequencing (BDT308...
A Step to the Clouded Solution of Scalable Clinical Genome Sequencing (BDT308...Amazon Web Services
 
Mik Black bioinformatics symposium
Mik Black bioinformatics symposiumMik Black bioinformatics symposium
Mik Black bioinformatics symposiumguest5e6f31
 
Open Source Networking Solving Molecular Analysis of Cancer
Open Source Networking Solving Molecular Analysis of CancerOpen Source Networking Solving Molecular Analysis of Cancer
Open Source Networking Solving Molecular Analysis of CancerOpen Networking Summit
 
FDA NGS and Big Data Conference September 2014
FDA NGS and Big Data Conference September 2014FDA NGS and Big Data Conference September 2014
FDA NGS and Big Data Conference September 2014Warren Kibbe
 
Machine Learning for Data Mining
Machine Learning for Data MiningMachine Learning for Data Mining
Machine Learning for Data MiningBhuban Roy
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Ian Foster
 
Friend p4c 2012-11-29
Friend p4c 2012-11-29Friend p4c 2012-11-29
Friend p4c 2012-11-29Sage Base
 
Docker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker, Inc.
 
Introduction to Big Data and its Potential for Dementia Research
Introduction to Big Data and its Potential for Dementia ResearchIntroduction to Big Data and its Potential for Dementia Research
Introduction to Big Data and its Potential for Dementia ResearchDavid De Roure
 
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29Stephen Friend Institute of Development, Aging and Cancer 2011-11-29
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29Sage Base
 
bioinfomatics
bioinfomaticsbioinfomatics
bioinfomaticsnguyenpg
 

Similar to Pitt's Barriers to Centralizing Clinical NGS Data Analysis (20)

Data sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryData sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK Story
 
Data management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK StoryData management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK Story
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiers
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptx
 
A Step to the Clouded Solution of Scalable Clinical Genome Sequencing (BDT308...
A Step to the Clouded Solution of Scalable Clinical Genome Sequencing (BDT308...A Step to the Clouded Solution of Scalable Clinical Genome Sequencing (BDT308...
A Step to the Clouded Solution of Scalable Clinical Genome Sequencing (BDT308...
 
Mik Black bioinformatics symposium
Mik Black bioinformatics symposiumMik Black bioinformatics symposium
Mik Black bioinformatics symposium
 
Mik Black bioinformatics symposium
Mik Black bioinformatics symposiumMik Black bioinformatics symposium
Mik Black bioinformatics symposium
 
Open Source Networking Solving Molecular Analysis of Cancer
Open Source Networking Solving Molecular Analysis of CancerOpen Source Networking Solving Molecular Analysis of Cancer
Open Source Networking Solving Molecular Analysis of Cancer
 
FDA NGS and Big Data Conference September 2014
FDA NGS and Big Data Conference September 2014FDA NGS and Big Data Conference September 2014
FDA NGS and Big Data Conference September 2014
 
Machine Learning for Data Mining
Machine Learning for Data MiningMachine Learning for Data Mining
Machine Learning for Data Mining
 
Machine Learning for Data Mining
Machine Learning for Data MiningMachine Learning for Data Mining
Machine Learning for Data Mining
 
Data Mining (Predict The Future)
Data Mining (Predict The Future)Data Mining (Predict The Future)
Data Mining (Predict The Future)
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
Friend p4c 2012-11-29
Friend p4c 2012-11-29Friend p4c 2012-11-29
Friend p4c 2012-11-29
 
Docker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce Hoff
 
Introduction to Big Data and its Potential for Dementia Research
Introduction to Big Data and its Potential for Dementia ResearchIntroduction to Big Data and its Potential for Dementia Research
Introduction to Big Data and its Potential for Dementia Research
 
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29Stephen Friend Institute of Development, Aging and Cancer 2011-11-29
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29
 
PSN for Precision Medicine
PSN for Precision MedicinePSN for Precision Medicine
PSN for Precision Medicine
 
Lab presentation2011
Lab presentation2011Lab presentation2011
Lab presentation2011
 
bioinfomatics
bioinfomaticsbioinfomatics
bioinfomatics
 

More from Copenhagenomics

Comparative metagenomics: quantifying similarities between environments, CMBI...
Comparative metagenomics: quantifying similarities between environments, CMBI...Comparative metagenomics: quantifying similarities between environments, CMBI...
Comparative metagenomics: quantifying similarities between environments, CMBI...Copenhagenomics
 
Exome sequencing for disease gene identification and patient diagnostics, Gen...
Exome sequencing for disease gene identification and patient diagnostics, Gen...Exome sequencing for disease gene identification and patient diagnostics, Gen...
Exome sequencing for disease gene identification and patient diagnostics, Gen...Copenhagenomics
 
Integrating omic approaches to investigate the gut microbiota, School of Bios...
Integrating omic approaches to investigate the gut microbiota, School of Bios...Integrating omic approaches to investigate the gut microbiota, School of Bios...
Integrating omic approaches to investigate the gut microbiota, School of Bios...Copenhagenomics
 
Sequencing the transcriptome reveals complex layers of regulation, Department...
Sequencing the transcriptome reveals complex layers of regulation, Department...Sequencing the transcriptome reveals complex layers of regulation, Department...
Sequencing the transcriptome reveals complex layers of regulation, Department...Copenhagenomics
 
Recent Advances in NGS Technologies, LaserGen & Baylor College of Medicine, M...
Recent Advances in NGS Technologies, LaserGen & Baylor College of Medicine, M...Recent Advances in NGS Technologies, LaserGen & Baylor College of Medicine, M...
Recent Advances in NGS Technologies, LaserGen & Baylor College of Medicine, M...Copenhagenomics
 
NGS in Forensics Genetics – examples using the GS Junior. Sponsored by Roche ...
NGS in Forensics Genetics – examples using the GS Junior. Sponsored by Roche ...NGS in Forensics Genetics – examples using the GS Junior. Sponsored by Roche ...
NGS in Forensics Genetics – examples using the GS Junior. Sponsored by Roche ...Copenhagenomics
 
High-Throughput Sequencing of the Human Microbiome, Rob Knight Research Group...
High-Throughput Sequencing of the Human Microbiome, Rob Knight Research Group...High-Throughput Sequencing of the Human Microbiome, Rob Knight Research Group...
High-Throughput Sequencing of the Human Microbiome, Rob Knight Research Group...Copenhagenomics
 
Clinical translation of prostate cancer genomics, Department of Biosciences a...
Clinical translation of prostate cancer genomics, Department of Biosciences a...Clinical translation of prostate cancer genomics, Department of Biosciences a...
Clinical translation of prostate cancer genomics, Department of Biosciences a...Copenhagenomics
 
Can we exploit the power of NGS to move towards personalized medicine?, Cente...
Can we exploit the power of NGS to move towards personalized medicine?, Cente...Can we exploit the power of NGS to move towards personalized medicine?, Cente...
Can we exploit the power of NGS to move towards personalized medicine?, Cente...Copenhagenomics
 
Assembling the Norway Spruce Genome: 20Gb and many challenges, Umeå Plant Sci...
Assembling the Norway Spruce Genome: 20Gb and many challenges, Umeå Plant Sci...Assembling the Norway Spruce Genome: 20Gb and many challenges, Umeå Plant Sci...
Assembling the Norway Spruce Genome: 20Gb and many challenges, Umeå Plant Sci...Copenhagenomics
 
Sequencing the entire nation of the Faroe Islands - from sequencing to societ...
Sequencing the entire nation of the Faroe Islands - from sequencing to societ...Sequencing the entire nation of the Faroe Islands - from sequencing to societ...
Sequencing the entire nation of the Faroe Islands - from sequencing to societ...Copenhagenomics
 
Uncovering the impacts of circumcision on the penis microbiome, Translational...
Uncovering the impacts of circumcision on the penis microbiome, Translational...Uncovering the impacts of circumcision on the penis microbiome, Translational...
Uncovering the impacts of circumcision on the penis microbiome, Translational...Copenhagenomics
 
Sequencing cannabis sativa and cannabis indica, Courtagen Life Sciences, Inc,...
Sequencing cannabis sativa and cannabis indica, Courtagen Life Sciences, Inc,...Sequencing cannabis sativa and cannabis indica, Courtagen Life Sciences, Inc,...
Sequencing cannabis sativa and cannabis indica, Courtagen Life Sciences, Inc,...Copenhagenomics
 
Discovery of Cow Rumen Biomass-Degrading Genes and Genomes through DNA Sequen...
Discovery of Cow Rumen Biomass-Degrading Genes and Genomes through DNA Sequen...Discovery of Cow Rumen Biomass-Degrading Genes and Genomes through DNA Sequen...
Discovery of Cow Rumen Biomass-Degrading Genes and Genomes through DNA Sequen...Copenhagenomics
 

More from Copenhagenomics (15)

Wigard P. Kloosterman
Wigard P. KloostermanWigard P. Kloosterman
Wigard P. Kloosterman
 
Comparative metagenomics: quantifying similarities between environments, CMBI...
Comparative metagenomics: quantifying similarities between environments, CMBI...Comparative metagenomics: quantifying similarities between environments, CMBI...
Comparative metagenomics: quantifying similarities between environments, CMBI...
 
Exome sequencing for disease gene identification and patient diagnostics, Gen...
Exome sequencing for disease gene identification and patient diagnostics, Gen...Exome sequencing for disease gene identification and patient diagnostics, Gen...
Exome sequencing for disease gene identification and patient diagnostics, Gen...
 
Integrating omic approaches to investigate the gut microbiota, School of Bios...
Integrating omic approaches to investigate the gut microbiota, School of Bios...Integrating omic approaches to investigate the gut microbiota, School of Bios...
Integrating omic approaches to investigate the gut microbiota, School of Bios...
 
Sequencing the transcriptome reveals complex layers of regulation, Department...
Sequencing the transcriptome reveals complex layers of regulation, Department...Sequencing the transcriptome reveals complex layers of regulation, Department...
Sequencing the transcriptome reveals complex layers of regulation, Department...
 
Recent Advances in NGS Technologies, LaserGen & Baylor College of Medicine, M...
Recent Advances in NGS Technologies, LaserGen & Baylor College of Medicine, M...Recent Advances in NGS Technologies, LaserGen & Baylor College of Medicine, M...
Recent Advances in NGS Technologies, LaserGen & Baylor College of Medicine, M...
 
NGS in Forensics Genetics – examples using the GS Junior. Sponsored by Roche ...
NGS in Forensics Genetics – examples using the GS Junior. Sponsored by Roche ...NGS in Forensics Genetics – examples using the GS Junior. Sponsored by Roche ...
NGS in Forensics Genetics – examples using the GS Junior. Sponsored by Roche ...
 
High-Throughput Sequencing of the Human Microbiome, Rob Knight Research Group...
High-Throughput Sequencing of the Human Microbiome, Rob Knight Research Group...High-Throughput Sequencing of the Human Microbiome, Rob Knight Research Group...
High-Throughput Sequencing of the Human Microbiome, Rob Knight Research Group...
 
Clinical translation of prostate cancer genomics, Department of Biosciences a...
Clinical translation of prostate cancer genomics, Department of Biosciences a...Clinical translation of prostate cancer genomics, Department of Biosciences a...
Clinical translation of prostate cancer genomics, Department of Biosciences a...
 
Can we exploit the power of NGS to move towards personalized medicine?, Cente...
Can we exploit the power of NGS to move towards personalized medicine?, Cente...Can we exploit the power of NGS to move towards personalized medicine?, Cente...
Can we exploit the power of NGS to move towards personalized medicine?, Cente...
 
Assembling the Norway Spruce Genome: 20Gb and many challenges, Umeå Plant Sci...
Assembling the Norway Spruce Genome: 20Gb and many challenges, Umeå Plant Sci...Assembling the Norway Spruce Genome: 20Gb and many challenges, Umeå Plant Sci...
Assembling the Norway Spruce Genome: 20Gb and many challenges, Umeå Plant Sci...
 
Sequencing the entire nation of the Faroe Islands - from sequencing to societ...
Sequencing the entire nation of the Faroe Islands - from sequencing to societ...Sequencing the entire nation of the Faroe Islands - from sequencing to societ...
Sequencing the entire nation of the Faroe Islands - from sequencing to societ...
 
Uncovering the impacts of circumcision on the penis microbiome, Translational...
Uncovering the impacts of circumcision on the penis microbiome, Translational...Uncovering the impacts of circumcision on the penis microbiome, Translational...
Uncovering the impacts of circumcision on the penis microbiome, Translational...
 
Sequencing cannabis sativa and cannabis indica, Courtagen Life Sciences, Inc,...
Sequencing cannabis sativa and cannabis indica, Courtagen Life Sciences, Inc,...Sequencing cannabis sativa and cannabis indica, Courtagen Life Sciences, Inc,...
Sequencing cannabis sativa and cannabis indica, Courtagen Life Sciences, Inc,...
 
Discovery of Cow Rumen Biomass-Degrading Genes and Genomes through DNA Sequen...
Discovery of Cow Rumen Biomass-Degrading Genes and Genomes through DNA Sequen...Discovery of Cow Rumen Biomass-Degrading Genes and Genomes through DNA Sequen...
Discovery of Cow Rumen Biomass-Degrading Genes and Genomes through DNA Sequen...
 

Recently uploaded

VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service MumbaiVIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbaisonalikaur4
 
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiNehru place Escorts
 
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service BangaloreCall Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalorenarwatsonia7
 
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy GirlsCall Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girlsnehamumbai
 
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...Miss joya
 
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original PhotosCall Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photosnarwatsonia7
 
Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...
Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...
Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...Miss joya
 
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.MiadAlsulami
 
Call Girl Surat Madhuri 7001305949 Independent Escort Service Surat
Call Girl Surat Madhuri 7001305949 Independent Escort Service SuratCall Girl Surat Madhuri 7001305949 Independent Escort Service Surat
Call Girl Surat Madhuri 7001305949 Independent Escort Service Suratnarwatsonia7
 
Vip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls Available
Vip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls AvailableVip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls Available
Vip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls AvailableNehru place Escorts
 
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...narwatsonia7
 
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% SafeBangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safenarwatsonia7
 
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service JaipurHigh Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipurparulsinha
 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...Miss joya
 
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowSonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowRiya Pathan
 

Recently uploaded (20)

Escort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCR
Escort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCREscort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCR
Escort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCR
 
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service MumbaiVIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
 
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
 
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
 
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service BangaloreCall Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
 
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy GirlsCall Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
 
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
 
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original PhotosCall Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
 
Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...
Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...
Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...
 
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
 
Call Girl Surat Madhuri 7001305949 Independent Escort Service Surat
Call Girl Surat Madhuri 7001305949 Independent Escort Service SuratCall Girl Surat Madhuri 7001305949 Independent Escort Service Surat
Call Girl Surat Madhuri 7001305949 Independent Escort Service Surat
 
Vip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls Available
Vip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls AvailableVip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls Available
Vip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls Available
 
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
 
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
 
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% SafeBangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
 
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
 
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service JaipurHigh Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
 
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
 
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowSonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
 

Pitt's Barriers to Centralizing Clinical NGS Data Analysis

  • 1. Department Name (View Master > Edit Slide 1) Computing Infrastructures for Clinical NGS Barriers to centralizing data analysis at the University of Pittsburgh M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 2. University of Pittsburgh • Geographically disperse campus (132 acres in Pittsburgh, plus regional campuses) • Large affiliated hospital system (UPMC - 23 hospitals spread out over tri-state area) • Has been ranked in the top cluster of research institutions in the US • 7th in nation in terms of funding from NIH M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 3. NGS at Pitt M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 4. Common Analysis hurdles for NGS •Hardware •Computing capacity •Network •Storage •Software M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 5. Common problems - (1) Hardware • When NGS machines started appearing on campus, there were 14 “high-performance” computing centers • Most were small group- or department- specific clusters (<100 cores) with limited storage and standard (GigE) networking • Larger computing resources were available at the Pittsburgh Supercomputing Center, but with limited availability M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 6. Center for Simulation and Modeling (SAM) • One large (>3000 cores) cluster existed on campus - established by computational chemistry and engineering groups • Large capacity machines (>12 cores/48Gb RAM per node - many with 48 cores/128-256Gb RAM) • This cleared up the RAM and capacity problems M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 7. Common Problems - (2) Storage • Despite early successes with SAM cluster, problems started appearing as number of users went up • Storage array - SAM cluster uses a shared NFS array for / home - reading and writing large files (read/quality files) became a serious bottleneck • Upgraded array to high-performance system (Panasas) - allowed for parallel (DirectFS/pNFS) access, greater throughput than standard RAID M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 8. Common Problems - (3) Networking • Networking within the SAM cluster is a combination of infiniband and gigabit ethernet - not a problem • Networking on campus was a problem • Old network segments (100Mb), Firewalls (multiple hops) - maximum transfer speeds only 10-15Mbit • Upgrades “in progress” M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 9. Common Problems - (3) Networking • Solutions • “Sneaker-net” - works, but leads to proliferation of drives, potential for data loss/corruption • Globus/GridFTP - faster than campus network (transfer speeds of 1-2Gbit) • Cloud-based services (SevenBridges) - surprisingly economical and efficient - sequencing centers upload data for individual groups, who then use the data for analysis (online or at local cluster) and for backup (desktops) M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 10. Common Problems - (4) Software • Pipelines created for linking together common tools (BWA/ NovoAlign/GATK/Annovar) - but these require familiarity with command line/unix environment • With increasing use of NGS by medical/clinical research groups, we had more and more users who were not comfortable in a unix environment • Solutions: train users or develop non-unix-based interfaces M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 11. Research Gateways • Several Bioinformatics/NGS gateways are in the process of being implemented • Each allows access to the computational resources of the SAM cluster using web-based or client-based interfaces M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 12. Galaxy M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 13. CLCbio Genomic Server M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 14. CLCbio Genomic Server CLCbio Genomics Workbench CLCbio Genomics Server M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 15. Genboree M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 16. M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 17. Issues with research gateways • Common data storage and data dedup • A current focus is configuring all NGS gateways so that they can all share the same common storage space and files, so we do not need to duplicate data cross multiple storage spaces • CLC bio Genomics Server “plays well with others”, as does Yabi, but Galaxy and Genboree do a lot of file permission modifications • Solution: create a meta-data store that ensures files are owned by the appropriate users and have appropriate permissions - coupled with cron tasks to monitor user/ permission changes M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 18. Cloud Computing • Another alternative: Cloud computing/hybrid solutions (“Cloud-bursting”) • Currently setting up cloud-based storage/staging of NGS data to circumvent networking issues on campus • Natural extension to allow users to analyze data “in the cloud” • Similar offerings from several companies - we’re working with SevenBridges Genomics - nice billing interface for each individual use (storage/staging/analysis) M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 19. Seven Bridges Genomics M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 20. NGS clinical process Patient/Family UPMC Pitt consented Samples DNA/ Sequencing drawn library prep Medical QC/filtering Report Alignment Analysis Variant Calling Interpretation Validation Annotation EMR Data Storage Data Storage (BAM, VCF files) (Raw, BAM, VCF) M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 21. The last analysis challenge • Even after fixing all of these issues, two major hurdles remain • Community • Organizing and coordinating all NGS efforts on campus would greatly speed up the pace of research • Education! • We need to educate clinicians and clinical-support staff (genetic counselors) to understand the limitations and the advantages of sequence data from the perspective of clinical utility M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh
  • 22. Thanks! M. Michael Barmada Department of Human Genetics Graduate School of Public Health, University of Pittsburgh

Editor's Notes

  1. Instructions for editing school and department titles:\n\n Select from menu: View &gt; Master &gt; Slide Master\n\n Click on each text area you wish to edit. Text will become editable.\n
  2. Some quick facts about the University of Pittsburgh - we are very spread out, as is common in large academic/medical centers. UPMC in particular is one of the largest hospital systems in the US, serving a population of approximately 15 million people in what we call the tri-state area, with major emphasis in GI, transplant, cancer, and aging centers. Given the large amount of research funding brought in by university and upmc researchers, you can imagine we are a very active research center.\n
  3. This map shows the location of NGS machines at Pitt that have shown up in the last three years - a total of 11 machines of all types (454, illumina, solid, ion torrent). The lack of centralization of these resources has hurt efforts to get good (reproducible) data from these centers.\n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n