SlideShare a Scribd company logo
The past, present, and future of HPC in life sciences
Erich Birngruber, Ümit Seren
Gregor Mendel Institute for Molecular Plant Biology (GMI)
AHPC17
Who we are
- Basic research institute in plant sciences
- 9 independent research groups
- Employees 100 + 20 (scientific + admin)
- HPC Operations Team: 2 + 1 (engineer + lead)
Past: Beginnings as traditional HPC
Scientific computing at GMI
- Started in 2010
- SGI ICE-X since 2013 (MENDEL)
(72 nodes, 144 today)
- SGI UV2000
- Rich software environment
(EasyBuild, lmod)
- Keeping up with current
developments
Machine specs
3 generations of nodes:
- 72x 16c E5-2609, 192gb mem
- 18x 20c E5-2680, 256gb mem
- 54x 24c E5-2650, 256gb mem,
230gb ssd
UV2000: 96c E5-4617, 2tb mem
IB FDR interconnect (1 fabric)
Storage: Lustre 300tb, NetApp >1pb
Past: System architecture
Present: GMI site specifics
- Services: customers are biologists
- On campus initial training
- Consulting and support (w/ ticket system, intranet wiki)
- Software installations
- Provided as modules: different versions, repeatability
- This is getting harder with the demand for more complex software
- Monitoring software usage
Present: Monitoring software usage
- Software in env modules
- 460 software packages
in 1297 versions
- Monitoring module usage
(load, unload)
- Reporting by user, job, project
Present: Monitoring system activity
Monitoring and metrics
The foundation for all future decisions
- Resource consumption
- Capacity planning
- Software, technology usage
- Auditing
Alerting
Nodestatus
Jobresources
Present: Applications & Appliances
Phenobox (in development)
- Web-interface, API
- MySQL (DB)
- DSLR, RaspberryPi
- HPC (computer vision, storage)
GWA-Portal (https://gwas.gmi.oeaw.ac.at)
- Web-interface, API
- Elasticsearch (fulltext search)
- PostgreSQL (DB)
- Docker (Python microservices)
- HPC (analysis, storage)
Galaxy (https://galaxyproject.org/)
- Web-interface, API
- MySQL (DB)
- Visualization
- HPC (analysis, storage)
PacBio SMRT Link
(https://github.com/PacificBiosciences/SMRT-Link)
- Web-interface, API
- MySQL (DB)
- HPC (analysis, storage)
Own developments: 3rd party software:
Present: new developments
Deployment of OpenStack (IaaS):
- Cross-vendor open source project
- On-premises cloud
- Provision VMs and containers
- Deploy classic application services
- Enables self-service for customers
Consequences:
- More heterogeneous use-cases
- Customer base is increasing
- Non-human “customers” of HPC
- Services are more complex and
distributed over subsystems
Past: System architecture
Present: MENDEL, Openstack
Present: Problem 1: maintenance
- VMs are difficult to maintain
- Wrong abstraction for the use-case
- What is the next step?
- Containers?
- Container Orchestration Engines?
- Provide Software as a Service (SaaS)?
Fact is: the field is evolving
Present: MENDEL, Openstack
Future: Problem 2: integration
Applications sit on different islands:
HPC vs. Cloud
Drawbacks:
- Hard to maintain (infra)
- Hard to debug (app)
Vision: converged compute platform.
Unified infrastructure to schedule all
types of tasks
New challenges:
- Networking - Storage
- IDM - Accounting
What do others do?
Container Orchestration Engine (Google
Kubernetes, Docker Swarm, Apache Mesos)
First steps:
- Containers for HPC
- Biocontainers http://biocontainers.pro
- Singularity http://singularity.lbl.gov
- Current status: test deployment
Contact / References:
Erich Birngruber <erich.birngruber@gmi.oeaw.ac.at>, @ebirn
Ümit Seren <uemit.seren@gmi.oeaw.ac.at>, @timeu_s
GMI on Github:
https://github.com/Gregor-Mendel-Institute
Total recall: holistic metrics for broad systems performance and user experience visibility in a
data-intensive computing environment
https://dl.acm.org/citation.cfm?id=2835001
Acknowledgements
Gregor Mendel Institute
of Molecular Plant
Biology
Dr Bohr-Gasse 3
1030 Vienna, Austria
EOF

More Related Content

Viewers also liked

Shooting schedule
Shooting scheduleShooting schedule
Shooting schedule
Kirsty Evers
 
Gospel of hip hop
Gospel of hip hopGospel of hip hop
Gospel of hip hop
Jalen Terry
 
Annual-Report-2013
Annual-Report-2013Annual-Report-2013
Annual-Report-2013
Rosa Ana Aguero Roman
 
Mood board
Mood boardMood board
Mood board
nancyover
 
Doc1
Doc1Doc1
Important Personalities of Mahabharata
Important Personalities of Mahabharata Important Personalities of Mahabharata
Important Personalities of Mahabharata
Abhishek Sharma
 
Pengolahan Limbah Cair dengan metode Elektrokoagulasi
Pengolahan Limbah Cair dengan metode Elektrokoagulasi Pengolahan Limbah Cair dengan metode Elektrokoagulasi
Pengolahan Limbah Cair dengan metode Elektrokoagulasi
ansyahrobi
 
Fast & Furious: building HPC solutions in a nutshell
Fast & Furious: building HPC solutions in a nutshellFast & Furious: building HPC solutions in a nutshell
Fast & Furious: building HPC solutions in a nutshell
Victor Haydin
 
FIEV report on steel sector
FIEV report on steel sectorFIEV report on steel sector
FIEV report on steel sector
Sourav Mahato
 
Kathryn Gregg Resume
Kathryn Gregg ResumeKathryn Gregg Resume
Kathryn Gregg Resume
Kaydee Gregg
 
CASO CLÍNICO ORTO ADRIÁN QUIZHPE
CASO CLÍNICO ORTO ADRIÁN QUIZHPECASO CLÍNICO ORTO ADRIÁN QUIZHPE
CASO CLÍNICO ORTO ADRIÁN QUIZHPE
AAQQ91
 
HPC in healthcare
HPC in healthcareHPC in healthcare
HPC in healthcare
luckyanup
 
Digital pen
Digital penDigital pen
Digital pen
Kundan Parmar
 
Introduction To Apache Mesos
Introduction To Apache MesosIntroduction To Apache Mesos
Introduction To Apache Mesos
Joe Stein
 
Big Data HPC Convergence
Big Data HPC ConvergenceBig Data HPC Convergence
Big Data HPC Convergence
Geoffrey Fox
 
ODCA Board Best Practice: High Performance Computing at BMW
ODCA Board Best Practice: High Performance Computing at BMWODCA Board Best Practice: High Performance Computing at BMW
ODCA Board Best Practice: High Performance Computing at BMW
Open Data Center Alliance
 
Big Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS CloudBig Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS Cloud
Amazon Web Services
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS Cloud
Amazon Web Services
 
HPC Market Update from IDC
HPC Market Update from IDCHPC Market Update from IDC
HPC Market Update from IDC
inside-BigData.com
 
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
Amazon Web Services
 

Viewers also liked (20)

Shooting schedule
Shooting scheduleShooting schedule
Shooting schedule
 
Gospel of hip hop
Gospel of hip hopGospel of hip hop
Gospel of hip hop
 
Annual-Report-2013
Annual-Report-2013Annual-Report-2013
Annual-Report-2013
 
Mood board
Mood boardMood board
Mood board
 
Doc1
Doc1Doc1
Doc1
 
Important Personalities of Mahabharata
Important Personalities of Mahabharata Important Personalities of Mahabharata
Important Personalities of Mahabharata
 
Pengolahan Limbah Cair dengan metode Elektrokoagulasi
Pengolahan Limbah Cair dengan metode Elektrokoagulasi Pengolahan Limbah Cair dengan metode Elektrokoagulasi
Pengolahan Limbah Cair dengan metode Elektrokoagulasi
 
Fast & Furious: building HPC solutions in a nutshell
Fast & Furious: building HPC solutions in a nutshellFast & Furious: building HPC solutions in a nutshell
Fast & Furious: building HPC solutions in a nutshell
 
FIEV report on steel sector
FIEV report on steel sectorFIEV report on steel sector
FIEV report on steel sector
 
Kathryn Gregg Resume
Kathryn Gregg ResumeKathryn Gregg Resume
Kathryn Gregg Resume
 
CASO CLÍNICO ORTO ADRIÁN QUIZHPE
CASO CLÍNICO ORTO ADRIÁN QUIZHPECASO CLÍNICO ORTO ADRIÁN QUIZHPE
CASO CLÍNICO ORTO ADRIÁN QUIZHPE
 
HPC in healthcare
HPC in healthcareHPC in healthcare
HPC in healthcare
 
Digital pen
Digital penDigital pen
Digital pen
 
Introduction To Apache Mesos
Introduction To Apache MesosIntroduction To Apache Mesos
Introduction To Apache Mesos
 
Big Data HPC Convergence
Big Data HPC ConvergenceBig Data HPC Convergence
Big Data HPC Convergence
 
ODCA Board Best Practice: High Performance Computing at BMW
ODCA Board Best Practice: High Performance Computing at BMWODCA Board Best Practice: High Performance Computing at BMW
ODCA Board Best Practice: High Performance Computing at BMW
 
Big Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS CloudBig Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS Cloud
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS Cloud
 
HPC Market Update from IDC
HPC Market Update from IDCHPC Market Update from IDC
HPC Market Update from IDC
 
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
 

Similar to Past, present, and future of HPC in life sciences

General Introduction to technologies that will be seen in the school
General Introduction to technologies that will be seen in the school General Introduction to technologies that will be seen in the school
General Introduction to technologies that will be seen in the school
ISSGC Summer School
 
Hungarian ClusterGrid and its applications
Hungarian ClusterGrid and its applicationsHungarian ClusterGrid and its applications
Hungarian ClusterGrid and its applications
Ferenc Szalai
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing Technologies
Intel® Software
 
NIIF Grid Development portfolio
NIIF Grid Development portfolioNIIF Grid Development portfolio
NIIF Grid Development portfolio
Ferenc Szalai
 
20160201_resume_Vladimir_Chesnokov
20160201_resume_Vladimir_Chesnokov20160201_resume_Vladimir_Chesnokov
20160201_resume_Vladimir_Chesnokov
Vladimir Chesnokov
 
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale SystemsDesigning HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
inside-BigData.com
 
Scallable Distributed Deep Learning on OpenPOWER systems
Scallable Distributed Deep Learning on OpenPOWER systemsScallable Distributed Deep Learning on OpenPOWER systems
Scallable Distributed Deep Learning on OpenPOWER systems
Ganesan Narayanasamy
 
Pathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaborationPathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaboration
EOSC-hub project
 
NWU and HPC
NWU and HPCNWU and HPC
NWU and HPC
Wilhelm van Belkum
 
Best pratices at BGI for the Challenges in the Era of Big Genomics Data
Best pratices at BGI for the Challenges in the Era of Big Genomics DataBest pratices at BGI for the Challenges in the Era of Big Genomics Data
Best pratices at BGI for the Challenges in the Era of Big Genomics Data
Xing Xu
 
H2020-AHTOOLS Use Case 3 Functional Design
H2020-AHTOOLS Use Case 3 Functional DesignH2020-AHTOOLS Use Case 3 Functional Design
H2020-AHTOOLS Use Case 3 Functional Design
CARLOS III UNIVERSITY OF MADRID
 
optimizing_ceph_flash
optimizing_ceph_flashoptimizing_ceph_flash
optimizing_ceph_flash
Vijayendra Shamanna
 
EGI Cloud Compute service for EOSC-hub
EGI Cloud Compute service for EOSC-hub EGI Cloud Compute service for EOSC-hub
EGI Cloud Compute service for EOSC-hub
EOSC-hub project
 
Available HPC Resources at CSUC
Available HPC Resources at CSUCAvailable HPC Resources at CSUC
Designing High-Performance and Scalable Middleware for HPC, AI and Data Science
Designing High-Performance and Scalable Middleware for HPC, AI and Data ScienceDesigning High-Performance and Scalable Middleware for HPC, AI and Data Science
Designing High-Performance and Scalable Middleware for HPC, AI and Data Science
Object Automation
 
Designing HPC & Deep Learning Middleware for Exascale Systems
Designing HPC & Deep Learning Middleware for Exascale SystemsDesigning HPC & Deep Learning Middleware for Exascale Systems
Designing HPC & Deep Learning Middleware for Exascale Systems
inside-BigData.com
 
NSCC Training Introductory Class
NSCC Training Introductory Class NSCC Training Introductory Class
NSCC Training Introductory Class
National Supercomputing Centre Singapore
 
Designing High performance & Scalable Middleware for HPC
Designing High performance & Scalable Middleware for HPCDesigning High performance & Scalable Middleware for HPC
Designing High performance & Scalable Middleware for HPC
Object Automation
 
Opentelemetry - From frontend to backend
Opentelemetry - From frontend to backendOpentelemetry - From frontend to backend
Opentelemetry - From frontend to backend
Sebastian Poxhofer
 
Exploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spaceExploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design space
jsvetter
 

Similar to Past, present, and future of HPC in life sciences (20)

General Introduction to technologies that will be seen in the school
General Introduction to technologies that will be seen in the school General Introduction to technologies that will be seen in the school
General Introduction to technologies that will be seen in the school
 
Hungarian ClusterGrid and its applications
Hungarian ClusterGrid and its applicationsHungarian ClusterGrid and its applications
Hungarian ClusterGrid and its applications
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing Technologies
 
NIIF Grid Development portfolio
NIIF Grid Development portfolioNIIF Grid Development portfolio
NIIF Grid Development portfolio
 
20160201_resume_Vladimir_Chesnokov
20160201_resume_Vladimir_Chesnokov20160201_resume_Vladimir_Chesnokov
20160201_resume_Vladimir_Chesnokov
 
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale SystemsDesigning HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
 
Scallable Distributed Deep Learning on OpenPOWER systems
Scallable Distributed Deep Learning on OpenPOWER systemsScallable Distributed Deep Learning on OpenPOWER systems
Scallable Distributed Deep Learning on OpenPOWER systems
 
Pathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaborationPathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaboration
 
NWU and HPC
NWU and HPCNWU and HPC
NWU and HPC
 
Best pratices at BGI for the Challenges in the Era of Big Genomics Data
Best pratices at BGI for the Challenges in the Era of Big Genomics DataBest pratices at BGI for the Challenges in the Era of Big Genomics Data
Best pratices at BGI for the Challenges in the Era of Big Genomics Data
 
H2020-AHTOOLS Use Case 3 Functional Design
H2020-AHTOOLS Use Case 3 Functional DesignH2020-AHTOOLS Use Case 3 Functional Design
H2020-AHTOOLS Use Case 3 Functional Design
 
optimizing_ceph_flash
optimizing_ceph_flashoptimizing_ceph_flash
optimizing_ceph_flash
 
EGI Cloud Compute service for EOSC-hub
EGI Cloud Compute service for EOSC-hub EGI Cloud Compute service for EOSC-hub
EGI Cloud Compute service for EOSC-hub
 
Available HPC Resources at CSUC
Available HPC Resources at CSUCAvailable HPC Resources at CSUC
Available HPC Resources at CSUC
 
Designing High-Performance and Scalable Middleware for HPC, AI and Data Science
Designing High-Performance and Scalable Middleware for HPC, AI and Data ScienceDesigning High-Performance and Scalable Middleware for HPC, AI and Data Science
Designing High-Performance and Scalable Middleware for HPC, AI and Data Science
 
Designing HPC & Deep Learning Middleware for Exascale Systems
Designing HPC & Deep Learning Middleware for Exascale SystemsDesigning HPC & Deep Learning Middleware for Exascale Systems
Designing HPC & Deep Learning Middleware for Exascale Systems
 
NSCC Training Introductory Class
NSCC Training Introductory Class NSCC Training Introductory Class
NSCC Training Introductory Class
 
Designing High performance & Scalable Middleware for HPC
Designing High performance & Scalable Middleware for HPCDesigning High performance & Scalable Middleware for HPC
Designing High performance & Scalable Middleware for HPC
 
Opentelemetry - From frontend to backend
Opentelemetry - From frontend to backendOpentelemetry - From frontend to backend
Opentelemetry - From frontend to backend
 
Exploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spaceExploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design space
 

Recently uploaded

GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
Techgropse Pvt.Ltd.
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
FODUU
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 

Recently uploaded (20)

GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 

Past, present, and future of HPC in life sciences

  • 1. The past, present, and future of HPC in life sciences Erich Birngruber, Ümit Seren Gregor Mendel Institute for Molecular Plant Biology (GMI) AHPC17
  • 2. Who we are - Basic research institute in plant sciences - 9 independent research groups - Employees 100 + 20 (scientific + admin) - HPC Operations Team: 2 + 1 (engineer + lead)
  • 3. Past: Beginnings as traditional HPC Scientific computing at GMI - Started in 2010 - SGI ICE-X since 2013 (MENDEL) (72 nodes, 144 today) - SGI UV2000 - Rich software environment (EasyBuild, lmod) - Keeping up with current developments Machine specs 3 generations of nodes: - 72x 16c E5-2609, 192gb mem - 18x 20c E5-2680, 256gb mem - 54x 24c E5-2650, 256gb mem, 230gb ssd UV2000: 96c E5-4617, 2tb mem IB FDR interconnect (1 fabric) Storage: Lustre 300tb, NetApp >1pb
  • 5. Present: GMI site specifics - Services: customers are biologists - On campus initial training - Consulting and support (w/ ticket system, intranet wiki) - Software installations - Provided as modules: different versions, repeatability - This is getting harder with the demand for more complex software - Monitoring software usage
  • 6. Present: Monitoring software usage - Software in env modules - 460 software packages in 1297 versions - Monitoring module usage (load, unload) - Reporting by user, job, project
  • 7. Present: Monitoring system activity Monitoring and metrics The foundation for all future decisions - Resource consumption - Capacity planning - Software, technology usage - Auditing Alerting
  • 10. Present: Applications & Appliances Phenobox (in development) - Web-interface, API - MySQL (DB) - DSLR, RaspberryPi - HPC (computer vision, storage) GWA-Portal (https://gwas.gmi.oeaw.ac.at) - Web-interface, API - Elasticsearch (fulltext search) - PostgreSQL (DB) - Docker (Python microservices) - HPC (analysis, storage) Galaxy (https://galaxyproject.org/) - Web-interface, API - MySQL (DB) - Visualization - HPC (analysis, storage) PacBio SMRT Link (https://github.com/PacificBiosciences/SMRT-Link) - Web-interface, API - MySQL (DB) - HPC (analysis, storage) Own developments: 3rd party software:
  • 11. Present: new developments Deployment of OpenStack (IaaS): - Cross-vendor open source project - On-premises cloud - Provision VMs and containers - Deploy classic application services - Enables self-service for customers Consequences: - More heterogeneous use-cases - Customer base is increasing - Non-human “customers” of HPC - Services are more complex and distributed over subsystems
  • 14. Present: Problem 1: maintenance - VMs are difficult to maintain - Wrong abstraction for the use-case - What is the next step? - Containers? - Container Orchestration Engines? - Provide Software as a Service (SaaS)? Fact is: the field is evolving
  • 16. Future: Problem 2: integration Applications sit on different islands: HPC vs. Cloud Drawbacks: - Hard to maintain (infra) - Hard to debug (app) Vision: converged compute platform. Unified infrastructure to schedule all types of tasks New challenges: - Networking - Storage - IDM - Accounting What do others do? Container Orchestration Engine (Google Kubernetes, Docker Swarm, Apache Mesos) First steps: - Containers for HPC - Biocontainers http://biocontainers.pro - Singularity http://singularity.lbl.gov - Current status: test deployment
  • 17. Contact / References: Erich Birngruber <erich.birngruber@gmi.oeaw.ac.at>, @ebirn Ümit Seren <uemit.seren@gmi.oeaw.ac.at>, @timeu_s GMI on Github: https://github.com/Gregor-Mendel-Institute Total recall: holistic metrics for broad systems performance and user experience visibility in a data-intensive computing environment https://dl.acm.org/citation.cfm?id=2835001
  • 18. Acknowledgements Gregor Mendel Institute of Molecular Plant Biology Dr Bohr-Gasse 3 1030 Vienna, Austria
  • 19. EOF