SlideShare a Scribd company logo
Future Science
on Future OpenStack
Developing next generation infrastructure at CERN and SKA
Belmiro Moreira
Cloud Architect, CERN
Stig Telfer
CTO, StackHPC Ltd
SKA Performance Prototype Platform
Co-chair, OpenStack Scientific SIG
CERN - Large Hadron Collider (LHC)
CERN - Large Hadron Collider (LHC)
CERN: Compact Muon Solenoid (CMS)
CERN: Cloud Infrastructure by Numbers
CERN Cloud Architecture
Cell A
Ceph
DBoD DBoD
Ceph
Nova API Cell
Controllers
API Servers
Cell B Cell C
OpenStack
Services
22ms
What is SKA?
Image courtesy of CSIRO
Science Data Processor
ALaSKA - à la SKA:
SKA Performance Prototyping
SKA - Performance Prototype Platform
Bare Metal Hardware
Lifecycle
CERN - Hardware Lifecycle
● ~ 2000 new servers per year
○ Two rounds of procurement - bulk purchases
■ Continuous delivery model cannot be used at CERN
○ Hardware location is defined per procurement round according to rack space, cooling and
electrical availability
○ Annual capacity planning
● When new servers are added others need to be retired
○ The process to empty machines depends on the workloads running on the servers
■ Batch workloads usually require a couple of weeks to free up the servers
■ Services/Personal workloads are migrated to the new hardware
CERN - Hardware Lifecycle
● Hardware is highly heterogeneous
○ 2-3 vendors per annual procurement cycle, each one with their own optimisations
● Advantages
○ A problem with a vendor doesn’t affect the entire capacity for a procurement cycle
■ When there are issues in one delivery, such as disk firmware, BMC controllers, … others
are usually not affected
● Disadvantages
○ 15-20 different hardware configurations in the data centres
○ CERN tooling (bootstrap, monitoring, ...) needs to support different configurations
○ Challenge in defining the VM flavors exposed to the users
CERN - Hardware Lifecycle
● OpenStack Ironic
○ Provision physical servers using OpenStack Nova API
■ VMs don’t fit all use cases
● Disk servers; DB servers; ...
○ All resources are managed using OpenStack (VMs; Containers; Bare Metal)
■ Same accounting and traceability for all resources
● Can Ironic be used to manage all the Hardware Lifecycle?
■ Replace all the specific tooling built over the years to manage the Hardware Lifecycle
workflow
CERN - Hardware Lifecycle
● Requirements to manage the Hardware Lifecycle
○ A database to store all hardware attributes
■ Manufacturer, product revision, firmware version, …
○ Flexible and complete Hardware introspection
○ Flexible API to add/query server attributes
○ Burn in and acceptance process
○ Define when resources are available to users
■ State workflow
○ Policy needs to allow segregate access to the different teams
○ Clear retirement procedure
CERN - Hardware Lifecycle
CERN - Hardware Lifecycle
● Current CERN model
○ Automated but complex
○ Set of tools/DBs developed in house
○ Difficult to track and account resource utilization
○ CERN specific
● What we envision with Ironic
○ Capable to manage the entire the Hardware Lifecycle
■ Automated and Generic
■ Pluggable
■ Track resources
ALaSKA - Hardware Life Cycle
Auto-discovery and inspection
Hardware anomaly detection with Cardiff
Enrollment with Inspector rules
Ansible-driven BIOS and RAID configuration
Ansible-driven network switch configuration
Kayobe: Kolla-on-Bifrost
http://kayobe.readthedocs.io/
Application Cluster Storage
Requirements
● SKA Science Data Processor consumes a data feed of 1.5TB/s
● This data must be stored for 6 hours
● Processed datasets must be stored for 6 months
Solutions
● High performance filesystems
● High performance object stores
● High performance message queues
Supporting Scientific
Applications
Preemptible Instances
● Scientific Clouds use project quotas
○ Projects have different funding models
■ They expect a predefined number of resources available
■ But not always these resources are used full time
○ Other projects can use these free resources
■ Opportunistic workloads
● Public clouds use a spot market for free resources
○ Based on different pricing/SLA considering resource availability
○ Private clouds usually don’t charge users directly
● How can the available resources be used more efficiently?
Preemptible Instances
● Building a prototype
○ Minimise changes to OpenStack nova
● Approach
○ Preemptible instances are identified using metadata
○ Project quotas are not considered for preemptible instances
○ “NoValidHost” for a non preemptible instance triggers the “Reaper” service
○ The “Reaper” service is responsible to delete preemptible instances
■ Needs some intelligence to free up the resources necessary for the new instance
○ The original request retries
● Follow/Participate in the discussion
○ https://etherpad.openstack.org/p/nova-preemptible-servers-discussion
○ https://review.openstack.org/#/c/438640/2
○ https://gitlab.cern.ch/ttsiouts/reaper/
Magnum on Bare Metal
Better Ironic support in Magnum templates
File and Block Storage within Magnum environments
OpenStack Magnum - Manages clusters defined by cluster templates
Supports Docker swarm mode & Kubernetes
Remote access to clusters using native tooling (Docker client, kubectl, etc.)
Automated scaling up/down
Bare metal support not always current
Sahara on Bare Metal
HiBD - Hadoop and Spark with Infiniband and RDMA acceleration
OpenStack Sahara - Manages clusters defined by cluster templates
Supports Hadoop and Spark
Automated scaling up/down
Extended with HiBD from OSU
RDMA-enabled analytics
OpenHPC on OpenStack
Cluster infrastructure deployed using Heat templates
Configuration and “personalisation” in Ansible
Slurm-as-a-Service deployment and configuration in Ansible
Infiniband and MPI
Home directories in CephFS
Keys managed in Barbican
OpenStack Scientific SIG
● Written with help from the OpenStack
Scientific SIG
● Current best practice for OpenStack and
HPC
● Six subject overviews with case studies
contributed by WG members
https://www.openstack.org/science/
What will openstack ‘Z’ look like?
● Due for release 2H 2022
● …?
● …?
● …?
● Remaining details TBD
Future Science on Future OpenStack

More Related Content

What's hot

20161025 OpenStack at CERN Barcelona
20161025 OpenStack at CERN Barcelona20161025 OpenStack at CERN Barcelona
20161025 OpenStack at CERN Barcelona
Tim Bell
 
The OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack NordicThe OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack Nordic
Tim Bell
 
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014
Belmiro Moreira
 
20150924 rda federation_v1
20150924 rda federation_v120150924 rda federation_v1
20150924 rda federation_v1
Tim Bell
 
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Belmiro Moreira
 
OpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim BellOpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim Bell
Amrita Prasad
 
Learning to Scale OpenStack
Learning to Scale OpenStackLearning to Scale OpenStack
Learning to Scale OpenStack
Rainya Mosher
 
20121017 OpenStack CERN Accelerating Science
20121017 OpenStack CERN Accelerating Science20121017 OpenStack CERN Accelerating Science
20121017 OpenStack CERN Accelerating Science
Tim Bell
 
Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive Into the CERN Cloud Infrastructure - November, 2013Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Belmiro Moreira
 
Unveiling CERN Cloud Architecture - October, 2015
Unveiling CERN Cloud Architecture - October, 2015Unveiling CERN Cloud Architecture - October, 2015
Unveiling CERN Cloud Architecture - October, 2015
Belmiro Moreira
 
OpenContrail Implementations
OpenContrail ImplementationsOpenContrail Implementations
OpenContrail Implementations
Jakub Pavlik
 
CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016
Belmiro Moreira
 
Operators experience and perspective on SDN with VLANs and L3 Networks
Operators experience and perspective on SDN with VLANs and L3 NetworksOperators experience and perspective on SDN with VLANs and L3 Networks
Operators experience and perspective on SDN with VLANs and L3 Networks
Jakub Pavlik
 
OpenStack at CERN : A 5 year perspective
OpenStack at CERN : A 5 year perspectiveOpenStack at CERN : A 5 year perspective
OpenStack at CERN : A 5 year perspective
Tim Bell
 
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
Jakub Pavlik
 
What's new in OpenStack Liberty
What's new in OpenStack LibertyWhat's new in OpenStack Liberty
What's new in OpenStack Liberty
Stephen Gordon
 
Integrating Bare-metal Provisioning into CERN's Private Cloud
Integrating Bare-metal Provisioning into CERN's Private CloudIntegrating Bare-metal Provisioning into CERN's Private Cloud
Integrating Bare-metal Provisioning into CERN's Private Cloud
Arne Wiebalck
 
The OpenStack Cloud at CERN
The OpenStack Cloud at CERNThe OpenStack Cloud at CERN
The OpenStack Cloud at CERN
Arne Wiebalck
 
OpenContrail Experience tcp cloud OpenStack Summit Tokyo
OpenContrail Experience tcp cloud OpenStack Summit TokyoOpenContrail Experience tcp cloud OpenStack Summit Tokyo
OpenContrail Experience tcp cloud OpenStack Summit Tokyo
Jakub Pavlik
 
Hpc to OpenStack: Our journey
Hpc to OpenStack: Our journeyHpc to OpenStack: Our journey
Hpc to OpenStack: Our journey
Arif Ali
 

What's hot (20)

20161025 OpenStack at CERN Barcelona
20161025 OpenStack at CERN Barcelona20161025 OpenStack at CERN Barcelona
20161025 OpenStack at CERN Barcelona
 
The OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack NordicThe OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack Nordic
 
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014
 
20150924 rda federation_v1
20150924 rda federation_v120150924 rda federation_v1
20150924 rda federation_v1
 
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
 
OpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim BellOpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim Bell
 
Learning to Scale OpenStack
Learning to Scale OpenStackLearning to Scale OpenStack
Learning to Scale OpenStack
 
20121017 OpenStack CERN Accelerating Science
20121017 OpenStack CERN Accelerating Science20121017 OpenStack CERN Accelerating Science
20121017 OpenStack CERN Accelerating Science
 
Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive Into the CERN Cloud Infrastructure - November, 2013Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive Into the CERN Cloud Infrastructure - November, 2013
 
Unveiling CERN Cloud Architecture - October, 2015
Unveiling CERN Cloud Architecture - October, 2015Unveiling CERN Cloud Architecture - October, 2015
Unveiling CERN Cloud Architecture - October, 2015
 
OpenContrail Implementations
OpenContrail ImplementationsOpenContrail Implementations
OpenContrail Implementations
 
CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016
 
Operators experience and perspective on SDN with VLANs and L3 Networks
Operators experience and perspective on SDN with VLANs and L3 NetworksOperators experience and perspective on SDN with VLANs and L3 Networks
Operators experience and perspective on SDN with VLANs and L3 Networks
 
OpenStack at CERN : A 5 year perspective
OpenStack at CERN : A 5 year perspectiveOpenStack at CERN : A 5 year perspective
OpenStack at CERN : A 5 year perspective
 
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
 
What's new in OpenStack Liberty
What's new in OpenStack LibertyWhat's new in OpenStack Liberty
What's new in OpenStack Liberty
 
Integrating Bare-metal Provisioning into CERN's Private Cloud
Integrating Bare-metal Provisioning into CERN's Private CloudIntegrating Bare-metal Provisioning into CERN's Private Cloud
Integrating Bare-metal Provisioning into CERN's Private Cloud
 
The OpenStack Cloud at CERN
The OpenStack Cloud at CERNThe OpenStack Cloud at CERN
The OpenStack Cloud at CERN
 
OpenContrail Experience tcp cloud OpenStack Summit Tokyo
OpenContrail Experience tcp cloud OpenStack Summit TokyoOpenContrail Experience tcp cloud OpenStack Summit Tokyo
OpenContrail Experience tcp cloud OpenStack Summit Tokyo
 
Hpc to OpenStack: Our journey
Hpc to OpenStack: Our journeyHpc to OpenStack: Our journey
Hpc to OpenStack: Our journey
 

Similar to Future Science on Future OpenStack

OpenStack Best Practices and Considerations - terasky tech day
OpenStack Best Practices and Considerations  - terasky tech dayOpenStack Best Practices and Considerations  - terasky tech day
OpenStack Best Practices and Considerations - terasky tech day
Arthur Berezin
 
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
NETWAYS
 
OpenStack London Meetup, 18 Nov 2015
OpenStack London Meetup, 18 Nov 2015OpenStack London Meetup, 18 Nov 2015
OpenStack London Meetup, 18 Nov 2015
Jesse Pretorius
 
Swami osi bangalore2017days pike release_updates
Swami osi bangalore2017days pike release_updatesSwami osi bangalore2017days pike release_updates
Swami osi bangalore2017days pike release_updates
Ranga Swami Reddy Muthumula
 
OpenStack on the Fabric - OpenStack Korea January Seminar 2014
OpenStack on the Fabric - OpenStack Korea January Seminar 2014OpenStack on the Fabric - OpenStack Korea January Seminar 2014
OpenStack on the Fabric - OpenStack Korea January Seminar 2014
Jun Lee
 
Introduction to rook
Introduction to rookIntroduction to rook
Introduction to rook
Rohan Gupta
 
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kevin Lynch
 
Cloud Architect Alliance #15: Openstack
Cloud Architect Alliance #15: OpenstackCloud Architect Alliance #15: Openstack
Cloud Architect Alliance #15: Openstack
Microsoft
 
CloudLab Overview
CloudLab OverviewCloudLab Overview
CloudLab Overview
Ed Dodds
 
Build bare metal kubernetes cluster for hpc on open stack in translational me...
Build bare metal kubernetes cluster for hpc on open stack in translational me...Build bare metal kubernetes cluster for hpc on open stack in translational me...
Build bare metal kubernetes cluster for hpc on open stack in translational me...
Shuquan Huang
 
Introduction openstack-meetup-nov-28
Introduction openstack-meetup-nov-28Introduction openstack-meetup-nov-28
Introduction openstack-meetup-nov-28
Sadique Puthen
 
KubeCon US 2021 - Recap - DCMeetup
KubeCon US 2021 - Recap - DCMeetupKubeCon US 2021 - Recap - DCMeetup
KubeCon US 2021 - Recap - DCMeetup
Faheem Memon
 
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Databricks
 
2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기
2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기
2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기
OpenStack Korea Community
 
Introduction to OpenStack Storage
Introduction to OpenStack StorageIntroduction to OpenStack Storage
Introduction to OpenStack Storage
NetApp
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017
Dave Holland
 
What's New with Ceph - Ceph Day Silicon Valley
What's New with Ceph - Ceph Day Silicon ValleyWhat's New with Ceph - Ceph Day Silicon Valley
What's New with Ceph - Ceph Day Silicon Valley
Ceph Community
 
Openstack – An introduction
Openstack – An introductionOpenstack – An introduction
Openstack – An introduction
Muddassir Nazir
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
MayaData Inc
 
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaSOpenstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Sadique Puthen
 

Similar to Future Science on Future OpenStack (20)

OpenStack Best Practices and Considerations - terasky tech day
OpenStack Best Practices and Considerations  - terasky tech dayOpenStack Best Practices and Considerations  - terasky tech day
OpenStack Best Practices and Considerations - terasky tech day
 
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
 
OpenStack London Meetup, 18 Nov 2015
OpenStack London Meetup, 18 Nov 2015OpenStack London Meetup, 18 Nov 2015
OpenStack London Meetup, 18 Nov 2015
 
Swami osi bangalore2017days pike release_updates
Swami osi bangalore2017days pike release_updatesSwami osi bangalore2017days pike release_updates
Swami osi bangalore2017days pike release_updates
 
OpenStack on the Fabric - OpenStack Korea January Seminar 2014
OpenStack on the Fabric - OpenStack Korea January Seminar 2014OpenStack on the Fabric - OpenStack Korea January Seminar 2014
OpenStack on the Fabric - OpenStack Korea January Seminar 2014
 
Introduction to rook
Introduction to rookIntroduction to rook
Introduction to rook
 
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the Datacenter
 
Cloud Architect Alliance #15: Openstack
Cloud Architect Alliance #15: OpenstackCloud Architect Alliance #15: Openstack
Cloud Architect Alliance #15: Openstack
 
CloudLab Overview
CloudLab OverviewCloudLab Overview
CloudLab Overview
 
Build bare metal kubernetes cluster for hpc on open stack in translational me...
Build bare metal kubernetes cluster for hpc on open stack in translational me...Build bare metal kubernetes cluster for hpc on open stack in translational me...
Build bare metal kubernetes cluster for hpc on open stack in translational me...
 
Introduction openstack-meetup-nov-28
Introduction openstack-meetup-nov-28Introduction openstack-meetup-nov-28
Introduction openstack-meetup-nov-28
 
KubeCon US 2021 - Recap - DCMeetup
KubeCon US 2021 - Recap - DCMeetupKubeCon US 2021 - Recap - DCMeetup
KubeCon US 2021 - Recap - DCMeetup
 
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
 
2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기
2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기
2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기
 
Introduction to OpenStack Storage
Introduction to OpenStack StorageIntroduction to OpenStack Storage
Introduction to OpenStack Storage
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017
 
What's New with Ceph - Ceph Day Silicon Valley
What's New with Ceph - Ceph Day Silicon ValleyWhat's New with Ceph - Ceph Day Silicon Valley
What's New with Ceph - Ceph Day Silicon Valley
 
Openstack – An introduction
Openstack – An introductionOpenstack – An introduction
Openstack – An introduction
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaSOpenstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
 

Recently uploaded

Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Peter Caitens
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
Marcin Chrost
 
Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...
Paul Brebner
 
Orca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container OrchestrationOrca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container Orchestration
Pedro J. Molina
 
TMU毕业证书精仿办理
TMU毕业证书精仿办理TMU毕业证书精仿办理
TMU毕业证书精仿办理
aeeva
 
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptxOperational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
sandeepmenon62
 
Kubernetes at Scale: Going Multi-Cluster with Istio
Kubernetes at Scale:  Going Multi-Cluster  with IstioKubernetes at Scale:  Going Multi-Cluster  with Istio
Kubernetes at Scale: Going Multi-Cluster with Istio
Severalnines
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
Bert Jan Schrijver
 
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
ervikas4
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
Grant Fritchey
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
The Third Creative Media
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
Reetu63
 
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
Luigi Fugaro
 
WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
Patrick Weigel
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Paul Brebner
 
The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024
Yara Milbes
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
dakas1
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 

Recently uploaded (20)

Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
 
Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...
 
Orca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container OrchestrationOrca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container Orchestration
 
TMU毕业证书精仿办理
TMU毕业证书精仿办理TMU毕业证书精仿办理
TMU毕业证书精仿办理
 
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptxOperational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
 
Kubernetes at Scale: Going Multi-Cluster with Istio
Kubernetes at Scale:  Going Multi-Cluster  with IstioKubernetes at Scale:  Going Multi-Cluster  with Istio
Kubernetes at Scale: Going Multi-Cluster with Istio
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
 
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
 
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
 
WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
 
The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 

Future Science on Future OpenStack

  • 1. Future Science on Future OpenStack Developing next generation infrastructure at CERN and SKA
  • 2. Belmiro Moreira Cloud Architect, CERN Stig Telfer CTO, StackHPC Ltd SKA Performance Prototype Platform Co-chair, OpenStack Scientific SIG
  • 3.
  • 4.
  • 5. CERN - Large Hadron Collider (LHC)
  • 6. CERN - Large Hadron Collider (LHC)
  • 7. CERN: Compact Muon Solenoid (CMS)
  • 9. CERN Cloud Architecture Cell A Ceph DBoD DBoD Ceph Nova API Cell Controllers API Servers Cell B Cell C OpenStack Services 22ms
  • 10.
  • 11.
  • 12. What is SKA? Image courtesy of CSIRO
  • 13.
  • 15. ALaSKA - à la SKA: SKA Performance Prototyping
  • 16. SKA - Performance Prototype Platform
  • 17.
  • 19. CERN - Hardware Lifecycle ● ~ 2000 new servers per year ○ Two rounds of procurement - bulk purchases ■ Continuous delivery model cannot be used at CERN ○ Hardware location is defined per procurement round according to rack space, cooling and electrical availability ○ Annual capacity planning ● When new servers are added others need to be retired ○ The process to empty machines depends on the workloads running on the servers ■ Batch workloads usually require a couple of weeks to free up the servers ■ Services/Personal workloads are migrated to the new hardware
  • 20. CERN - Hardware Lifecycle ● Hardware is highly heterogeneous ○ 2-3 vendors per annual procurement cycle, each one with their own optimisations ● Advantages ○ A problem with a vendor doesn’t affect the entire capacity for a procurement cycle ■ When there are issues in one delivery, such as disk firmware, BMC controllers, … others are usually not affected ● Disadvantages ○ 15-20 different hardware configurations in the data centres ○ CERN tooling (bootstrap, monitoring, ...) needs to support different configurations ○ Challenge in defining the VM flavors exposed to the users
  • 21. CERN - Hardware Lifecycle ● OpenStack Ironic ○ Provision physical servers using OpenStack Nova API ■ VMs don’t fit all use cases ● Disk servers; DB servers; ... ○ All resources are managed using OpenStack (VMs; Containers; Bare Metal) ■ Same accounting and traceability for all resources ● Can Ironic be used to manage all the Hardware Lifecycle? ■ Replace all the specific tooling built over the years to manage the Hardware Lifecycle workflow
  • 22. CERN - Hardware Lifecycle ● Requirements to manage the Hardware Lifecycle ○ A database to store all hardware attributes ■ Manufacturer, product revision, firmware version, … ○ Flexible and complete Hardware introspection ○ Flexible API to add/query server attributes ○ Burn in and acceptance process ○ Define when resources are available to users ■ State workflow ○ Policy needs to allow segregate access to the different teams ○ Clear retirement procedure
  • 23. CERN - Hardware Lifecycle
  • 24. CERN - Hardware Lifecycle ● Current CERN model ○ Automated but complex ○ Set of tools/DBs developed in house ○ Difficult to track and account resource utilization ○ CERN specific ● What we envision with Ironic ○ Capable to manage the entire the Hardware Lifecycle ■ Automated and Generic ■ Pluggable ■ Track resources
  • 25. ALaSKA - Hardware Life Cycle Auto-discovery and inspection Hardware anomaly detection with Cardiff Enrollment with Inspector rules Ansible-driven BIOS and RAID configuration Ansible-driven network switch configuration Kayobe: Kolla-on-Bifrost http://kayobe.readthedocs.io/
  • 27. Requirements ● SKA Science Data Processor consumes a data feed of 1.5TB/s ● This data must be stored for 6 hours ● Processed datasets must be stored for 6 months
  • 28. Solutions ● High performance filesystems ● High performance object stores ● High performance message queues
  • 30. Preemptible Instances ● Scientific Clouds use project quotas ○ Projects have different funding models ■ They expect a predefined number of resources available ■ But not always these resources are used full time ○ Other projects can use these free resources ■ Opportunistic workloads ● Public clouds use a spot market for free resources ○ Based on different pricing/SLA considering resource availability ○ Private clouds usually don’t charge users directly ● How can the available resources be used more efficiently?
  • 31. Preemptible Instances ● Building a prototype ○ Minimise changes to OpenStack nova ● Approach ○ Preemptible instances are identified using metadata ○ Project quotas are not considered for preemptible instances ○ “NoValidHost” for a non preemptible instance triggers the “Reaper” service ○ The “Reaper” service is responsible to delete preemptible instances ■ Needs some intelligence to free up the resources necessary for the new instance ○ The original request retries ● Follow/Participate in the discussion ○ https://etherpad.openstack.org/p/nova-preemptible-servers-discussion ○ https://review.openstack.org/#/c/438640/2 ○ https://gitlab.cern.ch/ttsiouts/reaper/
  • 32. Magnum on Bare Metal Better Ironic support in Magnum templates File and Block Storage within Magnum environments OpenStack Magnum - Manages clusters defined by cluster templates Supports Docker swarm mode & Kubernetes Remote access to clusters using native tooling (Docker client, kubectl, etc.) Automated scaling up/down Bare metal support not always current
  • 33. Sahara on Bare Metal HiBD - Hadoop and Spark with Infiniband and RDMA acceleration OpenStack Sahara - Manages clusters defined by cluster templates Supports Hadoop and Spark Automated scaling up/down Extended with HiBD from OSU RDMA-enabled analytics
  • 34. OpenHPC on OpenStack Cluster infrastructure deployed using Heat templates Configuration and “personalisation” in Ansible Slurm-as-a-Service deployment and configuration in Ansible Infiniband and MPI Home directories in CephFS Keys managed in Barbican
  • 35. OpenStack Scientific SIG ● Written with help from the OpenStack Scientific SIG ● Current best practice for OpenStack and HPC ● Six subject overviews with case studies contributed by WG members https://www.openstack.org/science/
  • 36. What will openstack ‘Z’ look like? ● Due for release 2H 2022 ● …? ● …? ● …? ● Remaining details TBD