SlideShare a Scribd company logo
1 of 18
Multi-Cloud Federated
Kubernetes at CERN
Clenimar Filemon
clenimar@lsd.ufcg.edu.br
Ricardo Rocha
ricardo.rocha@cern.ch
Founded in 1954
What is 96% of the universe made of?
Fundamental Science
Why isn’t there anti-matter in the universe?
What was the state of matter just after the Big Bang?
~40 MHz
~ 1PB/sec
L1
Trigger
~ 100 kHz
HL
Trigger
Collisions
Hardware Filter
Software Filter
~ 1 kHz
Raw Data
~ 1-10 GB/s
Huge Data
Still Big
Still Big
320 000 Cores
3 300 Users
4 300 Projects
10 000 Hypervisors 210 Kubernetes Clusters
250 Petabytes
200+ Sites
700 000 Cores
~400 000 Jobs
Distributed Computing
~30 GiB/s
CERN
T1
T2
...
...
...
...
...
...
...
...
Reconstruction
Calibration
Simulation
Analysis
CMS Higgs Event, May 2012
ATLAS Higgs Analysis, May 2012
Motivation for Federation
Periodic load spikes
International Conferences, Reconstruction Campaigns
Simplification
Monitoring, Lifecycle, Alarms
Deployment
Uniform API, Replication, Load Balancing
Use Cases: CERN Batch System, RECAST Analysis
Sched Collector
Negotiator
StartD
AcctGroup = "ATLAS"
JobPrio = 0
RequestCpus = 2
RequestMemory = 4260
...
CERNEnvironment = “production”
Datacenter = “meyrin”
HasMPI = true
TotalCpus = 8
TotalMemory = 22500
...
Matchmaking with ClassAds
Fair Share
Preemption
Running Virtualized
Extensive Experience in HEP
External Storage and Networking
Sched Collector
Negotiator
StartD
AcctGroup = "ATLAS"
JobPrio = 0
RequestCpus = 2
RequestMemory = 4260
...
CERNEnvironment = “production”
Datacenter = “meyrin”
HasMPI = true
TotalCpus = 8
TotalMemory = 22500
...
Matchmaking with ClassAds
Fair Share
Preemption
Running Virtualized
Extensive Experience in HEP
External Storage and Networking
Sched
Negotiator
Collector
Host
kubefed init fed --host-cluster-context=condor-host ...
kind: DaemonSet
...
hostNetwork: true
containers:
- name: condor-startd
image: .../cloud/condor-startd
command: ["/usr/sbin/condor_startd", "-f"]
securityContext:
privileged: true
livenessProbe:
exec:
command:
- condor_who
Sched
Negotiator
Collector
Host
StartD
...
StartD
...
StartD
...
kubefed init fed --host-cluster-context=condor-host ...
kubefed join --context fed tsystems 
--host-cluster-context condor-host --cluster-context tsystems
REANA / RECAST
Reusable Analysis Platform
Workflow Engine (Yadage)
Each step a Kubernetes Job
Integrated Monitoring & Logging
Centralized Log Collection
https://github.com/reanahubhttps://github.com/recast-hep https://github.com/diana-hep/yadage
https://www.youtube.com/watch?v=jNyd97LiTXk
Thank You
Great Community, Amazing Tools
Credits
CERN OpenStack Cloud and Batch teams (Spyros Trigazis and all)
Lukas Heinrich, REANA / RECAST
Kelsey Hightower

More Related Content

What's hot

inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...Andrew Howard
 
IBM Cloud Community Summit 2018:「Kubernetes in Muiticloudで戦うCloud Native時代」 b...
IBM Cloud Community Summit 2018:「Kubernetes in Muiticloudで戦うCloud Native時代」 b...IBM Cloud Community Summit 2018:「Kubernetes in Muiticloudで戦うCloud Native時代」 b...
IBM Cloud Community Summit 2018:「Kubernetes in Muiticloudで戦うCloud Native時代」 b...capsmalt
 
SkyhookDM - Towards an Arrow-Native Storage System
SkyhookDM - Towards an Arrow-Native Storage SystemSkyhookDM - Towards an Arrow-Native Storage System
SkyhookDM - Towards an Arrow-Native Storage SystemJayjeetChakraborty
 
Running a GPU burst for Multi-Messenger Astrophysics with IceCube across all ...
Running a GPU burst for Multi-Messenger Astrophysics with IceCube across all ...Running a GPU burst for Multi-Messenger Astrophysics with IceCube across all ...
Running a GPU burst for Multi-Messenger Astrophysics with IceCube across all ...Igor Sfiligoi
 
The OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack NordicThe OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack NordicTim Bell
 
Data-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud BurstData-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud BurstIgor Sfiligoi
 
Ajal vjcet
Ajal vjcetAjal vjcet
Ajal vjcetAJAL A J
 
Physics Data Processing - The online connection
Physics Data Processing - The online connectionPhysics Data Processing - The online connection
Physics Data Processing - The online connectionSander Klous
 
NRP Engagement webinar - Running a 51k GPU multi-cloud burst for MMA with Ic...
 NRP Engagement webinar - Running a 51k GPU multi-cloud burst for MMA with Ic... NRP Engagement webinar - Running a 51k GPU multi-cloud burst for MMA with Ic...
NRP Engagement webinar - Running a 51k GPU multi-cloud burst for MMA with Ic...Igor Sfiligoi
 
Burst data retrieval after 50k GPU Cloud run
Burst data retrieval after 50k GPU Cloud runBurst data retrieval after 50k GPU Cloud run
Burst data retrieval after 50k GPU Cloud runIgor Sfiligoi
 
PIT Overload Analysis in Content Centric Networks - Slides ICN '13
PIT Overload Analysis in Content Centric Networks - Slides ICN '13PIT Overload Analysis in Content Centric Networks - Slides ICN '13
PIT Overload Analysis in Content Centric Networks - Slides ICN '13Matteo Virgilio
 
scTGIFの鬼QC機能の追加
scTGIFの鬼QC機能の追加scTGIFの鬼QC機能の追加
scTGIFの鬼QC機能の追加弘毅 露崎
 
Quick Coarse-grained kinetic Monte Carlo overview
Quick Coarse-grained kinetic Monte Carlo overviewQuick Coarse-grained kinetic Monte Carlo overview
Quick Coarse-grained kinetic Monte Carlo overviewStuart Collins
 
"Building and running the cloud GPU vacuum cleaner"
"Building and running the cloud GPU vacuum cleaner""Building and running the cloud GPU vacuum cleaner"
"Building and running the cloud GPU vacuum cleaner"Frank Wuerthwein
 
1細胞オミックスのための新GSEA手法
1細胞オミックスのための新GSEA手法1細胞オミックスのための新GSEA手法
1細胞オミックスのための新GSEA手法弘毅 露崎
 
Pig TPC-H Benchmark and Performance Tuning
Pig TPC-H Benchmark and Performance TuningPig TPC-H Benchmark and Performance Tuning
Pig TPC-H Benchmark and Performance TuningJie Li
 
Aurora Dublin
Aurora DublinAurora Dublin
Aurora Dublindpshelio
 
Updates on the Fake Object Pipeline for HSC Survey
Updates on the Fake Object Pipeline for HSC Survey Updates on the Fake Object Pipeline for HSC Survey
Updates on the Fake Object Pipeline for HSC Survey Song Huang
 
Nika it consulting weekly update
Nika it consulting weekly update  Nika it consulting weekly update
Nika it consulting weekly update Rod Delwar
 

What's hot (20)

inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
 
IBM Cloud Community Summit 2018:「Kubernetes in Muiticloudで戦うCloud Native時代」 b...
IBM Cloud Community Summit 2018:「Kubernetes in Muiticloudで戦うCloud Native時代」 b...IBM Cloud Community Summit 2018:「Kubernetes in Muiticloudで戦うCloud Native時代」 b...
IBM Cloud Community Summit 2018:「Kubernetes in Muiticloudで戦うCloud Native時代」 b...
 
SkyhookDM - Towards an Arrow-Native Storage System
SkyhookDM - Towards an Arrow-Native Storage SystemSkyhookDM - Towards an Arrow-Native Storage System
SkyhookDM - Towards an Arrow-Native Storage System
 
Running a GPU burst for Multi-Messenger Astrophysics with IceCube across all ...
Running a GPU burst for Multi-Messenger Astrophysics with IceCube across all ...Running a GPU burst for Multi-Messenger Astrophysics with IceCube across all ...
Running a GPU burst for Multi-Messenger Astrophysics with IceCube across all ...
 
The OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack NordicThe OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack Nordic
 
Data-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud BurstData-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud Burst
 
Ajal vjcet
Ajal vjcetAjal vjcet
Ajal vjcet
 
Physics Data Processing - The online connection
Physics Data Processing - The online connectionPhysics Data Processing - The online connection
Physics Data Processing - The online connection
 
NRP Engagement webinar - Running a 51k GPU multi-cloud burst for MMA with Ic...
 NRP Engagement webinar - Running a 51k GPU multi-cloud burst for MMA with Ic... NRP Engagement webinar - Running a 51k GPU multi-cloud burst for MMA with Ic...
NRP Engagement webinar - Running a 51k GPU multi-cloud burst for MMA with Ic...
 
Burst data retrieval after 50k GPU Cloud run
Burst data retrieval after 50k GPU Cloud runBurst data retrieval after 50k GPU Cloud run
Burst data retrieval after 50k GPU Cloud run
 
TiReX: Tiled Regular eXpression matching architecture
TiReX: Tiled Regular eXpression matching architectureTiReX: Tiled Regular eXpression matching architecture
TiReX: Tiled Regular eXpression matching architecture
 
PIT Overload Analysis in Content Centric Networks - Slides ICN '13
PIT Overload Analysis in Content Centric Networks - Slides ICN '13PIT Overload Analysis in Content Centric Networks - Slides ICN '13
PIT Overload Analysis in Content Centric Networks - Slides ICN '13
 
scTGIFの鬼QC機能の追加
scTGIFの鬼QC機能の追加scTGIFの鬼QC機能の追加
scTGIFの鬼QC機能の追加
 
Quick Coarse-grained kinetic Monte Carlo overview
Quick Coarse-grained kinetic Monte Carlo overviewQuick Coarse-grained kinetic Monte Carlo overview
Quick Coarse-grained kinetic Monte Carlo overview
 
"Building and running the cloud GPU vacuum cleaner"
"Building and running the cloud GPU vacuum cleaner""Building and running the cloud GPU vacuum cleaner"
"Building and running the cloud GPU vacuum cleaner"
 
1細胞オミックスのための新GSEA手法
1細胞オミックスのための新GSEA手法1細胞オミックスのための新GSEA手法
1細胞オミックスのための新GSEA手法
 
Pig TPC-H Benchmark and Performance Tuning
Pig TPC-H Benchmark and Performance TuningPig TPC-H Benchmark and Performance Tuning
Pig TPC-H Benchmark and Performance Tuning
 
Aurora Dublin
Aurora DublinAurora Dublin
Aurora Dublin
 
Updates on the Fake Object Pipeline for HSC Survey
Updates on the Fake Object Pipeline for HSC Survey Updates on the Fake Object Pipeline for HSC Survey
Updates on the Fake Object Pipeline for HSC Survey
 
Nika it consulting weekly update
Nika it consulting weekly update  Nika it consulting weekly update
Nika it consulting weekly update
 

Similar to Multi-Cloud Federated Kubernetes at CERN

Terabit Applications: What Are They, What is Needed to Enable Them?
Terabit Applications: What Are They, What is Needed to Enable Them?Terabit Applications: What Are They, What is Needed to Enable Them?
Terabit Applications: What Are They, What is Needed to Enable Them?Larry Smarr
 
The Optiputer - Toward a Terabit LAN
The Optiputer - Toward a Terabit LANThe Optiputer - Toward a Terabit LAN
The Optiputer - Toward a Terabit LANLarry Smarr
 
Computing Challenges at the Large Hadron Collider
Computing Challenges at the Large Hadron ColliderComputing Challenges at the Large Hadron Collider
Computing Challenges at the Large Hadron Colliderinside-BigData.com
 
OSMC 2012 | Monitoring at CERN by Christophe Haen
OSMC 2012 | Monitoring at CERN by Christophe HaenOSMC 2012 | Monitoring at CERN by Christophe Haen
OSMC 2012 | Monitoring at CERN by Christophe HaenNETWAYS
 
Big Data for Big Discoveries
Big Data for Big DiscoveriesBig Data for Big Discoveries
Big Data for Big DiscoveriesGovnet Events
 
Big Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle PhysicsBig Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle PhysicsAndrew Lowe
 
Hpc, grid and cloud computing - the past, present, and future challenge
Hpc, grid and cloud computing - the past, present, and future challengeHpc, grid and cloud computing - the past, present, and future challenge
Hpc, grid and cloud computing - the past, present, and future challengeJason Shih
 
Kafka Summit SF 2017 - Accelerating Particles to Explore the Mysteries of the...
Kafka Summit SF 2017 - Accelerating Particles to Explore the Mysteries of the...Kafka Summit SF 2017 - Accelerating Particles to Explore the Mysteries of the...
Kafka Summit SF 2017 - Accelerating Particles to Explore the Mysteries of the...confluent
 
Valladolid final-septiembre-2010
Valladolid final-septiembre-2010Valladolid final-septiembre-2010
Valladolid final-septiembre-2010TELECOM I+D
 
CERN User Story
CERN User StoryCERN User Story
CERN User StoryTim Bell
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
 
Pic archiver stansted
Pic archiver stanstedPic archiver stansted
Pic archiver stanstedArchiver
 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumIan Foster
 
Why Researchers are Using Advanced Networks
Why Researchers are Using Advanced NetworksWhy Researchers are Using Advanced Networks
Why Researchers are Using Advanced NetworksLarry Smarr
 
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...BigDataEverywhere
 
Science and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraScience and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraLarry Smarr
 
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...Larry Smarr
 

Similar to Multi-Cloud Federated Kubernetes at CERN (20)

Terabit Applications: What Are They, What is Needed to Enable Them?
Terabit Applications: What Are They, What is Needed to Enable Them?Terabit Applications: What Are They, What is Needed to Enable Them?
Terabit Applications: What Are They, What is Needed to Enable Them?
 
Jarp big data_sydney_v7
Jarp big data_sydney_v7Jarp big data_sydney_v7
Jarp big data_sydney_v7
 
The Optiputer - Toward a Terabit LAN
The Optiputer - Toward a Terabit LANThe Optiputer - Toward a Terabit LAN
The Optiputer - Toward a Terabit LAN
 
Computing Challenges at the Large Hadron Collider
Computing Challenges at the Large Hadron ColliderComputing Challenges at the Large Hadron Collider
Computing Challenges at the Large Hadron Collider
 
OSMC 2012 | Monitoring at CERN by Christophe Haen
OSMC 2012 | Monitoring at CERN by Christophe HaenOSMC 2012 | Monitoring at CERN by Christophe Haen
OSMC 2012 | Monitoring at CERN by Christophe Haen
 
Big Data for Big Discoveries
Big Data for Big DiscoveriesBig Data for Big Discoveries
Big Data for Big Discoveries
 
Big Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle PhysicsBig Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle Physics
 
Hpc, grid and cloud computing - the past, present, and future challenge
Hpc, grid and cloud computing - the past, present, and future challengeHpc, grid and cloud computing - the past, present, and future challenge
Hpc, grid and cloud computing - the past, present, and future challenge
 
Kafka Summit SF 2017 - Accelerating Particles to Explore the Mysteries of the...
Kafka Summit SF 2017 - Accelerating Particles to Explore the Mysteries of the...Kafka Summit SF 2017 - Accelerating Particles to Explore the Mysteries of the...
Kafka Summit SF 2017 - Accelerating Particles to Explore the Mysteries of the...
 
Valladolid final-septiembre-2010
Valladolid final-septiembre-2010Valladolid final-septiembre-2010
Valladolid final-septiembre-2010
 
CERN User Story
CERN User StoryCERN User Story
CERN User Story
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
Pic archiver stansted
Pic archiver stanstedPic archiver stansted
Pic archiver stansted
 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the Continuum
 
Why Researchers are Using Advanced Networks
Why Researchers are Using Advanced NetworksWhy Researchers are Using Advanced Networks
Why Researchers are Using Advanced Networks
 
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
 
Supercomputers
SupercomputersSupercomputers
Supercomputers
 
Supercomputers
SupercomputersSupercomputers
Supercomputers
 
Science and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraScience and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated Era
 
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...
 

Recently uploaded

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

Multi-Cloud Federated Kubernetes at CERN

  • 1. Multi-Cloud Federated Kubernetes at CERN Clenimar Filemon clenimar@lsd.ufcg.edu.br Ricardo Rocha ricardo.rocha@cern.ch
  • 2. Founded in 1954 What is 96% of the universe made of? Fundamental Science Why isn’t there anti-matter in the universe? What was the state of matter just after the Big Bang?
  • 3.
  • 4.
  • 5.
  • 6.
  • 7. ~40 MHz ~ 1PB/sec L1 Trigger ~ 100 kHz HL Trigger Collisions Hardware Filter Software Filter ~ 1 kHz Raw Data ~ 1-10 GB/s Huge Data Still Big Still Big
  • 8. 320 000 Cores 3 300 Users 4 300 Projects 10 000 Hypervisors 210 Kubernetes Clusters 250 Petabytes
  • 9. 200+ Sites 700 000 Cores ~400 000 Jobs Distributed Computing ~30 GiB/s CERN T1 T2 ... ... ... ... ... ... ... ... Reconstruction Calibration Simulation Analysis
  • 10. CMS Higgs Event, May 2012 ATLAS Higgs Analysis, May 2012
  • 11. Motivation for Federation Periodic load spikes International Conferences, Reconstruction Campaigns Simplification Monitoring, Lifecycle, Alarms Deployment Uniform API, Replication, Load Balancing Use Cases: CERN Batch System, RECAST Analysis
  • 12. Sched Collector Negotiator StartD AcctGroup = "ATLAS" JobPrio = 0 RequestCpus = 2 RequestMemory = 4260 ... CERNEnvironment = “production” Datacenter = “meyrin” HasMPI = true TotalCpus = 8 TotalMemory = 22500 ... Matchmaking with ClassAds Fair Share Preemption Running Virtualized Extensive Experience in HEP External Storage and Networking
  • 13. Sched Collector Negotiator StartD AcctGroup = "ATLAS" JobPrio = 0 RequestCpus = 2 RequestMemory = 4260 ... CERNEnvironment = “production” Datacenter = “meyrin” HasMPI = true TotalCpus = 8 TotalMemory = 22500 ... Matchmaking with ClassAds Fair Share Preemption Running Virtualized Extensive Experience in HEP External Storage and Networking
  • 14. Sched Negotiator Collector Host kubefed init fed --host-cluster-context=condor-host ...
  • 15. kind: DaemonSet ... hostNetwork: true containers: - name: condor-startd image: .../cloud/condor-startd command: ["/usr/sbin/condor_startd", "-f"] securityContext: privileged: true livenessProbe: exec: command: - condor_who Sched Negotiator Collector Host StartD ... StartD ... StartD ... kubefed init fed --host-cluster-context=condor-host ... kubefed join --context fed tsystems --host-cluster-context condor-host --cluster-context tsystems
  • 16. REANA / RECAST Reusable Analysis Platform Workflow Engine (Yadage) Each step a Kubernetes Job Integrated Monitoring & Logging Centralized Log Collection https://github.com/reanahubhttps://github.com/recast-hep https://github.com/diana-hep/yadage
  • 18. Thank You Great Community, Amazing Tools Credits CERN OpenStack Cloud and Batch teams (Spyros Trigazis and all) Lukas Heinrich, REANA / RECAST Kelsey Hightower

Editor's Notes

  1. Dark energy + dark matter Quark gluon plasma, moments after the big bang
  2. Location, lake, alps mont blanc, swiss-french border Complex of accelerators, higher and higher energy. 27km circumference Two beams of protons travelling on different directions, close to speed of light
  3. Almost 10.000 magnets Kept in the ring thanks to these superconducting magnets Temperature kept at 1.9K (-271 celsius) to keep superconducting properties
  4. Sibling of ATLAS, with similar goals but different design 14.000 tons 20 meters long, 15x15
  5. Anti-matter Decelarator, creates anti-atoms to better understand its properties AMS experiment, launched on mission STS-134 (penultime shuttle mission), measures antimatter in cosmic rays
  6. 2 floors Historical building, 50 years old, from mainframes to racks Internet backbone, biggest in late 80s and early 90s
  7. Hierachical system, few big T1s and many smaller T2s
  8. Not a physicist, but learned to look for patterns in plots
  9. Based on htcondor, which HEP has decades of experience operating Component description, requests and resources published as classads Advanced functionality (fair share, pre-emption) Currently running mostly on virtualized resources Important: htcondor relies on an established storage and net infrastructure, handling compute only (which is what we try to federate)
  10. StartD is our first containerization goal, deployed at scale
  11. Host cluster, with the condor control plane One command only to establish federation
  12. StartD deployment as a daemonset, meaning we get one instance on every host Clusters are added again with one single command CVMFS caching for software distribution speed-up Data access and networking outside the scope of this exercise