SlideShare a Scribd company logo
Rob Gardner • Computation Institute • University of Chicago
CI Connect
A service for building multi-
institutional cluster environments
Acknowledgements
● Work presented here from many individuals
and groups & sites
● Special thanks to Globus & HTCondor teams!
● ATLAS Connect and OSG teams
○ Dave Lesny (Illinois), Lincoln Bryant, David Champion,
Suchandra Thapa (Chicago), Derek Weitzel (Nebraska)
● Duke, Syracuse, Clemson, UChicago, UWisc
A little bit of context
● Efforts began in 2013 to solve two problems:
○ How to provide a virtual cluster experience for small
research labs requiring distributed high throughput
computing resources (OSG Connect)
○ Extend the batch capacity of ATLAS (high energy
physics at CERN) Tier3 clusters (ATLAS Connect)
● Elements: Unix Acct. ‣ Software ‣ CPU ‣ Data
● HTCondor to be key linking these … but how?
Unix Acct ‣ Software ‣ CPU ‣ Data
● We need solutions for each
● No “development” effort, only integration
● Leverage proven technologies and advanced
CI activities
⇒ Globus Identity, CI-Logon, InCommon
⇒ HTCondor, Glideins, CernVM-FS, Xrootd
OSG Connect
Has an identity bridge: local campus identity (CILogon) ‣ OSG
Connect identity (Globus) ‣ virtual organization (OSG)
+ HTCondor Glidein Overlay
⇒ Virtual HTC cluster experience
ATLAS Connect
● Many ATLAS Tier3’s use HTCondor
● Simple to add flocking targets for one
○ But not managing a mesh (30 sites x N flocking targets)
● Centralize the flocking services
● Provide production backend for CERN based
system to shared university clusters
● Leverage work from OSG Connect for end-
user physicists
ATLAS Connect: Tier3 Users
bosco
factories
Harvard
Odyssey
Illinois T3 +
ICC
Indiana
Karst
Chicago
Midway
UTexas T3
(Rodeo virt)
sshloginasuser
condor
glideins
condor
pool
Stampede
CSU Fresno
Tier3 cluster
Physicist
user login
University
Tier3 cluster
Leverage standard
HTCondor flocking and
ClassAd matching
faxbox t3
MWT2
SE
pilot in/out
US ATLAS
(tier3 users)
ssh
Various job
schedulers, queue
policies, CVMFS
access methods, local
scratch & squid setups.
ATLAS Connect: Production backend
local pilot
factory
bosco
factories
Harvard
Odyssey
Illinois T3 +
ICC
Indiana
Karst
Chicago
Midway
UTexas T3
(Rodeo virt)
ssh
condor
glideins
condor
pool
Stampede
CSU Fresno
Tier3 cluster
pilots
faxbox t3
MWT2
SE
pilot in/out
US ATLAS
(tier3 users)
Jobs from central
database at CERN
sshloginasuser
No ATLAS or OSG
services or operational
effort is required
Easy to plug in
additional resources
or grow with new
allocations:
● university clusters
● xsede clusters
3M CPU hours
As important is the approach
● Focus is on
integration and
focused expertise
rather than
developing
something new
As important is the approach
● Deliver as hosted
service
● Minimize
services &
equipment at
resource
endpoints
Bringing HTCondor pools to campus
● Campus Connect Client (install locally)
● Virtual extension of /home comforts:
○ Local software, campus storage, tools
○ Marshall resources from one location & tool set:
■ local campus cluster allocation or general queue
■ XSEDE project allocation
■ shared resources via the OSG
■ multi-campus and community partnerships
■ public cloud resources
Organized into a ‘local’ queue
$ module load connect-client
$ connect setup
$ connect test
$ connect submit myjob.sub
$ connect q rwg
-- Submitter: login.ci-connect.uchicago.edu : <192.170.227.204:53212> : login.ci-connect.
uchicago.edu
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
252624.0 rwg 9/2 14:21 0+00:00:09 R 0 0.0 run_sim.sh 252624.
252624.1 rwg 9/2 14:21 0+00:00:09 R 0 0.0 run_sim.sh 252624.
...
$ connect status
$ connect pull (results)
Submitted from UChicago Research
Computing Center cluster “Midway”
UChicago CI Connect Service:
Midway cluster + OSG;
to add SDSC Comet allocation &
partner clusters at XENON institutions
Early adopter community:
Campus Users (@ Clemson) + OSG
Submission from Palmetto cluster (local)
add OSG nodes
burst
GLOW (UWisconsin-based campus researchers)
OSG Connect
Sharing local resources with communities
Palmetto cluster shared 300k
CPU-hrs in past 30 days
2015/09/08 (Last week)
19 active
users
Going forward: automation
Make it simple to
provision &
dynamically configure
multi-campus,
community-based
virtual cluster
instances
Summary
● Use existing, well-proven technologies to
enable sharing among campuses
● Minimize operational effort and equipment at
resource endpoints ⇒ provide as a service
● Future areas of work:
○ automation & scale
○ adaptive ‘policy-based’ provisioning services
○ campus connect interfaces (user, compute, data)

More Related Content

What's hot

An Overview of Bionimbus (March 2010)
An Overview of Bionimbus (March 2010)An Overview of Bionimbus (March 2010)
An Overview of Bionimbus (March 2010)Robert Grossman
 
Large Scale On-Demand Image Processing For Disaster Relief
Large Scale On-Demand Image Processing For Disaster ReliefLarge Scale On-Demand Image Processing For Disaster Relief
Large Scale On-Demand Image Processing For Disaster ReliefRobert Grossman
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesJan Aerts
 
Doing Research in the Cloud - NIH Workshop Dennis Gannon
Doing Research in the Cloud - NIH Workshop Dennis GannonDoing Research in the Cloud - NIH Workshop Dennis Gannon
Doing Research in the Cloud - NIH Workshop Dennis GannonMicrosoft Azure for Research
 
Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...
Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...
Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...Robert Grossman
 
Accelerating your Research with Microsoft Azure (June 2015)
Accelerating your Research with Microsoft Azure (June 2015)Accelerating your Research with Microsoft Azure (June 2015)
Accelerating your Research with Microsoft Azure (June 2015)Microsoft Azure for Research
 
Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Robert Grossman
 
Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)Robert Grossman
 
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationThe Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationIan Foster
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light SourcesIan Foster
 
Open Science Grid
Open Science GridOpen Science Grid
Open Science GridRob Gardner
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept Miha Ahronovitz
 
Grid computing by vaishali sahare [katkar]
Grid computing by vaishali sahare [katkar]Grid computing by vaishali sahare [katkar]
Grid computing by vaishali sahare [katkar]vaishalisahare123
 
Keynote IEEE International Workshop on Cloud Analytics. Dennis Gannon
Keynote IEEE International Workshop on Cloud Analytics. Dennis  GannonKeynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Keynote IEEE International Workshop on Cloud Analytics. Dennis GannonMicrosoft Azure for Research
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for ScienceIan Foster
 

What's hot (20)

An Overview of Bionimbus (March 2010)
An Overview of Bionimbus (March 2010)An Overview of Bionimbus (March 2010)
An Overview of Bionimbus (March 2010)
 
Large Scale On-Demand Image Processing For Disaster Relief
Large Scale On-Demand Image Processing For Disaster ReliefLarge Scale On-Demand Image Processing For Disaster Relief
Large Scale On-Demand Image Processing For Disaster Relief
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutes
 
Doing Research in the Cloud - NIH Workshop Dennis Gannon
Doing Research in the Cloud - NIH Workshop Dennis GannonDoing Research in the Cloud - NIH Workshop Dennis Gannon
Doing Research in the Cloud - NIH Workshop Dennis Gannon
 
Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...
Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...
Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...
 
Accelerating your Research with Microsoft Azure (June 2015)
Accelerating your Research with Microsoft Azure (June 2015)Accelerating your Research with Microsoft Azure (June 2015)
Accelerating your Research with Microsoft Azure (June 2015)
 
Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11
 
Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)
 
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationThe Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
 
Using CINET
Using CINETUsing CINET
Using CINET
 
Open Science Grid
Open Science GridOpen Science Grid
Open Science Grid
 
Grid Computing
Grid ComputingGrid Computing
Grid Computing
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept
 
NREN 3.0
NREN 3.0NREN 3.0
NREN 3.0
 
Virtualization for HPC at NCI
Virtualization for HPC at NCIVirtualization for HPC at NCI
Virtualization for HPC at NCI
 
Grid computing by vaishali sahare [katkar]
Grid computing by vaishali sahare [katkar]Grid computing by vaishali sahare [katkar]
Grid computing by vaishali sahare [katkar]
 
Keynote IEEE International Workshop on Cloud Analytics. Dennis Gannon
Keynote IEEE International Workshop on Cloud Analytics. Dennis  GannonKeynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Keynote IEEE International Workshop on Cloud Analytics. Dennis Gannon
 
2017 nov reflow sbtb
2017 nov reflow sbtb2017 nov reflow sbtb
2017 nov reflow sbtb
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 

Viewers also liked

Heba Dajani Styling Portfolio from www.chloelovescharlie.com
 Heba Dajani Styling Portfolio from www.chloelovescharlie.com Heba Dajani Styling Portfolio from www.chloelovescharlie.com
Heba Dajani Styling Portfolio from www.chloelovescharlie.comHeba Dajani
 
иль дмитрий+хостел+путешествующие
иль дмитрий+хостел+путешествующиеиль дмитрий+хостел+путешествующие
иль дмитрий+хостел+путешествующиеДмитрий Иль
 
JBC Presentation
JBC PresentationJBC Presentation
JBC PresentationJim Carey
 
Programación de la unidad didáctica
Programación de la unidad didácticaProgramación de la unidad didáctica
Programación de la unidad didácticaliseth_pirata
 
InterConsultant | Clienting | Customer Experience Management Consulting
InterConsultant | Clienting | Customer Experience Management ConsultingInterConsultant | Clienting | Customer Experience Management Consulting
InterConsultant | Clienting | Customer Experience Management ConsultingHugo A. Saenz
 
Momentos de Verdad: quejas y reclamos
Momentos de Verdad: quejas y reclamosMomentos de Verdad: quejas y reclamos
Momentos de Verdad: quejas y reclamosHugo A. Saenz
 
No me digas que empiezas la casa por donde la energía potencial es máxima
No me digas que empiezas la casa por donde la energía potencial es máximaNo me digas que empiezas la casa por donde la energía potencial es máxima
No me digas que empiezas la casa por donde la energía potencial es máximaMalvadoAlen
 
Cómo cualquiera puede ver la fuga de información de tus sitios HTTP usando la...
Cómo cualquiera puede ver la fuga de información de tus sitios HTTP usando la...Cómo cualquiera puede ver la fuga de información de tus sitios HTTP usando la...
Cómo cualquiera puede ver la fuga de información de tus sitios HTTP usando la...MalvadoAlen
 

Viewers also liked (14)

Obtencion de rrhh
Obtencion de rrhhObtencion de rrhh
Obtencion de rrhh
 
Heba Dajani Styling Portfolio from www.chloelovescharlie.com
 Heba Dajani Styling Portfolio from www.chloelovescharlie.com Heba Dajani Styling Portfolio from www.chloelovescharlie.com
Heba Dajani Styling Portfolio from www.chloelovescharlie.com
 
иль дмитрий+хостел+путешествующие
иль дмитрий+хостел+путешествующиеиль дмитрий+хостел+путешествующие
иль дмитрий+хостел+путешествующие
 
New cv Hussein
New cv HusseinNew cv Hussein
New cv Hussein
 
Hola
HolaHola
Hola
 
Finanzas de empresas 2016 i
Finanzas de empresas 2016 iFinanzas de empresas 2016 i
Finanzas de empresas 2016 i
 
JBC Presentation
JBC PresentationJBC Presentation
JBC Presentation
 
Programación de la unidad didáctica
Programación de la unidad didácticaProgramación de la unidad didáctica
Programación de la unidad didáctica
 
computación - arándanos
computación - arándanos computación - arándanos
computación - arándanos
 
METWALI TAREK
METWALI TAREKMETWALI TAREK
METWALI TAREK
 
InterConsultant | Clienting | Customer Experience Management Consulting
InterConsultant | Clienting | Customer Experience Management ConsultingInterConsultant | Clienting | Customer Experience Management Consulting
InterConsultant | Clienting | Customer Experience Management Consulting
 
Momentos de Verdad: quejas y reclamos
Momentos de Verdad: quejas y reclamosMomentos de Verdad: quejas y reclamos
Momentos de Verdad: quejas y reclamos
 
No me digas que empiezas la casa por donde la energía potencial es máxima
No me digas que empiezas la casa por donde la energía potencial es máximaNo me digas que empiezas la casa por donde la energía potencial es máxima
No me digas que empiezas la casa por donde la energía potencial es máxima
 
Cómo cualquiera puede ver la fuga de información de tus sitios HTTP usando la...
Cómo cualquiera puede ver la fuga de información de tus sitios HTTP usando la...Cómo cualquiera puede ver la fuga de información de tus sitios HTTP usando la...
Cómo cualquiera puede ver la fuga de información de tus sitios HTTP usando la...
 

Similar to Ci Connect: A service for building multi-institutional cluster environments

Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Larry Smarr
 
Scientific
Scientific Scientific
Scientific marpierc
 
Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchRobert Grossman
 
Gridcomputingppt
GridcomputingpptGridcomputingppt
Gridcomputingpptnavjasser
 
ZCloud Consensus on Hardware for Distributed Systems
ZCloud Consensus on Hardware for Distributed SystemsZCloud Consensus on Hardware for Distributed Systems
ZCloud Consensus on Hardware for Distributed SystemsGokhan Boranalp
 
2015 04 bio it world
2015 04 bio it world2015 04 bio it world
2015 04 bio it worldChris Dwan
 
GridComputing-an introduction.ppt
GridComputing-an introduction.pptGridComputing-an introduction.ppt
GridComputing-an introduction.pptNileshkuGiri
 
What Does Real World Mass Adoption of Decentralized Tech Look Like?
What Does Real World Mass Adoption of Decentralized Tech Look Like?What Does Real World Mass Adoption of Decentralized Tech Look Like?
What Does Real World Mass Adoption of Decentralized Tech Look Like?All Things Open
 
The Open Science Grid
The Open Science GridThe Open Science Grid
The Open Science GridRob Gardner
 
Cloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersCloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersAlan Sill
 
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...Amazon Web Services
 
Jetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
Jetstream: Adding Cloud-based Computing to the National CyberinfrastructureJetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
Jetstream: Adding Cloud-based Computing to the National CyberinfrastructureMatthew Vaughn
 
Security Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformSecurity Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformLarry Smarr
 
Data Mobility Exhibition
Data Mobility ExhibitionData Mobility Exhibition
Data Mobility ExhibitionGlobus
 
Using Ceph for Large Hadron Collider Data
Using Ceph for Large Hadron Collider DataUsing Ceph for Large Hadron Collider Data
Using Ceph for Large Hadron Collider DataRob Gardner
 
Grid computing notes
Grid computing notesGrid computing notes
Grid computing notesSyed Mustafa
 

Similar to Ci Connect: A service for building multi-institutional cluster environments (20)

Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​
 
Grid Computing
Grid ComputingGrid Computing
Grid Computing
 
Thoughts on Cybersecurity
Thoughts on CybersecurityThoughts on Cybersecurity
Thoughts on Cybersecurity
 
Scientific
Scientific Scientific
Scientific
 
Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science Research
 
Gridcomputingppt
GridcomputingpptGridcomputingppt
Gridcomputingppt
 
ZCloud Consensus on Hardware for Distributed Systems
ZCloud Consensus on Hardware for Distributed SystemsZCloud Consensus on Hardware for Distributed Systems
ZCloud Consensus on Hardware for Distributed Systems
 
2015 04 bio it world
2015 04 bio it world2015 04 bio it world
2015 04 bio it world
 
GridComputing-an introduction.ppt
GridComputing-an introduction.pptGridComputing-an introduction.ppt
GridComputing-an introduction.ppt
 
What Does Real World Mass Adoption of Decentralized Tech Look Like?
What Does Real World Mass Adoption of Decentralized Tech Look Like?What Does Real World Mass Adoption of Decentralized Tech Look Like?
What Does Real World Mass Adoption of Decentralized Tech Look Like?
 
The Open Science Grid
The Open Science GridThe Open Science Grid
The Open Science Grid
 
Cloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersCloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for Developers
 
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...
 
Jetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
Jetstream: Adding Cloud-based Computing to the National CyberinfrastructureJetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
Jetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
 
grid computing
grid computinggrid computing
grid computing
 
Security Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformSecurity Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research Platform
 
Grid Computing
Grid ComputingGrid Computing
Grid Computing
 
Data Mobility Exhibition
Data Mobility ExhibitionData Mobility Exhibition
Data Mobility Exhibition
 
Using Ceph for Large Hadron Collider Data
Using Ceph for Large Hadron Collider DataUsing Ceph for Large Hadron Collider Data
Using Ceph for Large Hadron Collider Data
 
Grid computing notes
Grid computing notesGrid computing notes
Grid computing notes
 

Recently uploaded

Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Alison B. Lowndes
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...Product School
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupCatarinaPereira64715
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsVlad Stirbu
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...Sri Ambati
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsPaul Groth
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backElena Simperl
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxAbida Shariff
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
 
КАТЕРИНА АБЗЯТОВА «Ефективне планування тестування ключові аспекти та практ...
КАТЕРИНА АБЗЯТОВА  «Ефективне планування тестування  ключові аспекти та практ...КАТЕРИНА АБЗЯТОВА  «Ефективне планування тестування  ключові аспекти та практ...
КАТЕРИНА АБЗЯТОВА «Ефективне планування тестування ключові аспекти та практ...QADay
 

Recently uploaded (20)

Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Ransomware Mallox [EN].pdf
Ransomware         Mallox       [EN].pdfRansomware         Mallox       [EN].pdf
Ransomware Mallox [EN].pdf
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
КАТЕРИНА АБЗЯТОВА «Ефективне планування тестування ключові аспекти та практ...
КАТЕРИНА АБЗЯТОВА  «Ефективне планування тестування  ключові аспекти та практ...КАТЕРИНА АБЗЯТОВА  «Ефективне планування тестування  ключові аспекти та практ...
КАТЕРИНА АБЗЯТОВА «Ефективне планування тестування ключові аспекти та практ...
 

Ci Connect: A service for building multi-institutional cluster environments

  • 1. Rob Gardner • Computation Institute • University of Chicago CI Connect A service for building multi- institutional cluster environments
  • 2. Acknowledgements ● Work presented here from many individuals and groups & sites ● Special thanks to Globus & HTCondor teams! ● ATLAS Connect and OSG teams ○ Dave Lesny (Illinois), Lincoln Bryant, David Champion, Suchandra Thapa (Chicago), Derek Weitzel (Nebraska) ● Duke, Syracuse, Clemson, UChicago, UWisc
  • 3. A little bit of context ● Efforts began in 2013 to solve two problems: ○ How to provide a virtual cluster experience for small research labs requiring distributed high throughput computing resources (OSG Connect) ○ Extend the batch capacity of ATLAS (high energy physics at CERN) Tier3 clusters (ATLAS Connect) ● Elements: Unix Acct. ‣ Software ‣ CPU ‣ Data ● HTCondor to be key linking these … but how?
  • 4. Unix Acct ‣ Software ‣ CPU ‣ Data ● We need solutions for each ● No “development” effort, only integration ● Leverage proven technologies and advanced CI activities ⇒ Globus Identity, CI-Logon, InCommon ⇒ HTCondor, Glideins, CernVM-FS, Xrootd
  • 5. OSG Connect Has an identity bridge: local campus identity (CILogon) ‣ OSG Connect identity (Globus) ‣ virtual organization (OSG) + HTCondor Glidein Overlay ⇒ Virtual HTC cluster experience
  • 6. ATLAS Connect ● Many ATLAS Tier3’s use HTCondor ● Simple to add flocking targets for one ○ But not managing a mesh (30 sites x N flocking targets) ● Centralize the flocking services ● Provide production backend for CERN based system to shared university clusters ● Leverage work from OSG Connect for end- user physicists
  • 7. ATLAS Connect: Tier3 Users bosco factories Harvard Odyssey Illinois T3 + ICC Indiana Karst Chicago Midway UTexas T3 (Rodeo virt) sshloginasuser condor glideins condor pool Stampede CSU Fresno Tier3 cluster Physicist user login University Tier3 cluster Leverage standard HTCondor flocking and ClassAd matching faxbox t3 MWT2 SE pilot in/out US ATLAS (tier3 users) ssh Various job schedulers, queue policies, CVMFS access methods, local scratch & squid setups.
  • 8. ATLAS Connect: Production backend local pilot factory bosco factories Harvard Odyssey Illinois T3 + ICC Indiana Karst Chicago Midway UTexas T3 (Rodeo virt) ssh condor glideins condor pool Stampede CSU Fresno Tier3 cluster pilots faxbox t3 MWT2 SE pilot in/out US ATLAS (tier3 users) Jobs from central database at CERN sshloginasuser No ATLAS or OSG services or operational effort is required
  • 9. Easy to plug in additional resources or grow with new allocations: ● university clusters ● xsede clusters 3M CPU hours
  • 10. As important is the approach ● Focus is on integration and focused expertise rather than developing something new
  • 11. As important is the approach ● Deliver as hosted service ● Minimize services & equipment at resource endpoints
  • 12. Bringing HTCondor pools to campus ● Campus Connect Client (install locally) ● Virtual extension of /home comforts: ○ Local software, campus storage, tools ○ Marshall resources from one location & tool set: ■ local campus cluster allocation or general queue ■ XSEDE project allocation ■ shared resources via the OSG ■ multi-campus and community partnerships ■ public cloud resources
  • 13. Organized into a ‘local’ queue $ module load connect-client $ connect setup $ connect test $ connect submit myjob.sub $ connect q rwg -- Submitter: login.ci-connect.uchicago.edu : <192.170.227.204:53212> : login.ci-connect. uchicago.edu ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 252624.0 rwg 9/2 14:21 0+00:00:09 R 0 0.0 run_sim.sh 252624. 252624.1 rwg 9/2 14:21 0+00:00:09 R 0 0.0 run_sim.sh 252624. ... $ connect status $ connect pull (results) Submitted from UChicago Research Computing Center cluster “Midway” UChicago CI Connect Service: Midway cluster + OSG; to add SDSC Comet allocation & partner clusters at XENON institutions Early adopter community:
  • 14. Campus Users (@ Clemson) + OSG Submission from Palmetto cluster (local) add OSG nodes burst
  • 15. GLOW (UWisconsin-based campus researchers) OSG Connect Sharing local resources with communities Palmetto cluster shared 300k CPU-hrs in past 30 days
  • 17. Going forward: automation Make it simple to provision & dynamically configure multi-campus, community-based virtual cluster instances
  • 18. Summary ● Use existing, well-proven technologies to enable sharing among campuses ● Minimize operational effort and equipment at resource endpoints ⇒ provide as a service ● Future areas of work: ○ automation & scale ○ adaptive ‘policy-based’ provisioning services ○ campus connect interfaces (user, compute, data)