SlideShare a Scribd company logo
1 of 22
Download to read offline
Multi-Cell Openstack:
How to Evolve your Cloud
to Scale
● Belmiro Moreira - CERN
● Matt Van Winkle - Rackspace
● Sam Morrison - NeCTAR, University of
Melbourne
Cells: How we use them
at NeCTAR
Sam Morrison
sam.morrison@unimelb.edu.au
NeCTAR Research Cloud
● Started in 2011
● Funded by the Australian Government
● 8 institutions around the country
● Production early 2012 - Openstack Diablo
● All federated to appear as 1 cloud from the
users point of view
● Put the compute near the data and tools
● 5000+ users
NeCTAR Sites
● University of Melbourne
● National Computation Infrastructure
● Monash University
● Queensland CyberInfrastructure Foundation
● eResearch SA
● University of Tasmania
● Intersect, NSW
● iVEC, WA
Cells to build a Federation
● Use cells to federate geographically
separated sites
● Different hardware/networks/people
● Parent cell run centrally at unimelb along
with keystone/cinder/glance etc (no neutron)
● Each site has 1 or more compute cells
● These roughly match up to availability zones from a
users perspective (cells are behind the scenes)
How big?
● Each site ~4000 cores, ~150 hypervisors
● 6 sites in production, 4600+ instances
● Last 2 sites in prod by end of year
● ~1000 hypervisors, 40k cores
● ~10 compute cells
● Some sites have multiple datacenters so
have multiple cells
Pain points
● Cell scheduling isn’t smart
● Broadcast calls rely on all cells to be alive
● Not many people to share experiences with
● Upgrades, although havana → icehouse
could happen in stages. Much easier!
Things we’ve added, not in
trunk (yet)
● Security group syncing
● ec2 id mappings (needed for metadata)
● Availability zone / aggregate support
● Flavour management
*We assume a cell only has 1 parent
Cells: How we use them
at CERN
Belmiro Moreira
email: belmiro.moreira @ cern.ch
@belmiromoreira
CERN
● Conseil Européen pour la Recherche Nucléaire – aka
European Organization for Nuclear Research
● Founded in 1954 with an international treaty
○ 21 state members, other countries contribute to experiments
○ Situated between Geneva and the Jura Mountains, straddling the Swiss-
French border
● CERN mission is to do fundamental research
● CERN provides particle accelerators and other infrastructure
for high-energy physics research
CERN - Cloud Infrastructure
● In production since July 2013
● Performed two upgrades: Grizzly -> Havana -> Icehouse
○ Currently running: nova; glance; keystone; horizon; cinder w/ Ceph;
ceilometer
● RDO distribution on SLC6; pip with Windows Server 2012 R2
● 2 geographically separated data centres
○ Geneva (Switzerland) and Budapest (Hungary)
● Numbers
○ ~3000 compute nodes (75k cores; 140TB RAM)
■ ~2900 kvm; ~100 Hyper-V;
○ ~8000 virtual machines
CERN - Cloud Infrastructure - Cells
● Why we use cells?
○ Scale transparently between different Data Centres
○ Availability and Resilience
○ Isolate different use-cases
● Today: 1 api Cell and 8 compute Cells
○ 2 level tree
○ size range between 100 to ~1600 compute nodes
○ 6 Compute Cells in Switzerland; 2 Compute Cells in Hungary
● “Shared” and “Private” Cells
○ 3 availability zones available in “Shared” Cells
CERN - Cells Limitations
● Missing functionality
○ Security Groups
○ Flavor propagation (api -> compute)
○ Manage aggregates on api Cell
○ Server groups
● Cell scheduler
● Ceilometer integration
CERN - Cells Challenges
● More ~74000 cores by beginning 2015
○ How to organize and distribute nodes between different cells?
● Split current large cells into a small number (~200) of
compute nodes
○ Expected to have +30 cells by end 2015
○ How to manage a large number of Cells?
Created by: Matt Van Winkle @mvanwink
Modified Date: 10/29/2014
Cells at Rackspace
Cells: How to Evolve Your Cloud to Scale
• Managed Cloud company offering a suite of dedicated and cloud hosting products
• Founded in 1998 in San Antonio, TX
• Home of Fanatical Support
• More than 200,000 customers in 120 countries
Rackspace
www.rackspace.com
• In production since August 2012
– Currently running: Nova; Glance; Neutron; Ironic; Swift; Cinder
• Regular upgrades from trunk
– Package built on trunk pull from 10/21 in testing now
• Compute nodes are Debian based
– Run as VMs on hypervisors and manage via XAPI
• 6 Geographic regions around the globe
– DFW; ORD; IAD; LON; SYD; HKG
• Numbers
– 10’s of 1000’s of hypervisors (over 330K Cores, 1+ Petabyte of RAM)
• All XenServer
– Over 150,000 virtual machines
Rackspace – Cloud Infrastructure
www.rackspace.com
• Why we use cells?
– Manage Multiple Flavor Classes
– Network resources (Public IPs, Private IPs, aggregation routers, etc)
– Network Constraints
– Continual Supply Chain
• 1 Global API cell per region with multiple Compute cells (3 – 35+)
– 2 level tree
– Size between ~100 and ~600 hosts per cell
• Control infrastructure exists as instances in small OpenStack deployment
• All cells available to all tenants
– Tested “dedicated” cells for potential large customers
Rackspace – Cloud Infrastructure - Cells
www.rackspace.com
• Missing Functionality
– Security Groups
– Host aggregates
• Scheduler
– No “disable”
– Incomplete host statuses
• Other services are not cell aware
– Neutron is a prime example
Rackspace – Cells Limitations
www.rackspace.com
• Increasing number of flavor classes
– Different Hardware specs per class
– Sizing varies by average VM density
• Multiple vendor sources
– Subtle hardware differences in same specs across different vendors
• Scaling global services with cell growth
– Still don’t have the perfect ratios
Rackspace – Cells Challenges
www.rackspace.com
• Nova Dev team met this morning to discuss cells in a few sessions:
– Cells – Wednesday, November 5, 09:00
– Cells continued – Wednesday, November 5, 09:50
• Areas of discussion
– Feature completion
– No-op/single cell as default
– Cell awareness in APIs
• Recap from sessions
Cells Feature Completion
www.rackspace.com
Thank You!
● Belmiro Moreira - CERN - belmiro.moreira@cern.ch
● Matt Van Winkle - Rackspace - @mvanwink
● Sam Morrison - NeCTAR, University of Melbourne - sam.morrison@unimelb.
edu.au
Questions?
www.rackspace.com

More Related Content

What's hot

Moving from CellsV1 to CellsV2 at CERN
Moving from CellsV1 to CellsV2 at CERNMoving from CellsV1 to CellsV2 at CERN
Moving from CellsV1 to CellsV2 at CERN
Belmiro Moreira
 
OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?
Tim Bell
 
What's new in OpenStack Liberty
What's new in OpenStack LibertyWhat's new in OpenStack Liberty
What's new in OpenStack Liberty
Stephen Gordon
 

What's hot (20)

CERN OpenStack Cloud Control Plane - From VMs to K8s
CERN OpenStack Cloud Control Plane - From VMs to K8sCERN OpenStack Cloud Control Plane - From VMs to K8s
CERN OpenStack Cloud Control Plane - From VMs to K8s
 
CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016
 
CERN User Story
CERN User StoryCERN User Story
CERN User Story
 
Future Science on Future OpenStack
Future Science on Future OpenStackFuture Science on Future OpenStack
Future Science on Future OpenStack
 
OpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim BellOpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim Bell
 
Moving from CellsV1 to CellsV2 at CERN
Moving from CellsV1 to CellsV2 at CERNMoving from CellsV1 to CellsV2 at CERN
Moving from CellsV1 to CellsV2 at CERN
 
20161025 OpenStack at CERN Barcelona
20161025 OpenStack at CERN Barcelona20161025 OpenStack at CERN Barcelona
20161025 OpenStack at CERN Barcelona
 
20121017 OpenStack CERN Accelerating Science
20121017 OpenStack CERN Accelerating Science20121017 OpenStack CERN Accelerating Science
20121017 OpenStack CERN Accelerating Science
 
Evolution of Openstack Networking at CERN
Evolution of Openstack Networking at CERNEvolution of Openstack Networking at CERN
Evolution of Openstack Networking at CERN
 
Containers on Baremetal and Preemptible VMs at CERN and SKA
Containers on Baremetal and Preemptible VMs at CERN and SKAContainers on Baremetal and Preemptible VMs at CERN and SKA
Containers on Baremetal and Preemptible VMs at CERN and SKA
 
The OpenStack Cloud at CERN
The OpenStack Cloud at CERNThe OpenStack Cloud at CERN
The OpenStack Cloud at CERN
 
20190620 accelerating containers v3
20190620 accelerating containers v320190620 accelerating containers v3
20190620 accelerating containers v3
 
20150924 rda federation_v1
20150924 rda federation_v120150924 rda federation_v1
20150924 rda federation_v1
 
Integrating Bare-metal Provisioning into CERN's Private Cloud
Integrating Bare-metal Provisioning into CERN's Private CloudIntegrating Bare-metal Provisioning into CERN's Private Cloud
Integrating Bare-metal Provisioning into CERN's Private Cloud
 
OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?
 
Operational War Stories from 5 Years of Running OpenStack in Production
Operational War Stories from 5 Years of Running OpenStack in ProductionOperational War Stories from 5 Years of Running OpenStack in Production
Operational War Stories from 5 Years of Running OpenStack in Production
 
Openstack Infrastructure Containerization
Openstack Infrastructure ContainerizationOpenstack Infrastructure Containerization
Openstack Infrastructure Containerization
 
What's new in OpenStack Liberty
What's new in OpenStack LibertyWhat's new in OpenStack Liberty
What's new in OpenStack Liberty
 
Euro ht condor_alahiff
Euro ht condor_alahiffEuro ht condor_alahiff
Euro ht condor_alahiff
 
Manila on CephFS at CERN (OpenStack Summit Boston, 11 May 2017)
Manila on CephFS at CERN (OpenStack Summit Boston, 11 May 2017)Manila on CephFS at CERN (OpenStack Summit Boston, 11 May 2017)
Manila on CephFS at CERN (OpenStack Summit Boston, 11 May 2017)
 

Viewers also liked

Producció I Distribució D’Energia ElèCtrica (1)
Producció I Distribució D’Energia ElèCtrica (1)Producció I Distribució D’Energia ElèCtrica (1)
Producció I Distribució D’Energia ElèCtrica (1)
AvantimePress
 
Articles Reading Rules
Articles Reading RulesArticles Reading Rules
Articles Reading Rules
Learngle
 
Periodico de Velilla de San Antonio Junio 2009
Periodico de Velilla de San Antonio Junio 2009Periodico de Velilla de San Antonio Junio 2009
Periodico de Velilla de San Antonio Junio 2009
guestf4a7e5e
 
Guia articuladora5
Guia articuladora5Guia articuladora5
Guia articuladora5
Karlita Sil
 
Núm. 2623, 23 de jeshván de 5774 México D.F. a 27 de octubre de 2013
Núm. 2623, 23 de jeshván de 5774 México D.F. a 27 de octubre de 2013Núm. 2623, 23 de jeshván de 5774 México D.F. a 27 de octubre de 2013
Núm. 2623, 23 de jeshván de 5774 México D.F. a 27 de octubre de 2013
Centro Deportivo Israelita
 

Viewers also liked (20)

Manual Tecnico
Manual TecnicoManual Tecnico
Manual Tecnico
 
UAV Summit 2010
UAV Summit 2010UAV Summit 2010
UAV Summit 2010
 
Plano texas
Plano texasPlano texas
Plano texas
 
Knowledge Innovation Market
Knowledge Innovation MarketKnowledge Innovation Market
Knowledge Innovation Market
 
Producció I Distribució D’Energia ElèCtrica (1)
Producció I Distribució D’Energia ElèCtrica (1)Producció I Distribució D’Energia ElèCtrica (1)
Producció I Distribució D’Energia ElèCtrica (1)
 
Articles Reading Rules
Articles Reading RulesArticles Reading Rules
Articles Reading Rules
 
Arte y fotos
Arte y fotosArte y fotos
Arte y fotos
 
Presentación posgradoAdministración de sistemas, devOps y Cloud Computing 05...
Presentación posgradoAdministración de sistemas, devOps y Cloud Computing  05...Presentación posgradoAdministración de sistemas, devOps y Cloud Computing  05...
Presentación posgradoAdministración de sistemas, devOps y Cloud Computing 05...
 
Curriculum espanol copy
Curriculum espanol   copyCurriculum espanol   copy
Curriculum espanol copy
 
Periodico de Velilla de San Antonio Junio 2009
Periodico de Velilla de San Antonio Junio 2009Periodico de Velilla de San Antonio Junio 2009
Periodico de Velilla de San Antonio Junio 2009
 
AEI Pastelería de Estepa - Agrupación Empresarial Innovadora
AEI Pastelería de Estepa -  Agrupación Empresarial InnovadoraAEI Pastelería de Estepa -  Agrupación Empresarial Innovadora
AEI Pastelería de Estepa - Agrupación Empresarial Innovadora
 
Jppc2013
Jppc2013Jppc2013
Jppc2013
 
Guia articuladora5
Guia articuladora5Guia articuladora5
Guia articuladora5
 
El lenguaje
El lenguajeEl lenguaje
El lenguaje
 
H2 Mobility Italy - Presentation at Hydrogen Park, Venice, 23 November 2015
H2 Mobility Italy - Presentation at Hydrogen Park, Venice, 23 November 2015H2 Mobility Italy - Presentation at Hydrogen Park, Venice, 23 November 2015
H2 Mobility Italy - Presentation at Hydrogen Park, Venice, 23 November 2015
 
Transferencia del Conocimiento y Propiedad Intelectual - Comisión Uinnova
Transferencia del Conocimiento y Propiedad Intelectual - Comisión UinnovaTransferencia del Conocimiento y Propiedad Intelectual - Comisión Uinnova
Transferencia del Conocimiento y Propiedad Intelectual - Comisión Uinnova
 
Núm. 2623, 23 de jeshván de 5774 México D.F. a 27 de octubre de 2013
Núm. 2623, 23 de jeshván de 5774 México D.F. a 27 de octubre de 2013Núm. 2623, 23 de jeshván de 5774 México D.F. a 27 de octubre de 2013
Núm. 2623, 23 de jeshván de 5774 México D.F. a 27 de octubre de 2013
 
Elementos de la Comunicación Visual
Elementos de la Comunicación VisualElementos de la Comunicación Visual
Elementos de la Comunicación Visual
 
Tengo Un Perro Así
Tengo Un Perro AsíTengo Un Perro Así
Tengo Un Perro Así
 
T1 oportunidad
T1 oportunidadT1 oportunidad
T1 oportunidad
 

Similar to Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014

Openstack For Beginners
Openstack For BeginnersOpenstack For Beginners
Openstack For Beginners
cpallares
 
CERN Data Centre Evolution
CERN Data Centre EvolutionCERN Data Centre Evolution
CERN Data Centre Evolution
Gavin McCance
 
Open stack neutron and opendaylight
Open stack neutron and opendaylightOpen stack neutron and opendaylight
Open stack neutron and opendaylight
ramgow
 
NaaS in OpenStack - CloudCamp Moscow
NaaS in OpenStack - CloudCamp MoscowNaaS in OpenStack - CloudCamp Moscow
NaaS in OpenStack - CloudCamp Moscow
Ilya Alekseyev
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
balmanme
 
CloudStack - LinuxFest NorthWest
CloudStack - LinuxFest NorthWestCloudStack - LinuxFest NorthWest
CloudStack - LinuxFest NorthWest
ke4qqq
 

Similar to Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014 (20)

Openstack For Beginners
Openstack For BeginnersOpenstack For Beginners
Openstack For Beginners
 
CloudLab Overview
CloudLab OverviewCloudLab Overview
CloudLab Overview
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017
 
DOE Magellan OpenStack user story
DOE Magellan OpenStack user storyDOE Magellan OpenStack user story
DOE Magellan OpenStack user story
 
CERN Data Centre Evolution
CERN Data Centre EvolutionCERN Data Centre Evolution
CERN Data Centre Evolution
 
Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt
 
OpenNebulaConf2015 1.07 Cloud for Scientific Computing @ STFC - Alexander Dibbo
OpenNebulaConf2015 1.07 Cloud for Scientific Computing @ STFC - Alexander DibboOpenNebulaConf2015 1.07 Cloud for Scientific Computing @ STFC - Alexander Dibbo
OpenNebulaConf2015 1.07 Cloud for Scientific Computing @ STFC - Alexander Dibbo
 
OpenStack: Toward a More Resilient Cloud
OpenStack: Toward a More Resilient CloudOpenStack: Toward a More Resilient Cloud
OpenStack: Toward a More Resilient Cloud
 
Toward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStackToward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStack
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
Open stack neutron and opendaylight
Open stack neutron and opendaylightOpen stack neutron and opendaylight
Open stack neutron and opendaylight
 
All about open stack
All about open stackAll about open stack
All about open stack
 
How to deliver High Performance OpenStack Cloud: Christoph Dwertmann, Vault S...
How to deliver High Performance OpenStack Cloud: Christoph Dwertmann, Vault S...How to deliver High Performance OpenStack Cloud: Christoph Dwertmann, Vault S...
How to deliver High Performance OpenStack Cloud: Christoph Dwertmann, Vault S...
 
NaaS in OpenStack - CloudCamp Moscow
NaaS in OpenStack - CloudCamp MoscowNaaS in OpenStack - CloudCamp Moscow
NaaS in OpenStack - CloudCamp Moscow
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
 
Walk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoCWalk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoC
 
CloudStack - LinuxFest NorthWest
CloudStack - LinuxFest NorthWestCloudStack - LinuxFest NorthWest
CloudStack - LinuxFest NorthWest
 
OpenStack@NBU
OpenStack@NBUOpenStack@NBU
OpenStack@NBU
 
20121115 open stack_ch_user_group_v1.2
20121115 open stack_ch_user_group_v1.220121115 open stack_ch_user_group_v1.2
20121115 open stack_ch_user_group_v1.2
 
CERN Mass and Agility talk at OSCON 2014
CERN Mass and Agility talk at OSCON 2014CERN Mass and Agility talk at OSCON 2014
CERN Mass and Agility talk at OSCON 2014
 

Recently uploaded

Recently uploaded (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 

Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014

  • 1. Multi-Cell Openstack: How to Evolve your Cloud to Scale ● Belmiro Moreira - CERN ● Matt Van Winkle - Rackspace ● Sam Morrison - NeCTAR, University of Melbourne
  • 2. Cells: How we use them at NeCTAR Sam Morrison sam.morrison@unimelb.edu.au
  • 3. NeCTAR Research Cloud ● Started in 2011 ● Funded by the Australian Government ● 8 institutions around the country ● Production early 2012 - Openstack Diablo ● All federated to appear as 1 cloud from the users point of view ● Put the compute near the data and tools ● 5000+ users
  • 4. NeCTAR Sites ● University of Melbourne ● National Computation Infrastructure ● Monash University ● Queensland CyberInfrastructure Foundation ● eResearch SA ● University of Tasmania ● Intersect, NSW ● iVEC, WA
  • 5. Cells to build a Federation ● Use cells to federate geographically separated sites ● Different hardware/networks/people ● Parent cell run centrally at unimelb along with keystone/cinder/glance etc (no neutron) ● Each site has 1 or more compute cells ● These roughly match up to availability zones from a users perspective (cells are behind the scenes)
  • 6. How big? ● Each site ~4000 cores, ~150 hypervisors ● 6 sites in production, 4600+ instances ● Last 2 sites in prod by end of year ● ~1000 hypervisors, 40k cores ● ~10 compute cells ● Some sites have multiple datacenters so have multiple cells
  • 7. Pain points ● Cell scheduling isn’t smart ● Broadcast calls rely on all cells to be alive ● Not many people to share experiences with ● Upgrades, although havana → icehouse could happen in stages. Much easier!
  • 8. Things we’ve added, not in trunk (yet) ● Security group syncing ● ec2 id mappings (needed for metadata) ● Availability zone / aggregate support ● Flavour management *We assume a cell only has 1 parent
  • 9. Cells: How we use them at CERN Belmiro Moreira email: belmiro.moreira @ cern.ch @belmiromoreira
  • 10. CERN ● Conseil Européen pour la Recherche Nucléaire – aka European Organization for Nuclear Research ● Founded in 1954 with an international treaty ○ 21 state members, other countries contribute to experiments ○ Situated between Geneva and the Jura Mountains, straddling the Swiss- French border ● CERN mission is to do fundamental research ● CERN provides particle accelerators and other infrastructure for high-energy physics research
  • 11. CERN - Cloud Infrastructure ● In production since July 2013 ● Performed two upgrades: Grizzly -> Havana -> Icehouse ○ Currently running: nova; glance; keystone; horizon; cinder w/ Ceph; ceilometer ● RDO distribution on SLC6; pip with Windows Server 2012 R2 ● 2 geographically separated data centres ○ Geneva (Switzerland) and Budapest (Hungary) ● Numbers ○ ~3000 compute nodes (75k cores; 140TB RAM) ■ ~2900 kvm; ~100 Hyper-V; ○ ~8000 virtual machines
  • 12. CERN - Cloud Infrastructure - Cells ● Why we use cells? ○ Scale transparently between different Data Centres ○ Availability and Resilience ○ Isolate different use-cases ● Today: 1 api Cell and 8 compute Cells ○ 2 level tree ○ size range between 100 to ~1600 compute nodes ○ 6 Compute Cells in Switzerland; 2 Compute Cells in Hungary ● “Shared” and “Private” Cells ○ 3 availability zones available in “Shared” Cells
  • 13. CERN - Cells Limitations ● Missing functionality ○ Security Groups ○ Flavor propagation (api -> compute) ○ Manage aggregates on api Cell ○ Server groups ● Cell scheduler ● Ceilometer integration
  • 14. CERN - Cells Challenges ● More ~74000 cores by beginning 2015 ○ How to organize and distribute nodes between different cells? ● Split current large cells into a small number (~200) of compute nodes ○ Expected to have +30 cells by end 2015 ○ How to manage a large number of Cells?
  • 15. Created by: Matt Van Winkle @mvanwink Modified Date: 10/29/2014 Cells at Rackspace Cells: How to Evolve Your Cloud to Scale
  • 16. • Managed Cloud company offering a suite of dedicated and cloud hosting products • Founded in 1998 in San Antonio, TX • Home of Fanatical Support • More than 200,000 customers in 120 countries Rackspace www.rackspace.com
  • 17. • In production since August 2012 – Currently running: Nova; Glance; Neutron; Ironic; Swift; Cinder • Regular upgrades from trunk – Package built on trunk pull from 10/21 in testing now • Compute nodes are Debian based – Run as VMs on hypervisors and manage via XAPI • 6 Geographic regions around the globe – DFW; ORD; IAD; LON; SYD; HKG • Numbers – 10’s of 1000’s of hypervisors (over 330K Cores, 1+ Petabyte of RAM) • All XenServer – Over 150,000 virtual machines Rackspace – Cloud Infrastructure www.rackspace.com
  • 18. • Why we use cells? – Manage Multiple Flavor Classes – Network resources (Public IPs, Private IPs, aggregation routers, etc) – Network Constraints – Continual Supply Chain • 1 Global API cell per region with multiple Compute cells (3 – 35+) – 2 level tree – Size between ~100 and ~600 hosts per cell • Control infrastructure exists as instances in small OpenStack deployment • All cells available to all tenants – Tested “dedicated” cells for potential large customers Rackspace – Cloud Infrastructure - Cells www.rackspace.com
  • 19. • Missing Functionality – Security Groups – Host aggregates • Scheduler – No “disable” – Incomplete host statuses • Other services are not cell aware – Neutron is a prime example Rackspace – Cells Limitations www.rackspace.com
  • 20. • Increasing number of flavor classes – Different Hardware specs per class – Sizing varies by average VM density • Multiple vendor sources – Subtle hardware differences in same specs across different vendors • Scaling global services with cell growth – Still don’t have the perfect ratios Rackspace – Cells Challenges www.rackspace.com
  • 21. • Nova Dev team met this morning to discuss cells in a few sessions: – Cells – Wednesday, November 5, 09:00 – Cells continued – Wednesday, November 5, 09:50 • Areas of discussion – Feature completion – No-op/single cell as default – Cell awareness in APIs • Recap from sessions Cells Feature Completion www.rackspace.com
  • 22. Thank You! ● Belmiro Moreira - CERN - belmiro.moreira@cern.ch ● Matt Van Winkle - Rackspace - @mvanwink ● Sam Morrison - NeCTAR, University of Melbourne - sam.morrison@unimelb. edu.au Questions? www.rackspace.com