CERN is expanding its computing infrastructure to support growing data and computing needs. It is adopting open source tools like Puppet for configuration management and OpenStack for cloud computing. CERN plans to deploy OpenStack into production in 2013 to manage over 15,000 hypervisors and 100,000 VMs across its data centers by 2015, supporting both traditional and cloud-based workflows. This will enable CERN to more efficiently manage resources and better support dynamic workloads and temporary spikes in demand.
Architectures for open and scalable cloudsRandy Bias
My presentation for 2012's Cloud Connect that goes over architectural and design patterns for open and scalable clouds. Technical deck targeted at business audiences with a technical bent.
OpenStack을 중심으로 하여 하이브리드 클라우드를 구축하기 위한 여러 전략들을 살펴보는 시간을 가졌습니다. 또한, Hyper-V를 예로 들어 OpenStack에서 상용 하이퍼바이저를 어떻게 지원하는지, 그리고 그에 따른 개발 과정을 같이 살펴보는 시간을 가졌습니다.
Architectures for open and scalable cloudsRandy Bias
My presentation for 2012's Cloud Connect that goes over architectural and design patterns for open and scalable clouds. Technical deck targeted at business audiences with a technical bent.
OpenStack을 중심으로 하여 하이브리드 클라우드를 구축하기 위한 여러 전략들을 살펴보는 시간을 가졌습니다. 또한, Hyper-V를 예로 들어 OpenStack에서 상용 하이퍼바이저를 어떻게 지원하는지, 그리고 그에 따른 개발 과정을 같이 살펴보는 시간을 가졌습니다.
CERN is the home of the Large Hadron Collider (LHC), a 27km circular proton accelerator generating tens of petabytes of new data every year. Data is stored and processed using a large amount of resources totaling over 250.000 cores and 1000s of storage servers, managed by OpenStack.
Networking is a critical part of our infrastructure and arguably the hardest to evolve. Given the size of CERN’s infrastructure, its flat network is partitioned in segments each representing a separate broadcast domain and potentially offering different levels of service. This fragmentation improves scalability and reduces the impact of misbehaving systems in the datacentre to individual segments. On the other hand, having multiple broadcast domains means features like floating and virtual IPs are much harder to offer.
We will tell the story of OpenStack Networking at CERN. First integration with Nova Network, the migration to Neutron and how we're adding SDN in our infrastructure.
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-RegionJi-Woong Choi
OpenStack Ceph & Neutron에 대한 설명을 담고 있습니다.
1. OpenStack
2. How to create instance
3. Ceph
- Ceph
- OpenStack with Ceph
4. Neutron
- Neutron
- How neutron works
5. OpenStack HA- controller- l3 agent
6. OpenStack multi-region
VMware Monitoring-Discover And Monitor Your Virtual EnvironmentSite24x7
Gain a holistic view of your VMware infrastructure. Monitor VMware vSphere hosts and virtual machines (VMs). Get graphical views, alarms and thresholds, out-of-the-box reports, comprehensive fault management and maximum ESX server uptime. Site24x7 vCenter servers allow you to take control of your virtual resources and VMware infrastructure.
Docker Networking with New Ipvlan and Macvlan DriversBrent Salisbury
Docker Networking presentation at ONS2016.
Docker Macvlan and Ipvlan Networking Drivers Experimental Readme:
github.com/docker/docker/blob/master/experimental/vlan-networks.md
Kernel requirements for Ipvlan mode is v4.2+, Macvlan mode is v3.19.
If using Virtualbox to test with, use NAT mode interfaces unless you have multiple MAC addresses working in your setup. Use the 172.x.x.x subnet and gateway used by the VBox NAT network. Vmware Fusion works out of the box.
Here is a screenshot of a VirtualBox NAT interface:
https://www.dropbox.com/s/w1rf61n18y7q4f1/Screenshot%202016-03-20%2001.55.13.png?dl=0
Continuous Delivery to Kubernetes with Jenkins and HelmDavid Currie
Presentation given at Oracle Code One 2018 covering deploying Jenkins to Kubernetes with Helm, deploying to Kubernetes from Jenkins with Helm, and Jenkins X.
Interconnecting Neutron and Network Operators' BGP VPNsThomas Morin
joint presentation given at OpenStack summit Barcelona (Oct. 2016) with Paul Carver and Tim Irnich
talk video: https://www.youtube.com/watch?v=LCDeR7MwTzE
demo: https://www.youtube.com/watch?v=5iRoZcmQyuU
OpenNebulaConf2015 1.07 Cloud for Scientific Computing @ STFC - Alexander DibboOpenNebula Project
The Science and Technology Facilities Council is a UK Research Council which funds research and provides large facilities to the UK Scientific Community. This includes running a Tier 1 site for the LHC computing project, the JASMIN Super Data Cluster and a number of other HPC and HTC facilities. The Scientific Computing Department at the Rutherford Appleton Laboratory has been developing a cloud for use across both sites of the Department and in the wider scientific community. This is an OpenNebula backed by Ceph block storage. I will give a brief background of the project, describe our set up, some use cases and the work we have done around OpenNebula (including a simplified web front-end and a number of hooks to provide us with traceability). I will also discuss how we are creating an elastic boundary between our HTC batch farm and cloud.
Author Biography
I am a Systems Administrator in the Scientific Computing Department of the UK’s Science and Technology Facilities Council. I work as part of the cloud team and I also work on a number of Grid services including our HTC batch farm for the LHC computing project.
Prior to my position here I worked in IT at a SMB focusing on Storage and Virtualisation, in particular Hyper-V and VMWare.
CERN is the home of the Large Hadron Collider (LHC), a 27km circular proton accelerator generating tens of petabytes of new data every year. Data is stored and processed using a large amount of resources totaling over 250.000 cores and 1000s of storage servers, managed by OpenStack.
Networking is a critical part of our infrastructure and arguably the hardest to evolve. Given the size of CERN’s infrastructure, its flat network is partitioned in segments each representing a separate broadcast domain and potentially offering different levels of service. This fragmentation improves scalability and reduces the impact of misbehaving systems in the datacentre to individual segments. On the other hand, having multiple broadcast domains means features like floating and virtual IPs are much harder to offer.
We will tell the story of OpenStack Networking at CERN. First integration with Nova Network, the migration to Neutron and how we're adding SDN in our infrastructure.
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-RegionJi-Woong Choi
OpenStack Ceph & Neutron에 대한 설명을 담고 있습니다.
1. OpenStack
2. How to create instance
3. Ceph
- Ceph
- OpenStack with Ceph
4. Neutron
- Neutron
- How neutron works
5. OpenStack HA- controller- l3 agent
6. OpenStack multi-region
VMware Monitoring-Discover And Monitor Your Virtual EnvironmentSite24x7
Gain a holistic view of your VMware infrastructure. Monitor VMware vSphere hosts and virtual machines (VMs). Get graphical views, alarms and thresholds, out-of-the-box reports, comprehensive fault management and maximum ESX server uptime. Site24x7 vCenter servers allow you to take control of your virtual resources and VMware infrastructure.
Docker Networking with New Ipvlan and Macvlan DriversBrent Salisbury
Docker Networking presentation at ONS2016.
Docker Macvlan and Ipvlan Networking Drivers Experimental Readme:
github.com/docker/docker/blob/master/experimental/vlan-networks.md
Kernel requirements for Ipvlan mode is v4.2+, Macvlan mode is v3.19.
If using Virtualbox to test with, use NAT mode interfaces unless you have multiple MAC addresses working in your setup. Use the 172.x.x.x subnet and gateway used by the VBox NAT network. Vmware Fusion works out of the box.
Here is a screenshot of a VirtualBox NAT interface:
https://www.dropbox.com/s/w1rf61n18y7q4f1/Screenshot%202016-03-20%2001.55.13.png?dl=0
Continuous Delivery to Kubernetes with Jenkins and HelmDavid Currie
Presentation given at Oracle Code One 2018 covering deploying Jenkins to Kubernetes with Helm, deploying to Kubernetes from Jenkins with Helm, and Jenkins X.
Interconnecting Neutron and Network Operators' BGP VPNsThomas Morin
joint presentation given at OpenStack summit Barcelona (Oct. 2016) with Paul Carver and Tim Irnich
talk video: https://www.youtube.com/watch?v=LCDeR7MwTzE
demo: https://www.youtube.com/watch?v=5iRoZcmQyuU
OpenNebulaConf2015 1.07 Cloud for Scientific Computing @ STFC - Alexander DibboOpenNebula Project
The Science and Technology Facilities Council is a UK Research Council which funds research and provides large facilities to the UK Scientific Community. This includes running a Tier 1 site for the LHC computing project, the JASMIN Super Data Cluster and a number of other HPC and HTC facilities. The Scientific Computing Department at the Rutherford Appleton Laboratory has been developing a cloud for use across both sites of the Department and in the wider scientific community. This is an OpenNebula backed by Ceph block storage. I will give a brief background of the project, describe our set up, some use cases and the work we have done around OpenNebula (including a simplified web front-end and a number of hooks to provide us with traceability). I will also discuss how we are creating an elastic boundary between our HTC batch farm and cloud.
Author Biography
I am a Systems Administrator in the Scientific Computing Department of the UK’s Science and Technology Facilities Council. I work as part of the cloud team and I also work on a number of Grid services including our HTC batch farm for the LHC computing project.
Prior to my position here I worked in IT at a SMB focusing on Storage and Virtualisation, in particular Hyper-V and VMWare.
A “meta‑cloud” for building clouds
Build your own cloud on our hardware resources
Agnostic to specific cloud software
Run existing cloud software stacks (like OpenStack, Hadoop, etc.)
... or new ones built from the ground up
Control and visibility all the way to the bare metal
“Sliceable” for multiple, isolated experiments at once
CLIMB System Introduction Talk - CLIMB LaunchTom Connor
Talk outlining the CLoud Infrastructure for Microbial Bioinformatics (CLIMB) system given at the CLIMB Launch in July 2016. CLIMB is a UK national e-infrastructure providing Microbial Bioinformatics as a Service.
CERN is the European Centre for Particle Physics based in Geneva. The home of the Large Hadron Collider and the birth place of the world wide web is expanding its computing resources with a second data centre to process over 35PB/year from one of the largest scientific experiments ever constructed.
Within the constraints of fixed budget and manpower, agile computing techniques and common open source tools are being adopted to support over 11,000 physicists in their search for how the universe works and what is it made of.
By challenging special requirements and understanding how other large computing infrastructures are built, we have deployed a 50,000 core cloud based infrastructure building on tools such as Puppet, OpenStack and Kibana.
In moving to a cloud model, this has also required close examination of the IT processes and culture. Finding the right approach between Enterprise and DevOps techniques has been one of the greatest challenges of this transformation.
This talk will cover the requirements, tools selected, results achieved so far and the outlook for the future.
Who Needs Network Management in a Cloud Native Environment?Eshed Gal-Or
(This talk was presented in OSS NA 2017 Los Angeles )
Network management (and virtual network in particular) is hard.
Cloud app developers find themselves dealing with too many options and too many settings, which make no sense.
This is because Cloud APIs evolved from legacy IT management.
Cloud-Native apps are revolutionizing how software is developed and deployed.
Why do app developers need to deal with those legacy network knobs and gauges?
Why do we even need to care about IP addresses, routers, or load balancers, in a cloud-native world?
In this presentation, we will explore some alternative approach and how we could go about implementing it *today* with K8S and Dragonflow (an open source virtual network management project), to provide a more stable, better performing and truly scalable cloud-native infrastructure.
Dev / Test / Ops – Gain More Horsepower and Reduce Costs by Sharing Kubernete...Ian Lumb
Containerization coupled with DevOps is revolutionizing application development and deployment, but organizations are creating silos of clusters that limit the operational efficiencies that can be gained by sharing hardware, software and systems administrators. This talk will cover how improved cluster management and consolidation at the orchestration, network and storage layers can yield great returns for developers, DevOps and IT management. (Slides from a presentation and demo at the Toronto Kubernetes Meetup on April 26, 2017.)
There is a growing trend today of enterprises leveraging both Amazon Web Services (AWS) and on-premise OpenStack-based private clouds. However, the default networking option in OpenStack remains broken and the plethora of confusing plug-ins makes networking in OpenStack mysterious and difficult to manage.
Enter MidoNet, the open source network virtualization solution from Midokura favored by DevOps cultures in web scale enterprises and service providers around the world. This session will present case studies from several end user deployments, showing how they use MidoNet to build, run and manage large-scale virtual networks in OpenStack clouds. The session will also discuss how transitioning from a public to private cloud enables organizations to accomplish much more with the same resources, without over-simplifying the inherent complexity of running an OpenStack cloud.
There is a growing trend today of enterprises leveraging both Amazon Web Services (AWS) and on-premise OpenStack-based private clouds. However, the default networking option in OpenStack remains broken and the plethora of confusing plug-ins makes networking in OpenStack mysterious and difficult to manage.
Enter MidoNet, the open source network virtualization solution from Midokura favored by DevOps cultures in web scale enterprises and service providers around the world. This session will present case studies from several end user deployments, showing how they use MidoNet to build, run and manage large-scale virtual networks in OpenStack clouds. The session will also discuss how transitioning from a public to private cloud enables organizations to accomplish much more with the same resources, without over-simplifying the inherent complexity of running an OpenStack cloud.
Dell openstack cloud with inktank ceph – large scale customer deploymentKamesh Pemmaraju
This was my presentation at the OpenStack Summit in Hong Kong, November 2013. Learn detail around a unique deployment of the Dell OpenStack-Powered Cloud Solution with Inktank Ceph installed at a large nationally recognized American University that specializes in cancer and genomic research. The University had a need to provide a scalable, secure, centralized data repository to support approximately 900 researchers and an ever-expanding number of research projects and rapidly expanding universe of data. The Dell and Inktank cloud storage solution addresses these storage challenges with an open source solution that leverages the Dell Crowbar Framework and Reference Architecture. After assessing a number of traditional storage scenarios, the University partnered with Dell and Inktank to architect a centralized cloud storage platform that is capable of scaling seamlessly and rapidly, is cost-effective, and that can leverage a single hardware infrastructure, with Dell Power Edge R-720XD servers and the Dell Reference Architecture for their OpenStack compute and storage environment.
The Effectiveness, Efficiency and Legitimacy of Outsourcing Your Data DataCentred
Presentation given by our CEO Mike Kelly at this year's Excellence in Policing conference talking about the benefits of cloud computing and the Effectiveness, Efficiency and Legitimacy of outsourcing data. The presentation looks at the long term trends supporting the adoption of cloud technologies and dispels some of the myths and reasons why not to adopt cloud.
The presentation concludes with an examination of the benefits of utilising cloud technology and examines how best to adopt a cloud approach.
Supporting Research through "Desktop as a Service" models of e-infrastructure...David Wallom
Keynote presentation given 13/9/16 @ ESA Earth Observation Open Science workshop 2016.
"The rise in cloud computing as an e-infrastructure model is one that has the power to democratise access to computational and data resources throughout the research communities. We have seen the difference that Infrastructure as a Service (IaaS) has made for different communities and are now only beginning to understand what different models further up the stack can make. It is also becoming clear that with the increase in research data volumes, the number of sources and the possibility of utilising data from different regulatory regimes that a different model of how analysis is performed on the data is possible. Utilising a "Desktop as a Service" model, with community focused applications installed on a common and well understood virtual system image that is directly connected to community relevant data allows the researcher to no longer have to consider moving data but only the final analysed results. This massively simplifies both the user model and the data and resource owner model. We will consider the specific example of the Environmental Ecomics Synthesis Cloud and how it could easily be generalised to other areas."
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
CERN Data Centre Evolution
1. CERN Data Centre Evolution
Gavin McCance
gavin.mccance@cern.ch
@gmccance
SDCD12: Supporting Science with Cloud Computing
Bern
19th November 2012
2. What is CERN ?
Gavin McCance, CERN 2
• Conseil Européen pour la
Recherche Nucléaire – aka
European Laboratory for
Particle Physics
• Between Geneva and the
Jura mountains, straddling
the Swiss-‐French border
• Founded in 1954 with an
international treaty
• Our business is fundamental
physics , what is the
universe made of and how
does it work
3. Gavin McCance, CERN 3
Answering fundamental questions…
• How to explain particles have mass?
We have theories and accumulating experimental evidence.. Getting close…
• What is 96% of the universe made of ?
We can only see 4% of its estimated mass!
• Why isn’t there anti-‐matter
in the universe?
Nature should be symmetric…
• What was the state of matter just
after the « Big Bang » ?
Travelling back to the earliest instants of
the universe would help…
6. Gavin McCance, CERN 6
• Data Centre by Numbers
– Hardware installation & retirement
• ~7,000 hardware movements/year; ~1,800 disk failures/year
Xeon
5150
2%
Xeon
5160
10%
Xeon
E5335
7%
Xeon
E5345
14%
Xeon
E5405
6%
Xeon
E5410
16%
Xeon
L5420
8%
Xeon
L5520
33%
Xeon
3GHz
4%
Fujitsu
3%
Hitachi
23%
HP
0%
Maxtor
0%
Seagate
15%
Western
Digital
59%
Other
0%
High Speed Routers
(640 Mbps → 2.4 Tbps)
24
Ethernet Switches 350
10 Gbps ports 2,000
Switching Capacity 4.8 Tbps
1 Gbps ports 16,939
10 Gbps ports 558
Racks 828
Servers 11,728
Processors 15,694
Cores 64,238
HEPSpec06 482,507
Disks 64,109
Raw disk capacity (TiB) 63,289
Memory modules 56,014
Memory capacity (TiB) 158
RAID controllers 3,749
Tape Drives 160
Tape Cartridges 45,000
Tape slots 56,000
Tape Capacity (TiB) 73,000
IT Power Consumption 2,456 KW
Total Power Consumption 3,890 KW
7. Current infrastructure
• Around 12k servers
– Dedicated compute, dedicated disk server, dedicated service nodes
– Majority Scientific Linux (RHEL5/6 clone)
– Mostly running on real hardware
– Last couple of years, we’ve consolidated some of the service nodes
onto Microsoft HyperV
– Various other virtualisation projects around
• In 2002 we developed our own management toolset
– Quattor / CDB configuration tool
– Lemon computer monitoring
– Open source, but a small community
Gavin McCance, CERN 7
8. • Many diverse applications (”clusters”)
• Managed by different teams (CERN IT + experiment groups)
Gavin McCance, CERN 8
9. New data centre to expand capacity
Gavin McCance, CERN 9
• Data centre in Geneva
at the limit of
electrical capacity at
3.5MW
• New centre chosen in
Budapest, Hungary
• Additional 2.7MW of
usable power
• Hands off facility
• Deploying from 2013
with 200Gbit/s
network to CERN
10. Time to change strategy
• Rationale
– Need to manage twice the servers as today
– No increase in staff numbers
– Tools becoming increasingly brittle and will not scale as-‐is
• Approach
– CERN is no longer a special case for compute
– Adopt an open source tool chain model
– Our engineers rapidly iterate
• Evaluate solutions in the problem domain
• Identify functional gaps and challenge old assumptions
• Select first choice but be prepared to change in future
– Contribute new function back to the community
Gavin McCance, CERN 10
12. Choose Puppet for Configuration
• The tool space has exploded in last few years
– In configuration management and operations
• Puppet and Chef are the clear leaders for ‘core tools’
• Many large enterprises now use Puppet
– Its declarative approach fits what we’re used to at CERN
– Large installations: friendly, wide-‐based community
– You can buy books on it
– You can employ people who know it better than do
Gavin McCance, CERN 12
13. Puppet Experience
• Excellent: basic puppet is easy to setup
and can be scaled-‐up well
• Well documented, configuring services with it is easy
• Handle our cluster diversity and dynamic clouds well
• Lots of resource (“modules”) online, though of varying quality
• Large, responsive community to help
• Lots of nice tooling for free
– Configuration version control and branching: integrates well with git
– Dashboard: we use the Foreman dashboard
• We’re moving all our production service over in 2013
Gavin McCance, CERN 13
15. Preparing the move to cloud
• Improve operational efficiency and dynamicness
– Dynamic multiple operating system demand
– Dynamic temporary load spikes for special activities
– Hardware interventions with long running programs (live migration)
• Improve resource efficiency
– Exploit idle resources, especially waiting for disk and tape I/O
– Highly variable load such as interactive or build machines
• Enable cloud architectures
– Gradual migration from traditional batch + disk to cloud interfaces and
workflows
• Improve responsiveness
– Self-‐Service with coffee break response time
Gavin McCance, CERN 15
16. What is OpenStack ?
• OpenStack is a cloud operating system that controls large
pools of compute, storage, and networking resources
throughout a datacenter, all managed through a dashboard
that gives administrators control while empowering their users
to provision resources through a web interface
Gavin McCance, CERN 16
17. Service Model
Gavin McCance, CERN 17
• Pets are given names like
pussinboots.cern.ch
• They are unique, lovingly hand raised
and cared for
• When they get ill, you nurse them back
to health
• Cattle are given numbers like
vm0042.cern.ch
• They are almost identical to other cattle
• When they get ill, you get another one
• Future application architectures should use Cattle but Pets with
strong configuration management are viable and still needed
Borrowed from
@randybias at Cloudscaling
http://www.slideshare.net/randybias/the-‐cloud-‐
revolution-‐cyber-‐press-‐forum-‐philippines
18. Basic Openstack Components
Gavin McCance, CERN 18
Compute Scheduler
NetworkVolume
Registry Image
KEYSTONE HORIZON
NOVAGLANCE
• Each component has an API and is pluggable
• Other non-‐core projects interact with these components
19. Supporting the Pets with OpenStack
• Network
– Interfacing with legacy site DNS and IP management
– Ensuring Kerberos identity before VM start
• Puppet
– Ease use of configuration management tools with our users
– Exploit mcollective for orchestration/delegation
• External Block Storage
– Currently using nova-‐volume with Gluster backing store
• Live migration to maximise availability
– KVM live migration using Gluster
– KVM and Hyper-‐V block migration
Gavin McCance, CERN 19
20. Current Status of OpenStack at CERN
• Working on an Essex code base from the EPEL repository
– Excellent experience with the Fedora cloud-‐sig team
– Cloud-‐init for contextualisation, oz for images with RHEL/Fedora
• Components
– Current focus is on Nova with KVM and Hyper-‐V
– Keystone running with Active Directory and Glance for Linux and
Windows images
• Pre-‐production facility with around 200 Hypervisors, with
2000 VMs integrated with CERN infrastructure
– used for simulation of magnet placement using LHC@Home and batch
physics programs
Gavin McCance, CERN 20
22. Next Steps
• Deploy into production at the start of 2013 with Folsom running
production services and compute on top of OpenStack IaaS
• Support multi-‐site operations with 2nd data centre in Hungary
• Exploit new functionality
– Ceilometer for metering
– Bare metal for non-‐virtualised use cases such as high I/O servers
– X.509 user certificate authentication
– Load balancing as a service
Ramping to 15K hypervisors with 100K
VMs by 2015
Gavin McCance, CERN 22
23. Conclusions
• CERN computer centre is expanding
• We’re in the process of refurbishing the tools we use
to manage the centre based on Openstack for IaaS
and Puppet for configuration management
• Production at CERN in next few months on Folsom
– Gradual migration of all our services
• Community is key to shared success
– CERN contributes and benefits
Gavin McCance, CERN 23
25. Training and Support
• Buy the book rather than guru mentoring
• Follow the mailing lists to learn
• Newcomers are rapidly productive (and often know more than us)
• Community and Enterprise support means we’re not on our own
Gavin McCance, CERN 25
26. Staff Motivation
• Skills valuable outside of CERN when an engineer’s contracts
end
Gavin McCance, CERN 26
27. When communities combine…
• OpenStack’s many components and options make
configuration complex out of the box
• Puppet forge module from PuppetLabs does our configuration
• The Foreman adds OpenStack provisioning for user kiosk to a
configured machine in 15 minutes
Gavin McCance, CERN 27
29. Active Directory Integration
• CERN’s Active Directory
– Unified identity management across the site
– 44,000 users
– 29,000 groups
– 200 arrivals/departures per month
• Full integration with Active Directory via LDAP
– Uses the OpenLDAP backend with some particular configuration
settings
– Aim for minimal changes to Active Directory
– 7 patches submitted around hard coded values and additional filtering
• Now in use in our pre-‐production instance
– Map project roles (admins, members) to groups
– Documentation in the OpenStack wiki
Gavin McCance, CERN 29
30. What are we missing (or haven’t found yet) ?
• Best practice for
– Monitoring and KPIs as part of core functionality
– Guest disaster recovery
– Migration between versions of OpenStack
• Roles within multi-‐user projects
– VM owner allowed to manage their own resources (start/stop/delete)
– Project admins allowed to manage all resources
– Other members should not have high rights over other members VMs
• Global quota management for non-‐elastic private cloud
– Manage resource prioritisation and allocation centrally
– Capacity management / utilisation for planning
Gavin McCance, CERN 30
31. Opportunistic Clouds in online experiment farms
• The CERN experiments have farms of 1000s of Linux servers
close to the detectors to filter the 1PByte/s down to 6GByte/s
to be recorded to tape
• When the accelerator is not running, these machines are
currently idle
– Accelerator has regular maintenance slots of several days
– Long Shutdown due from March 2013-‐November 2014
• One of the experiments are deploying OpenStack on their farm
– Simulation (low I/O, high CPU)
– Analysis (high I/O, high CPU, high network)
Gavin McCance, CERN 31
Established by an international treaty at the end of 2nd world war as a place where scientists could work together for fundamental researchNuclear is part of the name but our world is particle physics
Our current understanding of the universe is incomplete. A theory, called the Standard Model, proposes particles and forces, many of which have been experimentally observed. However, there are open questions- Why do some particles have mass and others not ? The Higgs Boson is a theory but we need experimental evidence.Our theory of forces does not explain how Gravity worksCosmologists can only find 4% of the matter in the universe, we have lost the other 96%We should have 50% matter, 50% anti-matter… why is there an asymmetry (although it is a good thing that there is since the two anhialiate each other) ?When we go back through time 13 billion years towards the big bang, we move back through planets, stars, atoms, protons/electrons towards a soup like quark gluon plasma. What were the properties of this?
The ring consists of two beam pipes, with a vacuum pressure 10 times lower than on the moon which contain the beams of protons accelerated to just below the speed of light. These go round 11,000 times per second being bent by the superconducting magnets cooled to 2K by liquid helium (-450F), colder than outer space. The beams themselves have a total energy similar to a high speed train so care needs to be taken to make sure they turn the corners correctly and don’t bump into the walls of the pipe.
To improve the statistics, we send round beams of multiple bunches, as they cross there are multiple collisions as 100 billion protons per bunch pass through each otherSoftware close by the detector and later offline in the computer centre then has to examine the tracks to understand the particles involved
So, to the Tier-0 computer centre at CERN… we are unusual in that we are public with our environment as there is no competitive advantage for us. We have thousands of visitors a year coming for tours and education and the computer center is a popular visit.The data centre has around 2.9MW of usable power looking after 12,000 servers.. In comparison, the accelerator uses 120MW, like a small town.With 64,000 disks, we have around 1,800 failing each year… this is much higher than the manufacturers’ MTBFs which is consistent with results from Google.Servers are mainly Intel processors, some AMD with dual core Xeon being the most common configuration.
Asked member states for offers200Gbit/s links connecting the centresExpect to double computing capacity compared to today by 2015
Double the capacity, same manpowerNeed to rethink how to solve the problem… look at how others approach itWe had our own tools in 2002 and as they become more sophisticated, it was not possible to take advantage of other developments elsewhere without a major break.Doing this while doing their ‘day’ jobs so it re-enforces the approach of taking what we can from the community
Model based on Google Toolchain, Puppet is key for many operations. We’ve only had to write one new significant custom CERN software component which is in the certificate authority. Other parts such as Lemon for monitoring are from our previous implementation as we did not want to change all at once and they scale.
Standardise hardware … buy in bulk and pile it up then work out what to use it forMemory, motherboards, cables or disks interventionsUsers waiting for I/O means wasted cycles. Build machines at night unused during the day. Interactive machines mainly during the dayMove to cloud APIs … need to support them but also maintain our existing applicationsDetails later on reception and testing
Puppet applies well to the cattle model but we’re also using it to handle the pet cases that can’t yet move over due to software limitations. So, they get cloud provisioning but flexible configuration management.
Complex to configure… take advantage of the experience of others
We’ve been very pleased with our choices. Along with the obvious benefits of the functionality, there are soft benefits from the community model.
Many staff at CERN are short term contracts… good benefits for those staff to leave with skills in need.
Communities integrating … when a new option is being used at CERN in OpenStack, we contribute the changes back to the puppet forge such as certificate handling. Even looking at Hyper-V/Windows openstack configuration…