CERN operates the largest particle physics laboratory in the world. It manages over 8,000 servers to support its research. In 2012, CERN recognized limits with its existing infrastructure management tools and formed a team to define a new "Agile Infrastructure Project." The project goals were to improve resource provisioning time, enable cloud interfaces, improve monitoring and accounting, and boost efficiency. The team adopted open source tools like OpenStack, Puppet, and Ceph to create a new cloud service spanning two data centers. This allowed on-demand provisioning in minutes versus months and helped CERN better support its expanding computing needs for research.
CERN is the European Centre for Particle Physics based in Geneva. The home of the Large Hadron Collider and the birth place of the world wide web is expanding its computing resources with a second data centre to process over 35PB/year from one of the largest scientific experiments ever constructed.
Within the constraints of fixed budget and manpower, agile computing techniques and common open source tools are being adopted to support over 11,000 physicists in their search for how the universe works and what is it made of.
By challenging special requirements and understanding how other large computing infrastructures are built, we have deployed a 50,000 core cloud based infrastructure building on tools such as Puppet, OpenStack and Kibana.
In moving to a cloud model, this has also required close examination of the IT processes and culture. Finding the right approach between Enterprise and DevOps techniques has been one of the greatest challenges of this transformation.
This talk will cover the requirements, tools selected, results achieved so far and the outlook for the future.
The CMS openstack, opportunistic, overlay, online-cluster Cloud (CMSooooCloud)Jose Antonio Coarasa Perez
The CMS online cluster consists of more than 3000 computers. It has been exclusively used for the Data Acquisition of the CMS experiment at CERN, archiving around 20Tbytes of data per day.
An openstack cloud layer has been deployed on part of the cluster (totalling more than 13000 cores) as a minimal overlay so as to leave the primary role of the computers untouched while allowing an opportunistic usage of the cluster. This allows running offline computing jobs on the online infrastructure while it is not (fully) used.
We will present the architectural choices made to deploy an unusual, as opposed to dedicated, "overlaid cloud infrastructure". These architectural choices ensured a minimal impact on the running cluster configuration while giving a maximal segregation of the overlaid virtual computer infrastructure. Openvswitch was chosen during the proof of concept phase in order to avoid changes on the network infrastructure. Its use will be illustrated as well as the final networking configuration used. The design and performance of the openstack cloud controlling layer will be also presented together with new developments and experience from the first year of usage.
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresRafael Ferreira da Silva
Presentation held at NSFCloud Workshop - Arlington, USA
DICE Team at Department of Computer Science and Academic Computer Center CYFRONET of AGH collaborates with researchers at the University of Southern California and the Center for Research Computing at the University of Notre Dame. In the scope of this collaboration, we develop methods and tools supporting programming and execution of complex scientific applications on heterogeneous computing infrastructures.
More information: www.rafaelsilva.com
C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...J On The Beach
Developing reliable data acquisition, processing and control modules for mission critical systems - as they run at CERN - has always been challenging. As both data volumes and rates increase, non-functional requirements such as performance, availability, and maintainability are getting more important than ever. C2MON is a modular Open Source Java framework for realising highly available, large industrial monitoring and control solutions. It has been initially developed for CERN’s demanding infrastructure monitoring needs and is based on more than 10 years of experience with the Technical Infrastructure Monitoring (TIM) systems at CERN. Combining maintainability and high-availability within a portable architecture is the focus of this work. Making use of standard Java libraries for in-memory data management, clustering and data persistence, the platform becomes interesting for many Big Data scenarios.
Overview of what has happened in HNSciCloud over the last five months, Delivered by Helge Meinhard of CERN at the HEPiX Workshop on October 21st, 2016, in Berkeley, California, USA.
Slides from our Q3 meetup held in Montreal on September 27th 2017 at the Cloud.ca Center.
Video recording can be seen at: https://www.youtube.com/watch?v=_1btwHW39ms&list=PLSsQodeQD6LPyqrvvczcC5mkOOnPt469o
OpenStack and Red Hat: How we learned to adapt with our customers in a maturi...OpenStack
Audience Level
All levels
Synopsis
Peter has been involved in OpenStack community since its B-release, and he has been enabling and helping customers across various industries adopt OpenStack in strategic ways. In this session, you will learn from his experience what Red Hat’s perspective is on the current state of affairs in the OpenStack community and the path we see ahead that Red Hat is putting its efforts in. OpenStack is not a product that tries to solve any one business problem in particular, but a technology that aims to be usable for many – what are the required steps to make sure that your organisation is ready for the OpenStack-based cloudification and transformation.
Speaker Bio:
Peter Jung is a Senior Business Development Manager at Red Hat where he leads the practice in the areas of Cloud, SDN/NFV and IoT across Australia and New Zealand. He is passionate about open innovation and open source software development model as the foundation for next generation society and ICT systems. Prior to Red Hat, he had various roles at Cisco and Dell for 15 years. He holds a BSEE and an MBA.
OpenStack Australia Day Melbourne 2017
https://events.aptira.com/openstack-australia-day-melbourne-2017/
Overlay Opportunistic Clouds in CMS/ATLAS at CERN: The CMSooooooCloud in DetailJose Antonio Coarasa Perez
Overlay opportunistic clouds in CMS/ATLAS at CERN: The CMSooooooCloud in detail
The CMS and ATLAS online clusters consist of more than 3000 computers each. They have been exclusively used for the data acquisition that led to the Higgs particle discovery, handling 100Gbytes/s data flows and archiving 20Tbytes of data per day.
An openstack cloud layer has been deployed on the newest part of the clusters (totalling 1300 hypervisors and more than 13000 cores in CMS alone) as a minimal overlay so as to leave the primary role of the computers untouched while allowing an opportunistic usage of the cluster.
This presentation will show how to share resources with a minimal impact on the existing infrastructure. We will present the architectural choices made to deploy an unusual, as opposed to dedicated, "overlaid cloud infrastructure". These architectural choices ensured a minimal impact on the running cluster configuration while giving a maximal segregation of the overlaid virtual computer infrastructure. The use of openvswitch to avoid changes on the network infrastructure and encapsulate the virtual machines traffic will be illustrated, as well as the networking configuration adopted due to the nature of our private network. The design and performance of the openstack cloud controlling layer will be presented. We will also show the integration carried out to allow the cluster to be used in an opportunistic way while giving full control to the CMS online run control.
Blue Waters and Resource Management - Now and in the Futureinside-BigData.com
In this presentation from Moabcon 2013, Bill Kramer from NCSA presents: Blue Waters and Resource Management - Now and in the Future.
Watch the video of this presentation: http://insidehpc.com/?p=36343
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...Igor Sfiligoi
Presented at PEARC20.
This talk presents expanding the IceCube’s production HTCondor pool using cost-effective GPU instances in preemptible mode gathered from the three major Cloud providers, namely Amazon Web Services, Microsoft Azure and the Google Cloud Platform. Using this setup, we sustained for a whole workday about 15k GPUs, corresponding to around 170 PFLOP32s, integrating over one EFLOP32 hour worth of science output for a price tag of about $60k. In this paper, we provide the reasoning behind Cloud instance selection, a description of the setup and an analysis of the provisioned resources, as well as a short description of the actual science output of the exercise.
Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...OpenStack
Audience Level
Intermediate
Synopsis
M3 is the latest generation system of the MASSIVE project, an HPC facility specializing in characterization science (imaging and visualization). Using OpenStack as the compute provisioning layer, M3 is a hybrid HPC/cloud system, custom-integrated by Monash’s R@CMon Research Cloud team. Built to support Monash University’s next-gen high-throughput instrument processing requirements, M3 is half-half GPU-accelerated and CPU-only.
We’ll discuss the design and tech used to build this innovative platform as well as detailing approaches and challenges to building GPU-enabled and HPC clouds. We’ll also discuss some of the software and processing pipelines that this system supports and highlight the importance of tuning for these workloads.
Speaker Bio
Blair Bethwaite: Blair has worked in distributed computing at Monash University for 10 years, with OpenStack for half of that. Having served as team lead, architect, administrator, user, researcher, and occasional hacker, Blair’s unique perspective as a science power-user, developer, and system architect has helped guide the evolution of the research computing engine central to Monash’s 21st Century Microscope.
Lance Wilson: Lance is a mechanical engineer, who has been making tools to break things for the last 20 years. His career has moved through a number of engineering subdisciplines from manufacturing to bioengineering. Now he supports the national characterisation research community in Melbourne, Australia using OpenStack to create HPC systems solving problems too large for your laptop.
With the HPC Cloud facility, SURFsara offers self-service, dynamically scalable and fully configurable HPC systems to the Dutch academic community. Users have, for example, a free choice of operating system and software.
The HPC Cloud offers full control over a HPC cluster, with fast CPUs and high memory nodes and it is possible to attach terabytes of local storage to a compute node. Because of this flexibility, users can fully tailor the system for a particular application. Long-running and small compute jobs are equally welcome. Additionally, the system facilitates collaboration: users can share control over their virtual private HPC cluster with other users and share processing time, data and results. A portal with wiki, fora, repositories, issue system, etc. is offered for collaboration projects as well.
CERN is the European Centre for Particle Physics based in Geneva. The home of the Large Hadron Collider and the birth place of the world wide web is expanding its computing resources with a second data centre to process over 35PB/year from one of the largest scientific experiments ever constructed.
Within the constraints of fixed budget and manpower, agile computing techniques and common open source tools are being adopted to support over 11,000 physicists in their search for how the universe works and what is it made of.
By challenging special requirements and understanding how other large computing infrastructures are built, we have deployed a 50,000 core cloud based infrastructure building on tools such as Puppet, OpenStack and Kibana.
In moving to a cloud model, this has also required close examination of the IT processes and culture. Finding the right approach between Enterprise and DevOps techniques has been one of the greatest challenges of this transformation.
This talk will cover the requirements, tools selected, results achieved so far and the outlook for the future.
The CMS openstack, opportunistic, overlay, online-cluster Cloud (CMSooooCloud)Jose Antonio Coarasa Perez
The CMS online cluster consists of more than 3000 computers. It has been exclusively used for the Data Acquisition of the CMS experiment at CERN, archiving around 20Tbytes of data per day.
An openstack cloud layer has been deployed on part of the cluster (totalling more than 13000 cores) as a minimal overlay so as to leave the primary role of the computers untouched while allowing an opportunistic usage of the cluster. This allows running offline computing jobs on the online infrastructure while it is not (fully) used.
We will present the architectural choices made to deploy an unusual, as opposed to dedicated, "overlaid cloud infrastructure". These architectural choices ensured a minimal impact on the running cluster configuration while giving a maximal segregation of the overlaid virtual computer infrastructure. Openvswitch was chosen during the proof of concept phase in order to avoid changes on the network infrastructure. Its use will be illustrated as well as the final networking configuration used. The design and performance of the openstack cloud controlling layer will be also presented together with new developments and experience from the first year of usage.
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresRafael Ferreira da Silva
Presentation held at NSFCloud Workshop - Arlington, USA
DICE Team at Department of Computer Science and Academic Computer Center CYFRONET of AGH collaborates with researchers at the University of Southern California and the Center for Research Computing at the University of Notre Dame. In the scope of this collaboration, we develop methods and tools supporting programming and execution of complex scientific applications on heterogeneous computing infrastructures.
More information: www.rafaelsilva.com
C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...J On The Beach
Developing reliable data acquisition, processing and control modules for mission critical systems - as they run at CERN - has always been challenging. As both data volumes and rates increase, non-functional requirements such as performance, availability, and maintainability are getting more important than ever. C2MON is a modular Open Source Java framework for realising highly available, large industrial monitoring and control solutions. It has been initially developed for CERN’s demanding infrastructure monitoring needs and is based on more than 10 years of experience with the Technical Infrastructure Monitoring (TIM) systems at CERN. Combining maintainability and high-availability within a portable architecture is the focus of this work. Making use of standard Java libraries for in-memory data management, clustering and data persistence, the platform becomes interesting for many Big Data scenarios.
Overview of what has happened in HNSciCloud over the last five months, Delivered by Helge Meinhard of CERN at the HEPiX Workshop on October 21st, 2016, in Berkeley, California, USA.
Slides from our Q3 meetup held in Montreal on September 27th 2017 at the Cloud.ca Center.
Video recording can be seen at: https://www.youtube.com/watch?v=_1btwHW39ms&list=PLSsQodeQD6LPyqrvvczcC5mkOOnPt469o
OpenStack and Red Hat: How we learned to adapt with our customers in a maturi...OpenStack
Audience Level
All levels
Synopsis
Peter has been involved in OpenStack community since its B-release, and he has been enabling and helping customers across various industries adopt OpenStack in strategic ways. In this session, you will learn from his experience what Red Hat’s perspective is on the current state of affairs in the OpenStack community and the path we see ahead that Red Hat is putting its efforts in. OpenStack is not a product that tries to solve any one business problem in particular, but a technology that aims to be usable for many – what are the required steps to make sure that your organisation is ready for the OpenStack-based cloudification and transformation.
Speaker Bio:
Peter Jung is a Senior Business Development Manager at Red Hat where he leads the practice in the areas of Cloud, SDN/NFV and IoT across Australia and New Zealand. He is passionate about open innovation and open source software development model as the foundation for next generation society and ICT systems. Prior to Red Hat, he had various roles at Cisco and Dell for 15 years. He holds a BSEE and an MBA.
OpenStack Australia Day Melbourne 2017
https://events.aptira.com/openstack-australia-day-melbourne-2017/
Overlay Opportunistic Clouds in CMS/ATLAS at CERN: The CMSooooooCloud in DetailJose Antonio Coarasa Perez
Overlay opportunistic clouds in CMS/ATLAS at CERN: The CMSooooooCloud in detail
The CMS and ATLAS online clusters consist of more than 3000 computers each. They have been exclusively used for the data acquisition that led to the Higgs particle discovery, handling 100Gbytes/s data flows and archiving 20Tbytes of data per day.
An openstack cloud layer has been deployed on the newest part of the clusters (totalling 1300 hypervisors and more than 13000 cores in CMS alone) as a minimal overlay so as to leave the primary role of the computers untouched while allowing an opportunistic usage of the cluster.
This presentation will show how to share resources with a minimal impact on the existing infrastructure. We will present the architectural choices made to deploy an unusual, as opposed to dedicated, "overlaid cloud infrastructure". These architectural choices ensured a minimal impact on the running cluster configuration while giving a maximal segregation of the overlaid virtual computer infrastructure. The use of openvswitch to avoid changes on the network infrastructure and encapsulate the virtual machines traffic will be illustrated, as well as the networking configuration adopted due to the nature of our private network. The design and performance of the openstack cloud controlling layer will be presented. We will also show the integration carried out to allow the cluster to be used in an opportunistic way while giving full control to the CMS online run control.
Blue Waters and Resource Management - Now and in the Futureinside-BigData.com
In this presentation from Moabcon 2013, Bill Kramer from NCSA presents: Blue Waters and Resource Management - Now and in the Future.
Watch the video of this presentation: http://insidehpc.com/?p=36343
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...Igor Sfiligoi
Presented at PEARC20.
This talk presents expanding the IceCube’s production HTCondor pool using cost-effective GPU instances in preemptible mode gathered from the three major Cloud providers, namely Amazon Web Services, Microsoft Azure and the Google Cloud Platform. Using this setup, we sustained for a whole workday about 15k GPUs, corresponding to around 170 PFLOP32s, integrating over one EFLOP32 hour worth of science output for a price tag of about $60k. In this paper, we provide the reasoning behind Cloud instance selection, a description of the setup and an analysis of the provisioned resources, as well as a short description of the actual science output of the exercise.
Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...OpenStack
Audience Level
Intermediate
Synopsis
M3 is the latest generation system of the MASSIVE project, an HPC facility specializing in characterization science (imaging and visualization). Using OpenStack as the compute provisioning layer, M3 is a hybrid HPC/cloud system, custom-integrated by Monash’s R@CMon Research Cloud team. Built to support Monash University’s next-gen high-throughput instrument processing requirements, M3 is half-half GPU-accelerated and CPU-only.
We’ll discuss the design and tech used to build this innovative platform as well as detailing approaches and challenges to building GPU-enabled and HPC clouds. We’ll also discuss some of the software and processing pipelines that this system supports and highlight the importance of tuning for these workloads.
Speaker Bio
Blair Bethwaite: Blair has worked in distributed computing at Monash University for 10 years, with OpenStack for half of that. Having served as team lead, architect, administrator, user, researcher, and occasional hacker, Blair’s unique perspective as a science power-user, developer, and system architect has helped guide the evolution of the research computing engine central to Monash’s 21st Century Microscope.
Lance Wilson: Lance is a mechanical engineer, who has been making tools to break things for the last 20 years. His career has moved through a number of engineering subdisciplines from manufacturing to bioengineering. Now he supports the national characterisation research community in Melbourne, Australia using OpenStack to create HPC systems solving problems too large for your laptop.
With the HPC Cloud facility, SURFsara offers self-service, dynamically scalable and fully configurable HPC systems to the Dutch academic community. Users have, for example, a free choice of operating system and software.
The HPC Cloud offers full control over a HPC cluster, with fast CPUs and high memory nodes and it is possible to attach terabytes of local storage to a compute node. Because of this flexibility, users can fully tailor the system for a particular application. Long-running and small compute jobs are equally welcome. Additionally, the system facilitates collaboration: users can share control over their virtual private HPC cluster with other users and share processing time, data and results. A portal with wiki, fora, repositories, issue system, etc. is offered for collaboration projects as well.
This a RECAP project overview slide deck prepared by Thang Le Duc (UMU), P-O Östberg (UMU) and Tomas Brännström (Tieto). It starts with an introduction and continues with a section on challenges for a self-orchestrated, self-remediated cloud system. It then presents the RECAP vision and use cases and finishes with a conclusion.
Grid Computing - Collection of computer resources from multiple locationsDibyadip Das
Grid computing is the collection of computer resources from multiple locations to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files.
This is a presentation by Prof. Anne Elster at the International Workshop on Open Source Supercomputing held in conjunction with the 2017 ISC High Performance Computing Conference.
How HPC and large-scale data analytics are transforming experimental scienceinside-BigData.com
In this deck from DataTech19, Debbie Bard from NERSC presents: Supercomputing and the scientist: How HPC and large-scale data analytics are transforming experimental science.
"Debbie Bard leads the Data Science Engagement Group NERSC. NERSC is the mission supercomputing center for the USA Department of Energy, and supports over 7000 scientists and 700 projects with supercomputing needs. A native of the UK, her career spans research in particle physics, cosmology and computing on both sides of the Atlantic. She obtained her PhD at Edinburgh University, and has worked at Imperial College London as well as the Stanford Linear Accelerator Center (SLAC) in the USA, before joining the Data Department at NERSC, where she focuses on data-intensive computing and research, including supercomputing for experimental science and machine learning at scale."
Watch the video: https://wp.me/p3RLHQ-kLV
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Grid optical network service architecture for data intensive applicationsTal Lavian Ph.D.
Integrated SW System Provide the “Glue”
Dynamic optical network as a fundamental Grid service in data-intensive Grid application, to be scheduled, to be managed and coordinated to support collaborative operations
From Super-computer to Super-network
In the past, computer processors were the fastest part
peripheral bottlenecks
In the future optical networks will be the fastest part
Computer, processor, storage, visualization, and instrumentation - slower "peripherals”
eScience Cyber-infrastructure focuses on computation, storage, data, analysis, Work Flow.
The network is vital for better eScience
Accelerating TensorFlow with RDMA for high-performance deep learningDataWorks Summit
Google’s TensorFlow is one of the most popular deep learning (DL) frameworks. In distributed TensorFlow, gradient updates are a critical step governing the total model training time. These updates incur a massive volume of data transfer over the network.
In this talk, we first present a thorough analysis of the communication patterns in distributed TensorFlow. Then we propose a unified way of achieving high performance through enhancing the gRPC runtime with Remote Direct Memory Access (RDMA) technology on InfiniBand and RoCE. Through our proposed RDMA-gRPC design, TensorFlow only needs to run over the gRPC channel and gets the optimal performance. Our design includes advanced features such as message pipelining, message coalescing, zero-copy transmission, etc. The performance evaluations show that our proposed design can significantly speed up gRPC throughput by up to 1.5x compared to the default gRPC design. By integrating our RDMA-gRPC with TensorFlow, we are able to achieve up to 35% performance improvement for TensorFlow training with CNN models.
Speakers
Dhabaleswar K (DK) Panda, Professor and University Distinguished Scholar, The Ohio State University
Xiaoyi Lu, Research Scientist, The Ohio State University
Palladio Optimization Suite: QoS optimization for component-based Cloud appli...Michele Ciavotta, PH. D.
Presentation slides for the 9th EAI International Conference on Performance Evaluation Methodologies and Tools (VALUETOOS 2015) December 14–16, 2015 | Berlin, Germany.
The crux of the talk is the presentation of Palladio Optimization Suite.
Similar to 20181219 ucc open stack 5 years v3 (20)
Review of CERN's objectives and how the computing infrastructure is evolving to address the challenges at scale using community supported software such as Puppet and OpenStack.
CERN, the European Organization for Nuclear Research, is one of the world’s largest centres for scientific research. Its business is fundamental physics, finding out what the universe is made of and how it works. At CERN, accelerators such as the 27km Large Hadron Collider, are used to study the basic constituents of matter. This talk reviews the challenges to record and analyse the 25 Petabytes/year produced by the experiments and the investigations into how OpenStack could help to deliver a more agile computing infrastructure.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
20181219 ucc open stack 5 years v3
1.
2. Clouds at CERN : A 5 year perspective
Utility and Cloud Computing Conference, December 19, 2018
Tim Bell
@noggin143UCC 2018 2
3. About Tim
• Responsible for Compute
and Monitoring in CERN
IT department
• Elected member of the
OpenStack Foundation
management board
• Member of the
OpenStack user
committee from 2013-
2015
UCC 2018 3
9. Image credit: CERN
Image credit: CERN
UCC 20189
ATLAS, CMS, ALICE and LHCb
EIFFEL
TOWER
HEAVIER
than the
Image credit: CERN
10. UCC 2018 10
40 million
pictures
per second
1PB/s
Image credit: CERN
11. About the CERN IT Department
UCC 2018 11
Enable the laboratory to fulfill its mission
- Main data centre on Meyrin site
- Wigner data centre in Budapest (since 2013)
- Connected via three dedicated 100Gbs links
- Where possible, resources at both sites
(plus disaster recovery)
Drone footage of the CERN CC
About the CERN IT Department
UCC 2018
4
Enable the laboratory to fulfill its mission
- Main data centre on Meyrin site
- Wigner data centre in Budapest (since 2013)
- Connected via three dedicated 100Gbs links
- Where possible, resources at both sites
(plus disaster recovery)
Drone footage of the CERN CC
19/12/2018
13. Outline
UCC 2018
13
• Fabric Management before 2012
• The AI Project
• The three AI areas
- Configuration Management
- Monitoring
- Resource provisioning
• Review
14. CERN IT Tools up to 2011 (1)
UCC 2018
14
• Developed in series of EU funded projects
- 2001-2004: European DataGrid
- 2004-2010: EGEE
• Work package 4 – Fabric management:
“Deliver a computing fabric comprised of all the necessary tools to
manage a centre providing grid services on clusters of thousands of
nodes.”
15. CERN IT Tools up to 2011 (2)
UCC 2018
15
• The WP4 software was developed from scratch
- Scale and experience needed for LHC Computing was special
- Config’ mgmt, monitoring, secret store, service status, state mgmt, service databases, …
LEMON – LHC Era Monitoring
- client/server based monitoring
- local agent with sensors
- samples stored in a cache & sent to server
- UDP or TCP, w/ or w/o encryption
- support for remote entities
- system administration toolkit
- automated installation, configuration &
management of clusters
- clients interact with a configuration
database (CMDB) & and an installation
infrastructure (AII)
Around 8’000 servers managed!
16. 2012: A Turning Point for CERN IT
UCC 2018
16
• EU projects finished in 2010: decreasing development and support
• LHC compute and data requirements increasing
- Moore’s law would help, but not enough
• Staff would not grow with managed resources
- Standardization & automation, current tools not apt
• Other deployments have surpassed the CERN one
- Mostly commercial companies like Google, Facebook, Rackspace, Amazon, Yahoo!, …
- We were no longer special! Can we profit?
0
20
40
60
80
100
120
140
160
Run 1 Run 2 Run 3 Run 4
GRID
ATLAS
CMS
LHCb
ALICE
we are
here
what we
can afford
LS1 (2013) ahead, next window for change would only open in 2019 …
2012
17. UCC 2018
17
How we began …
• Formed a small team of service managers from …
- Large services (e.g. batch, plus)
- Existing fabric services (e.g. monitoring)
- Existing virtualization service
• ... to define project goals
- What issues do we need to address?
- What forward looking features do we need?
http://iopscience.iop.org/article/10.1088/1742-6596/396/4/042002/pdf
18. Agile Infrastructure Project Goals
UCC 2018
18
New data centre support
- Overcome limits of CC in Meyrin
- Disaster recovery and business continuity
- ‘Smart hands’ approach
1
19. Agile Infrastructure Project Goals
UCC 2018
19
Sustainable tool support
- Tools to be used at our scale need maintenance
- Tools with a limited community require more time for
newcomers to become productive and are less valuable
for the time after (transferable skills)
2
20. Agile Infrastructure Project Goals
UCC 2018
20
Improve user response time
- Reduce the resource provisioning time span
(current virtualization service reached scaling limits)
- Self-service kiosk
3
21. Agile Infrastructure Project Goals
UCC 2018
21
Enable cloud interfaces
- Experiments already started to use EC2
- Enable libraries such as Apache’s libcloud
4
22. Agile Infrastructure Project Goals
UCC 2018
22
Precise monitoring and
accounting
- Enable timely monitoring for debugging
- Showback usage to the cloud users
- Consolidate accounting data for usage of CPU, network,
storage … across batch, physical nodes and grid
resources
5
23. Agile Infrastructure Project Goals
UCC 2018
23
Improve resource
efficiency
- Adapt provisioned resources to services’ needs
- Streamline the provisioning workflows
(e.g. burn-in, repair or retirement)
6
24. Our Approach: Tool Chain and DevOps
UCC 2018
24
• CERN’s requirements are no longer special!
• A set of tools emerged when looking at other places
• Small dedicated tools
allowed for rapid validation &
prototyping
• Adapted our processes,
policies and work flows
to the tools!
• Join (and contribute to)
existing communities!
25. IT Policy Changes for Services
UCC 2018
25
• Services shall be virtual …
- Within reason
- Exceptions are costly!
• Puppet managed, and …
• … monitored!
- (Semi-)automatic with Puppet
Decrease provisioning time
Increase resource efficiency
Simplify infrastructure mgmt
Profit from others’ work
Speed up deployment
‘Automatic’ documentation
Centralized monitoring
Integrated alarm handling
26. UCC 2018
26
Tools + Policies:
Sounds simple!
From tools to services is complex!
- Integration w/ sec services?
- Incident handling?
- Request work flows?
- Change management?
- Accounting and charging?
- Life cycle management?
- … Image: Subbu Allamaraju
28. Resource Provisioning: IaaS
UCC 2018
28
• Based on OpenStack
- Collection of open source projects for cloud orchestration
- Started by NASA and Rackspace in 2010
- Grown into a global software community
30. The CERN Cloud Service
UCC 2018
30
• Production since July 2013
- Several rolling upgrades since,
now on Rocky
- Many sub services deployed
• Spans two data centers
- One region, one API entry point
• Deployed using RDO + Puppet
- Mostly upstream, patched where needed
• Many sub services run on VMs!
- Boot strapping
32. Agility in the Cloud
UCC 2018
32
• Use case spectrum
- Batch service (physics analysis)
- IT services (built on each other)
- Experiment services (build)
- Engineering (chip design)
- Infrastructure (hotel, bikes)
- Personal (development)
• Hardware spectrum
- Processor archs (features, NUMA, …)
- Core-to-RAM ratio (1:2, 1:3, 1:5, …)
- Core-to-disk ratio (2x or 4x SSDs)
- Disk layout (2, 3, 4, mixed)
- Network (1/10GbE, FC, domain)
- Location (DC, power)
- SLC6, CC7, RHEL, Windows
- …
33. What about our initial goals?
UCC 2018
33
• The remote DC is seamlessly
integrated
- No difference from provisioning PoV
- Easily accessible by users
- Local DC limits overcome (business continuity?)
• Sustainable tools
- Number of managed machines has multiplied
- Good collaboration with upstream communities
- Newcomers know tools, can use knowledge
afterwards
• Provisioning time span is ~minutes
- Was several months before
- Self-service kiosk with automated workflows
• Cloud interfaces
- Good OpenStack adoption, EC2 support
• Flexible monitoring infra
- Automatic in for simple cases
- Powerful tool set for more complex ones
- Accounting for local and grid resources
• Increased resource efficiency
- ‘Packing’ of services
- Overcommit
- Adapted to services’ needs
- Quick draining & back filling
So … 100% success?
34. Cloud Architecture Overview
UCC 2018
34
• Top and child cells for scaling
- API, DB, MQ, Compute nodes
- Remote DC is set of cells
• Nova HA only on top cell
- Simplicity vs impact
• Other projects global
- Load balanced controllers
- RabbitMQ clusters
• Three Ceph instances
- Volumes (Cinder), images (Glance), shares (Manila)
36. Tech. Challenge: Scaling
• OpenStack Cells provides composable units
• Cells V1 – Special custom developments
• Cells V2 – Now the standard deployment model
• Broadcast vs Targetted queries
• Handling down cells
• Quota
• Academic and scientific instances push the
limits
• Now many enterprise clouds above 1000
hypervisors
• CERN running 73 Cells in production
UCC 2018 36
https://www.openstack.org/analytics
37. Tech. Challenge: CPU Performance
UCC 2018
37
• The benchmarks on full-node VMs was about 20% lower
than the one of the underlying host
- Smaller VMs much better
• Investigated various tuning options
- KSM*, EPT**, PAE, Pinning, … +hardware type dependencies
- Discrepancy down to ~10% between virtual and physical
• Comparison with Hyper-V: no general issue
- Loss w/o tuning ~3% (full-node), <1% for small VMs
- … NUMA-awareness!
*KSM on/off: beware of memory reclaim! **EPT on/off: beware of expensive page table walks!
38. CPU Performance: NUMA
UCC 2018
38
• NUMA-awareness identified as most
efficient setting
• “EPT-off” side-effect
- Small number of hosts, but very
visible there
• Use 2MB Huge Pages
- Keep the “EPT off” performance gain
with “EPT on”
39. NUMA roll-out
UCC 2018
39
• Rolled out on ~2’000 batch hypervisors (~6’000 VMs)
- HP allocation as boot parameter reboot
- VM NUMA awareness as flavor metadata delete/recreate
• Cell-by-cell (~200 hosts):
- Queue-reshuffle to minimize resource impact
- Draining & deletion of batch VMs
- Hypervisor reconfiguration (Puppet) & reboot
- Recreation of batch VMs
• Whole update took about 8 weeks
- Organized between batch and cloud teams
- No performance issue observed since
VM Before After
4x 8 8%
2x 16 16%
1x 24 20% 5%
1x 32 20% 3%
41. VM Expiry
UCC 2018 41
• Each personal instance will have an expiration date
• Set shortly after creation and evaluated daily
• Configured to 180 days, renewable
• Reminder mails starting 30 days before expiration
43. Tech. Challenge: Bare Metal
UCC 2018 43
• VMs not suitable for all of our use cases
- Storage and database nodes, HPC clusters, boot strapping,
critical network equipment or specialised network setups,
precise/repeatable benchmarking for s/w frameworks, …
• Complete our service offerings
- Physical nodes (in addition to VMs and containers)
- OpenStack UI as the single pane of glass
• Simplify hardware provisioning workflows
- For users: openstack server create/delete
- For procurement & h/w provisioning team: initial on-boarding, server re-assignments
• Consolidate accounting & bookkeeping
- Resource accounting input will come from less sources
- Machine re-assignments will be easier to track
44. Adapt the Burn In process
• “Burn-in” before acceptance
- Compliance with technical spec (e.g. performance)
- Find failed components (e.g. broken RAM)
- Find systematic errors (e.g. bad firmware)
- Provoke early failing due to stress
- Tests include
- CPU: burnK7, burnP6, burnMMX (cooling)
- RAM: memtest, Disk: badblocks
- Network: iperf(3) between pairs of nodes
- automatic node pairing
- Benchmarking: HEPSpec06 (& fio)
- derivative of SPEC06
- we buy total compute capacity (not newest processors)
UCC 2018 44
46. Tech. Challenge: Containers
UCC 2018 46
An OpenStack API Service that allows creation of container
clusters
● Use your OpenStack credentials, quota and roles
● You choose your cluster type
● Multi-Tenancy
● Quickly create new clusters with advanced features
such as multi-master
● Integrated monitoring and CERN storage access
● Making it easy to do the right thing
47. Scale Testing using Rally
• An Openstack benchmark test tool
• Easily extended by plugin
• Test result in HTML reports
• Used by many projects
• Context: set up environment
• Scenario: run benchmark
• Recommended for a production
service
to verify that the service behaves as
expected at all time
UCC 2018 47
Kubernetes
Cluster
pods,
contai
ners
Rally
report
48. First Attempt – 1M requests/Seq
• 200 Nodes
• Found multiple limits
• Heat Orchestration scaling
• Authentication caches
• Volume deletion
• Site services
UCC 2018 48
50. Tech. Challenge: Meltdown
UCC 2018 50
• In January 2018, a security vulnerability was
disclosed a new kernel everywhere
• Staged campaign
• 7 reboot days, 7 tidy up days
• By availability zone
• Benefits
• Automation now to reboot the cloud if needed -
33,000 VMs on 9,000 hypervisors
• Latest QEMU and RBD user code on all VMs
• Then L1TF came along
• And we had to do it all again......
06/06/2018
51. UCC 2018 51
First run LS1 Second run Third run LS3 HL-LHC Run4
…2009 2013 2014 2015 2016 2017 201820112010 2012 2019 2023 2024 2030?20212020 2022 …2025
LS2
Significant part of cost comes
from global operations
Even with technology increase of
~15%/year, we still have a big
gap if we keep trying to do things
with our current compute models
Raw data volume
increases significantly
for High Luminosity LHC
2026
53. Non-Technical Challenges (1)
UCC 2018
53
• Agile Infrastructure Paradigm Adoption
- ‘VMs are slower than physical machines.’
- ‘I need to keep control on the full stack.’
- ‘This would not have happened with physical machines.’
- ‘It’s the cloud, so it should be able to do X!’
- ‘Using a config’ management tool is too dangerous!’
- ‘They are my machines’
54. Non-Technical Challenges (2)
UCC 2018
54
• Agility can bring great benefits …
• … but mind (adapted) Hooke’s Law!
- Avoid irreversible deformations
• Ensure the tail is moving as well as
the head
- Application support
- Cultural changes
- Workflow adoption
- Open source community culture can help
55. Non-Technical Challenges (3)
• Contributor License Agreements
• Patches needed but merges/review time
• Regular staff changes limits Karma
• Need to be a polyglot
• Python, Ruby, Go, … and legacy Perl etc.
• Keep riding the release wave
• Avoid the end-of-life scenarios
UCC 2018 55
56. Ongoing Work Areas
• Spot Market / Pre-emptible instances
• Software Defined Networking
• Regions
• GPUs
• Containers on Bare Metal
• …
UCC 2018 56
57. Summary
UCC 2018
57
Positive results after 5 years into the project!
- LHC needs met without additional staff
- Tools and workflows widely adopted and accepted
- Many technical challenges were mastered and returned upstream
- Integration with open source communities successful
- Use of common tools increased CERN’s attraction of talents
Further enhancements in function & scale needed for HL-LHC
58. Further Information
• CERN information outside the auditorium
• Jobs at CERN – wide range of options
• http://jobs.cern
• CERN blogs
• http://openstack-in-production.blogspot.ch
• https://techblog.web.cern.ch/techblog/
• Recent Talks at OpenStack summits
• https://www.openstack.org/videos/search?search=cern
• Source code
• https://github.com/cernops and https://github.com/openstack
UCC 2018 58
61. Agile Infrastructure Core Areas
UCC 2018
61
• Resource provisioning (IaaS)
- Based on OpenStack
• Centralized Monitoring
- Based on Collectd (sensor) + ‘ELK’ stack
• Configuration Management
- Based on Puppet
62. Configuration Management
UCC 2018
62
• Client/server architecture
- ‘agents’ running on hosts plus horizontally scalable ‘masters’
• Desired state of hosts described in ‘manifests’
- Simple, declarative language
- ‘resource’ basic unit for system modeling, e.g. package or service
• ‘agent’ discovers system state using ‘facter’
- Sends current system state to masters
• Master compiles data and manifests into ‘catalog’
- Agent applies catalog on the host
63. Status: Config’ Management (1)
UCC 2018
63
(virtual and physical, private and public cloud)
(‘base’ is what every Puppet node gets)
(compilations are spread out)
(this number includes dev changes)
(number Puppet code committers)
65. Status: Config’ Management (3)
UCC 2018
65
• Changes to QA are
announced publicly
• QA duration: 1 week
• All Service Managers
can stop a change!
66. Monitoring: Scope
UCC 2018
66
Data Centre Monitoring
• Two DCs at CERN and Wigner
• Hardware, O/S, and services
• PDUs, temp sensors, …
• Metrics and logs
Experiment Dashboards
- WLCG Monitoring
- Sites availability, data transfers,
job information, reports
- Used by WLCG, experiments,
sites and users
67. UCC 2018
67
Status: (Unified) Monitoring (1)
• Offering: monitor, collect, aggregate, process,
visualize, alarm … for metrics and logs!
• ~400 (virtual) servers, 500GB/day, 1B docs/day
- Mon data management from CERN IT and WLCG
- Infrastructure and tools for CERN IT and WLCG
• Migrations ongoing (double maintenance)
- CERN IT: From Lemon sensor to collectd
- WLCG: From former infra, tools, and dashboards
68. Status: (Unified) Monitoring (2)
UCC 2018
68
Kafka cluster
(buffering) *
Processing
Data enrichment
Data aggregation
Batch Processing
Transport
FlumeKafkasink
Flumesinks
FTS
Data
Sources
Rucio
XRootD
Jobs
…
Lemon
syslog
app log
DB
HTTP
feed
AMQ
Flume
AMQ
Flume
DB
Flume
HTTP
Flume
Log
GW
Flume
Metric
GW
Logs
Lemon
metrics
HDFS
Elastic
Search
…
Storage &
Search
Others
(influxdb)
Data
Access
CLI, API
User
Views
User
Jobs
User
Data
Today: > 500 GB/day, 72h buffering
Editor's Notes
Reference: Fabiola’s talk @ Univ of Geneva
https://www.unige.ch/public/actualites/2017/le-boson-de-higgs-et-notre-vie/
European Centre for Nuclear research
Founded in 1954, today 22 member state
World largest particle physics laboratory
~2.300 staff, 13k users on site
Budget 1k MCHF
Mission
Answer fundamental question on the universe
Advance the technology frontiers
Train scientist of tomorrow
Bring nations together
https://communications.web.cern.ch/fr/node/84
For all this fundamental research, CERN provides different facilities to scientists, for example the LHC
It’s a ring 27 km in circumference, crosses 2 countries, 100 mt underground, accelerates 2 particle beans to near the speed of light, and it make them collides to 4 different points where there are detectors to observe the fireworks.
2.500 people employed by CERN, > 10k users on the site
Talk about LHC here, describe experiment, lake geneve , mont blanc, an then jump in
Big ring is the LHC, the small one is the SPS, computer centre is not so far.
Pushing the boundary of technology,
It facilitate research, we just run the accelerators, experiment are done by institurtes, member states, university
Itranco swiss border, very close to geneva
Our flagship program is the LHC
Trillions of protons race around the 27km ring in opposite directions over 11,000 times a second, travelling at 99.9999991 per cent the speed of light.
Largest machine on earth
With an operating temperature of about -271 degrees Celsius, just 1.9 degrees above absolute zero, the LHC is one of the coldest places in the universe
120T Helium, only at that temperature there is no resistence
https://home.cern/about/engineering/vacuum-empty-interstellar-space
Inside beam operate a vey high vacuum, comparable to vacuum of the moon, there actually 2 beam, proton beams going int 2 directions, vaccum to avoiud protocon interacting with other particles
Technology very advanced beasts, 4 of them, ATLAS and CMS are the most well known ones, generale pouprose testing standard model properties, in those detector higgs particle have been discovered in 2012
In the picture you can see physicists. ALICE and LHCB
To sample and record the debris from up to 600 million proton collisions per second, scientists are building gargantuan devices that measure particles with micron precision.
100 Mpixel camera, 40 Million picture per seconds
https://www.ethz.ch/en/news-and-events/eth-news/news/2017/03/new-heart-for-cerns-cms.html