Prometheus has become the defacto monitoring system for cloud native applications, with systems like Kubernetes and Etcd natively exposing Prometheus metrics. In this talk Tom will explore all the moving part for a working Prometheus-on-Kubernetes monitoring system, including kube-state-metrics, node-exporter, cAdvisor and Grafana. You will learn about the various methods for getting to a working setup: the manual approach, using CoreOSs Prometheus Operator, or using Prometheus Ksonnet Mixin. Tom will also share some little tips and tricks for getting the most out of your Prometheus monitoring, including the common pitfalls and what you should be alerting on.
Prometheus was recently accepted into the Cloud Native Computing Foundation, making it the second project after Kubernetes to be given their blessing and acknowledging that Prometheus and Kubernetes make an awesome combination. In this talk we'll cover common patterns for running Prometheus on Kubernetes, how to monitor services on Kubernetes, and some cool tips and hacks to ensure you get the most out of your Prometheus + Kubernetes deployment.
Monitoring Uptime on the NeCTAR Research Cloud - Andy Botting, University of ...OpenStack
Audience Level
Intermediate
Synopsis
We will discuss how we do monitoring on the Nectar research cloud, utilising tools like OpenStack tempest, Nagios and translating this into a user facing dashboard.
Speaker Bio:
Andy is a DevOps engineer working at the University of Melbourne in the Core Services team for the Nectar Research Cloud.
Prometheus has become the defacto monitoring system for cloud native applications, with systems like Kubernetes and Etcd natively exposing Prometheus metrics. In this talk Tom will explore all the moving part for a working Prometheus-on-Kubernetes monitoring system, including kube-state-metrics, node-exporter, cAdvisor and Grafana. You will learn about the various methods for getting to a working setup: the manual approach, using CoreOSs Prometheus Operator, or using Prometheus Ksonnet Mixin. Tom will also share some little tips and tricks for getting the most out of your Prometheus monitoring, including the common pitfalls and what you should be alerting on.
Prometheus was recently accepted into the Cloud Native Computing Foundation, making it the second project after Kubernetes to be given their blessing and acknowledging that Prometheus and Kubernetes make an awesome combination. In this talk we'll cover common patterns for running Prometheus on Kubernetes, how to monitor services on Kubernetes, and some cool tips and hacks to ensure you get the most out of your Prometheus + Kubernetes deployment.
Monitoring Uptime on the NeCTAR Research Cloud - Andy Botting, University of ...OpenStack
Audience Level
Intermediate
Synopsis
We will discuss how we do monitoring on the Nectar research cloud, utilising tools like OpenStack tempest, Nagios and translating this into a user facing dashboard.
Speaker Bio:
Andy is a DevOps engineer working at the University of Melbourne in the Core Services team for the Nectar Research Cloud.
Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce...Flink Forward
Flink is a great stream processor, Python is a great programming language, Apache Beam is a great programming model and portability layer. Using all three together is a great idea! We will demo and discuss writing Beam Python pipelines and running them on Flink. We will cover Beam's portability vision that led here, what you need to know about how Beam Python pipelines are executed on Flink, and where Beam's portability framework is headed next (hint: Python pipelines reading from non-Python connectors)
Here is the PPT of our recently happened workshop. You can also watch on our youtube channel. here is the link -https://www.youtube.com/channel/UCeLma6SpNYH7jjYKSBNSexw
OSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike KlusikNETWAYS
Monitoring Kubernetes Clusters with Prometheus is state of the art. The difficulty is to find the significant metrics from the vast amount of available metrics. This talk shows a Monitoring Cockpit defined to get a quick overview of the cluster health and usage. It uses the Standard Metrics available for Kubernetes/OpenShift Clusters and their standard services. The monitoring solution is based on Prometheus, using InfluxDB for central long term storage and Grafana.
Securing & Enforcing Network Policy and Encryption with Weave NetLuke Marsden
This talk starts with a primer on container networking, then goes on to cover two distinct areas of container network security: encryption, enabled by IPsec in Weave Net and container firewalls, enabled by Kubernetes Network Policy and enforced by the Weave Net Network Policy Controller. A discussion of thread models is included.
Prometheus - Intro, CNCF, TSDB,PromQL,GrafanaSridhar Kumar N
https://www.youtube.com/playlist?list=PLAiEy9H6ItrKC5PbH7KiELiSEIKv3tuov
-What is Prometheus?
-Difference Between Nagios vs Prometheus
-Architecture
-Alertmanager
-Time series DB
-PromQL (Prometheus Query Language)
-Live Demo
-Grafana
OpenNebulaconf2017US: Rapid scaling of research computing to over 70,000 cor...OpenNebula Project
Since 2008, Harvard Research Computing has undertaken a significant scaling challenge increasing their available HPC and storage from 200 cores and 20TB to over 70,000 cores and 35PB of storage. James will discuss the journey and the highlights of extending the computing to support world class research and education. During the evolution of the computing platforms at Harvard they also helped to support and build the Massachusetts Green High Performance Computing Center which is a dedicated high performance research computing facility in Holyoke, MA. This facility continues to support large scale research computing with sustainable energy and advanced networking. Recently the NESE project (New England Storage Exchange) was funded by the National Science Foundation. This is a multi-petabyte object store that is supported by the existing MGHPCC facility supporting the region. The Data Science Initiative at Harvard has also been recently announced and will require even further advanced computation to support their research faculty. Now as the world takes a grip on "cloud" but more importantly remotely provisioned infrastructure, hybrid models for compute and storage are required along with flexibility to be able to further accelerate science. James will discuss their strategy moving forwards and the current and existing infrastructures in place to allow for seamless provisioning of research computing. Justin Riley Team Lead at Harvard, will follow this talk with a deep technical discussion of the specific implementation of the systems that Harvard are designing in concert with the development teams and leadership at OpenNebula to support research computing to make their platforms more resilient and able to continue to scale.
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...Tobias Schneck
Have you ever thought about migrating your Kubernetes clusters to Google Cloud to get your services closer to your customers? Yes? We too! Join us on an interactive journey to discover the main challenges of live migration at scale of etcd's, traffic routing and application workloads from your on-premise platform to GCP. The talk will discuss the current state of the technical concept, known problems and insides of the already proven migration steps for stateless workload.
As part of the journey, we'll see the differences between migrating one or one hundred clusters with productive workloads; What parts can be automated? What steps may need to be manual? Let's see how an automated solution could look like in the future and what steps are missing.
Outdated training deck for Prometheus monitoring tool - shared as a basis for newer content for potential MeetUp and Conference talks. I'm sharing it since there is some intrinsic value remaining.
OpenStack Collaboration made in heaven with Heat, Mistral, Neutron and more..Trinath Somanchi
Cross-project collaboration is something OpenStack community has embraced for a long time. Common libraries like Oslo reduces the time and effort to build a new service. Another way this manifests is in new OpenStack services getting built using existing services to solve an higher level use-case.
In this talk we are present how the band of projects comprising of Mistral, Tacker, Neutron, Heat, TOSCA-parser and Barbican came together to build an industry leading ETSI NFV Orchestrator that leveraged the best of these projects. Each of these projects brought in critical functionalities needed towards the final product. You will learn how, when strung together, this solution follows the classic Microservices design pattern that the industry is rapidly adopting.
Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce...Flink Forward
Flink is a great stream processor, Python is a great programming language, Apache Beam is a great programming model and portability layer. Using all three together is a great idea! We will demo and discuss writing Beam Python pipelines and running them on Flink. We will cover Beam's portability vision that led here, what you need to know about how Beam Python pipelines are executed on Flink, and where Beam's portability framework is headed next (hint: Python pipelines reading from non-Python connectors)
Here is the PPT of our recently happened workshop. You can also watch on our youtube channel. here is the link -https://www.youtube.com/channel/UCeLma6SpNYH7jjYKSBNSexw
OSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike KlusikNETWAYS
Monitoring Kubernetes Clusters with Prometheus is state of the art. The difficulty is to find the significant metrics from the vast amount of available metrics. This talk shows a Monitoring Cockpit defined to get a quick overview of the cluster health and usage. It uses the Standard Metrics available for Kubernetes/OpenShift Clusters and their standard services. The monitoring solution is based on Prometheus, using InfluxDB for central long term storage and Grafana.
Securing & Enforcing Network Policy and Encryption with Weave NetLuke Marsden
This talk starts with a primer on container networking, then goes on to cover two distinct areas of container network security: encryption, enabled by IPsec in Weave Net and container firewalls, enabled by Kubernetes Network Policy and enforced by the Weave Net Network Policy Controller. A discussion of thread models is included.
Prometheus - Intro, CNCF, TSDB,PromQL,GrafanaSridhar Kumar N
https://www.youtube.com/playlist?list=PLAiEy9H6ItrKC5PbH7KiELiSEIKv3tuov
-What is Prometheus?
-Difference Between Nagios vs Prometheus
-Architecture
-Alertmanager
-Time series DB
-PromQL (Prometheus Query Language)
-Live Demo
-Grafana
OpenNebulaconf2017US: Rapid scaling of research computing to over 70,000 cor...OpenNebula Project
Since 2008, Harvard Research Computing has undertaken a significant scaling challenge increasing their available HPC and storage from 200 cores and 20TB to over 70,000 cores and 35PB of storage. James will discuss the journey and the highlights of extending the computing to support world class research and education. During the evolution of the computing platforms at Harvard they also helped to support and build the Massachusetts Green High Performance Computing Center which is a dedicated high performance research computing facility in Holyoke, MA. This facility continues to support large scale research computing with sustainable energy and advanced networking. Recently the NESE project (New England Storage Exchange) was funded by the National Science Foundation. This is a multi-petabyte object store that is supported by the existing MGHPCC facility supporting the region. The Data Science Initiative at Harvard has also been recently announced and will require even further advanced computation to support their research faculty. Now as the world takes a grip on "cloud" but more importantly remotely provisioned infrastructure, hybrid models for compute and storage are required along with flexibility to be able to further accelerate science. James will discuss their strategy moving forwards and the current and existing infrastructures in place to allow for seamless provisioning of research computing. Justin Riley Team Lead at Harvard, will follow this talk with a deep technical discussion of the specific implementation of the systems that Harvard are designing in concert with the development teams and leadership at OpenNebula to support research computing to make their platforms more resilient and able to continue to scale.
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...Tobias Schneck
Have you ever thought about migrating your Kubernetes clusters to Google Cloud to get your services closer to your customers? Yes? We too! Join us on an interactive journey to discover the main challenges of live migration at scale of etcd's, traffic routing and application workloads from your on-premise platform to GCP. The talk will discuss the current state of the technical concept, known problems and insides of the already proven migration steps for stateless workload.
As part of the journey, we'll see the differences between migrating one or one hundred clusters with productive workloads; What parts can be automated? What steps may need to be manual? Let's see how an automated solution could look like in the future and what steps are missing.
Outdated training deck for Prometheus monitoring tool - shared as a basis for newer content for potential MeetUp and Conference talks. I'm sharing it since there is some intrinsic value remaining.
OpenStack Collaboration made in heaven with Heat, Mistral, Neutron and more..Trinath Somanchi
Cross-project collaboration is something OpenStack community has embraced for a long time. Common libraries like Oslo reduces the time and effort to build a new service. Another way this manifests is in new OpenStack services getting built using existing services to solve an higher level use-case.
In this talk we are present how the band of projects comprising of Mistral, Tacker, Neutron, Heat, TOSCA-parser and Barbican came together to build an industry leading ETSI NFV Orchestrator that leveraged the best of these projects. Each of these projects brought in critical functionalities needed towards the final product. You will learn how, when strung together, this solution follows the classic Microservices design pattern that the industry is rapidly adopting.
OpenNebulaconf2017US: Paying down technical debt with "one" dollar bills by ...OpenNebula Project
In addition to providing bare-metal access to large amounts of compute FAS Research Computing (FASRC) at Harvard also builds and fully maintains custom virtual machines tailored to faculty and researchers needs including lab websites, portals, databases, project development environments, and more both locally and on public clouds. Recently FASRC converted its internal VM infrastructure from a completely home-made KVM cluster to a more robust and reliable system powered by OpenNebula and Ceph configured with public cloud integration. Over the years as the number of VMs grew our home-made solution started to show signs of wear and tear with respect to scheduling, provisioning, management, inventory, and performance. Our new deployment improves on all of these areas and provides APIs and features that both help us serve clients more efficiently and improve our internal processes for testing new system configurations and dynamically spinning up resources for continous integration and deployment. Our new VM infrastructure deployment is fully automated via puppet and has been used to provision a multi-datacenter, fault-tolerant, VM infrastructure with a multi-tiered back-up system and robust VM and virtual disk monitoring. We will describe our internal system architecture and deployment, challenges we faced, and innovations we made along the way while deploying OpenNebula and Ceph. We will also discuss a new client-facing OpenNebula cloud deployment we’re currently beta testing with select users where users have full control over the creation and configuration of their VMs on FASRC compute resources via the OpenNebula dashboard and APIs.
OpenStack at NTT Resonant: Lessons Learned in Web InfrastructureTomoya Hashimoto
This slide is what was announced at the OpenStack Summit Tokyo.
NTT Resonant Inc., one of NTT group company, is an operator of the "goo" Japanese web portal and a leading provider of Internet services. NTT Resonant deployed and has been operating OpenStack as its service infrastructure since October 2014 in production. The infrastructure started with 400 hypervisors and now accommodates more than 80 services and over 1700 virtual servers. It processes most of 170 Million unique users per month and 1 Billion page views per month.
We will show our knowledge based on our experience. This talk will specifically cover the following areas:
https://www.openstack.org/summit/tokyo-2015/videos/presentation/openstack-at-ntt-resonant-lessons-learned-in-web-infrastructure
The advantages of Arista/OVH configurations, and the technologies behind buil...OVHcloud
Arista will put an emphasis on the technologies behind building and operating datacentres, and the reasons they give the results expected from them (varied traffic spike management, increasing bandwidth, end points and security), including very large-scale production environments.
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataGetInData
Did you like it? Check out our blog to stay up to date: https://getindata.com/blog
The webinar was organized by GetinData on 2020. During the webinar we explaned the concept of monitoring and observability with focus on data analytics platforms.
Watch more here: https://www.youtube.com/watch?v=qSOlEN5XBQc
Whitepaper - Monitoring ang Observability for Data Platform: https://getindata.com/blog/white-paper-big-data-monitoring-observability-data-platform/
Speaker: Albert Lewandowski
Linkedin: https://www.linkedin.com/in/albert-lewandowski/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Overview of OpenStack nova-networking evolution towards Neutron. Architecture overview of OVS plugin, ML2, and MidoNet Overlay product. Overview and example of Heat templates, along with automation of physical switches using Cumulus
Presentation written by Sean Cohen and Steve Gordon at Red Hat covering the highlights of the OpenStack Liberty release.
Presented to the Atlanta OpenStack Meetup on October 15th.
Similar to Addressing data plane performance measurement on OpenStack clouds using VMTP (20)
Addressing data plane performance measurement on OpenStack clouds using VMTP
1. Addressing data plane performance measurement on OpenStack clouds using VMTP
OpenStack summit Vancouver is fast approaching and I for one can’t wait to see how
much OpenStack has progressed from Juno to Kilo. I expect to see even more momentum with
new companies being added to the OpenStack community, continuing to drive the industry
acceptance momentum of OpenStack. While a lot of good stuff is happening from a
functionality (features) and new projects standpoint, being able to systematically measure
performance across the major components still remains largely a work in progress. While Nova
(Compute) and Swift (Object Storage) continue to mature rapidly with good work being done by
many in the OpenStack community around performance measurement, Neutron (Networking)
continues to lag.
Prior to the Juno summit in Paris last November, there was intent in the OpenStack
community to move away from Nova networking onto Neutron where there is significantly
more functionality and scale to be had. But there are many challenges that are preventing this
migration from happening. Primary among those challenges is the inability to easily measure
performance on Neutron to ensure that there isn’t significant performance degradation as a
result of the move from Nova networking to Neutron. Cisco is proactively addressing the gap of
performance measurement on Neutron by releasing in Stackforge an open source data plane
performance measurement tool called VMTP (VM Throughput Performance). It is heartening to
see that others in the community are also providing tools for data plane measurement such as
the recently released “shaker”.
VMTP addresses the need for a quick, simple and automatable way to get VM-level or host-
level single-flow throughput and latency numbers from any OpenStack cloud, while also taking
into account various Neutron topologies. VMTP can also easily check whether certain
OpenStack configuration options, Neutron plug-ins perform to expectation or if there is any
data path impact for upgrading to a different OpenStack release.
VMTP is a small python application that will automatically perform ping connectivity, round
trip time measurement (latency) and TCP/UDP throughput measurement for the following
East/West flows on any OpenStack deployment:
VM to VM same network (private fixed IP, flow #1)
VM to VM different network using fixed IP (same as intra-tenant L3 fixed IP, flow #2)
VM to VM different network using floating IP and NAT (same as floating IP inter-tenant
L3, flow #3)
In addition VMTP can also test for the following traffic scenarios
When an external Linux host is available for testing North/South flows:
2. External host/VM download and upload throughput/latency (L3/floating IP, flow #4
and #5)
Optionally, when SSH login to any Linux host (native or virtual) is available:
o Host to host process-level throughput/latency (intra-node and inter-node)
VMTP can also extract automatically CPU usage from all native hosts in the cloud during the
throughput tests, provided the Ganglia monitoring service (gmond) is installed and enabled on
those hosts.
For VM-related flows, VMTP will automatically create the necessary OpenStack
resources (router, networks, subnets, key pairs, security groups, test VMs) using the public
OpenStack API, install the test tools then orchestrate them to gather the throughput
measurements then cleanup all related resources before exiting. VMTP has been architected to
run independently of Heat as we’ve seen that most deployments in the field don’t have Heat
also installed.
VMTP measures true north/south traffic via a client app that run outside the cloud.
Additional benefits of using VMTP are:
A hook to Ganglia monitoring to capture host level system metrics during test
Flow measurement chaining which results in faster runs (e.g. VMTP chains all client
VM positions for different flows with the same server VM, which you can't do with
Heat. With Heat you need to tear down and rebuild the setup/server for each flow)
3. Results stored directly in MongoDB
Auto extraction of OpenStack versions, encapsulation and L2 agent types and
versions
Select special network interfaces (useful for VMs attached to multiple networks)
SR-IOV and IPv6 support
Detailed online documentation @ http://vmtp.readthedocs.org/en/latest/
Finally VMTP also supports a quick way of creating charts from multiple run JSON
results so you can easily view and consume the results of multiple runs. This new
feature (genchart.py) takes one or more VMTP JSON files as input and generates a
static HTML file. The --browser option allows the tool to directly open the generated
HTML file in the user's default browser. There is no need for any additional HTTP
server since the html file is self-contained and the browser will automatically load
the required JavaScript libraries.
I hope you enjoyed learning about what the VMTP tool can offer and hope you
join in making it even better by contributing to the code at:
http://vmtp.readthedocs.org/en/latest/contributing.html#contribute-to-vmtp