Stacking up with OpenStack: Building for High Availability discusses designing applications for high availability (HA) when using OpenStack. It recommends eliminating single points of failure in the OpenStack infrastructure, designing applications to withstand server, zone, and cloud failures through techniques like replication, auto-scaling, and keeping management layers separate from infrastructure. The document also discusses different disaster recovery strategies and the tradeoffs between availability and cost.
Deployment topologies for high availability (ha)Deepak Mane
The document discusses different deployment topologies for OpenStack high availability configurations. It describes the types of nodes in an OpenStack deployment including endpoint, controller, compute, and cinder volume nodes. It then examines several specific topology examples: one using a hardware load balancer with API services on compute nodes, another with a dedicated endpoint node and API services on controller nodes, and a third with simple controller redundancy and API services on controller nodes. Across all the examples, the key is distributing OpenStack services across nodes in a redundant and highly available manner.
The primary requirements for OpenStack based clouds (public, private or hybrid) is that they must be massively scalable and highly available. There are a number of interrelated concepts which make the understanding and implementation of HA complex. The potential for not implementing HA correctly would be disastrous.
This session was presented at the OpenStack Meetup in Boston Feb 2014. We discussed interrelated concepts as a basis for implementing HA and examples of HA for MySQL, Rabbit MQ and the OpenStack APIs primarily using Keepalived, VRRP and HAProxy which will reinforce the concepts and show how to connect the dots.
This document discusses various approaches to implementing high availability (HA) in OpenStack including active/active and active/passive configurations. It provides an overview of HA techniques used at Deutsche Telekom and eBay/PayPal including load balancing APIs and databases, replicating RabbitMQ and MySQL, and configuring Pacemaker/Corosync for OpenStack services. It also discusses lessons learned around testing failures, placing services across availability zones, and having backups for HA infrastructures.
High availability and fault tolerance of openstackDeepak Mane
This document discusses building a fault tolerant and highly available architecture for OpenStack. It proposes:
1. A master-master cluster architecture for MySQL and session-level replication for RabbitMQ to provide high availability for the database and message broker components.
2. Disk-level replication using DBRD for Glance, Swift, and Cinder to provide redundancy at the storage level.
3. Ensuring high availability for networking and the Horizon dashboard.
4. Developing predictive and reactive models to detect failures in Nova, Swift, and compute instances and enable recovery of all components.
The document recommends using Pacemaker for cluster-level management and Corosync for reliable messaging between cluster nodes.
The document discusses high availability (HA) techniques in OpenStack. It covers HA concepts for both stateless and stateful services. For compute HA, it discusses server evacuation and instance migration without and with shared storage. It then covers different HA options for OpenStack controllers, including Pacemaker/Corosync/DRBD for active-passive HA and Galera for active-active MySQL HA. It also discusses using Keepalived, HAProxy and VRRP for load balancing and failover of API services. Finally, it presents a sample highly available OpenStack architecture and lists additional resources.
Technical overview of how SUSE OpenStack Cloud uses Chef to implement highly available OpenStack infrastructure services.
Target audience: curious developers in the upstream openstack-chef community
These slides were extracted from internal HA training for SUSE OpenStack Cloud developers, and slightly modified for the benefit of the openstack‐chef community.
Deployment topologies for high availability (ha)Deepak Mane
The document discusses different deployment topologies for OpenStack high availability configurations. It describes the types of nodes in an OpenStack deployment including endpoint, controller, compute, and cinder volume nodes. It then examines several specific topology examples: one using a hardware load balancer with API services on compute nodes, another with a dedicated endpoint node and API services on controller nodes, and a third with simple controller redundancy and API services on controller nodes. Across all the examples, the key is distributing OpenStack services across nodes in a redundant and highly available manner.
The primary requirements for OpenStack based clouds (public, private or hybrid) is that they must be massively scalable and highly available. There are a number of interrelated concepts which make the understanding and implementation of HA complex. The potential for not implementing HA correctly would be disastrous.
This session was presented at the OpenStack Meetup in Boston Feb 2014. We discussed interrelated concepts as a basis for implementing HA and examples of HA for MySQL, Rabbit MQ and the OpenStack APIs primarily using Keepalived, VRRP and HAProxy which will reinforce the concepts and show how to connect the dots.
This document discusses various approaches to implementing high availability (HA) in OpenStack including active/active and active/passive configurations. It provides an overview of HA techniques used at Deutsche Telekom and eBay/PayPal including load balancing APIs and databases, replicating RabbitMQ and MySQL, and configuring Pacemaker/Corosync for OpenStack services. It also discusses lessons learned around testing failures, placing services across availability zones, and having backups for HA infrastructures.
High availability and fault tolerance of openstackDeepak Mane
This document discusses building a fault tolerant and highly available architecture for OpenStack. It proposes:
1. A master-master cluster architecture for MySQL and session-level replication for RabbitMQ to provide high availability for the database and message broker components.
2. Disk-level replication using DBRD for Glance, Swift, and Cinder to provide redundancy at the storage level.
3. Ensuring high availability for networking and the Horizon dashboard.
4. Developing predictive and reactive models to detect failures in Nova, Swift, and compute instances and enable recovery of all components.
The document recommends using Pacemaker for cluster-level management and Corosync for reliable messaging between cluster nodes.
The document discusses high availability (HA) techniques in OpenStack. It covers HA concepts for both stateless and stateful services. For compute HA, it discusses server evacuation and instance migration without and with shared storage. It then covers different HA options for OpenStack controllers, including Pacemaker/Corosync/DRBD for active-passive HA and Galera for active-active MySQL HA. It also discusses using Keepalived, HAProxy and VRRP for load balancing and failover of API services. Finally, it presents a sample highly available OpenStack architecture and lists additional resources.
Technical overview of how SUSE OpenStack Cloud uses Chef to implement highly available OpenStack infrastructure services.
Target audience: curious developers in the upstream openstack-chef community
These slides were extracted from internal HA training for SUSE OpenStack Cloud developers, and slightly modified for the benefit of the openstack‐chef community.
A study and practice of OpenStack release Kilo HA deployment. The Kilo document has some errors, and it's hardly find a detailed document to describe how to deploy a HA cloud based on Kilo release. Hope this slides can provide some clues.
Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...confluent
"To provide exceptional customer experiences at scale, the data pipelines that can move data reliably across the systems and applications in real-time should be seamlessly scalable. For the past several years, we relied on Message Queue based data pipelines to facilitate the transfer of data across the applications. However, as the number of use cases that require real-time data transfer increased rapidly, it became difficult to scale the messaging platform. Moving to Kafka helped us to resolve the data pipeline scaling issues and reduce the Publisher/Subscriber on-boarding time from several weeks to a few days. To support the on-demand scaling of Kafka clusters, we run them on RedHat OpenShift, an Enterprise Kubernetes. While managing Kafka that handles critical financial events, we have learned some lessons and developed efficient strategies to manage production-grade Kafka clusters on OpenShift. In this talk, we will present:
1. Some of the challenges that we faced with Kafka on OpenShift and how we evolved our infrastructure to overcome them.
2. Share our experiences from operating Kafka clusters at Scale in Production.
3. Our strategy for performing automated Kafka deployment and rollback in OpenShift.
4. Explain our fail-over strategy using Confluent’s Replicator to ensure service availability during cluster failures."
You’ve heard all of the hype, but how can SMACK work for you? In this all-star lineup, you will learn how to create a reactive, scaling, resilient and performant data processing powerhouse. Bringing Akka, Kafka and Mesos together provides a foundation to develop and operate an elastically scalable actor system. We will go through the basics of Akka, Kafka and Mesos and then deep dive into putting them together in an end2end (and back again) distrubuted transaction. Distributed transactions mean producers waiting for one or more of consumers to respond. We'll also go through automated ways to failure induce these systems (using LinkedIn Simoorg) and trace them from start to stop through each component (using Twitters Zipkin). Finally, you will see how Apache Cassandra and Spark can be combined to add the incredibly scaling storage and data analysis needed in fast data pipelines. With these technologies as a foundation, you have the assurance that scale is never a problem and uptime is default.
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019confluent
Cloud migration: it's practically a rite of passage for anyone who's built infrastructure on bare metal. When we migrated our 5-year-old Kafka deployment from the datacenter to GCP, we were faced with the task of making our highly mutable server infrastructure more cloud-friendly. This led to a surprising decision: we chose to run our Kafka cluster on Kubernetes. I'll share war stories from our Kafka migration journey, explain why we chose Kubernetes over arguably simpler options like GCP VMs, and present the lessons we learned while making our way toward a stable and self-healing Kubernetes deployment. I'll also go through some improvements in the more recent Kafka releases that make upgrades crucial for any Kafka deployment on immutable and ephemeral infrastructure. You'll learn what happens when you try to run one complex distributed system on top of another, and come away with some handy tricks for automating cloud cluster management, plus some migration pitfalls to avoid. And if you're not sure whether running Kafka on Kubernetes is right for you, our experiences should provide some extra data points that you can use as you make that decision.
Scalable Persistent Storage for Erlang: Theory and PracticeAmir Ghaffari
The RELEASE project at Glasgow University aims to improve the scalability of Erlang onto commodity architectures with 100,000 cores.
Such architectures require scalable and available persistent storage on up to 100 hosts. The talk describes the provision of scalable persistent storage options for Erlang.
We outline the theory and apply it to popular Erlang distributed database management systems (DBMS): Mnesia, CouchDB, Riak and Cassandra. We identify Dynamo-style NoSQL DBMS as suitable scalable persistent storage technologies. To evidence the scalability we benchmark Riak in practice, measuring the scalability and elasticity of Riak on 100-node cluster with 800 cores.
Capacity planning is a difficult challenge faced by most companies. If you have too few machines, you will not have enough compute resources available to deal with heavy loads. On the other hand, if you have too many machines, you are wasting money. This is why companies have started investing in automatically scaling services and infrastructure to minimize the amount of wasted money and resources.
In this talk, Nathan will describe how Yelp is using PaaSTA, a PaaS built on top of open source tools including Docker, Mesos, Marathon, and Chronos, to automatically and gracefully scale services and the underlying cluster. He will go into detail about how this functionality was implemented and the design designs that were made while architecting the system. He will also provide a brief comparison of how this approach differs from existing solutions.
Running a distributed system across kubernetes clusters - Kubecon North Ameri...Alex Robinson
Kubernetes makes it easy to run distributed applications, even those that manage persistent state, within the confines of a single cluster. Running the same applications in a multi-region or multi-cloud fashion across multiple Kubernetes clusters, however, is considerably more difficult due to the networking and service discovery problems involved.
In this talk, Alex will walk through his team’s experience over the last six months of running a distributed database across Kubernetes clusters in different regions and their attempts to make the process repeatable on different cloud providers and on-prem environments. He’ll cover common problems they encountered, solutions they’ve tried, how they’re running things today, and the future improvements he’s most excited about from community projects like Istio.
Running Galera Cluster in Microsoft Azure involves setting up virtual machines and installing Galera Cluster software. This provides more control than Azure Database for MySQL, which uses asynchronous replication. While Azure Database for MySQL is fully managed, Galera Cluster in VMs supports the virtually synchronous replication that is its core feature. Cost estimates show running three Galera Cluster nodes in VMs costs less monthly than three hosted MySQL instances in Azure Database for MySQL.
This document discusses how Pulsar operators can be used to automate lifecycle management of Pulsar clusters on Kubernetes. It describes how operators use custom resource definitions and controllers to reconcile the actual cluster state with the desired state. Specific examples are provided for how operators can perform controlled cluster upgrades, scale bookies, and clean up after cluster deletion. The integration of operators with Helm charts is also covered.
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison Severalnines
Galera Cluster for MySQL, Percona XtraDB Cluster and MariaDB Cluster (the three “flavours” of Galera Cluster) make use of the Galera WSREP libraries to handle synchronous replication.MySQL Cluster is the official clustering solution from Oracle, while Galera Cluster for MySQL is slowly but surely establishing itself as the de-facto clustering solution in the wider MySQL eco-system.
In this webinar, we will look at all these alternatives and present an unbiased view on their strengths/weaknesses and the use cases that fit each alternative.
This webinar will cover the following:
MySQL Cluster architecture: strengths and limitations
Galera Architecture: strengths and limitations
Deployment scenarios
Data migration
Read and write workloads (Optimistic/pessimistic locking)
WAN/Geographical replication
Schema changes
Management and monitoring
Jakub Pavlik discusses high availability versus disaster recovery in OpenStack clouds. He describes four types of high availability in OpenStack: physical infrastructure, OpenStack control services, virtual machines, and applications. For each type, he outlines concepts like active/passive and active/active configurations, specific technologies used like Pacemaker, Corosync, HAProxy, and MySQL Galera, and considerations for shared and non-shared storage. Finally, he provides examples of high availability architectures and methods used by different OpenStack vendors.
Troubleshooting Kafka's socket server: from incident to resolutionJoel Koshy
LinkedIn’s Kafka deployment is nearing 1300 brokers that move close to 1.3 trillion messages a day. While operating Kafka smoothly even at this scale is testament to both Kafka’s scalability and the operational expertise of LinkedIn SREs we occasionally run into some very interesting bugs at this scale. In this talk I will dive into a production issue that we recently encountered as an example of how even a subtle bug can suddenly manifest at scale and cause a near meltdown of the cluster. We will go over how we detected and responded to the situation, investigated it after the fact and summarize some lessons learned and best-practices from this incident.
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...Data Con LA
Abstract:-
Tracking user events as they happen can challenge anyone providing real time user interaction. It can demand both huge scale and a lot of processing to support dynamic adjustment to targeting products and services. As the operational data store Couchbase data services are capable of processing tens of millions of updates a day. Streaming through systems such as Apache Spark and Kafka into Hadoop, information about these key events can be turned into deeper knowledge. We will review Lambda architectures deployed at sites like PayPal, Live Person and LinkedIn that leverage a Couchbase Data Pipeline.
Bio:-
Justin Michaels. With over 20 years experience in deploying mission critical systems, Justin Michaels industry experience covers capacity planning, architecture and industry vertical experience. Justin brings his passion for architecting, implementing and improving Couchbase to the community as a Solution Architect. His expertise involves both conventional application platforms as well as distributed data management systems. He regularly engages with existing and new Couchbase customers in performance reviews, architecture planning and best practice guidance.
The document discusses using OpenStack and VMware vSphere together to build a hybrid cloud solution. It describes how traditional and cloud native workloads have different characteristics that influence infrastructure design. Traditionally, infrastructure was designed for resilience while clouds are designed for rapid scale and assume failures will occur. The document advocates designing applications to handle resiliency through loose coupling, horizontal scaling, and treating instances as cattle not pets. OpenStack can provide automation and orchestration of resources like vSphere hypervisors to deliver self-service IT capabilities at scale. An integrated OpenStack and vSphere hybrid solution provides customers the best of both worlds.
How Pulsar Stores Your Data - Pulsar Summit NA 2021StreamNative
In order to leverage the best performance characters of your stream backend, it is important to understand the nitty gritty details of how pulsar stores your data. Understanding this empowers you to design your use case solutioning so as to make the best use of resources at hand as well as get the optimum amount of consistency, availability, latency and throughput for a given amount of resources at hand.
With this underlying philosophy, in this talk, we will get to the bottom of storage tier of pulsar (apache bookkeeper), the barebones of the bookkeeper storage semantics, how it is used in different use cases ( even other than pulsar), understand the object models of storage in pulsar, different kinds of data structures and algorithms pulsar uses therein and how that maps to the semantics of the storage class shipped with pulsar by default. Oh yes, you can change the storage backend too with some additional code!
This session will empower you with the right background to map your data right with pulsar.
Friends don't let friends do dual writes: Outbox pattern with OpenShift Strea...Red Hat Developers
Dual writes are a common source of issues in distributed event-driven applications. A dual write occurs when an application has to change data in two different systems - for instance, when an application needs to persist data in the database and send a Kafka message to notify other systems. If one of these two operations fail, you might end up with inconsistent data which can be hard to detect and fix.
OpenShift Streams for Apache Kafka is Red Hat's fully hosted and managed Apache Kafka service targeting development teams that want to incorporate streaming data and scalable messaging in their applications, without the burden of setting up and maintaining a Kafka cluster infrastructure. Debezium is an open source distributed platform for change data capture. Built on top of Apache Kafka, it allows applications to react to inserts, updates, and deletes in your databases.
In this session you will learn how you can leverage OpenShift Streams for Apache Kafka and Debezium to avoid the dual write issue in an event-driven application using the outbox pattern. More specifically, we will show you how to:
Provision a Kafka cluster on OpenShift Streams for Apache Kafka.
Deploy and configure Debezium to use OpenShift Streams for Apache Kafka.
Refactor an application to leverage Debezium and OpenShift Streams for Apache Kafka to avoid the dual write problem.
It’s no news that containers represent a portable unit of deployment, and OpenStack has proven an ideal environment for running container workloads. However, where it usually becomes more complex is that many times an application is often built out of multiple containers, as well as hybrid environments - diverse clouds, bare metal and even non-virtualized infrastructure. What’s more, setting up a cluster of container images can be fairly cumbersome because you need to make one container aware of another and expose intimate details that are required for them to communicate which is not trivial especially if they’re not on the same host.
These scenarios have instigated the demand for some kind of orchestrator. The list of container orchestrators is growing fairly fast. This session will compare the different orchestration projects out there - from Heat to Kubernetes to Mesos & Cloudify - and help you choose the right tool for the job.
Deep dive into highly available open stack architecture openstack summit va...Arthur Berezin
This document summarizes a presentation on highly available OpenStack architecture. It discusses using Pacemaker and HAProxy for high availability enabling services. Shared databases like MariaDB Galera and message queues like RabbitMQ are made highly available. Individual OpenStack services like Keystone, Glance, Cinder, Nova, Neutron, and Horizon are made highly available through active-active clustering, load balancing, and fencing. The presentation covers topologies for controller, compute, network, and storage nodes. It provides examples of making individual services highly available and discusses ongoing work and future plans to improve high availability in OpenStack.
Rackspace aims to deploy code from the OpenStack trunk on demand to its multi-cell regions with minimal customer impact. It discusses strategies for merging and branching code, packaging and distributing releases, and deploying and testing in development, QA, and production environments. Challenges include managing code conflicts, disruptive database migrations, testing at scale, and aligning continuous integration/delivery processes with OpenStack release methodology. Rackspace is working to address these challenges to keep OpenStack trunk continuously deployable.
This document discusses using Hadoop for OpenStack log analysis to address challenges of operating OpenStack at scale. It proposes collecting logs continuously into Hadoop, parsing and indexing them intelligently, and defining a storage schema. The current development status includes batch loading of logs converted to AVRO format and indexed in SOLR. Next steps discussed include documenting patterns, collaborating on schema design, and getting sample logs to Hadoop experts.
A study and practice of OpenStack release Kilo HA deployment. The Kilo document has some errors, and it's hardly find a detailed document to describe how to deploy a HA cloud based on Kilo release. Hope this slides can provide some clues.
Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...confluent
"To provide exceptional customer experiences at scale, the data pipelines that can move data reliably across the systems and applications in real-time should be seamlessly scalable. For the past several years, we relied on Message Queue based data pipelines to facilitate the transfer of data across the applications. However, as the number of use cases that require real-time data transfer increased rapidly, it became difficult to scale the messaging platform. Moving to Kafka helped us to resolve the data pipeline scaling issues and reduce the Publisher/Subscriber on-boarding time from several weeks to a few days. To support the on-demand scaling of Kafka clusters, we run them on RedHat OpenShift, an Enterprise Kubernetes. While managing Kafka that handles critical financial events, we have learned some lessons and developed efficient strategies to manage production-grade Kafka clusters on OpenShift. In this talk, we will present:
1. Some of the challenges that we faced with Kafka on OpenShift and how we evolved our infrastructure to overcome them.
2. Share our experiences from operating Kafka clusters at Scale in Production.
3. Our strategy for performing automated Kafka deployment and rollback in OpenShift.
4. Explain our fail-over strategy using Confluent’s Replicator to ensure service availability during cluster failures."
You’ve heard all of the hype, but how can SMACK work for you? In this all-star lineup, you will learn how to create a reactive, scaling, resilient and performant data processing powerhouse. Bringing Akka, Kafka and Mesos together provides a foundation to develop and operate an elastically scalable actor system. We will go through the basics of Akka, Kafka and Mesos and then deep dive into putting them together in an end2end (and back again) distrubuted transaction. Distributed transactions mean producers waiting for one or more of consumers to respond. We'll also go through automated ways to failure induce these systems (using LinkedIn Simoorg) and trace them from start to stop through each component (using Twitters Zipkin). Finally, you will see how Apache Cassandra and Spark can be combined to add the incredibly scaling storage and data analysis needed in fast data pipelines. With these technologies as a foundation, you have the assurance that scale is never a problem and uptime is default.
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019confluent
Cloud migration: it's practically a rite of passage for anyone who's built infrastructure on bare metal. When we migrated our 5-year-old Kafka deployment from the datacenter to GCP, we were faced with the task of making our highly mutable server infrastructure more cloud-friendly. This led to a surprising decision: we chose to run our Kafka cluster on Kubernetes. I'll share war stories from our Kafka migration journey, explain why we chose Kubernetes over arguably simpler options like GCP VMs, and present the lessons we learned while making our way toward a stable and self-healing Kubernetes deployment. I'll also go through some improvements in the more recent Kafka releases that make upgrades crucial for any Kafka deployment on immutable and ephemeral infrastructure. You'll learn what happens when you try to run one complex distributed system on top of another, and come away with some handy tricks for automating cloud cluster management, plus some migration pitfalls to avoid. And if you're not sure whether running Kafka on Kubernetes is right for you, our experiences should provide some extra data points that you can use as you make that decision.
Scalable Persistent Storage for Erlang: Theory and PracticeAmir Ghaffari
The RELEASE project at Glasgow University aims to improve the scalability of Erlang onto commodity architectures with 100,000 cores.
Such architectures require scalable and available persistent storage on up to 100 hosts. The talk describes the provision of scalable persistent storage options for Erlang.
We outline the theory and apply it to popular Erlang distributed database management systems (DBMS): Mnesia, CouchDB, Riak and Cassandra. We identify Dynamo-style NoSQL DBMS as suitable scalable persistent storage technologies. To evidence the scalability we benchmark Riak in practice, measuring the scalability and elasticity of Riak on 100-node cluster with 800 cores.
Capacity planning is a difficult challenge faced by most companies. If you have too few machines, you will not have enough compute resources available to deal with heavy loads. On the other hand, if you have too many machines, you are wasting money. This is why companies have started investing in automatically scaling services and infrastructure to minimize the amount of wasted money and resources.
In this talk, Nathan will describe how Yelp is using PaaSTA, a PaaS built on top of open source tools including Docker, Mesos, Marathon, and Chronos, to automatically and gracefully scale services and the underlying cluster. He will go into detail about how this functionality was implemented and the design designs that were made while architecting the system. He will also provide a brief comparison of how this approach differs from existing solutions.
Running a distributed system across kubernetes clusters - Kubecon North Ameri...Alex Robinson
Kubernetes makes it easy to run distributed applications, even those that manage persistent state, within the confines of a single cluster. Running the same applications in a multi-region or multi-cloud fashion across multiple Kubernetes clusters, however, is considerably more difficult due to the networking and service discovery problems involved.
In this talk, Alex will walk through his team’s experience over the last six months of running a distributed database across Kubernetes clusters in different regions and their attempts to make the process repeatable on different cloud providers and on-prem environments. He’ll cover common problems they encountered, solutions they’ve tried, how they’re running things today, and the future improvements he’s most excited about from community projects like Istio.
Running Galera Cluster in Microsoft Azure involves setting up virtual machines and installing Galera Cluster software. This provides more control than Azure Database for MySQL, which uses asynchronous replication. While Azure Database for MySQL is fully managed, Galera Cluster in VMs supports the virtually synchronous replication that is its core feature. Cost estimates show running three Galera Cluster nodes in VMs costs less monthly than three hosted MySQL instances in Azure Database for MySQL.
This document discusses how Pulsar operators can be used to automate lifecycle management of Pulsar clusters on Kubernetes. It describes how operators use custom resource definitions and controllers to reconcile the actual cluster state with the desired state. Specific examples are provided for how operators can perform controlled cluster upgrades, scale bookies, and clean up after cluster deletion. The integration of operators with Helm charts is also covered.
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison Severalnines
Galera Cluster for MySQL, Percona XtraDB Cluster and MariaDB Cluster (the three “flavours” of Galera Cluster) make use of the Galera WSREP libraries to handle synchronous replication.MySQL Cluster is the official clustering solution from Oracle, while Galera Cluster for MySQL is slowly but surely establishing itself as the de-facto clustering solution in the wider MySQL eco-system.
In this webinar, we will look at all these alternatives and present an unbiased view on their strengths/weaknesses and the use cases that fit each alternative.
This webinar will cover the following:
MySQL Cluster architecture: strengths and limitations
Galera Architecture: strengths and limitations
Deployment scenarios
Data migration
Read and write workloads (Optimistic/pessimistic locking)
WAN/Geographical replication
Schema changes
Management and monitoring
Jakub Pavlik discusses high availability versus disaster recovery in OpenStack clouds. He describes four types of high availability in OpenStack: physical infrastructure, OpenStack control services, virtual machines, and applications. For each type, he outlines concepts like active/passive and active/active configurations, specific technologies used like Pacemaker, Corosync, HAProxy, and MySQL Galera, and considerations for shared and non-shared storage. Finally, he provides examples of high availability architectures and methods used by different OpenStack vendors.
Troubleshooting Kafka's socket server: from incident to resolutionJoel Koshy
LinkedIn’s Kafka deployment is nearing 1300 brokers that move close to 1.3 trillion messages a day. While operating Kafka smoothly even at this scale is testament to both Kafka’s scalability and the operational expertise of LinkedIn SREs we occasionally run into some very interesting bugs at this scale. In this talk I will dive into a production issue that we recently encountered as an example of how even a subtle bug can suddenly manifest at scale and cause a near meltdown of the cluster. We will go over how we detected and responded to the situation, investigated it after the fact and summarize some lessons learned and best-practices from this incident.
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...Data Con LA
Abstract:-
Tracking user events as they happen can challenge anyone providing real time user interaction. It can demand both huge scale and a lot of processing to support dynamic adjustment to targeting products and services. As the operational data store Couchbase data services are capable of processing tens of millions of updates a day. Streaming through systems such as Apache Spark and Kafka into Hadoop, information about these key events can be turned into deeper knowledge. We will review Lambda architectures deployed at sites like PayPal, Live Person and LinkedIn that leverage a Couchbase Data Pipeline.
Bio:-
Justin Michaels. With over 20 years experience in deploying mission critical systems, Justin Michaels industry experience covers capacity planning, architecture and industry vertical experience. Justin brings his passion for architecting, implementing and improving Couchbase to the community as a Solution Architect. His expertise involves both conventional application platforms as well as distributed data management systems. He regularly engages with existing and new Couchbase customers in performance reviews, architecture planning and best practice guidance.
The document discusses using OpenStack and VMware vSphere together to build a hybrid cloud solution. It describes how traditional and cloud native workloads have different characteristics that influence infrastructure design. Traditionally, infrastructure was designed for resilience while clouds are designed for rapid scale and assume failures will occur. The document advocates designing applications to handle resiliency through loose coupling, horizontal scaling, and treating instances as cattle not pets. OpenStack can provide automation and orchestration of resources like vSphere hypervisors to deliver self-service IT capabilities at scale. An integrated OpenStack and vSphere hybrid solution provides customers the best of both worlds.
How Pulsar Stores Your Data - Pulsar Summit NA 2021StreamNative
In order to leverage the best performance characters of your stream backend, it is important to understand the nitty gritty details of how pulsar stores your data. Understanding this empowers you to design your use case solutioning so as to make the best use of resources at hand as well as get the optimum amount of consistency, availability, latency and throughput for a given amount of resources at hand.
With this underlying philosophy, in this talk, we will get to the bottom of storage tier of pulsar (apache bookkeeper), the barebones of the bookkeeper storage semantics, how it is used in different use cases ( even other than pulsar), understand the object models of storage in pulsar, different kinds of data structures and algorithms pulsar uses therein and how that maps to the semantics of the storage class shipped with pulsar by default. Oh yes, you can change the storage backend too with some additional code!
This session will empower you with the right background to map your data right with pulsar.
Friends don't let friends do dual writes: Outbox pattern with OpenShift Strea...Red Hat Developers
Dual writes are a common source of issues in distributed event-driven applications. A dual write occurs when an application has to change data in two different systems - for instance, when an application needs to persist data in the database and send a Kafka message to notify other systems. If one of these two operations fail, you might end up with inconsistent data which can be hard to detect and fix.
OpenShift Streams for Apache Kafka is Red Hat's fully hosted and managed Apache Kafka service targeting development teams that want to incorporate streaming data and scalable messaging in their applications, without the burden of setting up and maintaining a Kafka cluster infrastructure. Debezium is an open source distributed platform for change data capture. Built on top of Apache Kafka, it allows applications to react to inserts, updates, and deletes in your databases.
In this session you will learn how you can leverage OpenShift Streams for Apache Kafka and Debezium to avoid the dual write issue in an event-driven application using the outbox pattern. More specifically, we will show you how to:
Provision a Kafka cluster on OpenShift Streams for Apache Kafka.
Deploy and configure Debezium to use OpenShift Streams for Apache Kafka.
Refactor an application to leverage Debezium and OpenShift Streams for Apache Kafka to avoid the dual write problem.
It’s no news that containers represent a portable unit of deployment, and OpenStack has proven an ideal environment for running container workloads. However, where it usually becomes more complex is that many times an application is often built out of multiple containers, as well as hybrid environments - diverse clouds, bare metal and even non-virtualized infrastructure. What’s more, setting up a cluster of container images can be fairly cumbersome because you need to make one container aware of another and expose intimate details that are required for them to communicate which is not trivial especially if they’re not on the same host.
These scenarios have instigated the demand for some kind of orchestrator. The list of container orchestrators is growing fairly fast. This session will compare the different orchestration projects out there - from Heat to Kubernetes to Mesos & Cloudify - and help you choose the right tool for the job.
Deep dive into highly available open stack architecture openstack summit va...Arthur Berezin
This document summarizes a presentation on highly available OpenStack architecture. It discusses using Pacemaker and HAProxy for high availability enabling services. Shared databases like MariaDB Galera and message queues like RabbitMQ are made highly available. Individual OpenStack services like Keystone, Glance, Cinder, Nova, Neutron, and Horizon are made highly available through active-active clustering, load balancing, and fencing. The presentation covers topologies for controller, compute, network, and storage nodes. It provides examples of making individual services highly available and discusses ongoing work and future plans to improve high availability in OpenStack.
Rackspace aims to deploy code from the OpenStack trunk on demand to its multi-cell regions with minimal customer impact. It discusses strategies for merging and branching code, packaging and distributing releases, and deploying and testing in development, QA, and production environments. Challenges include managing code conflicts, disruptive database migrations, testing at scale, and aligning continuous integration/delivery processes with OpenStack release methodology. Rackspace is working to address these challenges to keep OpenStack trunk continuously deployable.
This document discusses using Hadoop for OpenStack log analysis to address challenges of operating OpenStack at scale. It proposes collecting logs continuously into Hadoop, parsing and indexing them intelligently, and defining a storage schema. The current development status includes batch loading of logs converted to AVRO format and indexed in SOLR. Next steps discussed include documenting patterns, collaborating on schema design, and getting sample logs to Hadoop experts.
Jesus Gonzalez-Barahona presented on using Grimoire tools to analyze OpenStack. Grimoire includes tools to extract data from source code repositories, issue trackers, and mailing lists. This data is stored in SQL databases and can then be queried, analyzed, and visualized. Specifically, Bitergia has deployed an OpenStack activity dashboard that visualizes metrics on contributions over time. Future work includes tracking additional parameters like time-to-close issues and developer demographics. The goal is to provide transparency into OpenStack development and support data-driven community decisions.
The document discusses 10 things learned from implementing OpenStack. It covers topics like cloud geography, industry and technology diversity in OpenStack implementations, hybrid cloud models focusing on storage, continuous vs staged integration, the duality of OpenStack storage, diversity of development and operations models, distributed vs centralized control, using erasure coding vs RAID for resiliency, and the idea of a shared service proposal.
Blue host using openstack in a traditional hosting environmentOpenStack Foundation
Using OpenStack in a traditional hosting environment posed scaling challenges that required automating provisioning across multiple data centers. OpenStack was chosen for its open source support, typical cloud features, and ability to transition to a future cloud offering. Bluehost implemented OpenStack across over 10,000 servers, addressing issues like unstable messaging, overloaded MySQL, and premature networking plugins. Solutions involved read-only databases, optimized configurations, and custom scheduler, quantum, and nova components.
The document discusses Bluehost's experience transitioning to OpenStack for hosting over 10,000 physical servers. Key points:
- Bluehost needed an automated system to manage scaling rapidly from 1,000 to over 10,000 servers across data centers. OpenStack was chosen for its scalability, standard APIs, and momentum.
- Major challenges included scaling messaging, database, and APIs. Customizations were made to OpenStack components like Nova, Quantum, and MySQL to address these issues.
- Operational issues around reboots, monitoring, and network abstraction were solved through workarounds and enhanced plugins. Overall the experience highlighted areas like scalable databases and networking that need improvement for true OpenStack success at large scales.
Mark Collier, COO of the OpenStack Foundation, gave the opening keynote at the OpenStack Day London event in June 2014.
Much of the content was presented at the recent Summit in Atlanta as well: https://www.youtube.com/watch?v=H4j-Mnxenc4
Canonical transitioned its internal IT infrastructure to use OpenStack in their private cloud (CanoniStack) to practice what they preach to customers. This was challenging due to heterogeneous hardware, deciding on OpenStack software versions, and managing the cloud platform. They overcame these challenges and now run two OpenStack regions for internal systems. Looking forward, Canonical aims to run more internal services on CanoniStack and improve areas like high availability and live upgrades.
The document discusses considerations for building a private cloud using OpenStack Folsom. It covers topics such as the definition of a private cloud, sizing instances and flavors, network architecture including multiple networks, image storage and performance, and architecture examples for different sizes of private clouds. The document provides guidance on capacity planning, performance bottlenecks, and best practices for building a private cloud with OpenStack.
This document discusses the use of cloud computing in high energy physics (particle physics). It describes how clouds are being used to preserve long-term software and data from particle physics experiments. Clouds are also being used to provide distributed computing for exceptional computing demands. Private clouds have been enabled using the existing high-level trigger farms of the ATLAS and CMS experiments during periods when the accelerators are idle. Both private and commercial clouds are playing an important role in high energy physics.
The document outlines enhancements to the Trove database service from the Icehouse to Juno releases of OpenStack. Key additions in Juno include support for asynchronous MySQL replication, integration with Neutron networking, expanded configuration groups, additional datastore support like PostgreSQL and Vertica, cross-region backups, and improved testing. The goal is to provide a scalable, reliable database as a service with a fully-featured open source framework.
Canonical transitioned its internal IT infrastructure to use OpenStack in their private cloud (CanoniStack) to practice what they preach about cloud technologies. This transition was challenging due to organizational expectations for increased efficiency, heterogeneous hardware, and decisions around OpenStack software configuration and service management. They eventually implemented a production-ready private cloud (ProdStack) using specific Ubuntu OpenStack releases, hardware resource management with MAAS and Juju, and further improvements are planned around high availability, live upgrades, and resilience testing.
This document discusses how clouds are used in high energy physics research. It describes how clouds provide computing resources for experiments like the Large Hadron Collider. Clouds help process massive amounts of data from particle collisions and allow global collaboration between researchers. They also help preserve data and software from past experiments for long-term analysis. Clouds enable high energy physics to further understanding of fundamental questions about the universe.
1. The document discusses 10 things learned from implementing OpenStack including cloud geography, industry diversity, and technology diversity.
2. It explores the variety of consumption models for OpenStack including rack appliances, controllers, and software instances.
3. Integration approaches are discussed ranging from continuous integration to staged integration for different environments like surgery, air traffic control, or military systems.
Chef for OpenStack provides a framework for automatically deploying and managing OpenStack using Chef. It includes cookbooks for common OpenStack components like Keystone, Glance, Nova, etc. that allow OpenStack to be deployed in an automated, repeatable way. The project has many corporate contributors and aims to reduce fragmentation and encourage collaboration in deploying OpenStack. It provides a Chef repository, documentation, and community support through various channels. The goal is to make deploying and managing OpenStack infrastructure as code a reality.
Best Practices for Integrating a Third party Portal with OpenStackOpenStack Foundation
This document discusses best practices for integrating a third party portal with OpenStack. It notes that while Horizon allows resource consumption, it does not provide full enterprise management integration. The document outlines example OpenStack service provider and enterprise architectures that incorporate a third party portal. It also discusses user experience considerations like simplifying common operations and providing transparent cost management.
This document provides a conceptual overview of the OpenStack architecture in 3 sentences or less:
OpenStack is an open source cloud operating system that consists of a set of interrelated services that are written in Python and provide APIs to interact with components like compute, networking, storage and identity. The core components include compute (Nova), object storage (Swift), block storage (Cinder), image service (Glance), identity (Keystone), networking (Quantum), and dashboard (Horizon), which provides a web-based user interface. Each component communicates with others via APIs to provide infrastructure as a service capabilities.
Cloud Immortality - Architecting for High Availability & Disaster RecoveryRightScale
RightScale Conference Santa Clara 2011: RightScale is involved in building diverse cloud environments for our customers. Want to deploy applications in highly available, fault-tolerant environments? Across multiple zones, regions, and clouds providers? We’ve walked the walk. And we have five years of best practices to share. This session will cover best practices and tips and tricks for architecting highly available, fault-tolerant multi-zone and multi-cloud deployments.
The presentation discussed moving applications to the cloud for scalability, flexibility and pay-as-you-go pricing, noting key differences between RSAWEBCloud and AWS; challenges for developers include optimizing applications for production environments and handling scaling which requires separating concerns like data types and using caching, load balancing, and autoscaling tools.
CloudSave is an object-relational mapping tool for the cloud that provides distributed transactions and optimal data placement across cloud resources while reducing unintended consequences, aiming to be as safe as a traditional database but as fast as the cloud. It is being designed and built using the GigaSpaces application platform to enable big and fast data applications to run reliably at cloud speeds. A demonstration of CloudSave is planned for April.
Tour de Clouds: Understanding Multi-Cloud IntegrationRightScale
Whether you are new to the cloud or a power user, join our discussion about the range of public and private clouds that RightScale supports. We will provide an overview of how and why we integrate with certain clouds, the capabilities of each cloud within RightScale, and how you can leverage these clouds for a variety of use cases.
RightScale overview and why I find it elegantGiri Fox
- RightScale is a cloud management platform that provides abstraction, automation, and governance across public and private clouds.
- It has around 250 staff across offices in several countries, including a new office opened in Australia, and has raised $47M in venture capital.
- RightScale provides a single pane of glass for provisioning, monitoring, and managing infrastructure across multiple cloud providers.
These are the slides of my presentation at the NYC MySQL Meetup on Sep 21 2012. There are tips and tricks about MySQL in the cloud and the SkySQL cloud data suite
RightScale is a cloud management platform company founded in 2006 and headquartered in Santa Barbara, CA with operating subsidiaries around the world. It helps customers easily consume and automate cloud resources through its cloud management platform and professional services. RightScale pioneered cloud management and now manages global cloud deployments for many large customers across multiple cloud providers.
RightScale is a startup cloud management platform provider with 250 employees. It provides tools to manage deployments on multiple public clouds through a single control plane that abstracts differences between clouds. This allows for remote control of configuration, automation, and governance. RightScale has various clients such as Zynga and PBS that use its tools to scale workloads, achieve high availability, and gain visibility into their cloud infrastructure.
WeLab Reaps Advantages of Multi-Cloud Capabilities. You Can Too.NuoDB
Traditional financial institutions are beginning to move critical core banking applications to the cloud, while new challengers in the form of digital-only banks are gaining millions of new accounts. These digital banks must meet their customers’ real-time demands while complying with new and changing regulatory requirements to ensure data privacy, security, and availability. Join us for this webinar to explore a case study featuring WeLab a new Hong Kong digital bank. Learn how they combined Temenos Transact, a cloud-native, cloud-agnostic core banking solution, with NuoDB’s revolutionary distributed SQL database to:
- Deploy a fault tolerant multi-cluster environment across multiple clouds
- Use microservices, containers, and Kubernetes to increase speed to market
- Reduce TCO with on-demand scalability and built-in continuous availability
RightScale Conference Santa Clara 2011: When getting started with a new technology, it’s helpful to hear the war stories and successes of those who have gone before us. We’re excited that several RightScale customers will share their experiences of how they have achieved agility in the cloud.
Cloud Networking: Network aspects of the cloudSAIL
This document discusses networking aspects of cloud computing. It notes that connectivity between compute and storage resources in clouds is often overlooked. It proposes two approaches: 1) distributing cloud resources across the network and 2) assessing which network domain or data center to place resources in. The document introduces SAIL concepts like OCNI and DCP that aim to provide interfaces for requesting resources and establishing connectivity between distributed clouds in a way that ensures interoperability.
Building cross-region and cross could high availability into your app, a real life use case by Gigaspaces, Nati Shalom, Funder & CTO, Gigaspaces
Achieving high levels of availability and disaster recovery in a cloud environment requires the implementation of patterns and practices that introduce redundancy through multi-zone, multi-region, and multi-cloud deployments. As we move towards implementing higher availability, we cannot escape the direct increase in the accidental complexity of the deployment architecture resulting from lack of cloud portability and deployment lifecycle automation. We present how high availability and disaster recovery were achieved in reality by using the Cloudify open source framework on top of AWS. This approach applies to not just AWS but also other public clouds and private cloud environments such as Eucalyptus. The resulting reference architecture provides portable PostgreSQL replication and disaster recovery as well as application tier scalability across zones, regions, and public/private clouds through a unified deployment workflow.
This document discusses building private and hybrid clouds using RightScale cloud management tools. It describes RightScale as the #1 cloud management system, managing deployments for over 4 years globally. It defines what a cloud is, where RightScale fits in managing applications and infrastructure across public and private clouds. It highlights benefits like control, flexibility, and performance that customers gain from deploying private and hybrid cloud solutions with RightScale. Finally, it provides an overview of how to easily get started with a RightScale account and deploy server templates across diverse resource pools.
Challenge: Recent success of Docker containers reveals arrival of a new era: the number of CPUs is exploding 10-100 folds up, and cloud networking is already in a new movement of scalability upgrade
Question: To scale UP or OUT? I.e., UPgrade or OUTgrade?
Answer from DaoliCloud’s practice: Better scale OUT, ,,.and Openflow can help
The document provides an overview of a Cloud Foundry bootcamp presented by Alvaro Videla. It includes information about the presenter such as his role as a Developer Advocate for Cloud Foundry, his blog and Twitter account. It also outlines the topics that will be covered in the bootcamp, including the basics of how Cloud Foundry works, Micro Cloud Foundry, the capabilities and services offered by Cloud Foundry, and a demo of deploying an application from the command line.
Nimble Storage - The Predicitive Multicloud Flash FabricVITO - Securitas
This document provides information about Nimble Storage and its Predictive MultiCloud Flash Fabric storage solutions. It highlights key features such as predictive analytics, flash storage arrays, data protection, multi-cloud capabilities, and an all-inclusive business model. The document is intended to promote Nimble Storage's products and solutions to potential customers.
Similar to Stacking up with OpenStack: Building for High Availability (20)
In this webinar, we will review all important information for sponsors packages, add-ons, venue details, and how to become a sponsor.
Webinar recording: https://youtu.be/kUjMTNoX6yM
A few quick points for those who may be attending an OpenStack Summit for the first time. We are excited to see you in Barcelona, Spain October 25-28, 2016.
An overview of the 1H2016 OpenStack Marketing Plan shared with the marketing community during our regular calls. Learn more at https://wiki.openstack.org/wiki/Governance/Foundation/Marketing#Open_Marketing_Meetings_2016
The document lists the birthdays of various cities around the world, with dates ranging from June 30 to July 18. Cities celebrating on June 30 include Paris, France and Bucharest, Romania. Cities celebrating on July 1 include Sevilla, Spain, Athens, Greece, and Manila, Philippines. The list continues with over 30 cities across Europe, Asia, Africa, North America, South America, and Oceania celebrating their birthdays on subsequent dates throughout the month of July.
The Foundation marketing team put together a high level overview of 2H 2015 plans in order to get input from the marketing community and provide more information on how marketers can take advantage of the work, as well as get involved and contribute.
This is a content overview of the important information and details for sponsors of the upcoming OpenStack Summit in Tokyo, Japan taking place October 27 - 30.
You can watch a recording of the webinar here: https://openstack.webex.com/openstack/ldr.php?RCID=d48605b7ca9fdccd990ab20eb9334be8
This document provides an update on the OpenStack Cinder Liberty release. It outlines that 19 new volume drivers were added with CI testing, 29 blueprints and 134 bug fixes were completed. New features discussed include nested quotas to manage descendant project quotas, force detach to safely detach stuck volumes, a generic image cache to speed up volume creation from images, and improved migrations. It encourages reviewing the full specifications and provides contacts for more information.
The document summarizes updates to OpenStack Glance from the Kilo to Liberty releases. In Kilo, Glance added features like artifact repository, catalog indexing, image conversion/introspection, and support for multiple datastores in storage drivers. Liberty priorities included additional image listing filters, store refactor/cleaner API, encrypted/authenticated image support, and tag metadata CLI support. It also focused on Glance v3 API evolution and increasing adoption of the v2 API within OpenStack. The presenter invites questions by IRC, email, the OpenStack mailing list tagged [Glance], or the weekly Glance meeting.
The OpenStack Heat project update from July 2015 summarizes the Kilo release and previews plans for the upcoming Liberty release. Key accomplishments of the Kilo release included 74 implemented blueprints, 389 fixed bugs, and over 1100 code commits. New Kilo features improved nested stacks and added template functions, while new resources included alarms, volumes, and identity services. Upcoming changes in Liberty will add resources for encryption, monitoring, and containers, move tests and documentation into the Heat project, and focus on convergence and role-based availability of resources.
Neutron will focus on plugin decomposition in Liberty, improving the API, and enabling quality of service bandwidth limiting. Additional priorities include making the Linuxbridge driver ready for the gate, implementing role-based access control for networks, and integrating NFV and load balancing as a service features.
The summary discusses OpenStack Nova project updates post Liberty-1 release. Key points include:
- Kilo release focused on major architecture evolution, release of API v2.1, and reduced API downtime during upgrades.
- Plans for Liberty include continuing architectural evolution, improvements to API v2.1, reducing upgrade downtime, and making Cells v2 the default configuration.
- Scaling the Nova community focuses on better communication, renewed mentoring and onboarding, continued innovation within scope, and goals of supporting the Nova API ecosystem and improving stability, scalability and upgradability.
Sahara provides scalable data processing by provisioning and operating data processing clusters and scheduling jobs. In the Kilo release, Sahara added support for new plugins like Apache Storm, improved the Sahara UI for guided cluster creation and job execution, and added features like indirect VM access, an event log, and default templates via the CLI. Looking ahead, upcoming Liberty releases will focus on high availability for CDH and HDP, support for Spark, improved data sources handling, editing existing objects, and enhanced testing.
Searchlight is an OpenStack project that provides advanced indexing and search capabilities across multi-tenant cloud resources using Elasticsearch. It was originally developed as an experimental feature of Glance called the "Glance Catalog Index Service" but has expanded to index other resources like Nova instances. The current priorities for Searchlight in the Liberty release include completing the separation from Glance, adding deployment options, indexing additional resources like Glance images and Nova instances, initial Horizon integration, and improving documentation. The long term vision is for Searchlight to provide a unified search interface across all major OpenStack services.
Trove provides database services and improved in several areas for the Kilo and Liberty releases. Key improvements included adding new database engines like CouchDB and Vertica, improving replication functionality for MySQL and Redis, and enhancing clustering support. Testing and CI were moved to OpenStack infrastructure. For Liberty, backup/restore was added for MongoDB and Redis, and limitations on flavors per datastore were introduced. Community involvement also grew significantly over this period.
The document summarizes OpenStack developments including its mission to produce an open source cloud computing platform that is simple, scalable, and meets the needs of public and private clouds. It discusses trends like applications on OpenStack, identity federation enabling hybrid multi-cloud scenarios, and new community app catalog. User stories from Walmart and PayPal highlight how they rely on OpenStack. The document outlines upcoming OpenStack events and releases.
This document provides a PTL update on the state of the community for the Liberty open source object storage project. It notes continued active participation and growth in contributors. It outlines recent work on encryption, erasure coding, storelets, policy migrations, sharding, fast POST, SDKs, documentation, notifications, and performance. It indicates future plans include native media support and improving the client ecosystem.
Congress is an OpenStack project that provides policy management and enforcement across OpenStack services. It allows defining policies like restricting network access based on group membership. In Kilo, Congress focused on the core capabilities of monitoring for violations and basic proactive/reactive enforcement. Liberty adds controls for limiting enforcement actions and expands the number of integrated services. It also introduces scale out and high availability architectures using a shared database and load balancing. Liberty may also integrate delegation of enforcement through Keystone.
This document discusses challenges in coordinating the production of OpenStack and proposed changes to address those challenges. It describes reforming the project structure to recognize more projects as part of OpenStack if they help with the mission. It also discusses establishing a single OpenStack Security Team portal and moving away from integrated releases to a larger collection of coordinated projects. The Liberty development cycle will use a 6-month time-based model with three interim releases before the final Liberty release in October 2015.
At OpenStack Day CEE 2015, we discuss the latest user survey results, some real-world OpenStack case studies and how new users and cloud operators can get involved with the community.
OpenID AuthZEN Interop Read Out - AuthorizationDavid Brossard
During Identiverse 2024 and EIC 2024, members of the OpenID AuthZEN WG got together and demoed their authorization endpoints conforming to the AuthZEN API
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfTechgropse Pvt.Ltd.
In this blog post, we'll delve into the intersection of AI and app development in Saudi Arabia, focusing on the food delivery sector. We'll explore how AI is revolutionizing the way Saudi consumers order food, how restaurants manage their operations, and how delivery partners navigate the bustling streets of cities like Riyadh, Jeddah, and Dammam. Through real-world case studies, we'll showcase how leading Saudi food delivery apps are leveraging AI to redefine convenience, personalization, and efficiency.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxSitimaJohn
Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
4. 4#
My relationship with HA 2001
How many 9-s can
your product do?
Cloud Management #rightscale
5. 5#
So what did they mean by 5-9s?
Availability Allowed Down Time each Year
99% 3.65 days
99.9% 8.76 hours
99.99% 52.56 minutes
99.999% 5.26 minutes
Cloud Management #rightscale
11. 11#
Golden Age of Cloud Computing
No Up-Front Low Cost Pay Only for
Capital Expense What You Use
Self-Service Easily Scale Up Improve Agility &
Infrastructure and Down Time-to-Market
Deploy
Cloud Management #rightscale
12. 12#
Golden Age for Fault-Tolerance
No Up-Front HA Low Cost Pay for DR Only
Capital Expense Backups When You Use it
Self-Service Easily Deliver Fault- Improve Agility &
DR Infrastructure Tolerant Applications Time-to-Recovery
Deploy
Cloud Management #rightscale
13. 13#
Yeah, but …
What about my private cloud?
Applications deployed in private clouds have to worry about:
• Private Cloud Infrastructure being HA
• Application architecture HA / DR
• With Public Clouds – Well, you get what your provider gives
you
Cloud Management #rightscale
14. 14#
Private Cloud Infrastructure HA
Several single points of failure in OpenStack deployment
• OpenStack API services
• MySQL
• RabbitMQ
Solved in various ways
• Pacemaker cluster management
• Keepalived (e.g: RAX Private Cloud)
• MySQL (Galera), RabbitMQ (active-active mirrored queues)
Eliminate SPoFs as best as you can.
Cloud Management #rightscale
15. 15#
What about my app?
Design for failure:
• If your application relies on Cloud infrastructure
SLA for its HA needs, you are STUCK with that
vendor / infrastructure
• Need to balance cost and complexity against risk
tolerance
• Design application so that its:
Build for server failure
Build for zone failure
Build for cloud failure
Keep management layer separate from infrastructure
Cloud Management #rightscale
16. 16#
Build for Server Failure
• Set up auto-scaling
• Set up database mirroring,
master/slave configuration
• Use static public IPs
• Use Dynamic DNS for
private IPs
Cloud Management #rightscale
17. 17#
Build for Zone Failure
Static Public IPs
DNS
172.168.7.31 172.168.8.62
Zone 1 Zone 2
1
LOAD BALANCERS LOAD BALANCERS Where possible,
use NoSQL DB
like Cassandra
or MongoDB
APP SERVERS
AUTOSCALE
MASTER DB SLAVE DB
REPLICATE
Block
SNAPSHOTS
Object store
Snapshot data volume for backups so
Place Slave databases in one
the database can be readily recovered
or more zones for failover.
within the region.
A creative deployment model would be to make your private cloud an “AZ” by placing
it in close physical proximity to a public cloud provider
Cloud Management #rightscale
18. 18#
Build for Cloud Failure (Cold DR)
Staged Server Configuration and generally no staged data
$
• Not recommended if rapid recovery is required
• Slow to replicate data to other cloud and bring database online
DNS
172.168.7.31
Private DALLAS
LOAD BALANCERS LOAD BALANCERS
APP SERVERS APP SERVERS
MASTER DB SLAVE DB SLAVE DB
REPLICATE
Block
SNAPSHOTS
CLOUD
Cloud Management FILES #rightscale
19. 19#
Build for Cloud Failure (Warm DR)
Staged Server Configuration, pre-staged data and running Slave Database Server
$$
• Generally recommended DR solution
• Minimal additional cost and allows fairly rapid recovery
DNS
172.168.7.31
Private DALLAS
LOAD BALANCERS LOAD BALANCERS
APP SERVERS APP SERVERS
MASTER DB SLAVE DB SLAVE DB
REPLICATE REPLICATE
Block
SNAPSHOTS
SNAPSHOTS
CLOUD
Cloud Management FILES #rightscale
20. 20#
Build for Cloud Failure (Hot DR)
Parallel Deployment with all servers running but all traffic going to primary
$$$
• Not recommended
• Very high additional cost to allow rapid recovery
DNS
172.168.7.31
Private DALLAS
LOAD BALANCERS LOAD BALANCERS
APP SERVERS APP SERVERS
MASTER DB SLAVE DB SLAVE DB
REPLICATE REPLICATE
Block
SNAPSHOTS SNAPSHOTS
CLOUD
Cloud Management FILES #rightscale
23. 23#
Automate and test everything
• Automate backups of your data
• Setup monitoring and alerts
• Run fire-drills! Plan and Practice your recovery procedures!
Cloud Management #rightscale
24. 24#
Separate Management layer from Infrastructure
• Keep the keys to the car outside the car
Cloud Management #rightscale
25. 25#
Automating HA and DR
• Use dynamic DNS for your database servers
• Allow app servers to use a single FQDN.
• Use a low TTL to allow rapid failover in the case of a change in master
database
• Automatic connection of app servers to load balancing servers
• App servers can connect to all load balancers automatically at launch
• No manual intervention
• No DNS modifications
• Automated promotion of slave to master
• Process is automated
• Decision to run process is manual
Cloud Management #rightscale
28. 28#
How RightScale makes it possible
RightScale ServerTemplates™
• Reproducible: Predictable
deployment
• Dynamic: Configuration from
scripts at boot time
• Multi-cloud: Cloud agnostic
and portable
• Modular: Role and behavior
abstracted from cloud
infrastructure
Cloud Management #rightscale
29. 29#
How RightScale makes it possible
MultiCloud Images
• MultiCloud Images can be launched across regions and clouds
without modification
ServerTemplate contains a list
1 of MultiCloud Images (MCIs)
When the Server is
2 created, a specific MCI
is chosen.
The appropriate
3 RightImage is used at
MultiCloud Images
launch.
Cloud A, B, Image 1
Cloud A C, Image 2
Cloud B, Image 1 Cloud A, B, Image 1
Cloud B
Stability across clouds
Image 1
RightImage
Cloud Management #rightscale
30. 30#
Outage-Proofing Best Practices
Place in >1 Replicate data Replicate data
zone: across zones across zones
• Load balancers Backup across Design stateless
• App servers regions & clouds apps for
• Databases Monitoring, alert, resilience to
Maintain and automate reboot / relaunch
capacity to operations to
absorb zone or speed up
region failures failover
Cloud Management #rightscale
31. 31#
Thank you!
Sign-up for a free account at: www.rightscale.com
Check out job postings are: www.rightscale.com/jobs
We are hiring!
Cloud Management #rightscale
Editor's Notes
Good afternoon folks, Hope you are here for the high availability discussion.. In case of an emergency, we have specially arrange a highly available pair of exits to your left and behind ya..So, let me tell u a bit about myself and what HA means to me.. I am a product manager at RightScale..
My relationship with HA goes back all the way to my kindergarten years, growing up in India. Going to my first big kindergarten exam, I recall worrying about having more than one sharpened pencils in my pencil box ready to go. And yes, kindergarteners have exams in India, but that’s an entirely different discussion. Fast forward to my college days, taking my big 747 flight to california. Yes, you guessed it, I worried about the plane having enough engines so if one of them failed, I wouldn’t become fish food in the pacific ocean Fast forward few more years to my telecommunication days – visiting KDDI and NTT DoCoMo in Japan for discussion on our messaging product.. They pretty much immediately got to the topic of “how many 9s does your product do”? Any anything less than 5-9s would not have been an acceptable answer in the heavily regulated Japanese telecommunication market.
Fast forward to my college days, taking my first big flight on a 747 to california. Yes, you guessed it, I worried about the plane having enough engines so if one of them failed, I wouldn’t become fish food in the pacific ocean
Fast forward few more years to my telecommunication days – visiting KDDI and NTT DoCoMo in Japan for discussion on our messaging product.. They pretty much immediately got to the topic of “how many 9s does your product do”? Any anything less than 5-9s would not have been an acceptable answer in the heavily regulated Japanese telecommunication market.
Quick definition of how the “9s” availability translates to allowed downtime each year
Leap forward to 2012 – the cloud era is in full swing. Behemoth cloud providers are stamping out VMs like Oreo cookies, while preaching the mantra “everything fails all the time”.And rightfully so – In 2012, we saw 27 sizable outages in public, private, hosting and SaaS providers.Infographic -- not just restricted to cloud computing only..- 7 major cloud outages in 2012.. Average company has 1 major and 3 minor DC outages per year$5k per min of downtime (avg cost)They are starting to become more and more public as more people are getting on the cloud..-May of 2010. - first big one that happened was in- April 2011 -- lot of people that got a lot of press
Among the top-5 causes for outages were power loss, natural disasters, software bugs that cascaded and operator errors.Even though large scale outages are rare, they do happen and will continue to happen in the future.
In the aftermath of outages, you see these..Outages are expensive – there is nothing more frustrating to a modern day consumer to go to a website and see its down.. Every minute of downtime affect your revenue and your brand reputation. Computer Associates did a study last year that the cost of outages is about $26 Billion a year.Cost of
We are in the golden age of cloud computing..
At the end of the day, you are responsible for the HA of your application. Cloud infrastructure provides tools.Relying on cloud infrastructure for HA is a recipe for trouble as this locks you into that cloud infra.. You need portability, so when you move your application to another cloud, it stands on its own merit.Complexity of HA against the risk.. Auto and home insurance. The cost of HA goes up exponentially as you reduce your tolerance for downtime (Recovery time objective) as well as tolerance for data loss (Recovery Point objective).
This is what we generally recommend when someone comes to us and says I want HAThree tiered ApplicationRR DNS Load BalancersArray of Application ServersMaster – Slave DatabasesAtleast one of each component in each AZPlace slave database in different zone, so if one of the zones were to go down, you will not have an outage.. Granted there will be some performance degradation..
During emergencies, time is precious – make sure it works
If both goes down, u have no where to go..if the disaster hits management, u still have the app,if the disaster hit app u can execute on DR scenarios..
Which parts you should automate and which parts you shouldn’t..We always recommend using dynamic DNS for your DB servers.. This allows app servers to use a single FQDN that can be resolved by the dynamic DNS. So in case of a failover, Dynamic DNS gets automatically updated and the servers will discover the new DB once the TTL expires.Use low TTL(e.g: mymaster.mydomain.com)We recommend automating the process of connecting apps servers to LBs. So when a new app server fires up, it automatically registers itself to the load balancer without manual interventionThe process is automated, decision to run the process is manual.. Once u pushed that button, there is no going back, so make sure u are certain before you failover.. The promotion happened in case where the master wasn’t really down but it resulted
I AM representing RightScale today, so a little bit on how RightScale can help.Server templates allow you to pre-configure servers by starting from a base image and adding scripts that run during boot, operational and shutdown phases of a server instance.The key benefit of a server template is that they help you create a easily reproducible server setup. And this can be done across multiple clouds..Through the server configuration mechanism that is built into the server templates, they servers have the ability to automatically join load balancer pools, autoscale across zones etc.
I AM representing RightScale today, so a little bit on how RightScale can help.Server Template contains a list of multi-cloud images.. When a server is created, Quickly, efficiently and repeatably