Technical overview of how SUSE OpenStack Cloud uses Chef to implement highly available OpenStack infrastructure services.
Target audience: curious developers in the upstream openstack-chef community
These slides were extracted from internal HA training for SUSE OpenStack Cloud developers, and slightly modified for the benefit of the openstack‐chef community.
The primary requirements for OpenStack based clouds (public, private or hybrid) is that they must be massively scalable and highly available. There are a number of interrelated concepts which make the understanding and implementation of HA complex. The potential for not implementing HA correctly would be disastrous.
This session was presented at the OpenStack Meetup in Boston Feb 2014. We discussed interrelated concepts as a basis for implementing HA and examples of HA for MySQL, Rabbit MQ and the OpenStack APIs primarily using Keepalived, VRRP and HAProxy which will reinforce the concepts and show how to connect the dots.
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...StreamNative
With the rise of the number of tenants and traffic in the cluster, we are always striving for a system that is both multi-tenant and secure enough to onboard applications having different use cases and those applications can access pulsar from different cloud providers or even from cross-organization for enterprise integration.
Large organizations use TLS proxy servers which act as a gateway between a local network and a large-scale network, such as the internet. Aside from traffic forwarding, proxy servers provide security by hiding the actual IP address of a server. Organizational policies often require systems to stay behind enterprise proxy/gateway servers such as HAProxy, ATS, Nginx and follow standard security regulations to protect systems against known vulnerabilities. Apache Pulsar provides various solutions for TLS proxy and Pulsar is the only messaging system that supports SNI proxy to leverage various enterprise proxy solutions.
In this talk, we will discuss security and proxy solutions for Apache Pulsar which enables users in multi-tenant environments to access Pulsar instances securely from the on-prem, public cloud, and cross-enterprise. We will also talk about different multi-tenancy dimensions of Apache Pulsar which we use in Verizon Media to serve different use cases and applications on a shared pulsar cluster.
Technical overview of how SUSE OpenStack Cloud uses Chef to implement highly available OpenStack infrastructure services.
Target audience: curious developers in the upstream openstack-chef community
These slides were extracted from internal HA training for SUSE OpenStack Cloud developers, and slightly modified for the benefit of the openstack‐chef community.
The primary requirements for OpenStack based clouds (public, private or hybrid) is that they must be massively scalable and highly available. There are a number of interrelated concepts which make the understanding and implementation of HA complex. The potential for not implementing HA correctly would be disastrous.
This session was presented at the OpenStack Meetup in Boston Feb 2014. We discussed interrelated concepts as a basis for implementing HA and examples of HA for MySQL, Rabbit MQ and the OpenStack APIs primarily using Keepalived, VRRP and HAProxy which will reinforce the concepts and show how to connect the dots.
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...StreamNative
With the rise of the number of tenants and traffic in the cluster, we are always striving for a system that is both multi-tenant and secure enough to onboard applications having different use cases and those applications can access pulsar from different cloud providers or even from cross-organization for enterprise integration.
Large organizations use TLS proxy servers which act as a gateway between a local network and a large-scale network, such as the internet. Aside from traffic forwarding, proxy servers provide security by hiding the actual IP address of a server. Organizational policies often require systems to stay behind enterprise proxy/gateway servers such as HAProxy, ATS, Nginx and follow standard security regulations to protect systems against known vulnerabilities. Apache Pulsar provides various solutions for TLS proxy and Pulsar is the only messaging system that supports SNI proxy to leverage various enterprise proxy solutions.
In this talk, we will discuss security and proxy solutions for Apache Pulsar which enables users in multi-tenant environments to access Pulsar instances securely from the on-prem, public cloud, and cross-enterprise. We will also talk about different multi-tenancy dimensions of Apache Pulsar which we use in Verizon Media to serve different use cases and applications on a shared pulsar cluster.
Hochverfügbarkeit mit MariaDB Enterprise
Presented by Ralf Gebhardt at the MariaDB Roadshow Germany: 4.7.2014 in Hamburg, 8.7.2014 in Berlin and 11.7.2014 in Frankfurt.
Friends don't let friends do dual writes: Outbox pattern with OpenShift Strea...Red Hat Developers
Dual writes are a common source of issues in distributed event-driven applications. A dual write occurs when an application has to change data in two different systems - for instance, when an application needs to persist data in the database and send a Kafka message to notify other systems. If one of these two operations fail, you might end up with inconsistent data which can be hard to detect and fix.
OpenShift Streams for Apache Kafka is Red Hat's fully hosted and managed Apache Kafka service targeting development teams that want to incorporate streaming data and scalable messaging in their applications, without the burden of setting up and maintaining a Kafka cluster infrastructure. Debezium is an open source distributed platform for change data capture. Built on top of Apache Kafka, it allows applications to react to inserts, updates, and deletes in your databases.
In this session you will learn how you can leverage OpenShift Streams for Apache Kafka and Debezium to avoid the dual write issue in an event-driven application using the outbox pattern. More specifically, we will show you how to:
Provision a Kafka cluster on OpenShift Streams for Apache Kafka.
Deploy and configure Debezium to use OpenShift Streams for Apache Kafka.
Refactor an application to leverage Debezium and OpenShift Streams for Apache Kafka to avoid the dual write problem.
How Incremental Compaction Reduces Your Storage FootprintScyllaDB
What if there was a new, better, more efficient way to handle compactions in Scylla? One that allows you to use your storage much more efficiently? Enter Scylla’s unique Incremental Compaction Strategy (ICS). Get a comparison of common compaction strategies and a technical deep dive into ICS. You’ll learn why ICS will become the new standard for compaction, including an overview of how much disk space you can save with ICS.
Troubleshooting Kafka's socket server: from incident to resolutionJoel Koshy
LinkedIn’s Kafka deployment is nearing 1300 brokers that move close to 1.3 trillion messages a day. While operating Kafka smoothly even at this scale is testament to both Kafka’s scalability and the operational expertise of LinkedIn SREs we occasionally run into some very interesting bugs at this scale. In this talk I will dive into a production issue that we recently encountered as an example of how even a subtle bug can suddenly manifest at scale and cause a near meltdown of the cluster. We will go over how we detected and responded to the situation, investigated it after the fact and summarize some lessons learned and best-practices from this incident.
A step-by-step deep dive into Kafka Security world. This presentation covers few most sought-after questions in Streaming / Kafka; like what happens internally when SASL / Kerberos / SSL security is configured, how does various Kafka components interacts with each other. This could be valuable resource for administrators, users & Application developers alike. Having internal Kafka knowledge would help them to configure, manage and use the Kafka systems in a more optimal way with least possible errors / mistakes.
Agenda is to discuss:
- Various Kafka Security model available: PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, PLAINTEXT_SSL and when to use which model
- Anatomy of each Security model: in-depth examination of these models and what happens internally when they are used; with real life examples
- Do's and Don'ts of Kafka Security
- Common Errors & Troubleshooting
This talk will be all about looking under-the-hood with respect to Kafka Security. Suitable for all levels from beginners to expert.
Speaker
Vipin Rathor, Sr. Product Specialist (security), Hortonworks
How Orange Financial combat financial frauds over 50M transactions a day usin...JinfengHuang3
You will learn how Orange Financial combats financial fraud over 50M transactions a day using Apache Pulsar. The presentation is shared at Strata Data Conference at New York, US, 2019/09.
Let's begin a very unusual Kafka Summit by reflecting about change. Changes we've seen in the software engineering world and changes we've seen in Kafka. We'll also talk about things that don't change - like great software design and architecture. We'll dive deep into two huge changes that are happening in the Kafka community right now - and the possibilities they open for the future.
These slides are from the recent meetup @ Uber - Apache Cassandra at Uber and Netflix on new features in 4.0.
Abstract:
A glimpse of Cassandra 4.0 features:
There are a lot of exciting features coming in 4.0, but this talk covers some of the features that we at Netflix are particularly excited about and looking forward to. In this talk, we present an overview of just some of the many improvements shipping soon in 4.0.
Connect at Twitter-scale | Jordan Bull and Ryanne Dolan, TwitterHostedbyConfluent
Twitter has one of the largest Kafka fleets in the world, handling hundreds of millions of events per second. In order to operate Kafka Connect at this scale, we've had to get creative. In this talk we'll present some of the problems we've run into with Kafka Connect, and how we've engineered around them.
Exploring the problem of Microservices communication and how both Kafka and Service Mesh solutions address it. We then look at some approaches for combining both.
What happened when our biggest and most important Kafka cluster went rogue all of a sudden, and while trying to recover it, a single, crucial misconfiguration made things even worse?
At a company like Taboola, where service availability and latency are our top priority, this was a disaster.
With 300K messages/sec and 250TB of messages produced each day to our on-premise Kafka clusters, and mirrored to our central Kafka cluster, we always try to ensure Kafka behaves well under high loads of traffic and unexpected cluster failures. So when our main Kafka cluster went crazy we had a serious issue on our hands.
This session is the story of how we learned the hard way about mitigating cluster failures with the proper configurations in place.
Hochverfügbarkeit mit MariaDB Enterprise
Presented by Ralf Gebhardt at the MariaDB Roadshow Germany: 4.7.2014 in Hamburg, 8.7.2014 in Berlin and 11.7.2014 in Frankfurt.
Friends don't let friends do dual writes: Outbox pattern with OpenShift Strea...Red Hat Developers
Dual writes are a common source of issues in distributed event-driven applications. A dual write occurs when an application has to change data in two different systems - for instance, when an application needs to persist data in the database and send a Kafka message to notify other systems. If one of these two operations fail, you might end up with inconsistent data which can be hard to detect and fix.
OpenShift Streams for Apache Kafka is Red Hat's fully hosted and managed Apache Kafka service targeting development teams that want to incorporate streaming data and scalable messaging in their applications, without the burden of setting up and maintaining a Kafka cluster infrastructure. Debezium is an open source distributed platform for change data capture. Built on top of Apache Kafka, it allows applications to react to inserts, updates, and deletes in your databases.
In this session you will learn how you can leverage OpenShift Streams for Apache Kafka and Debezium to avoid the dual write issue in an event-driven application using the outbox pattern. More specifically, we will show you how to:
Provision a Kafka cluster on OpenShift Streams for Apache Kafka.
Deploy and configure Debezium to use OpenShift Streams for Apache Kafka.
Refactor an application to leverage Debezium and OpenShift Streams for Apache Kafka to avoid the dual write problem.
How Incremental Compaction Reduces Your Storage FootprintScyllaDB
What if there was a new, better, more efficient way to handle compactions in Scylla? One that allows you to use your storage much more efficiently? Enter Scylla’s unique Incremental Compaction Strategy (ICS). Get a comparison of common compaction strategies and a technical deep dive into ICS. You’ll learn why ICS will become the new standard for compaction, including an overview of how much disk space you can save with ICS.
Troubleshooting Kafka's socket server: from incident to resolutionJoel Koshy
LinkedIn’s Kafka deployment is nearing 1300 brokers that move close to 1.3 trillion messages a day. While operating Kafka smoothly even at this scale is testament to both Kafka’s scalability and the operational expertise of LinkedIn SREs we occasionally run into some very interesting bugs at this scale. In this talk I will dive into a production issue that we recently encountered as an example of how even a subtle bug can suddenly manifest at scale and cause a near meltdown of the cluster. We will go over how we detected and responded to the situation, investigated it after the fact and summarize some lessons learned and best-practices from this incident.
A step-by-step deep dive into Kafka Security world. This presentation covers few most sought-after questions in Streaming / Kafka; like what happens internally when SASL / Kerberos / SSL security is configured, how does various Kafka components interacts with each other. This could be valuable resource for administrators, users & Application developers alike. Having internal Kafka knowledge would help them to configure, manage and use the Kafka systems in a more optimal way with least possible errors / mistakes.
Agenda is to discuss:
- Various Kafka Security model available: PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, PLAINTEXT_SSL and when to use which model
- Anatomy of each Security model: in-depth examination of these models and what happens internally when they are used; with real life examples
- Do's and Don'ts of Kafka Security
- Common Errors & Troubleshooting
This talk will be all about looking under-the-hood with respect to Kafka Security. Suitable for all levels from beginners to expert.
Speaker
Vipin Rathor, Sr. Product Specialist (security), Hortonworks
How Orange Financial combat financial frauds over 50M transactions a day usin...JinfengHuang3
You will learn how Orange Financial combats financial fraud over 50M transactions a day using Apache Pulsar. The presentation is shared at Strata Data Conference at New York, US, 2019/09.
Let's begin a very unusual Kafka Summit by reflecting about change. Changes we've seen in the software engineering world and changes we've seen in Kafka. We'll also talk about things that don't change - like great software design and architecture. We'll dive deep into two huge changes that are happening in the Kafka community right now - and the possibilities they open for the future.
These slides are from the recent meetup @ Uber - Apache Cassandra at Uber and Netflix on new features in 4.0.
Abstract:
A glimpse of Cassandra 4.0 features:
There are a lot of exciting features coming in 4.0, but this talk covers some of the features that we at Netflix are particularly excited about and looking forward to. In this talk, we present an overview of just some of the many improvements shipping soon in 4.0.
Connect at Twitter-scale | Jordan Bull and Ryanne Dolan, TwitterHostedbyConfluent
Twitter has one of the largest Kafka fleets in the world, handling hundreds of millions of events per second. In order to operate Kafka Connect at this scale, we've had to get creative. In this talk we'll present some of the problems we've run into with Kafka Connect, and how we've engineered around them.
Exploring the problem of Microservices communication and how both Kafka and Service Mesh solutions address it. We then look at some approaches for combining both.
What happened when our biggest and most important Kafka cluster went rogue all of a sudden, and while trying to recover it, a single, crucial misconfiguration made things even worse?
At a company like Taboola, where service availability and latency are our top priority, this was a disaster.
With 300K messages/sec and 250TB of messages produced each day to our on-premise Kafka clusters, and mirrored to our central Kafka cluster, we always try to ensure Kafka behaves well under high loads of traffic and unexpected cluster failures. So when our main Kafka cluster went crazy we had a serious issue on our hands.
This session is the story of how we learned the hard way about mitigating cluster failures with the proper configurations in place.
Watcher, a Resource Manager for OpenStack: Plans for the N-release and BeyondAntoine Cabot
Watcher is an open source software package which provides a flexible and scalable resource optimization service for multi-tenant OpenStack-based clouds.
Watcher provides a complete optimization loop—including everything from a metrics receiver, optimization processor and an action plan applier. This provides a robust framework to realize a wide range of cloud optimization goals, including the reduction of data center operating costs, increased system performance via intelligent virtual machine migration,increased energy efficiency, etc.
The overall goal is that OpenStack-based clouds equipped with Watcher will decrease their Total Cost of Ownership by way of more efficient use of their infrastructure through targeted optimizations and close-loop automation.
In this presentation we will go over the state of Watcher as it is today, its architecture, the team’s accomplishments for the Mitaka release and our plans for the N-release and beyond.
The Nova driver for Docker has been maturing rapidly since its mainline removal in Icehouse. During the Juno cycle, substantial improvements have been made to the driver, and greater parity has been reached with other virtualization drivers. We will explore these improvements and what they mean to deployers. Eric will additionally showcase deployment scenarios for the deployment of OpenStack itself inside and underneath of Docker for powering traditional VM-based computing, storage, and other cloud services. Finally, users should expect a preview of the planned integration with the new OpenStack Containers Service effort to provide automation of advanced containers functionality and Docker-API semantics inside of an OpenStack cloud.
Note that the included Heat templates are NOT usable. See the linked Heat resources for viable templates and examples.
My first book preview.
The published eBook willl have plenty of Hyperlinks to Flash movies to explain advanced topics. You can donate or order the books if you want.
Swiss IPv6 Council: IPv6 in der Cloud - Case Study der cloudscale.chDigicomp Academy AG
Die monatlichen Anlässe in Zusammenarbeit mit dem Swiss IPv6 Council behandeln verschiedene technische Themenbereiche von IPv6.
In seinem Referat präsentierte Manuel Schweizer, Gründer der cloudscale.ch AG und Vorstandsmitglied beim SwissIX Internet Exchange, die Erfahrungen der cloudscale.ch mit IPv6 in der Cloud. Der Schweizer Iaas-Anbieter hat 2016 IPv6 in der Cloud ausgerollt. Die spannende Casestudy gab einen Überblick über die Akzeptanz, Kundenfeedback und gemachten Erfahrungen.
Gerne stellen wir Ihnen die Slides von Manuel Schweizer zur Verfügung.
What's really the difference between a VM and a Container?Adrian Otto
Slides for my SCaLE 15x Presentation for 2017-03-04:
What's really the difference between a VM and a Container?
Docker, Kubernetes, Mesos, and the container buzzword bingo game leaves us all asking this same question at some point. We know VMs are great, so why all this fuss now about containers? Are they the same thing, but better? This talk will go deep into the technical details of the fundamental differences between the technology, explaining in depth how each of them works, and where each of them shine and why businesses choose one over the other. You will also get a good sense of where the warts are too, so you know when to pick the right one (or the right combination of them) depending on what’s important for each of your various workloads.
https://www.socallinuxexpo.org/scale/15x/presentations/whats-really-difference-between-vm-and-container
A study and practice of OpenStack release Kilo HA deployment. The Kilo document has some errors, and it's hardly find a detailed document to describe how to deploy a HA cloud based on Kilo release. Hope this slides can provide some clues.
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Kai Wähner
Architecture patterns for distributed, hybrid, edge and global Apache Kafka deployments
Multi-cluster and cross-data center deployments of Apache Kafka have become the norm rather than an exception. This session gives an overview of several scenarios that may require multi-cluster solutions and discusses real-world examples with their specific requirements and trade-offs, including disaster recovery, aggregation for analytics, cloud migration, mission-critical stretched deployments and global Kafka.
Key takeaways:
In many scenarios, one Kafka cluster is not enough. Understand different architectures and alternatives for multi-cluster deployments.
Zero data loss and high availability are two key requirements. Understand how to realize this, including trade-offs.
Learn about features and limitations of Kafka for multi cluster deployments
Global Kafka and mission-critical multi-cluster deployments with zero data loss and high availability became the normal, not an exception.
Operating OpenStack - Case Study in the Rackspace CloudRainya Mosher
Presentation given in Seoul, South Korea at the Cloud and Data Center Conference in March 2014. Introduces the concept of the Rackspace Hybric Cloud Experience, the product platforms that are being used to make that happen, and then focuses on the operation and deployment of the Public Cloud.
Rackspace has years of experience with running Xen at scale, starting with Xen and migrating to XenServer. We will share why we use Xen/XenServer along with some of the issues that we've experienced. We will touch on our experience with migrating from Xen to XenServer and the challenges there. We will share information about Rackspace Cloud Servers architecture, and touch briefly on OpenStack when doing so. We will explain how we use Xen to quickly deploy new Openstack services with what we call Nova on Nova. And finally, we will discuss what additional features and improvements are needed and why.
Scaling Xen Within Rackspace Cloud ServersRackspace
Rackspace has years of experience with running Xen at scale, starting with Xen and migrating to XenServer. We will share why we use Xen/XenServer along with some of the issues that we've experienced. We will touch on our experience with migrating from Xen to XenServer and the challenges there. We will share information about Rackspace Cloud Servers architecture, and touch briefly on OpenStack when doing so. We will explain how we use Xen to quickly deploy new Openstack services with what we call Nova on Nova. And finally, we will discuss what additional features and improvements are needed and why.
In this session, we will discuss the operational issues that Rackspace has encountered during and after implementing Neutron at a large scale. Neutron at scale required a significant amount of development and operations effort, some of which resulted in deviations from upstream code. Finally, our team would like to discuss our solutions and our upstream differences for Neutron and OpenStack that we believe are necessary so that it can be more performant at scale.
Cloud orchestration stacks are an important component in completing the move to a private cloud. In this rapid fire session, speakers representing key cloud orchestration stacks will have 10 minutes each to present their responses to key questions about the functions, features and capabilities of each cloud stack. Questions include: services and capabilities offered; languages, operating systems, APIs and image formats supported; virtualization stacks supported; management tools; portability; hardware, capacity, performance and availability constraints, pricing and more. Presentations will be followed by an open Q and A discussion.
This presentation covers the OpenStack cloud stack.
Rackspace’s Enterprise Business Intelligence group (EBI) was looking for a cost-effective way to support the reporting and information needs of its internal users, which include business and operations personnel. It was also looking to scale out new infrastructure in order to meet their increasing business demands, house increasing amounts of data, and customize the collection of data, while seeking a way to move away from their legacy Data Warehouse solution. To do this, Rackspace built the Analytical Compute Grid (ACG) by using Hadoop, Cassandra and PostgreSQL with an OpenStack cloud. Read more about it in this presentation.
Scalable Persistent Storage for Erlang: Theory and PracticeAmir Ghaffari
The RELEASE project at Glasgow University aims to improve the scalability of Erlang onto commodity architectures with 100,000 cores.
Such architectures require scalable and available persistent storage on up to 100 hosts. The talk describes the provision of scalable persistent storage options for Erlang.
We outline the theory and apply it to popular Erlang distributed database management systems (DBMS): Mnesia, CouchDB, Riak and Cassandra. We identify Dynamo-style NoSQL DBMS as suitable scalable persistent storage technologies. To evidence the scalability we benchmark Riak in practice, measuring the scalability and elasticity of Riak on 100-node cluster with 800 cores.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
2. RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Agenda
• What is HA
• Compute HA
• Controller HA
• Corosync, Pacemaker and DRBD
• Galera
• HAProxy, keepalived, VRRP
• Resources and Summary
2
4. RACKSPACE® HOSTING | WWW.RACKSPACE.COM
High Availability
4
Minimize data loss
Minimize system downtime
5. RACKSPACE® HOSTING | WWW.RACKSPACE.COM
High Availability Concepts
• Stateless services
– There is no dependency between requests
– For example: Nova API, Nova Scheduler, etc.
• Stateful services
– An action typically comprises multiple requests
– For example: MySQL, RabbitMQ, etc.
5
6. RACKSPACE® HOSTING | WWW.RACKSPACE.COM
High Availability Concepts
• Active/Passive
– Redundant instances of stateless services are load balanced
– For Stateful services a replacement resource can be brought online.
• Active/Active
– Redundant instances of stateless services are load balanced
– Stateful services are managed in such a way that services are redundant, and that all
instances have an identical state.
– Updates to one instance of a database would also update all other instances.
6
9. RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Server Evacuation
9
Evacuation
Evacuation
• Without Shared Storage
– The instance will be booted from a
new disk, but will preserve the
configuration, e.g. id, name, uid,
ip...etc.
• With Shared Storage
– The instance will be booted from
same disk and data will be preserved
11. RACKSPACE® HOSTING | WWW.RACKSPACE.COM
11
Virtualization vs. Cloud
• Virtualization needs care and feeding
– Name the VM
– Tune and groom regularly
– Feed it with good food and supplements
– Take to the vet when sick
• Cloud servers are disposable
– VMs are not unique
– Tune and groom apps not the cows
– Keep the cow upright
– Shoot the cow when it is sick
12. RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Scale Up vs. Scale Out
12
Traditional Cloud
15. RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Pacemaker, Corosync and DRBD
15
• Pacemaker
– high availability and load balancing stack for
the Linux platform.
– Interacts with applications through Resource
Agents (RA)
• Corosync
– Totem single-ring ordering and membership
protocol
– UDP and InfiniBand based messaging,
quorum, and cluster membership to
Pacemaker.
• DRBD (Distributed Replication Block
Device)
– Synchronizes Data at the block device
– Uses a journaling system (such as ext3 or
ext4)
16. RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Galera
• Synchronous multi-master cluster
technology for MySQL/InnoDB
– MySQL patched for wsrep (Write Set
REPlication)
– Active/active multi-master topology
– Read and write to any cluster node
– True parallel replication, in row level
– No slave lag or integrity issues
16
18. RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Keepalived, HAProxy and VRRP
• HAProxy
– Load Balancing and Proxying for HTTP and TCP Applications
– Works over multiple connections
– Used to load balance API services
• VRRP (Virtual Router Redundancy Protocol)
– Eliminates SPOF in a static default routed environment
• Keepalived
– Based on Linux Virtual Server (IPVS) kernel module to provide layer 4 Load Balancing
– Implements a set of checkers to check service status and to maintain health
– Leverage the VRRP Protocol to remap VIPS in event of failure
18
19. RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Sample OpenStack HA Architecture
19
.…
Availability
Zone 1
Dedicated Firewalls
BOND
0
BOND
1
BOND
0
BOND
1
Controller
API Services
API & Horizon
Cinder API
Nova Scheduler
Keystone
Glance
RabbitMQ
MYSQL
Chef
Server
Recipes
Load Balancers
Redundant Network Switches
Storage
EMC, NetApp, or
Solidfire
Vols
BOND
2
Redundant Network Switches
Inside LB VLAN
Storage Network (private)
Fixed Network (private)
Compute 1
KVM
G2
G1
G4
G3
Compute N
KVM
G6
G5
G7
BOND
0
BOND
1
BOND
2
.…
Availability
Zone 2
BOND
0
BOND
1
BOND
2
Compute 1
KVM
Compute N
KVM
G1
6
G15
G17
BOND
0
BOND
1
BOND
2
BOND
0
BOND
1
BOND
2
G12
G11
G14
G13
BOND
2
Controller
API Services
API & Horizon
Cinder API
Nova Scheduler
Keystone
Glance
RabbitMQ
MYSQL
Chef
Server
Recipes
22. RACKSPACE® HOSTING | WWW.RACKSPACE.COM
22
Comparison
Database Replication
method
Strengths Weakness/Limita
tions
Keepalived/HAPro
xy/VRRP
Works on MySQL
master-master
replication
Simple to
implement and
understand.
Works for any
storage system.
Master-master
replication does
not work beyond 2
nodes.
Pacemaker/Coros
ync/DRBD
Mirroring on Block
Devices
Well tested More complex to
setup. Split Brain
possibility
Galera Based on write-
set Replication
(wsrep)
No Slave lag Needs at least 3
nodes. Relatively
new.
Others MySQL Cluster,
RHCS with
DAS/SAN storage
Well tested More complex
setup.
24. RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Resources
• OpenStack
– openstack.org
– launchpad.net/openstack
– #openstack
– #openstack on webchat.freenode.net
• OpenStack HA
– http://docs.openstack.org/trunk/openstack-ha/openstack-ha-guide-trunk.pdf
– https://github.com/rcbops-cookbooks/
• MySQL HA
– http://www.mysql.com/why-mysql/white-papers/mysql-high-availability-drbd-configuration-
deployment-guide/
– http://dev.mysql.com/doc/refman/5.7/en/ha-overview.html
– https://www.hastexo.com/
– http://www.drbd.org/
25. RACKSPACE® HOSTING | WWW.RACKSPACE.COM
2
5
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
For More Information
You can reach me at:
Kenneth Hui
Open Cloud Architect
Rackspace
E-mail: ken.hui@rackspace.com
Twitter: @hui_kenneth
Blog: http://cloudarchitectmusings.com