The document discusses distributed database systems and properties of the Riak database. It defines distributed systems and discusses key aspects like availability, fault tolerance, and latency. It explains Riak's masterless architecture and how it provides high availability and scalability through horizontal scaling on commodity servers. The document also covers consistency models and how Riak allows tuning availability and consistency based on use cases.
Building A Diverse Geo-Architecture For Cloud Native Applications In One DayVMware Tanzu
Presenter: Ben Laplanche, Product Manager, Pivotal Cloud Foundry
Companies turn to PaaS and Cloud Native Applications to gain agility and speed. To provide customer value, a fault tolerant infrastructure is essential. But what happens if an entire data center, region, or even country should go offline? Cassandra holds the key to keeping application state in sync through replication, whilst Pivotal Cloud Foundry provides easy deployment to multiple IaaS providers. It also comes complete with a managed service offering for DataStax Enterprise. This talk will discuss how this setup can be deployed in one day, including demonstrations and a walkthrough of the key concepts, approaches, and considerations.
Join the product and cloud computing leaders of Netflix to discuss why and how the company moved to Amazon Web Services. From early experiments for media transcoding, to building the operational skills to optimize costs and the creation of the Simian Army, this session guides business leaders through real world examples of evaluating and adopting cloud computing.
NetflixOSS Meetup S6E1 - Titus & Containersaspyker
Come hear about our container management platform, Titus. Titus launches over 2 millions containers per week for service and batch workloads. Come to learn what applications are powered by Titus and what values the developers are getting from containers. Also, we will cover some of the Titus unique aspects of reliability, control plane, scheduling, and container runtime technologies. We will also cover our integrations with Netflix systems such as Spinnaker as well as Amazon concepts such as VPC and IAM.
https://www.meetup.com/Netflix-Open-Source-Platform/events/247776324/
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemonsaspyker
Disenchantment is a Netflix show following the medieval misadventures of a hard-drinking princess, her feisty elf, and her personal demon. In this talk, we will follow the story of Netflix’s container management platform, Titus, which powers critical aspects of the Netflix business (video encoding & streaming, big data, recommendations & machine learning, and other workloads). We’ll cover the challenges growing Titus from 10’s to 1000’s of workloads. We’ll talk about our feisty team’s work across container runtimes, scheduling & control plane, and cloud infrastructure integration. We’ll talk about the demons we’ve found on this journey covering operability, security, reliability and performance.
Building A Diverse Geo-Architecture For Cloud Native Applications In One DayVMware Tanzu
Presenter: Ben Laplanche, Product Manager, Pivotal Cloud Foundry
Companies turn to PaaS and Cloud Native Applications to gain agility and speed. To provide customer value, a fault tolerant infrastructure is essential. But what happens if an entire data center, region, or even country should go offline? Cassandra holds the key to keeping application state in sync through replication, whilst Pivotal Cloud Foundry provides easy deployment to multiple IaaS providers. It also comes complete with a managed service offering for DataStax Enterprise. This talk will discuss how this setup can be deployed in one day, including demonstrations and a walkthrough of the key concepts, approaches, and considerations.
Join the product and cloud computing leaders of Netflix to discuss why and how the company moved to Amazon Web Services. From early experiments for media transcoding, to building the operational skills to optimize costs and the creation of the Simian Army, this session guides business leaders through real world examples of evaluating and adopting cloud computing.
NetflixOSS Meetup S6E1 - Titus & Containersaspyker
Come hear about our container management platform, Titus. Titus launches over 2 millions containers per week for service and batch workloads. Come to learn what applications are powered by Titus and what values the developers are getting from containers. Also, we will cover some of the Titus unique aspects of reliability, control plane, scheduling, and container runtime technologies. We will also cover our integrations with Netflix systems such as Spinnaker as well as Amazon concepts such as VPC and IAM.
https://www.meetup.com/Netflix-Open-Source-Platform/events/247776324/
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemonsaspyker
Disenchantment is a Netflix show following the medieval misadventures of a hard-drinking princess, her feisty elf, and her personal demon. In this talk, we will follow the story of Netflix’s container management platform, Titus, which powers critical aspects of the Netflix business (video encoding & streaming, big data, recommendations & machine learning, and other workloads). We’ll cover the challenges growing Titus from 10’s to 1000’s of workloads. We’ll talk about our feisty team’s work across container runtimes, scheduling & control plane, and cloud infrastructure integration. We’ll talk about the demons we’ve found on this journey covering operability, security, reliability and performance.
Engineering Leader opportunity @ Netflix - Playback Data SystemsPhilip Fisher-Ogden
Across the globe, 75M Netflix members love watching 125M hours per day of TV shows and movies. They love the ease of starting on one device and resuming on another, and the Playback Data Systems team makes that happen. We’re looking for a senior engineering manager to lead this high-impact team at Netflix.
Attributions for images:
https://www.flickr.com/photos/theholyllama/5738164504/ and https://www.flickr.com/photos/brewbooks/7780990192/, no changes made, https://creativecommons.org/licenses/by-sa/2.0/
https://www.flickr.com/photos/crschmidt/2956721498/, no changes made, https://creativecommons.org/licenses/by/2.0/
The Last Pickle: Distributed Tracing from Application to DatabaseDataStax Academy
Monitoring provides information on system performance, however tracing is necessary to understand individual request performance. Detailed query tracing has been provided by Cassandra since version 1.2 and is invaluable when diagnosing problems. Although knowing what queries to trace and why the application makes them still requires deep technical knowledge. By merging Application tracing via Zipkin and Cassandra query tracing we automate the process and make it easier to identify and resolve problems. In this talk Mick Semb Wever, Team Member at The Last Pickle, will introduce Cassandra query tracing and Zipkin. He will then propose an extension that allows clients to pass a trace identifier through to Cassandra, and a way to integrate Zipkin tracing into Cassandra. Driving all this is the desire to create one tracing view across the entire system.
In League of Legends, just as in any competitive team game, communication is essential to success. Therefore, when building Chat for the game we had to make sure that the new service would be absolutely rock solid in every respect. This includes not only guaranteed message delivery and consistent presence propagation across the system, but also maintenance of the created social network graph.
In this talk I would like to present how we achieved linear scalability for Chat, improved its overall fault tolerance, and got ready for the new features we wanted to ship. I will also discuss in detail why we migrated our data from MySQL to Riak and how we used CRDTs to deal with conflicting object updates.
Beaming flink to the cloud @ netflix ff 2016-monal-daxiniMonal Daxini
Netflix is a data driven company and we process over 700 billion streaming events per day with at-least once processing semantics in the cloud. To enable extracting intelligence from this unbounded stream easily we are building Stream Processing as a Service (SPaaS) infrastructure so that the user can focus on extracting value and not have to worry about boilerplate infrastructure and scale.
We will share our experience in building a scalable SPaaS using Flink, Apache Beam and Kafka as the foundation layer to process over 1.3 PB of event data without service disruption.
Netflix viewing data architecture evolution - EBJUG Nov 2014Philip Fisher-Ogden
Netflix's architecture for viewing data has evolved as streaming usage has grown. Each generation was designed for the next order of magnitude, and was informed by learnings from the previous. From SQL to NoSQL, from data center to cloud, from proprietary to open source, look inside to learn how this system has evolved. (slides from a talk given at the East Bay Java Users Group MeetUp in Nov 2014)
In this episode, we will focus on continuous delivery and how Netflix uses Spinnaker and Kayenta to safely deliver changes to the cloud and beyond. Kayenta is a platform for Automated Canary Analysis (ACA). It is used by Spinnaker to enable automated canary deployments. We will also discuss how Spinnaker is used at Netflix to deploy targets beyond cloud VMs and containers --- batch jobs, CDNs, fast properties and Open Connect appliances.
Healthcare data comes in many shapes and sizes making ingestion difficult for a variety of batch and near real time use cases. By Cerner evolving its architecture to adopt Apache Kafka, Cerner was able to build a modular architecture for current and future use cases. Reviewing the evolution of Cerner’s uses, developers can help to avoid mistakes and set themselves up for success.
Keystone processes over 1 trillion events per day with at-least once processing semantics in the cloud. We will explore in detail how we have modified and leverage Kafka, Samza, Docker, and Linux at scale to implement a multi-tenant pipeline in the Amazon AWS cloud within a year.
Most Cassandra usages take advantage of its exceptional performance and ability to handle massive data sets. At PagerDuty, we use Cassandra for entirely different reasons: to reliably manage mutable application states and to maintain durability requirements even in the face of full data center outages. We achieve this by deploying Cassandra clusters with hosts in multiple WAN-separated data centers, configured with per-data center replica placement requirements, and with significant application-level support to use Cassandra as a consistent datastore. Accumulating several years of experience with this approach, we've learned to accommodate the impact of WAN network latency on Cassandra queries, how to horizontally scale while maintaining our placement invariants, why asymmetric load is experienced by nodes in different data centers, and more. This talk will go over our workload and design goals, detail the resultant Cassandra system design, and explain a number of our unintuitive operational learnings about this novel Cassandra usage paradigm.
Streaming in Practice - Putting Apache Kafka in Productionconfluent
This presentation focuses on how to integrate all these components into an enterprise environment and what things you need to consider as you move into production.
We will touch on the following topics:
- Patterns for integrating with existing data systems and applications
- Metadata management at enterprise scale
- Tradeoffs in performance, cost, availability and fault tolerance
- Choosing which cross-datacenter replication patterns fit with your application
- Considerations for operating Kafka-based data pipelines in production
Introduction To Streaming Data and Stream Processing with Apache Kafkaconfluent
Modern businesses have data at their core, and this data is changing continuously. How can we harness this torrent of continuously changing data in real time? The answer is stream processing, and one system that has become a core hub for streaming data is Apache Kafka.
This presentation will give a brief introduction to Apache Kafka and describe its usage as a platform for streaming data. It will explain how Kafka serves as a foundation for both streaming data pipelines and applications that consume and process real-time data streams. It will introduce some of the newer components of Kafka that help make this possible, including Kafka Connect, a framework for capturing continuous data streams, and Kafka Streams, a lightweight stream processing library.
This is talk 1 out of 6 from the Kafka Talk Series.
http://www.confluent.io/apache-kafka-talk-series/introduction-to-stream-processing-with-apache-kafka
A talk given on 2018-06-16 in HK Open Source Conference 2018.
The rise of the Apache Kafka starts a new generation of data pipeline - the stream-processing pipeline.
In this talk, Dr. Mole Wong will walk you through the concept of the stream-processing data pipeline, and how this data pipeline can be set up. He will also discuss the use cases of such a data pipeline.
Using Kubernetes to deliver a “serverless” serviceDoKC
Serverless promises to change the way we consume software. It allows us to potentially pay for only that which we use and can help drive down operational costs to the minimal amount of resources necessary.
Architecting for serverless requires a unique look at app logic and the way it is deployed. It takes a combination of the logical and physical worlds. An architectural pattern has emerged where we can scale ephemeral compute separate from services that need to persist.
We use Kubernetes to deliver exactly this. A “serverless” experience that is driven and enabled by compute pods and storage pods. We also have used our experience running thousands of database clusters on Kubernetes to automate the operational expertise of managing a distributed database.
In this talk, we will take a dive deep into the architecture of our application and share:
* A definition and outline of the challenges of serverless
* How we reworked our logic for a serverless approach
* How we use Kubernetes to gain serverless autoscaling
This talk was given by Jim Walker for DoK Day Europe @ KubeCon 2022.
Engineering Leader opportunity @ Netflix - Playback Data SystemsPhilip Fisher-Ogden
Across the globe, 75M Netflix members love watching 125M hours per day of TV shows and movies. They love the ease of starting on one device and resuming on another, and the Playback Data Systems team makes that happen. We’re looking for a senior engineering manager to lead this high-impact team at Netflix.
Attributions for images:
https://www.flickr.com/photos/theholyllama/5738164504/ and https://www.flickr.com/photos/brewbooks/7780990192/, no changes made, https://creativecommons.org/licenses/by-sa/2.0/
https://www.flickr.com/photos/crschmidt/2956721498/, no changes made, https://creativecommons.org/licenses/by/2.0/
The Last Pickle: Distributed Tracing from Application to DatabaseDataStax Academy
Monitoring provides information on system performance, however tracing is necessary to understand individual request performance. Detailed query tracing has been provided by Cassandra since version 1.2 and is invaluable when diagnosing problems. Although knowing what queries to trace and why the application makes them still requires deep technical knowledge. By merging Application tracing via Zipkin and Cassandra query tracing we automate the process and make it easier to identify and resolve problems. In this talk Mick Semb Wever, Team Member at The Last Pickle, will introduce Cassandra query tracing and Zipkin. He will then propose an extension that allows clients to pass a trace identifier through to Cassandra, and a way to integrate Zipkin tracing into Cassandra. Driving all this is the desire to create one tracing view across the entire system.
In League of Legends, just as in any competitive team game, communication is essential to success. Therefore, when building Chat for the game we had to make sure that the new service would be absolutely rock solid in every respect. This includes not only guaranteed message delivery and consistent presence propagation across the system, but also maintenance of the created social network graph.
In this talk I would like to present how we achieved linear scalability for Chat, improved its overall fault tolerance, and got ready for the new features we wanted to ship. I will also discuss in detail why we migrated our data from MySQL to Riak and how we used CRDTs to deal with conflicting object updates.
Beaming flink to the cloud @ netflix ff 2016-monal-daxiniMonal Daxini
Netflix is a data driven company and we process over 700 billion streaming events per day with at-least once processing semantics in the cloud. To enable extracting intelligence from this unbounded stream easily we are building Stream Processing as a Service (SPaaS) infrastructure so that the user can focus on extracting value and not have to worry about boilerplate infrastructure and scale.
We will share our experience in building a scalable SPaaS using Flink, Apache Beam and Kafka as the foundation layer to process over 1.3 PB of event data without service disruption.
Netflix viewing data architecture evolution - EBJUG Nov 2014Philip Fisher-Ogden
Netflix's architecture for viewing data has evolved as streaming usage has grown. Each generation was designed for the next order of magnitude, and was informed by learnings from the previous. From SQL to NoSQL, from data center to cloud, from proprietary to open source, look inside to learn how this system has evolved. (slides from a talk given at the East Bay Java Users Group MeetUp in Nov 2014)
In this episode, we will focus on continuous delivery and how Netflix uses Spinnaker and Kayenta to safely deliver changes to the cloud and beyond. Kayenta is a platform for Automated Canary Analysis (ACA). It is used by Spinnaker to enable automated canary deployments. We will also discuss how Spinnaker is used at Netflix to deploy targets beyond cloud VMs and containers --- batch jobs, CDNs, fast properties and Open Connect appliances.
Healthcare data comes in many shapes and sizes making ingestion difficult for a variety of batch and near real time use cases. By Cerner evolving its architecture to adopt Apache Kafka, Cerner was able to build a modular architecture for current and future use cases. Reviewing the evolution of Cerner’s uses, developers can help to avoid mistakes and set themselves up for success.
Keystone processes over 1 trillion events per day with at-least once processing semantics in the cloud. We will explore in detail how we have modified and leverage Kafka, Samza, Docker, and Linux at scale to implement a multi-tenant pipeline in the Amazon AWS cloud within a year.
Most Cassandra usages take advantage of its exceptional performance and ability to handle massive data sets. At PagerDuty, we use Cassandra for entirely different reasons: to reliably manage mutable application states and to maintain durability requirements even in the face of full data center outages. We achieve this by deploying Cassandra clusters with hosts in multiple WAN-separated data centers, configured with per-data center replica placement requirements, and with significant application-level support to use Cassandra as a consistent datastore. Accumulating several years of experience with this approach, we've learned to accommodate the impact of WAN network latency on Cassandra queries, how to horizontally scale while maintaining our placement invariants, why asymmetric load is experienced by nodes in different data centers, and more. This talk will go over our workload and design goals, detail the resultant Cassandra system design, and explain a number of our unintuitive operational learnings about this novel Cassandra usage paradigm.
Streaming in Practice - Putting Apache Kafka in Productionconfluent
This presentation focuses on how to integrate all these components into an enterprise environment and what things you need to consider as you move into production.
We will touch on the following topics:
- Patterns for integrating with existing data systems and applications
- Metadata management at enterprise scale
- Tradeoffs in performance, cost, availability and fault tolerance
- Choosing which cross-datacenter replication patterns fit with your application
- Considerations for operating Kafka-based data pipelines in production
Introduction To Streaming Data and Stream Processing with Apache Kafkaconfluent
Modern businesses have data at their core, and this data is changing continuously. How can we harness this torrent of continuously changing data in real time? The answer is stream processing, and one system that has become a core hub for streaming data is Apache Kafka.
This presentation will give a brief introduction to Apache Kafka and describe its usage as a platform for streaming data. It will explain how Kafka serves as a foundation for both streaming data pipelines and applications that consume and process real-time data streams. It will introduce some of the newer components of Kafka that help make this possible, including Kafka Connect, a framework for capturing continuous data streams, and Kafka Streams, a lightweight stream processing library.
This is talk 1 out of 6 from the Kafka Talk Series.
http://www.confluent.io/apache-kafka-talk-series/introduction-to-stream-processing-with-apache-kafka
A talk given on 2018-06-16 in HK Open Source Conference 2018.
The rise of the Apache Kafka starts a new generation of data pipeline - the stream-processing pipeline.
In this talk, Dr. Mole Wong will walk you through the concept of the stream-processing data pipeline, and how this data pipeline can be set up. He will also discuss the use cases of such a data pipeline.
Using Kubernetes to deliver a “serverless” serviceDoKC
Serverless promises to change the way we consume software. It allows us to potentially pay for only that which we use and can help drive down operational costs to the minimal amount of resources necessary.
Architecting for serverless requires a unique look at app logic and the way it is deployed. It takes a combination of the logical and physical worlds. An architectural pattern has emerged where we can scale ephemeral compute separate from services that need to persist.
We use Kubernetes to deliver exactly this. A “serverless” experience that is driven and enabled by compute pods and storage pods. We also have used our experience running thousands of database clusters on Kubernetes to automate the operational expertise of managing a distributed database.
In this talk, we will take a dive deep into the architecture of our application and share:
* A definition and outline of the challenges of serverless
* How we reworked our logic for a serverless approach
* How we use Kubernetes to gain serverless autoscaling
This talk was given by Jim Walker for DoK Day Europe @ KubeCon 2022.
Neutron Done the SDN Way
Dragonflow is an open source distributed control plane implementation of Neutron which is an integral part of OpenStack. Dragonflow introduces innovative solutions and features to implement networking and distributed network services in a manner that is both lightweight and simple to extend, yet targeted towards performance-intensive and latency-sensitive applications. Dragonflow aims at solving the performance
Patterns and Pains of Migrating Legacy Applications to KubernetesQAware GmbH
Open Source Summit 2018, Vancouver (Canada): Talk by Josef Adersberger (@adersberger, CTO at QAware), Michael Frank (Software Architect at QAware) and Robert Bichler (IT Project Manager at Allianz Germany)
Abstract:
Running applications on Kubernetes can provide a lot of benefits: more dev speed, lower ops costs and a higher elasticity & resiliency in production. Kubernetes is the place to be for cloud-native apps. But what to do if you’ve no shiny new cloud-native apps but a whole bunch of JEE legacy systems? No chance to leverage the advantages of Kubernetes? Yes you can!
We’re facing the challenge of migrating hundreds of JEE legacy applications of a German blue chip company onto a Kubernetes cluster within one year.
The talk will be about the lessons we've learned - the best practices and pitfalls we've discovered along our way.
Patterns and Pains of Migrating Legacy Applications to KubernetesJosef Adersberger
Running applications on Kubernetes can provide a lot of benefits: more dev speed, lower ops costs, and a higher elasticity & resiliency in production. Kubernetes is the place to be for cloud native apps. But what to do if you’ve no shiny new cloud native apps but a whole bunch of JEE legacy systems? No chance to leverage the advantages of Kubernetes? Yes you can!
We’re facing the challenge of migrating hundreds of JEE legacy applications of a German blue chip company onto a Kubernetes cluster within one year.
The talk will be about the lessons we've learned - the best practices and pitfalls we've discovered along our way.
Big Data LDN 2018: CHARTING ZERO LATENCY FUTURE WITH REDISMatt Stubbs
Date: 13th November 2018
Location: Fast Data Theatre
Time: 10:30 - 11:00
Speaker: Manish Gupta
Organisation: Redis Labs
About: We live in a world of instant expectations and the technologies underpinning our applications must meet this new demand. This session reviews the mega trends influencing modern application architecture. The talk will discuss use of elements within Redis, the fastest in-memory multi-model database, to support use cases like recommendations engine and personalization that rely on a combination of high-speed analytics and transactions occurring at the same time. Data structures for simultaneous transaction and analytics, probabilistic counting mechanisms, and adaptable machine learning models, will be explored. Real world examples and customer implementations will also be shared. Cost-effectiveness and efficiency of database operations will also be covered in the context of practical enterprise requirements to compete in today's big data world.
Supporting Hadoop in containers takes much more than the very primitive support Docker provides using the Storage Plugin. A production scale Hadoop deployment inside containers needs to honor anti/affinity, fault-domain and data-locality policies. Kubernetes alone, with primitives such as StatefulSets and PersitentVolumeClaims, is not sufficient to support a complex data-heavy application such as Hadoop. One needs to think about this problem more holistically across containers, networking and storage stacks. Also, constructs around deployment, scaling, upgrade etc in traditional orchestration platforms is designed for applications that have adopted a microservices philosophy, which doesn't fit most Big Data applications across the ingest, store, process, serve and visualization stages of the pipeline. Come to this technical session to learn how to run and manage lifecycle of containerized Hadoop and other applications in the data analytics pipeline efficiently and effectively, far and beyond simple container orchestration. #BigData, #NoSQL, #Hortonworks, #Cloudera, #Kafka, #Tensorflow, #Cassandra, #MongoDB, #Kudu, #Hive, #HBase, PARTHA SEETALA, CTO, Robin Systems.
AWS re:Invent 2016: Moving Mission Critical Apps from One Region to Multi-Reg...Amazon Web Services
In gaming, low latencies and connectivity are bare minimum expectations users have while playing online on PlayStation Network. Alex and Dustin share key architectural patterns to provide low latency, multi-region services to global users. They discuss the testing methodologies and how to programmatically map out a large dependency multi-region deployment with data-driven techniques. The patterns shared show how to adapt to changing bottlenecks and sudden, several million request spikes. You’ll walk away with several key architectural patterns that can service users at global scale while being mindful of costs.
The Crown Jewels: Is Enterprise Data Ready for the Cloud?Inside Analysis
The Briefing Room with Dr. Robin Bloor and NuoDB
Live Webcast on March 25, 2014
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=ac6cb15c0aaaa6d044784969e4187696
Enterprise organizations are already deeply embedded in the cloud, whether it’s via Salesforce.com for customer relationship management or Marketo for marketing and lead generation. But frequently the most significant impediment to moving the crown jewels of corporate data to the cloud is the database. A cloud database must be secure, flexible enough to solve a variety of problems, easy to automate and administer, and able to run in multiple cloud data centers simultaneously. Plus, it should be consistently resilient in the face of failure, not to mention cost-effective, just like the cloud itself.
Register for this episode of The Briefing Room to hear veteran Analyst Dr. Robin Bloor as he explains how cloud deployments are the inevitable next step for information management. He will be joined by Jim Starkey, co-founder of NuoDB, who will discuss the common reasons enterprises shy away from leveraging a database in the cloud, as well as how next generation DBMS, purpose-built for the cloud, can create strategic organizational advantage.
Visit InsideAnlaysis.com for more information.
In this slide deck, we go exploring the database landscape today and the common lego blocks that are used to build these different falvours of databses. We will dive through internals of a database, explore some choices and towards the end also explore some real world database architectures in view of the concepts (legos) we explored earlier.
DataStax C*ollege Credit: What and Why NoSQL?DataStax
In the first of our bi-weekly C*ollege Credit series Aaron Morton, DataStax MVP for Apache Cassandra and Apache Cassandra committer and Robin Schumacher, VP of product management at DataStax, will take a look back at the history of NoSQL databases and provide a foundation of knowledge for people looking to get started with NoSQL, or just wanting to learn more about this growing trend. You will learn how to know that NoSQL is right for your application, and how to pick a NoSQL database. This webinar is C* 101 level.
Webinar Slides: Geo-Distributed MySQL Clustering Done Right!Continuent
With Multiple Active Primary MySQL Databases
Watch this on-demand webinar to learn the right way to deploy geo-distributed databases. We look at the pitfalls of deploying a single site and passive sites, and from there we show how to provide the best user experience by leveraging geo-distributed MySQL.
When considering geo-distributed MySQL database environments it is important to understand the nuances of having multiple active clusters deployed across sites and clouds. This webinar walks through the proper planning of geo-distributed MySQL for success.
Finally, you’ll learn about our best practices for multiple primary clusters, as well as failover and disaster recovery for MySQL.
AGENDA
- Why Geo-Distributed Databases
- Geo-Distributed MySQL Starts With High Performance Local Clusters
- Extend The Cluster To Multiple Datacenters/Clouds
- Best Practices For Multiple Primary Clusters
- Failover & Disaster Recovery
- Key Benefits
PRESENTER
Matthew Lang, Customer Success Director – Americas, Continuent, has over 25 years of experience in database administration, database programming, and system architecture, including the creation of a database replication product that is still in use today. He has designed highly available, scaleable systems that have allowed startups to quickly become enterprise organizations, utilizing a variety of technologies including open source projects, virtualization and cloud.
Using Software-Defined WAN implementation to turn on advanced connectivity se...RedHatTelco
A presentation given by Red Hat and Juniper at OpenStack Summit Boston 2017 on May 7, 2017
How OpenStack enable SDN and NFV to easily work together
https://www.openstack.org/videos/boston-2017/using-software-defined-wan-implementation-to-turn-on-advanced-connectivity-services-in-openstack
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterContinuent
Galera Cluster vs. Continuent Tungsten Clusters
Building a Geo-Scale, Multi-Region and Highly Available MySQL Cloud Back-End
This second installment of our High Noon series of on-demand webinars is focused on Galera Cluster (including MariaDB Cluster & Percona XtraDB Cluster). It looks at some of the key characteristics of Galera Cluster and how it fares as a MySQL HA / DR / Geo-Scale solution, especially when compared to Continuent Tungsten Clustering.
Watch this webinar to learn how to do better MySQL HA / DR / Geo-Scale.
AGENDA
- Goals for the High Noon Webinar Series
- High Noon Series: Tungsten Clustering vs Others
- Galera Cluster (aka MariaDB Cluster & Percona XtraDB Cluster)
- Key Characteristics
- Certification-based Replication
- Galera Multi-Site Requirements
- Limitations Using Galera Cluster
- How to do better MySQL HA / DR / Geo-Scale?
- Galera Cluster vs Tungsten Clustering
- About Continuent & Its Solutions
PRESENTER
Matthew Lang - Customer Success Director – Americas, Continuent - has over 25 years of experience in database administration, database programming, and system architecture, including the creation of a database replication product that is still in use today. He has designed highly available, scaleable systems that have allowed startups to quickly become enterprise organizations, utilizing a variety of technologies including open source projects, virtualization and cloud.
This presentation, given by Dave Rosenthal at NoSQL Now! 2013, presents the case for why he believes NoSQL databases will need to support ACID transactions in order for developers to more easily build, deploy, and scale applications in the future.
Using Kubernetes to deliver a “serverless” serviceDoKC
Link: https://youtu.be/C4rlepOPk5o
https://go.dok.community/slack
https://dok.community/
From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE)
Serverless promises to change the way we consume software. It allows us to potentially pay for only that which we use and can help drive down operational costs to the minimal amount of resources necessary.
Architecting for serverless requires a unique look at app logic and the way it is deployed. It takes a combination of the logical and physical worlds. An architectural pattern has emerged where we can scale ephemeral compute separate from services that need to persist.
We use Kubernetes to deliver exactly this. A “serverless” experience that is driven and enabled by compute pods and storage pods. We also have used our experience running thousands of database clusters on Kubernetes to automate the operational expertise of managing a distributed database.
In this talk, we will take a dive deep into the architecture of our application and share:
* A definition and outline of the challenges of serverless
* How we reworked our logic for a serverless approach
* How we use Kubernetes to gain serverless autoscaling
-----
Jim is a recovering developer turned evangelist who loves useful, cool, cutting-edge tech. He loves to translate and distill complex concepts into compelling, more simple explanations that broader communities can consume. He is an advocate of the developer and an active participant in several open source communities.
ScyllaDB Open Source 5.0 is the latest evolution of our monstrously fast and scalable NoSQL database – powering instantaneous experiences with massive distributed datasets.
Join us to learn about ScyllaDB Open Source 5.0, which represents the first milestone in ScyllaDB V. ScyllaDB 5.0 introduces a host of functional, performance and stability improvements that resolve longstanding challenges of legacy NoSQL databases.
We’ll cover:
- New capabilities including a new IO model and scheduler, Raft-based schema updates, automated tombstone garbage collection, optimized reverse queries, and support for the latest AWS EC2 instances
- How ScyllaDB 5.0 fits into the evolution of ScyllaDB – and what to expect next
- The first look at benchmarks that quantify the impact of ScyllaDB 5.0's numerous optimizations
This will be an interactive session with ample time for Q & A – bring us your questions and feedback!
Similar to O'Reilly Webinar: Simplicity Scales - Big Data (20)
Time Series data is proliferating with literally every step that we take, just think about things like Fit Bit bracelets that track your every move and financial trading data all of which is timestamped.
Time series data requires high performance reads and writes even with a huge number of data sources. Both speed and scale are integral to success, which makes for a unique challenge for your database.
A time series NoSQL data model requires flexibility to support unstructured, and semi-structured data as well as the ability to write range queries to analyze your time series data. So how can you tackle speed, scale and flexibility all at once?
Join Professional Services Architect Drew Kerrigan and Developer Advocate Matt Brender for a discussion of:
Examples of time series data sets, from IoT to Finance to jet engines
What makes time series queries different from other database queries
How to model your dataset to answer the right questions about your data
How to store, query and analyze a set of time series data points
Learn how a NoSQL database model and Riak TS can help you address the unique challenges of time series data.
The Boston Riak had Sean Kelly from Tapjoy digging into message queue infrastructure at the company. They process billions of requests a day and queuing is an important element of that scale.
To kick us off, we discussed the basics of message queues, distributed systems and why dual writes are evil. Here is that talk with a few links to get you started.
This is a presentation by Peter Coppola, VP of Product and Marketing at Basho Technologies and Matthew Aslett, Research Director at 451 Research. Join them as they discuss whether multi-model databases and polyglot persistence have increased operational complexity. They'll discuss the benefits and importance of NoSQL databases and how the Basho Data Platform helps enterprises leverage Big Data applications.
Here's a walkthrough of the set CRDT within Riak and a bucket strategy that leads to Riak being the best choice. You'll see that conflict is inevitable. The set bucket type allows developers to rely on eventually consistency adding up to the data set that we expect.
For more on sets and CRDTs see:
http://basho.com/distributed-data-types-riak-2-0/
http://basho.com/data-modeling-with-riak/
http://docs.basho.com/riak/latest/dev/using/data-types/
Here's an example of how to code with Riak using cURL and ruby to do a basic PUT, GET and more. We then index the data using Apache Solr integration.
No matter what platform we’re discussing, we’re beyond the view of rows and columns. Data is more diverse than ever. More difficult to parse. Here is some of that story.
This is a presentation given by Matt Brender (@mjbrender) at Big Data TechCon 2015.
In this class, we will discuss why companies choose Riak over a relational database with a specific focus on availability, scalability, and the key/value data model. We then analyze the decision points that should be considered when choosing a non-relational solution and review data modeling, querying, and consistency guarantees. Finally, we end with simple patterns for building common applications in Riak using its key/value design, dealing with data conflicts that emerge in an eventually consistent system, and discuss multi-datacenter replication.
Here is Matt Brender's presentation at Big Data TechCon centered on understanding how distributed systems play a role in Big Data.
Full description:
Whether you’re an experienced user of Hadoop or a recent convert to Spark, you recognize that data is powerful when stored and analyzed. Analysis, as a workload, can be contrasted with the initial creation and storage of that data. These “active” workloads are what generate the data we covet.
Understanding this persistence of data as workload requires an appreciation of distributed systems. We will explore what factors affect your choice in database technology and particularly how to prioritize the choice in core architectural underpinnings present in NoSQL designs. We will also explore what these technologies solve and suggestions for how to align them with your business objectives.
You’ll leave this session with an understanding of the basic principles of NoSQL architectural design and a deeper understanding of the considerations when identifying a persistence solution for your active workloads.
Basho and Riak at GOTO Stockholm: "Don't Use My Database."Basho Technologies
What are common use cases for NoSQL? When should I avoid NoSQL? When is RDBMS just fine?
This presentation, delivered at the GOTO NoSQL Roadshow events in London and Stockholm in November of 2011 by Basho co-founder and COO, Antony Falco, take a no-BS look at the tradeoffs one must make to gain the advantages offered by distributed databases like Riak.
6. CHANGE IN ARCHITECTURAL DESIGN
App App App App
Virtualization
Server
App
Aggregation
Server Server Server Server
SMALL APPS
BIG SERVERS
ONE LOCATION
BIG APPS
COMMODITY SERVERS
MANY LOCATIONS
7. In 2014, 20% of
enterprise data projects
add distributed
processes into
production
8. THE BENEFITS OF RIAK
Riak is an operationally friendly database that is:
• Fault-tolerant
• Highly-available
• Scalable
• Self-healing
9. THE PROPERTIES OF A DISTRIBUTED DB
Riak is a multi-model database that is:
• Open Source & Commercial
• Distributed
• Masterless
• Eventually Consistent
11. This is NOT about Riak.
This is about design
decisions in distributed
systems.
12. This IS about Riak.
And learning from
Basho’s architectural
decisions.
13. DISTRIBUTED SYSTEMS – A DEFINITION
“A distributed system is a software
system in which components located
on network computers communicate
and coordinate their actions by passing
messages. The components interact
with each other in order to achieve a
common goal.”
--Wikipedia
14. DISTRIBUTED SYSTEMS – A DEFINITION
“A distributed system is one in which
the failure of a computer you didn't
even know existed can render your
own computer unusable”
--Leslie Lamport
15. DISTRIBUTED SYSTEMS – A DEFINITION
“Everything works at small scale.
Understand failure modalities to
understand your realities.”
--Tyler Hannan
20. HARVEST AND YIELD
Harvest
• a fraction
• data available / complete data
Yield
• a probability
• queries completed / queries requested
Failure will cause known linear reduction to one of
these
33. FAULT TOLERANCE
How many hosts/replicas do you need
to survive “F” failures?
• F + 1 – fundamental minimum
• 2F + 1 – a majority are alive
• 3F + 1 – Byzantine Fault Tolerance
55. DISTRIBUTED SYSTEMS – A DEFINITION
“Everything works at small scale.
Understand failure modalities to
understand your realities.”
--Tyler Hannan
56. CV CV
NoSQL
Database
Unstructured Data
No pre-defined Schema
Small and Large Data Sets
on Commodity HW
Many Models:
K/V, document store, graph
Variety of Query Methods
RELATIONAL & NOSQL
What’s the difference?
Relational
Database
Structured Data
Defined Schema
Tables with
Rows/Columns
Indexed
w/ Table Joins
SQL
57. THE EVOLUTION OF NOSQL
Unstructured
Data Platforms
Multi-Model
Solutions
Point
Solutions
58. 42% of database decision makers admit they
struggle to manage the NoSQL solutions
deployed in their environments”
Riak
Spark
COMPLEX TECHNOLOGY STACK
64. MILLIONS OF RECORDS
Information requested
and amended more
than 2.6 BILLION
times a year
42 MILLION Summary
Care Records
1.3 BILLION
prescription messages
65. BILLIONS OF MOBILE DEVICES
10 BILLION data
transactions a day –
150,000 a second
Forecasting 2.8
BILLION locations
around the world
Generates 4GB OF
DATA every second
“IF THE SYSTEM IS ‘DOWN’ AND NO ONE MAKES A REQUEST, IS IT REALLY DOWN?” ~ ME
CAP theorem - Dr. Eric Brewer, Prof UC Berkeley, VP infrastructure google, Basho Board
Consistency
Always return last written value “eventual consistency” seems scary. Is not – DNS Master/Slave RDMBS log shipping Caching layer
Availability
Get a response even if portions of system are down
Partition Tolerance
Not a trade-off
google “coda hale partition tolerance”
Harvest
a fraction
data available / complete data
Yield
a probability
queries completed / queries requested
Transaction log shipping hot/hot backup Datacenter failure vs. node failureMaster/Replica Architecture
Assumption of Transactional Consistency
What happens under failure conditions?
Requests are routed to Nodes using standard load balancing techniques
Under the cover keys are actually stored as as combination of the bucket name and user assigned key value.
Potentially add proxy/client visual
Riak’s uniform distribution and
equal allocation of vnodes to machines
Allows you to think of each machine
being responsible for:
1/Nth data
&
1/Nth performance
Hot spots
Unevenly spread data and request patterns
Resharding is operationally intensive and often manual
Designed for Vertical scale
Cost Considerations a key element of vertical scaling
Sharding
An approach to fault tolerance is to look at hashing algorithms to distribute data across the F instances. This will be a “ring free” presentation…
Latency is largely a function of the speed of light, which is 299,792,458 meters/second in vacuum. This would equate to a latency of 3.33 microseconds for every kilometer of path length. The index of refraction of most fibre optic cables is about 1.5, meaning that light travels about 1.5 times as fast in a vacuum as it does in the cable. This works out to about 4.9 microseconds of latency for every kilometer. In shorter metro networks, the latency performance rises a bit more due to building risers and cross-connects and can bring the latency as high as 5 microseconds per kilometer.
Our understanding of availability translates to the computational systems we build. Some systems use physical temporal clocks and timestamps (even Riak). Time is continuous but we can only represent discrete moments in a computer. Granularity of the discretization is dependent on probable frequency of events. We tend to choose a millisecond which is like plucking a drop of water from a gushing river.
To make matters worse, we know for certain that information does not transmit instantaneously, the speed of light is finite. The distance across this nebula is probably many light-years. To put it on something closer to human scale: the distance between San Francisco and New York at the speed of light is 14 milliseconds, which is a long time in a computer!
Now add in a global footprint. The latency, round trip, from SF to Amsterdam can be as much as 200ms.
In-memory BigTable lookups
–data replicated in two in-memory tables
–issue requests for 1000 keys spread across 100 tablets
–measure elapsed time until data for last key arrives
In-memory BigTable lookups
–data replicated in two in-memory tables
–issue requests for 1000 keys spread across 100 tablets
–measure elapsed time until data for last key arrives
And…more importantly…when?
They built a tool called basho_bench that’s designed for testing out distributed, key-value stores.
It allows users to specify options such as their key distribution, values, etc.
All of our tests used binary representations as integers for keys, and pseudorandom, un-compressible data for values.
In our testing, load generation was spread across three nodes, with one node also acting as a test coordinator.
There were 256 virtual clients spread across said 3 nodes. All benchmarks were run with {mode, max}, a worst case scenario for load generation. On connection termination, the client was configured to immediately reconnect, and retry the op. The benchmark used the protocol buffers client, which maintains a long-lived connection to the Riak cluster. In order to enable us to reason about an elastic cluster, we added a new driver operation, reconnect. In every one of our tests, 1/10000 operations per client driver was going to result in reconnecting a new node.
Each of our instances were of the type n1-standard-16 deployed in the us-central1-f zone. The instances themselves were running the image ‘backports-debian-7-wheezy-v20140904’ All Riak nodes were deployed in the same network. Network load balancing was used to distribute the traffic amongst the Riak nodes, with health checking doing a /ping on the HTTP interface with default healthcheck intervals.
One of the benefits of distributed, Dynamo databases is that cluster expansion is relatively easy. Often times, unexpected load on traditional databases leaves customers, and engineers in a crux, where the entire system is unavailable. Upgrading, and expanding hardware to said databases is typically a tenous, multi-hour offere operation. In this test, we show the effects of taking a fully loaded cluster of 6 nodes, and growing it by 3 nodes, 180 seconds after we started the test.
As you can see in this test, 5 minutes after cluster expansion was initiated, an increased throughput of 27% was realized. On the other hand, update latency took slightly longer to converge, as that was based on rebalancing the Riak_api, and protocol buffers coordination load across the rest of the cluster. During the entire rebalancing period, median latency slightly increased, but handoffs can be throttled, and fewer nodes can be added in a single handoff claimaint handoff quanta, to make this process have fewer side effects.
Node failure is a normal part of life for operators today. This is really where Riak shines. We started the test, and 90 seconds afterwards, we killed 1 of 9 nodes by prompting a forced, immediate shutdown of Riak through a pkill. After 420 seconds, an operator initiated force-remove was activated, and 2100 seconds later, the cluster had converged.
There was only one operation that actually returned in an error, unfortunately that didn’t show up on the graph. The rest of the operations immediately retried, and were successful. Realistically, convergence could have taken up to 2 seconds, given that’s how long Google’s default healthcheck window takes. Node failure is simply a non-event in a properly built, and planned Riak cluster.
Reduce complexity with integrated NoSQL databases, caching, in-memory analytics, and search components
Enhance high availability and fault tolerance across components
Integrate real-time analytics with Apache Spark and Riak KV
Increase application performance with integrated Redis caching and Riak KV
Optimize search with Apache Solr and Riak KV integration
Xfinity
Run app on iOS or Android and program television remotely
User Profile - Stored as JSON document
Metadata – do you have rights to record?
Non-trivial drop in calls to support center ~80%