The document discusses high availability solutions for MariaDB databases. It begins by defining high availability and concepts like Recovery Time Objective (RTO) and Recovery Point Objective (RPO). It then presents different MariaDB and MaxScale architectures that provide high availability, including single node, primary-replica, Galera cluster, and SkySQL solutions. Key aspects covered are automatic failover, load balancing, data filtering, and service level agreements.
Using all of the high availability options in MariaDBMariaDB plc
MariaDB provides a number of high availability options, including replication with automatic failover and multi-master clustering. In this session Wagner Bianchi, Principal Remote DBA, provides a comprehensive overview of the high availability features in MariaDB, highlights their impact on consistency and performance, discusses advanced failover strategies and introduces new features such as casual reads and transparent connection failover.
PoC: Using a Group Communication System to improve MySQL Replication HAUlf Wendel
High Availability solutions for MySQL Replication are either simple to use but introduce a single point of failure or free of pitfalls but complex and hard to use. The Proof-of-Concept sketches a way in the middle. For monitoring a group communication system is embedded into MySQL usng a MySQL plugin which eliminates the monitoring SPOF and is easy to use. Much emphasis is put of the often neglected client side. The PoC shows an architecture in which clients reconfigure themselves dynamically. No client deployment is required.
Introducing Galera Cluster & the Codership Team
Galera Cluster in a nutshell:
True multi-master:
Read & write to any node
* Synchronous replication
* No slave lag
* No integrity issues
* No master-slave failovers or VIP needed
* Multi-threaded slave, no performance penalty
* Automatic node provisioning
Elastic:
Easy scale-out & scale-in, all nodes read-write
Presented at SF Big Analytics Meetup
Online event processing applications often require the ability to ingest, store, dispatch and process events. Until now, supporting all of these needs has required different systems for each task -- stream processing engines, messaging queuing middleware, and pub/sub messaging systems. This has led to the unnecessary complexity for the development of such applications and operations leading to increased barrier to adoption in the enterprises. In this talk, Karthik will outline the need to unify these capabilities in a single system and make it easy to develop and operate at scale. Karthik will delve into how Apache Pulsar was designed to address this need with an elegant architecture. Apache Pulsar is a next generation distributed pub-sub system that was originally developed and deployed at Yahoo and running in production in more than 100+ companies. Karthik will explain how the architecture and design of Pulsar provides the flexibility to support developers and applications needing any combination of queuing, messaging, streaming and lightweight compute for events. Furthermore, he will provide real life use cases how Apache Pulsar is used for event processing ranging from data processing tasks to web processing applications.
Using all of the high availability options in MariaDBMariaDB plc
MariaDB provides a number of high availability options, including replication with automatic failover and multi-master clustering. In this session Wagner Bianchi, Principal Remote DBA, provides a comprehensive overview of the high availability features in MariaDB, highlights their impact on consistency and performance, discusses advanced failover strategies and introduces new features such as casual reads and transparent connection failover.
PoC: Using a Group Communication System to improve MySQL Replication HAUlf Wendel
High Availability solutions for MySQL Replication are either simple to use but introduce a single point of failure or free of pitfalls but complex and hard to use. The Proof-of-Concept sketches a way in the middle. For monitoring a group communication system is embedded into MySQL usng a MySQL plugin which eliminates the monitoring SPOF and is easy to use. Much emphasis is put of the often neglected client side. The PoC shows an architecture in which clients reconfigure themselves dynamically. No client deployment is required.
Introducing Galera Cluster & the Codership Team
Galera Cluster in a nutshell:
True multi-master:
Read & write to any node
* Synchronous replication
* No slave lag
* No integrity issues
* No master-slave failovers or VIP needed
* Multi-threaded slave, no performance penalty
* Automatic node provisioning
Elastic:
Easy scale-out & scale-in, all nodes read-write
Presented at SF Big Analytics Meetup
Online event processing applications often require the ability to ingest, store, dispatch and process events. Until now, supporting all of these needs has required different systems for each task -- stream processing engines, messaging queuing middleware, and pub/sub messaging systems. This has led to the unnecessary complexity for the development of such applications and operations leading to increased barrier to adoption in the enterprises. In this talk, Karthik will outline the need to unify these capabilities in a single system and make it easy to develop and operate at scale. Karthik will delve into how Apache Pulsar was designed to address this need with an elegant architecture. Apache Pulsar is a next generation distributed pub-sub system that was originally developed and deployed at Yahoo and running in production in more than 100+ companies. Karthik will explain how the architecture and design of Pulsar provides the flexibility to support developers and applications needing any combination of queuing, messaging, streaming and lightweight compute for events. Furthermore, he will provide real life use cases how Apache Pulsar is used for event processing ranging from data processing tasks to web processing applications.
Training Slides: Basics 102: Introduction to Tungsten ClusteringContinuent
This 30 minutes training session provides an introduction to how Tungsten Clustering for MySQL / MariaDB / Percona Server works, its basic principles, understanding Tungsten Clustering topologies, failover, rolling maintenance and related tools.
AGENDA
- Review the key benefits offered by Tungsten Clustering
- Examine the Tungsten Clustering architecture
- Tungsten Cluster Topologies for MySQL High Availability and Disaster Recovery
- Composite vs Multi-Site/Multi-Master
- Review automatic and manual failover
- Explore the concepts of a rolling maintenance procedure
- Study key resources to monitor and manage the cluster
Using Apache Spark to analyze large datasets in the cloud presents a range of challenges. Different stages of your pipeline may be constrained by CPU, memory, disk and/or network IO. But what if all those stages have to run on the same cluster? In the cloud, you have limited control over the hardware your cluster runs on.
You may have even less control over the size and format of your raw input files. Performance tuning is an iterative and experimental process. It’s frustrating with very large datasets: what worked great with 30 billion rows may not work at all with 400 billion rows. But with strategic optimizations and compromises, 50+ TiB datasets can be no big deal.
By using Spark UI and simple metrics, explore how to diagnose and remedy issues on jobs:
Sizing the cluster based on your dataset (shuffle partitions)
Ingestion challenges – well begun is half done (globbing S3, small files)
Managing memory (sorting GC – when to go parallel, when to go G1, when offheap can help you)
Shuffle (give a little to get a lot – configs for better out of box shuffle) – Spill (partitioning for the win)
Scheduling (FAIR vs FIFO, is there a difference for your pipeline?)
Caching and persistence (it’s the cost of doing business, so what are your options?)
Fault tolerance (blacklisting, speculation, task reaping)
Making the best of a bad deal (skew joins, windowing, UDFs, very large query plans)
Writing to S3 (dealing with write partitions, HDFS and s3DistCp vs writing directly to S3)
Presented at Spark+AI Summit Europe 2019
https://databricks.com/session_eu19/apache-spark-at-scale-in-the-cloud
Using Apache Spark to analyze large datasets in the cloud presents a range of challenges. Different stages of your pipeline may be constrained by CPU, memory, disk and/or network IO. But what if all those stages have to run on the same cluster? In the cloud, you have limited control over the hardware your cluster runs on.
You may have even less control over the size and format of your raw input files. Performance tuning is an iterative and experimental process. It’s frustrating with very large datasets: what worked great with 30 billion rows may not work at all with 400 billion rows. But with strategic optimizations and compromises, 50+ TiB datasets can be no big deal.
By using Spark UI and simple metrics, explore how to diagnose and remedy issues on jobs:
Sizing the cluster based on your dataset (shuffle partitions)
Ingestion challenges – well begun is half done (globbing S3, small files)
Managing memory (sorting GC – when to go parallel, when to go G1, when offheap can help you)
Shuffle (give a little to get a lot – configs for better out of box shuffle) – Spill (partitioning for the win)
Scheduling (FAIR vs FIFO, is there a difference for your pipeline?)
Caching and persistence (it’s the cost of doing business, so what are your options?)
Fault tolerance (blacklisting, speculation, task reaping)
Making the best of a bad deal (skew joins, windowing, UDFs, very large query plans)
Slow things down to make them go faster [FOSDEM 2022]Jimmy Angelakos
Talk from FOSDEM 2022
It's easy to get misled into overconfidence based on the performance of powerful servers, given today's monster core counts and RAM sizes. However, the reality of high concurrency usage is often disappointing, with less throughput than one would expect. Because of its internals and its multi-process architecture, PostgreSQL is very particular about how it likes to deal with high concurrency and in some cases it can slow down to the point where it looks like it's not performing as it should. In this talk we'll take a look at potential pitfalls when you throw a lot of work at your database. Specifically, very high concurrency and resource contention can cause problems with lock waits in Postgres. Very high transaction rates can also cause problems of a different nature. Finally, we will be looking at ways to mitigate these by examining our queries and connection parameters, leveraging connection pooling and replication, or adapting the workload.
Topics:
1. Understand what we mean by high concurrency.
2. Understand ACID & MVCC in Postgres.
3. Understand how high concurrency affects Postgres performance.
4. Understand how locks/latches affect Postgres performance.
5. Understand how high transaction rates can affect Postgres.
6. Mitigation strategies for high concurrency scenarios.
1. If it’s not SQL, it’s not a database.
2. It takes 5+ years to build a database.
3. Listen to your users.
4. Too much magic is a bad thing.
5. It’s the cloud, stupid.
A presentation about how to make MySQL highly available, presented at the San Francisco MySQL Meetup (http://www.sfmysql.org/events/15760472/) on January 26th, 2011.
A video recording of this presentation is available from Ustream: http://ustre.am/fyLk
Why new hardware may not make Oracle databases fasterSolarWinds
How can you know if hardware is the right answer to your Oracle database performance issues? How can you know for sure which hardware components will have the biggest impact? As a DBA or database developer, you should know that you can gain significant performance improvements without the time, money and risk associated with providing the latest server or flash storage array.
Learn why new hardware may not make your Oracle database faster and what you can do instead.
Running Dataproc At Scale in production - Searce Talk at GDG DelhiSearce Inc
From the collective experience of helping multiple customers run hundreds of dataproc clusters in production, scale them, troubleshoot issues, our engineers Manan and Rohit talk about running Dataproc At Scale in Production.
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messagesLINE Corporation
Yuto Kawamura
LINE / Z Part Team
At LINE we've been operating Apache Kafka to provide the company-wide shared data pipeline for services using it for storing and distributing data.
Kafka is underlying many of our services in some way, not only the messaging service but also AD, Blockchain, Pay, Timeline, Cryptocurrency trading and more.
Many services feeding many data into our cluster, leading over 250 billion daily messages and 3.5GB incoming bytes in 1 second which is one of the world largest scale.
At the same time, it is required to be stable and performant all the time because many important services uses it as a backend.
In this talk I will introduce the overview of Kafka usage at LINE and how we're operating it.
I'm also going to talk about some engineerings we did for maximizing its performance, solving troubles led particularly by hosting huge data from many services, leveraging advanced techniques like kernel-level dynamic tracing.
Since 5.7.2, MySQL implements parallel replication in the same schema, also known as LOGICAL_CLOCK (DATABASE based parallel replication is also implemented in 5.6 but this is not covered in this talk). In early 5.7 versions, parallel replication was based on group commit (like MariaDB) and 5.7.6 changed that to intervals.
Intervals are more complicated but they are also more powerful. In this talk, I will explain in detail how they work and why intervals are better than group commit. I will also cover how to optimize parallel replication in MySQL 5.7 and what improvements are coming in MySQL 8.0.
Training Slides: Basics 102: Introduction to Tungsten ClusteringContinuent
This 30 minutes training session provides an introduction to how Tungsten Clustering for MySQL / MariaDB / Percona Server works, its basic principles, understanding Tungsten Clustering topologies, failover, rolling maintenance and related tools.
AGENDA
- Review the key benefits offered by Tungsten Clustering
- Examine the Tungsten Clustering architecture
- Tungsten Cluster Topologies for MySQL High Availability and Disaster Recovery
- Composite vs Multi-Site/Multi-Master
- Review automatic and manual failover
- Explore the concepts of a rolling maintenance procedure
- Study key resources to monitor and manage the cluster
Using Apache Spark to analyze large datasets in the cloud presents a range of challenges. Different stages of your pipeline may be constrained by CPU, memory, disk and/or network IO. But what if all those stages have to run on the same cluster? In the cloud, you have limited control over the hardware your cluster runs on.
You may have even less control over the size and format of your raw input files. Performance tuning is an iterative and experimental process. It’s frustrating with very large datasets: what worked great with 30 billion rows may not work at all with 400 billion rows. But with strategic optimizations and compromises, 50+ TiB datasets can be no big deal.
By using Spark UI and simple metrics, explore how to diagnose and remedy issues on jobs:
Sizing the cluster based on your dataset (shuffle partitions)
Ingestion challenges – well begun is half done (globbing S3, small files)
Managing memory (sorting GC – when to go parallel, when to go G1, when offheap can help you)
Shuffle (give a little to get a lot – configs for better out of box shuffle) – Spill (partitioning for the win)
Scheduling (FAIR vs FIFO, is there a difference for your pipeline?)
Caching and persistence (it’s the cost of doing business, so what are your options?)
Fault tolerance (blacklisting, speculation, task reaping)
Making the best of a bad deal (skew joins, windowing, UDFs, very large query plans)
Writing to S3 (dealing with write partitions, HDFS and s3DistCp vs writing directly to S3)
Presented at Spark+AI Summit Europe 2019
https://databricks.com/session_eu19/apache-spark-at-scale-in-the-cloud
Using Apache Spark to analyze large datasets in the cloud presents a range of challenges. Different stages of your pipeline may be constrained by CPU, memory, disk and/or network IO. But what if all those stages have to run on the same cluster? In the cloud, you have limited control over the hardware your cluster runs on.
You may have even less control over the size and format of your raw input files. Performance tuning is an iterative and experimental process. It’s frustrating with very large datasets: what worked great with 30 billion rows may not work at all with 400 billion rows. But with strategic optimizations and compromises, 50+ TiB datasets can be no big deal.
By using Spark UI and simple metrics, explore how to diagnose and remedy issues on jobs:
Sizing the cluster based on your dataset (shuffle partitions)
Ingestion challenges – well begun is half done (globbing S3, small files)
Managing memory (sorting GC – when to go parallel, when to go G1, when offheap can help you)
Shuffle (give a little to get a lot – configs for better out of box shuffle) – Spill (partitioning for the win)
Scheduling (FAIR vs FIFO, is there a difference for your pipeline?)
Caching and persistence (it’s the cost of doing business, so what are your options?)
Fault tolerance (blacklisting, speculation, task reaping)
Making the best of a bad deal (skew joins, windowing, UDFs, very large query plans)
Slow things down to make them go faster [FOSDEM 2022]Jimmy Angelakos
Talk from FOSDEM 2022
It's easy to get misled into overconfidence based on the performance of powerful servers, given today's monster core counts and RAM sizes. However, the reality of high concurrency usage is often disappointing, with less throughput than one would expect. Because of its internals and its multi-process architecture, PostgreSQL is very particular about how it likes to deal with high concurrency and in some cases it can slow down to the point where it looks like it's not performing as it should. In this talk we'll take a look at potential pitfalls when you throw a lot of work at your database. Specifically, very high concurrency and resource contention can cause problems with lock waits in Postgres. Very high transaction rates can also cause problems of a different nature. Finally, we will be looking at ways to mitigate these by examining our queries and connection parameters, leveraging connection pooling and replication, or adapting the workload.
Topics:
1. Understand what we mean by high concurrency.
2. Understand ACID & MVCC in Postgres.
3. Understand how high concurrency affects Postgres performance.
4. Understand how locks/latches affect Postgres performance.
5. Understand how high transaction rates can affect Postgres.
6. Mitigation strategies for high concurrency scenarios.
1. If it’s not SQL, it’s not a database.
2. It takes 5+ years to build a database.
3. Listen to your users.
4. Too much magic is a bad thing.
5. It’s the cloud, stupid.
A presentation about how to make MySQL highly available, presented at the San Francisco MySQL Meetup (http://www.sfmysql.org/events/15760472/) on January 26th, 2011.
A video recording of this presentation is available from Ustream: http://ustre.am/fyLk
Why new hardware may not make Oracle databases fasterSolarWinds
How can you know if hardware is the right answer to your Oracle database performance issues? How can you know for sure which hardware components will have the biggest impact? As a DBA or database developer, you should know that you can gain significant performance improvements without the time, money and risk associated with providing the latest server or flash storage array.
Learn why new hardware may not make your Oracle database faster and what you can do instead.
Running Dataproc At Scale in production - Searce Talk at GDG DelhiSearce Inc
From the collective experience of helping multiple customers run hundreds of dataproc clusters in production, scale them, troubleshoot issues, our engineers Manan and Rohit talk about running Dataproc At Scale in Production.
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messagesLINE Corporation
Yuto Kawamura
LINE / Z Part Team
At LINE we've been operating Apache Kafka to provide the company-wide shared data pipeline for services using it for storing and distributing data.
Kafka is underlying many of our services in some way, not only the messaging service but also AD, Blockchain, Pay, Timeline, Cryptocurrency trading and more.
Many services feeding many data into our cluster, leading over 250 billion daily messages and 3.5GB incoming bytes in 1 second which is one of the world largest scale.
At the same time, it is required to be stable and performant all the time because many important services uses it as a backend.
In this talk I will introduce the overview of Kafka usage at LINE and how we're operating it.
I'm also going to talk about some engineerings we did for maximizing its performance, solving troubles led particularly by hosting huge data from many services, leveraging advanced techniques like kernel-level dynamic tracing.
Since 5.7.2, MySQL implements parallel replication in the same schema, also known as LOGICAL_CLOCK (DATABASE based parallel replication is also implemented in 5.6 but this is not covered in this talk). In early 5.7 versions, parallel replication was based on group commit (like MariaDB) and 5.7.6 changed that to intervals.
Intervals are more complicated but they are also more powerful. In this talk, I will explain in detail how they work and why intervals are better than group commit. I will also cover how to optimize parallel replication in MySQL 5.7 and what improvements are coming in MySQL 8.0.
SkySQL is the first and only database-as-a-service (DBaaS) to perform workload analysis with advanced deep learning models, identifying and classifying discrete workload patterns so DBAs can better understand database workloads, identify anomalies and predict changes.
In this session, we’ll explain the concepts behind workload analysis and show how it can be used in the real world (and with sample real-world data) to improve database performance and efficiency by identifying key metrics and changes to cyclical patterns.
SkySQL uses best-of-breed software, and when it comes to metrics and monitoring that means Prometheus and Grafana. SkySQL Monitor is built on both, and provides customers with interactive dashboards for both real-time and historic metrics monitoring. In addition, it meets the same high availability and security requirements as other SkySQL components, ensuring metrics are always available and always secure.
In this session, we’ll explain how SkySQL Monitor works, walk through its dashboards and show how to monitor key metrics for performance and replication.
Introducing the R2DBC async Java connectorMariaDB plc
Not too long ago, a reactive variant of the JDBC driver was released, known as Reactive Relational Database Connectivity (R2DBC for short). While R2DBC started as an experiment to enable integration of SQL databases into systems that use reactive programming models, it now specifies a full-fledged service-provider interface that can be used to retrieve data from a target data source.
In this session, we’ll take a look at the new MariaDB R2DBC connector and examine the advantages of fully reactive, non-blocking development with MariaDB. And, of course, we’ll dive in and get a first-hand look at what it’s like to use the new connector with some live coding!
The capabilities and features of MariaDB Platform continue to expand, resulting in larger and more sophisticated production deployments – and the need for better tools. To provide DBAs with comprehensive, consolidating tooling, we created MariaDB Enterprise Tools: an easy-to-use, modular command-line interface for interacting with any part of MariaDB Platform.
In this session, we will provide a preview of the MariaDB Enterprise Client, walk through current and planned modules and discuss future plans for MariaDB Enterprise Tools – including SkySQL modules and the ability to create custom modules.
Faster, better, stronger: The new InnoDBMariaDB plc
For MariaDB Enterprise Server 10.5, the default transactional storage engine, InnoDB, has been significantly rewritten to improve the performance of writes and backups. Next, we removed a number of parameters to reduce unnecessary complexity, not only in terms of configuration but of the code itself. And finally, we improved crash recovery thanks to better consistency checks and we reduced memory consumption and file I/O thanks to an all new log record format.
In this session, we’ll walk through all of the improvements to InnoDB, and dive deep into the implementation to explain how these improvements help everything from configuration and performance to reliability and recovery.
SkySQL implements a groundbreaking, state-of-the-art architecture based on Kubernetes and ServiceNow, and with a strong emphasis on cloud security – using compartmentalization and indirect access to secure and protect customer databases.
In this session, we’ll walk through the architecture of SkySQL and discuss how MariaDB leverages an advanced Kubernetes operator and powerful ServiceNow configuration/workflow management to deploy and manage databases on cloud infrastructure.
What to expect from MariaDB Platform X5, part 1MariaDB plc
MariaDB Platform X5 will be based on MariaDB Enterprise Server 10.5. This release includes Xpand, a fully distributed storage engine for scaling out, as well as many new features and improvements for DBAs and developers alike, including enhancements to temporal tables, additional JSON functions, a new performance schema, non-blocking schema changes with clustering and a Hashicorp Vault plugin for key management.
In this session, we’ll walk through all of the new features and enhancements available in MariaDB Enterprise Server 10.5. In addition, we will highlight those being backported to maintenance releases of MariaDB Enterprise Server 10.2, 10.3 and 10.4.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
2. 2
High Availability - HA
High availability
(HA) is a characteristic
of a system which aims
to ensure an agreed
level of operational
performance, usually
uptime, for a higher than
normal period.
https://en.wikipedia.org/wiki/High_availability
3. https://upload.wikimedia.org/wikipedia/commons/6/69
/RPO_RTO_example_converted.png
3
RPO RTO
Recovery Time Objective
The Recovery Time Objective (RTO) is the targeted
duration of time and a service level within which a
business process must be restored after a disruption in
order to avoid a break in business continuity.
According to business continuity planning methodology,
the RTO is established during the Business Impact
Analysis (BIA) by the owner(s) of the process, including
identifying time frames for alternate or manual
workarounds.
Recovery Point Objective
A Recovery Point Objective (RPO) is the maximum
acceptable interval during which transactional data is
lost from an IT service.
For example, if RPO is measured in minutes, then in
practice, off-site mirrored backups must be continuously
maintained as a daily off-site backup will not suffice.
https://en.wikipedia.org/wiki/Disaster_recovery
5. 5
Architecture - Single Node
MariaDB
Primary
r/w
Your
Applications Single Node Setup
No failover option
Backup / Restore is key
RPO / RTO define the SLA
6. 6
Architecture - Primary / Replica Setup
MariaDB
Primary
r/w
Your
Applications
Primary / Replica Node Setup
“Manual” failover to Slave
Asynchronous Replication
Semi-synchronous Replication
“Passive” Hardware
Manual failover process defines the SLA
Backup process can run on Slave
MariaDB
Replica
7. 7
Architecture - Primary / Replica Setup
MariaDB
Primary
r/w
Your
Applications
Primary / Replica Node Setup
“Manual” failover to Slave
Asynchronous Replication
Semi-synchronous Replication
Galera Cluster
“Passive” Hardware
Manual failover process defines the SLA
Backup process can run on Slave
MariaDB
Replica
MariaDB
Replica
8. Architecture for high Availability with MaxScale
MariaDB
Primary
MaxScale
MariaDB
Replica
r/w r
Your
Applications
MariaDB MaxScale is an advanced SQL firewall, proxy, router, and load balancer:
• MaxScale performs automated failover for MariaDB replication.
• MaxScale's ReadWriteSplit router performs query-based load balancing.
• MaxScale's Cache filter can improve SELECT performance by caching and
reusing results.
• MaxScale can filter data via Data Masking, with defined patterns
• MaxScale also helps to avoid downtimes or hick-ups with
- Upgrades and Patches
- adding Nodes
- DoS attacks
- SQL Injection
- Security Violations
MariaDB
Replica
r
9. Architecture for high Availability in SkySQL
MariaDB
Primary
MaxScale
MariaDB
Replica
r/w r
Your
Applications
Performance Standard
SkySQL Foundation Tier
• Multi-node configurations will deliver a 99.95%
service availability on a per-billing-month basis.
• For example, with this availability target in a 30 day
calendar month the maximum service downtime is 21
minutes and 54 seconds.
SkySQL Power Tier
• Multi-node configurations will deliver a 99.995%
service availability on a per-billing-month basis.
• For example, with this availability target in a 30 day
calendar month the maximum service downtime is 2
minutes and 11 seconds.
Availability
Zone 2
Availability
Zone 1
MariaDB SkySQL SLA
12. Traditional Setup
12
● Prior to MaxScale 2.5, MaxScale HA required manual intervention
● While all the MaxScale nodes can route queries, read write splitting
and other operations, only the “active” MaxScale node (PASSIVE =
false) could perform automatic failovers.
● In case of the “active” MaxScale goes down, one of the remaining
MaxScale nodes needed to be set to “PASSIVE = false” so that
particular node could handle automatic failover.
● This was usually done with the help of third party tools such as
○ keepalived
○ corosync/pacemaker
13. Typical Recommended Architecture (Traditionally)
13
MaxScale
MaxScale
1
Active
Primary Replica-1
MaxScale
MaxScale
2
Passive
Replica-2
Replication
● Can’t have both MaxScale doing database
Failover
● Must use 3rd Party tools such as KeepaliveD to
control which is the “Active” MaxScale
● Issues for support in case of KeepaliveD failure
● Complex Configuration
● Only One MaxScale can be used for Query
routing
KeepaliveD
Virtual IP
14. Why “Cooperative Locking”
14
● Starting with MaxScale 2.5, Co-op Locking was introduced
● Multiple MaxScale nodes can work together without the need of any
third party component(s)
● MaxScale nodes will seamlessly decide which is the primary
MaxScale and which is not.
○ This is done by a special locking mechanism.
● Primary MaxScale handles the MariaDB failover.
● Two modes to choose from
○ majority_of_running
○ majority_of_all
15. cooperative_monitoring_locks (maxscale.cnf)
15
majority_of_running
● Default in SkySQL if the customer goes for dual MaxScale setup.
● MaxScale node that has the maximum number of locks will become the Primary
● In this mode, the total number of “Running” MariaDB nodes are considered excluding the
nodes that are down.
● Locks required are calculated as
○ Round the result up: n_servers/2 + 1
○ “n_servers” is the total number alive servers in the cluster
○ Consider a 3 nodes cluster
■ All 3 nodes are alive: round(3/2+1) = 2
■ 1 Node goes down: round(2/2+1) = 2
■ 2 Nodes goes down: round(1/2+1) = 1
○ This supports more nodes failure while still being able to do automatic MariaDB
failover.
16. majority_of_running
16
MaxScale
MaxScale
1
Primary Primary
Primary DC DR DC
MaxScale
MaxScale
2
Replica-1
Async Replication
● One nodes go down, the minimum of DB locks required reduced to “2”, it
can be achieved,
● MaxScale 1 is “primary”
● Automatic DB failover remains activated.
17. majority_of_running
17
MaxScale
MaxScale
1
Primary Primary
Primary DC DR DC
MaxScale
MaxScale
2
Primary
● 2 nodes go down, the minimum of DB locks required reduced to “1” which can
be still achieved.
● MaxScale 1 is still the “primary” MaxScale
● Automatic DB failover remains activated.
18. majority_of_running
18
MaxScale
MaxScale
1
Primary Primary
Primary DC DR DC
MaxScale
MaxScale
2
● Entire Data Center goes down
● The minimum of DB locks required is
reduced to “1” which can be still achieved.
● MaxScale 3 becomes “primary”
● Automatic DB failover remains activated.
Primary
19. cooperative_monitoring_locks (maxscale.cnf)
19
majority_of_running
● Can cause split-brain (Multiple MaxScale nodes becoming primary!)
○ Consider a Primary / DR setup
○ In case of a network partition between the two data centers, both
MaxScale on each data center will become “Primary” as they can’t
see the other side DB nodes.
○ This leads to Two “Primary” MariaDB servers running on each data
center!
○ Unlikely scenario but keep this in mind.
20. majority_of_running
20
MaxScale
MaxScale
1
Primary Replica-1
Primary DC DR DC
MaxScale
MaxScale
2
Primary
● Network between the two data centers is LOST
● The MaxScale nodes can only see the DB nodes within their own data centers
● “majority_of_running” rule applies and minimum of locks required is reduced to 2 for DC and is reduced to 1 for DR
● Split-Brain! We now have two “primary” MaxScale nodes!
● The new “primary” MaxScale node in DR, promotes one of the Replica as “Primary DB”
● Two Primary DB nodes running, one on each DC creating data inconsistency!
Async Replication
21. cooperative_monitoring_locks (maxscale.cnf)
21
majority_of_all
● In this mode, all the nodes are considered
● MaxScale node that has the maximum number of locks will become the Primary
● Locks required are calculated as
○ Round the result up: n_servers/2 + 1
○ “n_servers” is the total number of MariaDB servers in the cluster
○ In case of
■ 3 nodes setup, the locks required by MaxScale is round(3/2+1) = 2
■ 7 nodes setup, the locks required by MaxScale is round(7/2+1) = 4
● If too many MariaDB nodes going down at the same time, none of the
MaxScale nodes will be able to get the minimum number of locks required.
○ Consider, total of 3 backend servers, if 2 nodes go down, the minimum of
required locks, “2”, can’t be achieved
○ No automatic failover.
○ Minimum of “n_servers/2 + 1” must be alive for the automatic failover to
work
22. majority_of_all
22
MaxScale
MaxScale
1
Primary Replica-1
Primary DC DR DC
MaxScale
MaxScale
2
Replica-2
Async Replication
● Locks required are (round up of) 3/2+1 = 2
● MaxScale 1 has the max locks for instance, it becomes “primary”
● Other three MaxScale noes are “secondary”
23. majority_of_all
23
MaxScale
MaxScale
1
Primary Primary
Primary DC DR DC
MaxScale
MaxScale
2
Replica-2
Async Replication
● One nodes go down, the minimum of DB locks required “2” can be achieved,
● MaxScale 1 is still “primary”
● It is possible, another MaxScale node becomes primary, but only one.
24. majority_of_all
24
MaxScale
MaxScale
1
Primary Primary
Primary DC DR DC
MaxScale
MaxScale
2
Replica-2
● 2 nodes go down, the minimum of DB locks required “2” can no longer be be achieved!
● All the MaxScale nodes become “secondary”, Automatic failover is disabled.
26. majority_of_all
26
MaxScale
MaxScale
1
Primary Replica-1
Primary DC DR DC (Read-Only)
MaxScale
MaxScale
2
Replica-2
● Network between the two data centers is broken, all the MaxScale nodes on DC can
acquire 2 locks each which the same as minimum requirement of “2”
● DC still MaxScale still can do automatic failover.
● But the DR MaxScale can only get lock on “1” node, it’s automatic failover is disabled.
27. 27
Architecture - higher Availability Options
MariaDB
Primary
MaxScale
r/w
MariaDB
Replica
r
Your
Applications
MariaDB
Replica
MaxScale
MariaDB
Replica
r r
MariaDB
Replica
r
Datacenter 2
Datacenter 1
28. 28
MaxScale config_sync_cluster
When configuring MaxScale synchronization for the first time, the same static configuration files should be used on all MaxScale
instances that use same cluster value of “config_sync_cluster” must be the same on all MaxScale instances and the cluster (i.e.
the monitor) pointed by it and its servers must be the same in every configuration.
31. 31
Xpand - the distributed OLTP Database
Transactional
Distributed SQL
Full Elasticity (↑↓)
Read/Write Scale
32. 32
Xpand - the distributed OLTP Database
When you run a
distributed Database, you
always think about:
- Data Distribution
- Data Replication
- Skewing
- Shared-nothing
- Distributed SQL
- Data locality
- GEO-Distribution
- read/write
performance
- etc. ….
49. Serves Multiple Problem Domains
49
High Volume, Fast, Parallel
Asynchronous Replication
Active-Active
topology
Passive Standby
for Disaster Recovery
Daisy chain
replication to multiple regions
for global access