SolrCloud uses Zookeeper to elect a leader node for each shard. The leader coordinates write requests to ensure consistency. When the leader dies, Zookeeper detects this and elects a new leader based on the nodes' sequence numbers registered with Zookeeper. The new leader syncs updates with replicas and can replay logs if any replicas are too far behind. This allows write requests to continue being served with high availability despite leader failures.
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Henning Jacobs
Kubernetes has the concept of resource requests and limits. Pods get scheduled on the nodes based on their requests and optionally limited in how much of the resource they can consume. Understanding and optimizing resource requests/limits is crucial both for reducing resource "slack" and ensuring application performance/low-latency. This talk shows our approach to monitoring and optimizing Kubernetes resources for 80+ clusters to achieve cost-efficiency and reducing impact for latency-critical applications. All shown tools are Open Source and can be applied to most Kubernetes deployments.
Gerrit 3.0 it out! If you‘ve tried 2.16, you may have already seen the new database backend (NoteDb) and UI (PolyGerrit). With 2.16 and 3.0, these features are better than ever. In Gerrit 3.0, we’ve deleted thousands and thousands of lines from Gerrit's codebase.
Join me for a tour of new features you can use today, and a discussion about even more things to look forward to once we've freed ourselves from the burden of some of our legacy code.
An inroduction to Terraform, a tool that helps you deploy and change your infrastructure as code. Given at Rencontres Mondiales du Logiciel libre (RMLL) 2017
Integrating icinga2 and the HashiCorp suiteBram Vogelaar
We all love infrastructure as code, we automate everything ™ but how many
of us can really say we could destroy and recreate our core infrastructure
without human intervention. Can you be sure there isnt a DNS problem or
that all the things ™ are done in the right order This talk walks the
audience through a green fields exercise that sets up service discovery
using Consul, infrastructure as code using terraform, using images build
with packer and configured using puppet.
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Henning Jacobs
Kubernetes has the concept of resource requests and limits. Pods get scheduled on the nodes based on their requests and optionally limited in how much of the resource they can consume. Understanding and optimizing resource requests/limits is crucial both for reducing resource "slack" and ensuring application performance/low-latency. This talk shows our approach to monitoring and optimizing Kubernetes resources for 80+ clusters to achieve cost-efficiency and reducing impact for latency-critical applications. All shown tools are Open Source and can be applied to most Kubernetes deployments.
Gerrit 3.0 it out! If you‘ve tried 2.16, you may have already seen the new database backend (NoteDb) and UI (PolyGerrit). With 2.16 and 3.0, these features are better than ever. In Gerrit 3.0, we’ve deleted thousands and thousands of lines from Gerrit's codebase.
Join me for a tour of new features you can use today, and a discussion about even more things to look forward to once we've freed ourselves from the burden of some of our legacy code.
An inroduction to Terraform, a tool that helps you deploy and change your infrastructure as code. Given at Rencontres Mondiales du Logiciel libre (RMLL) 2017
Integrating icinga2 and the HashiCorp suiteBram Vogelaar
We all love infrastructure as code, we automate everything ™ but how many
of us can really say we could destroy and recreate our core infrastructure
without human intervention. Can you be sure there isnt a DNS problem or
that all the things ™ are done in the right order This talk walks the
audience through a green fields exercise that sets up service discovery
using Consul, infrastructure as code using terraform, using images build
with packer and configured using puppet.
Presentation at Strata Data Conference 2018, New York
The controller is the brain of Apache Kafka. A big part of what the controller does is to maintain the consistency of the replicas and determine which replica can be used to serve the clients, especially during individual broker failure.
Jun Rao outlines the main data flow in the controller—in particular, when a broker fails, how the controller automatically promotes another replica as the leader to serve the clients, and when a broker is started, how the controller resumes the replication pipeline in the restarted broker.
Jun then describes recent improvements to the controller that allow it to handle certain edge cases correctly and increase its performance, which allows for more partitions in a Kafka cluster.
This talk discusses the core concepts behind the Kubernetes extensibility model. We are going to see how to implement new CRDs, operators and when to use them to automate the most critical aspects of your Kubernetes clusters.
We believe that the popularity of Kubernetes derives from its ability to adapt and improve the infrastructure in which is deployed. I'll explain how this is done
Helm helps you manage Kubernetes applications — Helm Charts help you define, install, and upgrade even the most complex Kubernetes application.
https://thinkcloudly.com/
URP? Excuse You! The Three Kafka Metrics You Need to KnowTodd Palino
What do you really know about how to monitor a Kafka cluster for problems? Is your most reliable monitoring your users telling you there’s something broken? Are you capturing more metrics than the actual data being produced? Sure, we all know how to monitor disk and network, but when it comes to the state of the brokers, many of us are still unsure of which metrics we should be watching, and what their patterns mean for the state of the cluster. Kafka has hundreds of measurements, from the high-level numbers that are often meaningless to the per-partition metrics that stack up by the thousands as our data grows.
We will thoroughly explore three key monitoring concepts in the broker, that will leave you an expert in identifying problems with the least amount of pain:
Under-replicated Partitions: The mother of all metrics
Request Latencies: Why your users complain
Thread pool utilization: How could 80% be a problem?
We will also discuss the necessity of availability monitoring and how to use it to get a true picture of what your users see, before they come beating down your door!
How to test infrastructure code: automated testing for Terraform, Kubernetes,...Yevgeniy Brikman
This talk is a step-by-step, live-coding class on how to write automated tests for infrastructure code, including the code you write for use with tools such as Terraform, Kubernetes, Docker, and Packer. Topics covered include unit tests, integration tests, end-to-end tests, test parallelism, retries, error handling, static analysis, and more.
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...ShapeBlue
Having High Availability enabled for KVM Hosts can improve greatly the QoS by handling (fence/recover) a problematic Host as well as re-starting its stopped VMs on healthy hosts. However, there is a limitation on CloudStack HA for KVM; it relies mainly on NFS heartbeat script checks. This Talk illustrates how CloudStack HA works for KVM hosts and it presents a way of improving its implementation in a way that KVM HA works with any storage system pluggable on KVM, not just NFS.
About Gabriel Brasher - https://blogs.apache.org/cloudstack/
------------------------------------------
CloudStack European User Group Virtual happened on May 27th. The first CSEUG Virtual proved to be a huge success. It collected people from 23 countries – Germany, the United Kingdom, Switzerland, India, Bulgaria, Greece, Poland, Serbia, Brazil, Chile, Russia, USA, Canada, Japan, France, Uruguay, Korea …
We also had a record number of registrations and attendees for a CloudStack User Group Event. The physical distance was not a stopper for our speakers, who joined the event from 6 different countries.
------------------------------------------
About CloudStack: https://cloudstack.apache.org/
Solr Exchange: Introduction to SolrCloudthelabdude
SolrCloud is a set of features in Apache Solr that enable elastic scaling of search indexes using sharding and replication. In this presentation, Tim Potter will provide an architectural overview of SolrCloud and highlight its most important features. Specifically, Tim covers topics such as: sharding, replication, ZooKeeper fundamentals, leaders/replicas, and failure/recovery scenarios. Any discussion of a complex distributed system would not be complete without a discussion of the CAP theorem. Mr. Potter will describe why Solr is considered a CP system and how that impacts the design of a search application.
Presentation at Strata Data Conference 2018, New York
The controller is the brain of Apache Kafka. A big part of what the controller does is to maintain the consistency of the replicas and determine which replica can be used to serve the clients, especially during individual broker failure.
Jun Rao outlines the main data flow in the controller—in particular, when a broker fails, how the controller automatically promotes another replica as the leader to serve the clients, and when a broker is started, how the controller resumes the replication pipeline in the restarted broker.
Jun then describes recent improvements to the controller that allow it to handle certain edge cases correctly and increase its performance, which allows for more partitions in a Kafka cluster.
This talk discusses the core concepts behind the Kubernetes extensibility model. We are going to see how to implement new CRDs, operators and when to use them to automate the most critical aspects of your Kubernetes clusters.
We believe that the popularity of Kubernetes derives from its ability to adapt and improve the infrastructure in which is deployed. I'll explain how this is done
Helm helps you manage Kubernetes applications — Helm Charts help you define, install, and upgrade even the most complex Kubernetes application.
https://thinkcloudly.com/
URP? Excuse You! The Three Kafka Metrics You Need to KnowTodd Palino
What do you really know about how to monitor a Kafka cluster for problems? Is your most reliable monitoring your users telling you there’s something broken? Are you capturing more metrics than the actual data being produced? Sure, we all know how to monitor disk and network, but when it comes to the state of the brokers, many of us are still unsure of which metrics we should be watching, and what their patterns mean for the state of the cluster. Kafka has hundreds of measurements, from the high-level numbers that are often meaningless to the per-partition metrics that stack up by the thousands as our data grows.
We will thoroughly explore three key monitoring concepts in the broker, that will leave you an expert in identifying problems with the least amount of pain:
Under-replicated Partitions: The mother of all metrics
Request Latencies: Why your users complain
Thread pool utilization: How could 80% be a problem?
We will also discuss the necessity of availability monitoring and how to use it to get a true picture of what your users see, before they come beating down your door!
How to test infrastructure code: automated testing for Terraform, Kubernetes,...Yevgeniy Brikman
This talk is a step-by-step, live-coding class on how to write automated tests for infrastructure code, including the code you write for use with tools such as Terraform, Kubernetes, Docker, and Packer. Topics covered include unit tests, integration tests, end-to-end tests, test parallelism, retries, error handling, static analysis, and more.
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...ShapeBlue
Having High Availability enabled for KVM Hosts can improve greatly the QoS by handling (fence/recover) a problematic Host as well as re-starting its stopped VMs on healthy hosts. However, there is a limitation on CloudStack HA for KVM; it relies mainly on NFS heartbeat script checks. This Talk illustrates how CloudStack HA works for KVM hosts and it presents a way of improving its implementation in a way that KVM HA works with any storage system pluggable on KVM, not just NFS.
About Gabriel Brasher - https://blogs.apache.org/cloudstack/
------------------------------------------
CloudStack European User Group Virtual happened on May 27th. The first CSEUG Virtual proved to be a huge success. It collected people from 23 countries – Germany, the United Kingdom, Switzerland, India, Bulgaria, Greece, Poland, Serbia, Brazil, Chile, Russia, USA, Canada, Japan, France, Uruguay, Korea …
We also had a record number of registrations and attendees for a CloudStack User Group Event. The physical distance was not a stopper for our speakers, who joined the event from 6 different countries.
------------------------------------------
About CloudStack: https://cloudstack.apache.org/
Solr Exchange: Introduction to SolrCloudthelabdude
SolrCloud is a set of features in Apache Solr that enable elastic scaling of search indexes using sharding and replication. In this presentation, Tim Potter will provide an architectural overview of SolrCloud and highlight its most important features. Specifically, Tim covers topics such as: sharding, replication, ZooKeeper fundamentals, leaders/replicas, and failure/recovery scenarios. Any discussion of a complex distributed system would not be complete without a discussion of the CAP theorem. Mr. Potter will describe why Solr is considered a CP system and how that impacts the design of a search application.
These slides were presented at the Great Indian Developer Summit 2014 at Bangalore. See http://www.developermarch.com/developersummit/session.html?insert=ShalinMangar2
"SolrCloud" is the name given to Apache Solr's feature set for fault tolerant, highly available, and massively scalable capabilities. SolrCloud has enabled organizations to scale, impressively, into the billions of documents with sub-second search!
Solr 4.0 dramatically improves scalability, performance, and flexibility. An overhauled Lucene underneath sports near real-time (NRT) capabilities allowing indexed documents to be rapidly visible and searchable. Lucene’s improvements also include pluggable scoring, much faster fuzzy and wildcard querying, and vastly improved memory usage. These Lucene improvements automatically make Solr much better, and Solr magnifies these advances with “SolrCloud.” SolrCloud enables highly available and fault tolerant clusters for large scale distributed indexing and searching. There are many other changes that will be surveyed as well. This talk will cover these improvements in detail, comparing and contrasting to previous versions of Solr.
Solr search engine with multiple table relationJay Bharat
Here you can learn how to use solr search engine and implement in your application like in PHP/MYSQL.
I am introducing how to handle multiple table data handling in SOLR.
This is the twelfth set of slightly updated slides from a Perl programming course that I held some years ago.
I want to share it with everyone looking for intransitive Perl-knowledge.
A table of content for all presentations can be found at i-can.eu.
The source code for the examples and the presentations in ODP format are on https://github.com/kberov/PerlProgrammingCourse
Replication, Durability, and Disaster RecoverySteven Francia
This session introduces the basic components of high availability before going into a deep dive on MongoDB replication. We'll explore some of the advanced capabilities with MongoDB replication and best practices to ensure data durability and redundancy. We'll also look at various deployment scenarios and disaster recovery configurations.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
2. Questions we want to answer
• What is the purpose of leader in SolrCloud?
• How a leader is selected?
• What happens when a leader dies?
3. Purpose of Leader
• Shards: to scale : particular collection of
documents, the collection can be divided in
multiple shards.
• Shard replica: to failover correction(high
availability), load balancing : each of the shard
can be replicated to multiple shard replica
4. Purpose of Leader
• Collection – multiple shards – multiple replica
• How a request is served?
– Types of request:
• Read – search query, no consistency issue between
replica
• Write – index a document, consistency issue, should
have single source for write – Hence leader
6. Leader selection
• Zookeeper: SolrCloud uses Zok to track which
node is active and not, manage config files
etc.
• Zok helps is leader selection
• Zok already embedded in SolrCloud, but can
be run externally
7. Leader selection
• SolrCloud += new node
– The new node registers itself with Zok
– And creates znodes:
• session – with timeout, updated by the client node
regulary
• ephemaral node
• sequence node: when created gets a unique seq. no
assigned and suffixed to its name
– the clusterstate.json file gets updated (by
overseer)
11. Leader dies
• When the leader dies, znode having the
lowest sequence no.
• all znodes are being watched by ZoK
• Znode having the next sequence no. is elected
as the leader
12. Leader dies
• New leader candidate starts sync process with
each replica, if everyone has same version.
Then it registers as leader active
• Old leader might have sent docs to some
replicas and not all.
• And if a replica is far too behind, its tries to
replay log or ask for full replication
14. Code Flow of write requests
Rough sketch ->
org.apache.solr.handler.UpdateRequestHandler -> multiple
org.apache.solr.handler.loader.ContentStreamLoader: csv, xml, json
For each write request: loader is identified and its load method is
called
Within the loader, for different type of write request -
org.apache.solr.update.UpdateCommand is created and it is passed to
org.apache.solr.update.processor.UpdateRequestProcessor.process<Ad
d/Commit/...>
For solrcloud: DistributedUpdateProcessor is used