This document discusses the process of rebalancing in Voldemort. It begins by outlining the high-level steps taken, including getting the current and target cluster states, planning partition movements in batches, changing cluster metadata and rebalancing states, migrating data with redundancy checks, and rolling back changes if failures occur. Key aspects like maintaining consistency through proxying requests and handling failure scenarios are also summarized.
Apache Cassandra operations have the reputation to be simple on single datacenter deployments and / or low volume clusters but they become way more complex on high latency multi-datacenter clusters with high volume and / or high throughout: basic Apache Cassandra operations such as repairs, compactions or hints delivery can have dramatic consequences even on a healthy high latency multi-datacenter cluster.
In this presentation, Julien will go through Apache Cassandra mutli-datacenter concepts first then show multi-datacenter operations essentials in details: bootstrapping new nodes and / or datacenter, repairs strategy, Java GC tuning, OS tuning, Apache Cassandra configuration and monitoring.
Based on his 3 years experience managing a multi-datacenter cluster against Apache Cassandra 2.0, 2.1, 2.2 and 3.0, Julien will give you tips on how to anticipate and prevent / mitigate issues related to basic Apache Cassandra operations with a multi-datacenter cluster.
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...DataStax
We built an application based on the principles of CQRS and Event Sourcing using Cassandra and Spark. During the project we encountered a number of challenges and problems with Cassandra and the Spark Connector.
In this talk we want to outline a few of those problems and our actions to solve them. While some problems are specific to CQRS and Event Sourcing applications most of them are use case independent.
About the Speakers
Matthias Niehoff IT-Consultant, codecentric AG
works as an IT-Consultant at codecentric AG in Germany. His focus is on big data & streaming applications with Apache Cassandra & Apache Spark. Yet he does not lose track of other tools in the area of big data. Matthias shares his experiences on conferences, meetups and usergroups.
Stephan Kepser Senior IT Consultant and Data Architect, codecentric AG
Dr. Stephan Kepser is an expert on cloud computing and big data. He wrote a couple of journal articles and blog posts on subjects of both fields. His interests reach from legal questions to questions of architecture and design of cloud computing and big data systems to technical details of NoSQL databases.
Apache Cassandra operations have the reputation to be simple on single datacenter deployments and / or low volume clusters but they become way more complex on high latency multi-datacenter clusters with high volume and / or high throughout: basic Apache Cassandra operations such as repairs, compactions or hints delivery can have dramatic consequences even on a healthy high latency multi-datacenter cluster.
In this presentation, Julien will go through Apache Cassandra mutli-datacenter concepts first then show multi-datacenter operations essentials in details: bootstrapping new nodes and / or datacenter, repairs strategy, Java GC tuning, OS tuning, Apache Cassandra configuration and monitoring.
Based on his 3 years experience managing a multi-datacenter cluster against Apache Cassandra 2.0, 2.1, 2.2 and 3.0, Julien will give you tips on how to anticipate and prevent / mitigate issues related to basic Apache Cassandra operations with a multi-datacenter cluster.
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...DataStax
We built an application based on the principles of CQRS and Event Sourcing using Cassandra and Spark. During the project we encountered a number of challenges and problems with Cassandra and the Spark Connector.
In this talk we want to outline a few of those problems and our actions to solve them. While some problems are specific to CQRS and Event Sourcing applications most of them are use case independent.
About the Speakers
Matthias Niehoff IT-Consultant, codecentric AG
works as an IT-Consultant at codecentric AG in Germany. His focus is on big data & streaming applications with Apache Cassandra & Apache Spark. Yet he does not lose track of other tools in the area of big data. Matthias shares his experiences on conferences, meetups and usergroups.
Stephan Kepser Senior IT Consultant and Data Architect, codecentric AG
Dr. Stephan Kepser is an expert on cloud computing and big data. He wrote a couple of journal articles and blog posts on subjects of both fields. His interests reach from legal questions to questions of architecture and design of cloud computing and big data systems to technical details of NoSQL databases.
A brief history of Instagram's adoption cycle of the open source distributed database Apache Cassandra, in addition to details about it's use case and implementation. This was presented at the San Francisco Cassandra Meetup at the Disqus HQ in August 2013.
Apache Cassandra operations have the reputation to be quite simple against single datacenter clusters and / or low volume clusters but they become way more complex against high latency multi-datacenter clusters: basic operations such as repair, compaction or hints delivery can have dramatic consequences even on a healthy cluster.
In this presentation, Julien will go through Cassandra operations in details: bootstrapping new nodes and / or datacenter, repair strategies, compaction strategies, GC tuning, OS tuning, large batch of data removal and Apache Cassandra upgrade strategy.
Julien will give you tips and techniques on how to anticipate issues inherent to multi-datacenter cluster: how and what to monitor, hardware and network considerations as well as data model and application level bad design / anti-patterns that can affect your multi-datacenter cluster performances.
C* Summit 2013: Cassandra at Instagram by Rick BransonDataStax Academy
Speaker: Rick Branson, Infrastructure Engineer at Instagram
Cassandra is a critical part of Instagram's large scale site infrastructure that supports more than 100 million active users. This talk is a practical deep dive into data models, systems architecture, and challenges encountered during the implementation process.
- Understanding Time Series
- What's the Fundamental Problem
- Prometheus Solution (v1.x)
- New Design of Prometheus (v2.x)
- Data Compression Algorithm
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...DataStax
The 3.0 storage engine re-write is the biggest and most exciting change to ever happen in Apache Cassandra. The new storage engine can efficiently store and read data from disk using the same concepts present in the CQL 3 language. This has delivered large space savings, and creates new performance characteristics.
In this talk Aaron Morton, Co Founder at The Last Pickle and Apache Cassandra Committer, will discuss the 3.0 storage engine, it's layout and performance characteristics.
About the Speaker
Aaron Morton CEO, The Last Pickle
Aaron Morton is the Co Founder & CEO at The Last Pickle (thelastpickle.com). A professional services company that works with clients to deliver and improve Apache Cassandra based solutions. He's based in New Zealand, is an Apache Cassandra Committer and a DataStax MVP for Apache Cassandra.
Integrate Solr with real-time stream processing applicationslucenerevolution
Presented by Timothy Potter, Founder, Text Centrix
Storm is a real-time distributed computation system used to process massive streams of data. Many organizations are turning to technologies like Storm to complement batch-oriented big data technologies, such as Hadoop, to deliver time-sensitive analytics at scale. This talk introduces on an emerging architectural pattern of integrating Solr and Storm to process big data in real time. There are a number of natural integration points between Solr and Storm, such as populating a Solr index or supplying data to Storm using Solr’s real-time get support. In this session, Timothy will cover the basic concepts of Storm, such as spouts and bolts. He’ll then provide examples of how to integrate Solr into Storm to perform large-scale indexing in near real-time. In addition, we'll see how to embed Solr in a Storm bolt to match incoming tuples against pre-configured queries, commonly known as percolator. Attendees will come away from this presentation with a good introduction to stream processing technologies and several real-world use cases of how to integrate Solr with Storm.
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...DataStax
Making sure your Data Model will work on the production cluster after 6 months as well as it does on your laptop is an important skill. It's one that we use every day with our clients at The Last Pickle, and one that relies on tools like the cassandra-stress. Knowing how the data model will perform under stress once it has been loaded with data can prevent expensive re-writes late in the project.
In this talk Christopher Batey, Consultant at The Last Pickle, will shed some light on how to use the cassandra-stress tool to test your own schema, graph the results and even how to extend the tool for your own use cases. While this may be called premature optimisation for a RDBS, a successful Cassandra project depends on it's data model.
About the Speaker
Christopher Batey Consultant / Software Engineer, The Last Pickle
Christopher (@chbatey) is a part time consultant at The Last Pickle where he works with clients to help them succeed with Apache Cassandra as well as a freelance software engineer working in London. Likes: Scala, Haskell, Java, the JVM, Akka, distributed databases, XP, TDD, Pairing. Hates: Untested software, code ownership. You can checkout his blog at: http://www.batey.info
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...DataStax
Customizing JVM settings for the needs of an application can be a tricky business, especially when running externally developed software such as Cassandra. In this talk I will share our experiences and the procedure that we have used to test and validate changes with Java tuning. We'll explore with two recent experiences: changes and monitoring of G1 garbage collection, and moving buffer objects off the heap.
For the talk, I'll discuss our tuning process at Knewton. I will share some of the challenges that we faced while identifying what we expected to learn. I'll discuss how we isolated and minimized variables across tests, the importance of the duration of these tests, and how we try to separate correlation from causation. I will demonstrate how to use and interpret the results of the custom scripts that we were driven to develop to gain visibility into our G1GC processes; these scripts will be open sourced.
About the Speaker
Carlos Monroy Senior Software Engineer, Knewton
Carlos Monroy is a senior engineer on the database team at Knewton, an education company that created an adaptive learning platform. Carlos has been developing software professionally since 1998. His experience holding multiple roles on the software lifecycle provides him a wholistic approach. Having used over a half dozen relational database engines, he has recently come over to the NoSQL side, first working with HBase and for the last three years Cassandra.
Whether running load tests or migrating historic data, loading data directly into Cassandra can be very useful to bypass the system’s write path.
In this webinar, we will look at how data is stored on disk in sstables, how to generate these structures directly, and how to load this data rapidly into your cluster using sstableloader. We'll also review different use cases for when you should and shouldn't use this method.
Cassandra is the dominant data store used at Netflix and it's health is critical to many of its services. In this talk we will share details of the recent redesign of our health monitoring system and how we leveraged a reactive stream processing system to give us a real-time view our entire fleet while dramatically improving accuracy and reducing false alarms in our alerting.
About the Speaker
Jason Cacciatore Senior Software Engineer, Netflix
Jason Cacciatore is a Senior Software Engineer at Netflix, where he's been working for the past several years. He's interested in stateful distributed systems and has a diverse background in technology. In his spare time he enjoys spending time with his wife and two sons, reading non-fiction, and watching Netflix documentaries.
Slides from my talk at Cassandra Summit 2016 on troubleshooting Cassandra. This is a reprise of my popular talk from last summit, reorganized, expanded, and updated for Cassandra 3.0. In it I share the secrets I've learned in four years of supporting hundreds of customers using Apache Cassandra and DataStax Enterprise. Be sure to check out presenter notes for additional tips and links to further resources.
Basho and Riak at GOTO Stockholm: "Don't Use My Database."Basho Technologies
What are common use cases for NoSQL? When should I avoid NoSQL? When is RDBMS just fine?
This presentation, delivered at the GOTO NoSQL Roadshow events in London and Stockholm in November of 2011 by Basho co-founder and COO, Antony Falco, take a no-BS look at the tradeoffs one must make to gain the advantages offered by distributed databases like Riak.
A brief history of Instagram's adoption cycle of the open source distributed database Apache Cassandra, in addition to details about it's use case and implementation. This was presented at the San Francisco Cassandra Meetup at the Disqus HQ in August 2013.
Apache Cassandra operations have the reputation to be quite simple against single datacenter clusters and / or low volume clusters but they become way more complex against high latency multi-datacenter clusters: basic operations such as repair, compaction or hints delivery can have dramatic consequences even on a healthy cluster.
In this presentation, Julien will go through Cassandra operations in details: bootstrapping new nodes and / or datacenter, repair strategies, compaction strategies, GC tuning, OS tuning, large batch of data removal and Apache Cassandra upgrade strategy.
Julien will give you tips and techniques on how to anticipate issues inherent to multi-datacenter cluster: how and what to monitor, hardware and network considerations as well as data model and application level bad design / anti-patterns that can affect your multi-datacenter cluster performances.
C* Summit 2013: Cassandra at Instagram by Rick BransonDataStax Academy
Speaker: Rick Branson, Infrastructure Engineer at Instagram
Cassandra is a critical part of Instagram's large scale site infrastructure that supports more than 100 million active users. This talk is a practical deep dive into data models, systems architecture, and challenges encountered during the implementation process.
- Understanding Time Series
- What's the Fundamental Problem
- Prometheus Solution (v1.x)
- New Design of Prometheus (v2.x)
- Data Compression Algorithm
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...DataStax
The 3.0 storage engine re-write is the biggest and most exciting change to ever happen in Apache Cassandra. The new storage engine can efficiently store and read data from disk using the same concepts present in the CQL 3 language. This has delivered large space savings, and creates new performance characteristics.
In this talk Aaron Morton, Co Founder at The Last Pickle and Apache Cassandra Committer, will discuss the 3.0 storage engine, it's layout and performance characteristics.
About the Speaker
Aaron Morton CEO, The Last Pickle
Aaron Morton is the Co Founder & CEO at The Last Pickle (thelastpickle.com). A professional services company that works with clients to deliver and improve Apache Cassandra based solutions. He's based in New Zealand, is an Apache Cassandra Committer and a DataStax MVP for Apache Cassandra.
Integrate Solr with real-time stream processing applicationslucenerevolution
Presented by Timothy Potter, Founder, Text Centrix
Storm is a real-time distributed computation system used to process massive streams of data. Many organizations are turning to technologies like Storm to complement batch-oriented big data technologies, such as Hadoop, to deliver time-sensitive analytics at scale. This talk introduces on an emerging architectural pattern of integrating Solr and Storm to process big data in real time. There are a number of natural integration points between Solr and Storm, such as populating a Solr index or supplying data to Storm using Solr’s real-time get support. In this session, Timothy will cover the basic concepts of Storm, such as spouts and bolts. He’ll then provide examples of how to integrate Solr into Storm to perform large-scale indexing in near real-time. In addition, we'll see how to embed Solr in a Storm bolt to match incoming tuples against pre-configured queries, commonly known as percolator. Attendees will come away from this presentation with a good introduction to stream processing technologies and several real-world use cases of how to integrate Solr with Storm.
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...DataStax
Making sure your Data Model will work on the production cluster after 6 months as well as it does on your laptop is an important skill. It's one that we use every day with our clients at The Last Pickle, and one that relies on tools like the cassandra-stress. Knowing how the data model will perform under stress once it has been loaded with data can prevent expensive re-writes late in the project.
In this talk Christopher Batey, Consultant at The Last Pickle, will shed some light on how to use the cassandra-stress tool to test your own schema, graph the results and even how to extend the tool for your own use cases. While this may be called premature optimisation for a RDBS, a successful Cassandra project depends on it's data model.
About the Speaker
Christopher Batey Consultant / Software Engineer, The Last Pickle
Christopher (@chbatey) is a part time consultant at The Last Pickle where he works with clients to help them succeed with Apache Cassandra as well as a freelance software engineer working in London. Likes: Scala, Haskell, Java, the JVM, Akka, distributed databases, XP, TDD, Pairing. Hates: Untested software, code ownership. You can checkout his blog at: http://www.batey.info
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...DataStax
Customizing JVM settings for the needs of an application can be a tricky business, especially when running externally developed software such as Cassandra. In this talk I will share our experiences and the procedure that we have used to test and validate changes with Java tuning. We'll explore with two recent experiences: changes and monitoring of G1 garbage collection, and moving buffer objects off the heap.
For the talk, I'll discuss our tuning process at Knewton. I will share some of the challenges that we faced while identifying what we expected to learn. I'll discuss how we isolated and minimized variables across tests, the importance of the duration of these tests, and how we try to separate correlation from causation. I will demonstrate how to use and interpret the results of the custom scripts that we were driven to develop to gain visibility into our G1GC processes; these scripts will be open sourced.
About the Speaker
Carlos Monroy Senior Software Engineer, Knewton
Carlos Monroy is a senior engineer on the database team at Knewton, an education company that created an adaptive learning platform. Carlos has been developing software professionally since 1998. His experience holding multiple roles on the software lifecycle provides him a wholistic approach. Having used over a half dozen relational database engines, he has recently come over to the NoSQL side, first working with HBase and for the last three years Cassandra.
Whether running load tests or migrating historic data, loading data directly into Cassandra can be very useful to bypass the system’s write path.
In this webinar, we will look at how data is stored on disk in sstables, how to generate these structures directly, and how to load this data rapidly into your cluster using sstableloader. We'll also review different use cases for when you should and shouldn't use this method.
Cassandra is the dominant data store used at Netflix and it's health is critical to many of its services. In this talk we will share details of the recent redesign of our health monitoring system and how we leveraged a reactive stream processing system to give us a real-time view our entire fleet while dramatically improving accuracy and reducing false alarms in our alerting.
About the Speaker
Jason Cacciatore Senior Software Engineer, Netflix
Jason Cacciatore is a Senior Software Engineer at Netflix, where he's been working for the past several years. He's interested in stateful distributed systems and has a diverse background in technology. In his spare time he enjoys spending time with his wife and two sons, reading non-fiction, and watching Netflix documentaries.
Slides from my talk at Cassandra Summit 2016 on troubleshooting Cassandra. This is a reprise of my popular talk from last summit, reorganized, expanded, and updated for Cassandra 3.0. In it I share the secrets I've learned in four years of supporting hundreds of customers using Apache Cassandra and DataStax Enterprise. Be sure to check out presenter notes for additional tips and links to further resources.
Basho and Riak at GOTO Stockholm: "Don't Use My Database."Basho Technologies
What are common use cases for NoSQL? When should I avoid NoSQL? When is RDBMS just fine?
This presentation, delivered at the GOTO NoSQL Roadshow events in London and Stockholm in November of 2011 by Basho co-founder and COO, Antony Falco, take a no-BS look at the tradeoffs one must make to gain the advantages offered by distributed databases like Riak.
Introduction to Redis Data Structures: Sorted SetsScaleGrid.io
We provide an overview on what Redis is, what are sorted sets, common use cases for sorted sets, sorted set operations in Redis, internal implementation, and a comparison of Redis hashes and Redis sorted sets.
A review of BestBuy.com's cloud architecture for its browse cloud. A description of what BestBuy.com has built to support a high scale, high availability ecommerce system.
Redis Use Patterns (DevconTLV June 2014)Itamar Haber
An introduction to Redis for the SQL practitioner, covering data types and common use cases.
The video of this session can be found at: https://www.youtube.com/watch?v=8Unaug_vmFI
As the popularity of PostgreSQL continues to soar, many companies are exploring ways of migrating their application database over. At Redgate Software, we recently added PostgreSQL as an optional data store for SQL Monitor, our flagship monitoring application, after nearly 18 years of being backed exclusively by SQL Server. Knowing that others will be taking this journey in the near future, we'd like to discuss what we learned. In this training, we'll discuss the planning that needs to take place before a migration begins, including datatype changes, PostgreSQL configuration modifications, and query differences. This will be a mix of slides and demo from our own learnings, as well as those of some clients we've helped along the way.
There are many common workloads in R that are "embarrassingly parallel": group-by analyses, simulations, and cross-validation of models are just a few examples. In this talk I'll describe several techniques available in R to speed up workloads like these, by running multiple iterations simultaneously, in parallel.
Many of these techniques require the use of a cluster of machines running R, and I'll provide examples of using cloud-based services to provision clusters for parallel computations. In particular, I will describe how you can use the SparklyR package to distribute data manipulations using the dplyr syntax, on a cluster of servers provisioned in the Azure cloud.
Presented by David Smith at Data Day Texas in Austin, January 27 2018.
Building a High-Performance Database with Scala, Akka, and SparkEvan Chan
Here is my talk at Scala by the Bay 2016, Building a High-Performance Database with Scala, Akka, and Spark. Covers integration of Akka and Spark, when to use actors and futures, back pressure, reactive monitoring with Kamon, and more.
Introduciton to Apache Cassandra for Java Developers (JavaOne)zznate
The database industry has been abuzz over the past year about NoSQL databases. Apache Cassandra, which has quickly emerged as a best-of-breed solution in this space, is used at many companies to achieve unprecedented scale while maintaining streamlined operations.
This presentation goes beyond the hype, buzzwords, and rehashed slides and actually presents the attendees with a hands-on, step-by-step tutorial on how to write a Java application on top of Apache Cassandra. It focuses on concepts such as idempotence, tunable consistency, and shared-nothing clusters to help attendees get started with Apache Cassandra quickly while avoiding common pitfalls.
Dockerizing the Hard Services: Neutron and Novaclayton_oneill
Talk about the benefits and pitfalls involved in successfully running complex services like Neutron and Nova inside of Docker containers.
Topics include:
* What magic incantations are needed to run these services at all?
* How to prevent HA router failover on service restarts.
* How to prevent network namespaces from breaking everything.
* Bonus: How network namespace fixes also helped fix Cinder NFS backend
Pollfish is a survey platform which provides access to millions of targeted users. Pollfish allows easy distribution and targeting of surveys through existing mobile apps. (https://www.pollfish.com/). At pollfish we use Cassandra for difference use cases, eg. for application data store to maximize write throughput when appropriate and for our analytics project to find insights in application generated data. As a medium to accomplish our success so far, we use the Datastax's DSE 4.6 environment which integrates Appache Cassadra, Spark and a hadoop compatible file system (CFS). We will discuss how we started, how the journey was and the impressions gained so far along with some tips learned the hard way. This is a result of joint work of an excellent team here at Pollfish.
Building Apache Cassandra clusters for massive scaleAlex Thompson
Covering theory and operational aspects of bring up Apache Cassandra clusters - this presentation can be used as a field reference. Presented by Alex Thompson at the Sydney Cassandra Meetup.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Leading Change strategies and insights for effective change management pdf 1.pdf
Riak add presentation
1. Riak — простая, предсказуемая и
масштабируемая БД.
Преимущества и недостатки.
Илья Богунов
Профессионалы.ру
twitter.com/tech_mind
2. О чем это ?
• Исторический экскурс БД от mainframe`ов 80х к вебу
2000-ных и текущим временам.
• Что такое Riak. Как он выглядит для
разработчикаадмина.
• Кто его и зачем использует.
• Дополнительные плюшки, кроме чистого kv.
• Не silver bullet - wtf`ы, проблемы и
моменты, которые надо учитывать.
• Код и коммьюнити.
• «Кто, где и когда» ? Зачем и где его использовать.
http://www.flickr.com/photos/lofink/4501610335/
3. DB: mainframe`ы
As an OLTP system benchmark, TPC-C simulates a complete
environment where a population of terminal operators
executes transactions against a database.
The benchmark is centered around the principal activities
(transactions) of an order-entry environment.
These transactions include entering and delivering
orders (insert, update),
recording payments (insert, update),
checking the status of orders (select), and
monitoring the level of stock at the warehouses (group
by, sum, etc).
http://www.tpc.org/tpcc/detail.asp
9. DB: MAINFRAME -> WEB
Middle layer
http://www.dbshards.com/dbshards/
SHARDING
EAV
http://backchannel.org/blog/friendfeed-schemaless-mysql Sharding framework
https://github.com/twitter/gizzard
mysql patching
https://www.facebook.com/note.php?note_id=332335025932
10. DB: MAINFRAME -> WEB
Dynamo: Amazon’s Highly Available Key-value Store (2007),
http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf
11. DB: MAINFRAME -> WEB
Amazon Dynamo Clones:
• Cassandra (2008)
• Voldemort (2009) DHT
• Riak (2009)
12. RIAK:FOR DEVELOPER
RIAK(KV) SQL
• INSERT/CREATE/UPDATE/DELETE/, CREATE/ALT
• PUT/GET/DELETE ER/DROP, GRANT/REVOKE/DENY, MERGE/JOI
N/GROUP BY/HAVING/UNION …
Client:get(<<“bucket”>>, <<“key”>>)
• A (req/seconds) • X (req/seconds)
× B (% file cache miss) × [index, locks, read-ahead, random-
≈ C iops reads, optimizator index
selection, cache, buffer sizes]
= WTF ???
14. RIAK: FOR ADMIN/OPS
• Read repair, failover, hinted handoff.
• горизонтальное масштабирование
• легкость установки и добавления
ноды:
apt-getpkgindpkg install riak
riak start
riak-admin join node@name
15. CASSANDRA
http://wiki.apache.org/cassandra/Operations#Bootstrap
• Bootstrap
• Adding new nodes is called "bootstrapping."
• To bootstrap a node, turn AutoBootstrap on in the configuration file, and start it.
• If you explicitly specify an InitialToken in the configuration, the new node will bootstrap to that position on the ring. Otherwise, it will pick a Token that will give it half the keys from the node with the most disk space used, that does not already have
another node bootstrapping into its Range.
• Important things to note:
• You should wait long enough for all the nodes in your cluster to become aware of the bootstrapping node via gossip before starting another bootstrap. The new node will log "Bootstrapping" when this is safe, 2 minutes after starting. (90s to make sure
it has accurate load information, and 30s waiting for other nodes to start sending it inserts happening in its to-be-assumed part of the token ring.)
• Relating to point 1, one can only bootstrap N nodes at a time with automatic token picking, where N is the size of the existing cluster. If you need to more than double the size of your cluster, you have to wait for the first N nodes to finish until your
cluster is size 2N before bootstrapping more nodes. So if your current cluster is 5 nodes and you want add 7 nodes, bootstrap 5 and let those finish before bootstrapping the last two.
• As a safety measure, Cassandra does not automatically remove data from nodes that "lose" part of their Token Range to a newly added node. Run nodetool cleanup on the source node(s) (neighboring nodes that shared the same subrange) when you
are satisfied the new node is up and working. If you do not do this the old data will still be counted against the load on that node and future bootstrap attempts at choosing a location will be thrown off.
• When bootstrapping a new node, existing nodes have to divide the key space before beginning replication. This can take awhile, so be patient.
• During bootstrap, a node will drop the Thrift port and will not be accessible from nodetool.
• Bootstrap can take many hours when a lot of data is involved. See Streaming for how to monitor progress.
• Cassandra is smart enough to transfer data from the nearest source node(s), if your EndpointSnitch is configured correctly. So, the new node doesn't need to be in the same datacenter as the primary replica for the Range it is bootstrapping into, as
long as another replica is in the datacenter with the new one.
• Bootstrap progress can be monitored using nodetool with the netstats argument (0.7 and later) or streams (Cassandra 0.6).
• During bootstrap nodetool may report that the new node is not receiving nor sending any streams, this is because the sending node will copy out locally the data they will send to the receiving one, which can be seen in the sending node through the
the "AntiCompacting... AntiCompacted" log messages.
• Moving or Removing nodes
• Removing nodes entirely
• You can take a node out of the cluster with nodetool decommission to a live node, or nodetool removetoken (to any other machine) to remove a dead one. This will assign the ranges the old node was responsible for to other nodes, and replicate the
appropriate data there. If decommission is used, the data will stream from the decommissioned node. Ifremovetoken is used, the data will stream from the remaining replicas.
• No data is removed automatically from the node being decommissioned, so if you want to put the node back into service at a different token on the ring, it should be removed manually.
• Moving nodes
• nodetool move: move the target node to a given Token. Moving is both a convenience over and more efficient than decommission + bootstrap. After moving a node, nodetool cleanup should be run to remove any unnecessary data.
• As with bootstrap, see Streaming for how to monitor progress.
• Load balancing
• If you add nodes to your cluster your ring will be unbalanced and only way to get perfect balance is to compute new tokens for every node and assign them to each node manually by using nodetool move command.
• Here's a python program which can be used to calculate new tokens for the nodes. There's more info on the subject at Ben Black's presentation at Cassandra Summit 2010. http://www.datastax.com/blog/slides-and-videos-cassandra-summit-2010
• def tokens(nodes):
– for x in xrange(nodes):
• print 2 ** 127 / nodes * x
• In versions of Cassandra 0.7.* and lower, there's also nodetool loadbalance: essentially a convenience over decommission + bootstrap, only instead of telling the target node where to move on the ring it will choose its location based on the same
heuristic as Token selection on bootstrap. You should not use this as it doesn't rebalance the entire ring.
• The status of move and balancing operations can be monitored using nodetool with the netstat argument. (Cassandra 0.6.* and lower use the streams argument).
• Replacing a Dead Node
• Since Cassandra 1.0 we can replace a dead node with a new one using the property "cassandra.replace_token=<Token>", This property can be set using -D option while starting cassandra demon process.
• (Note:This property will be taken into effect only when the node doesn't have any data in it, You might want to empty the data dir if you want to force the node replace.)
• You must use this property when replacing a dead node (If tried to replace an existing live node, the bootstrapping node will throw a Exception). The token used via this property must be part of the ring and the node have died due to various reasons.
• Once this Property is enabled the node starts in a hibernate state, during which all the other nodes will see this node to be down. The new node will now start to bootstrap the data from the rest of the nodes in the cluster (Main difference between
normal bootstrapping of a new node is that this new node will not accept any writes during this phase). Once the bootstrapping is complete the node will be marked "UP", we rely on the hinted handoff's for making this node consistent (Since we don't
accept writes since the start of the bootstrap).
• Note: We Strongly suggest to repair the node once the bootstrap is completed, because Hinted handoff is a "best effort and not a guarantee".
16. VOLDEMORT
https://github.com/voldemort/voldemort/wiki/Voldemort-Rebalancing
• What are the actual steps performed during rebalancing?
• Following are the steps that we go through to rebalance successfully. The controller initiates the rebalancing and then waits for the completion.
• Input
– Either (a) current cluster xml, current stores xml, target cluster xml (b) url, target cluster xml
– batch size – Number of primary partitions to move together. There is a trade-off, more primary partitions movements = more redirections.
– max parallel rebalancing – Number of units ( stealer + donor node tuples ) to move together
• Get the latest state from the cluster
Compare the targetCluster.xml provided and add all new nodes to currentCluster.xml
• Verify that
– We are not in rebalancing state already ( “./bin/voldemort-admin-tool.sh —get-metadata server.state —url [url+” all returns NORMAL_SERVER and “./bin/voldemort-admin-tool.sh —get-metadata rebalancing.steal.info.key —url [url+” all returns “*+” i.e. no rebalancing state )
– RO stores ( if they exist ) are all using the latest storage format ( “./bin/voldemort-admin-tool.sh —ro-metadata storage-format —url [url+” returns all stores with “ro2” format )
• Get a list of every primary partition to be moved
• For every “batch” of primary partitions to move
– Create a transition cluster metadata which contains movement of “batch size” number of primary partitions
– Create a rebalancing plan based on this transition cluster.xml and the current state of the cluster. The plan generated is a map of stealer node to list of donor node + partitions to be moved.
– State change step
• has_read-only_stores AND has_read-write_stores => Change the rebalancing state [ with RO stores information ] change on all stealer nodes
• has_read-only_stores => Change the rebalancing state change on all stealer nodes
– Start multiple units of migration [ unit => stealer + donor node movement ] of all RO Stores data [ No changes to the routing layer and clients ]. At the end of each migration delete the rebalancing state => Done with parallelism ( max parallel rebalancing )
– State change step [ Changes in rebalancing state will kick in redirecting stores which will start redirecting requests to the old nodes ]
• hasROStore AND hasRWStore => Change the cluster metadata + Swap on all nodes AND Change rebalancing state [ with RW stores information ] on all stealer nodes
• hasROStore AND !hasRWStore => Change the cluster metadata + Swap on all nodes
• !hasROStore AND hasRWStore => Change the cluster metadata on all nodes AND Change rebalancing state [ with RW stores information ] on all stealer nodes
– Start multiple units of migration of all RW Stores data [ With redirecting happening ]. At the end of migration delete the rebalancing state => Done with parallelism ( max parallel rebalancing )
• What about the failure scenarios?
• Extra precaution has been taken and every step ( 5 [ c,d,e,f ] ) has a rollback strategy
• Rollback strategy for
5 ( c , e ) => If any failure takes place during the ‘State change step’, the following rollback strategy is run on every node that was completed successfully
• Swap ROChange cluster metadataChange rebalance stateOrderFTTremove the rebalance state change → change back clusterFFTremove from rebalance state changeTTFchange back cluster metadata → swapTTTremove from rebalance state change → change back cluster → swap5 ( d, f ) => Similarly during migration of partitions
• Has RO storesHas RW storesFinished RO storesRollback Action [ in the correct order ]TTTrollback cluster change + swapTTFnothing to do since “rebalance state change” should have removed everythingTFTwon’t be triggered since hasRW is falseTFFnothing to do since “rebalance state change” should have removed everythingFTTrollback cluster changeFTFwon’t be triggeredFFTwon’t be triggeredFFFwon’t be triggeredWhat happens on the stealer node side during 5 ( d, f )?
• The stealer node on receiving a “unit of migration” * unit => single stealer + single donor node migration + and does the following
• Check if the rebalancing state change was already done [ i.e. 5 ( c, e ) was successfully completed ]
• Acquire a lock for the donor node [ Fail if donor node was already rebalancing ]
• Start migration of the store partitions from the donor node => PARALLEL [ setMaxParallelStoresRebalancing ]. At the end of every store migration remove it from the list rebalance state change [ so as to stop redirecting stores ]
• What about the donor side?
• The donor node has no knowledge for rebalancing at all and keeps behaving normally.
• What about my data consistency during rebalancing?
• Rebalancing process has to maintain data consistency guarantees during rebalancing, We are doing it through a proxy based design. Once rebalancing starts the stealer node is the new master for all rebalancing partitions, All the clients talk directly to stealer node for all the requests for these partitions. Stealer node internally make proxy calls to the original donor node to return correct data back to the client.The process steps are
• Client request stealer node for key ‘k1’ belonging to partition ‘p1’ which is currently being migrated/rebalanced.
• Stealer node looks at the key and understands that this key is part of a rebalancing partition
• Stealer node makes a proxy call to donor node and gets the list of values as returned by the donor node.
• Stealer node does local put for all (key,value) pairs ignoring all ObsoleteVersionException
• Stealer node now should have all the versions from the original node and now does normal local get/put/ getAll operations.
• And how do my clients know the changes?
• Voldemort client currently bootstrap from the bootstrap URL at the start time and use the returned cluster/stores metadata for all subsequent operation. Rebalancing results in the cluster metadata change and so we need a mechanism to tell clients that they should rebootstrap if they have old metadata.
• Client Side routing bootstrapping: Since the client will be using the old metadata during rebalancing the server now throws anInvalidMetadataException if it sees a request for a partition which does not belong to it. On seeing this special exception the client is re-bootstraps from the bootstrap url and will hence pick up the correct cluster metadata.
• Server side routing bootstrapping: The other method of routing i.e. make calls to any server with enable_routing flag set to true and with re-routing to the correct location taking place on the server side ( i.e. 2 hops ). In this approach we’ve added a RebootstrappingStore which picks up the new metadata in case of change.
• How to start rebalancing?
• Step 1: Make sure cluster is not doing rebalancing.
– The rebalancing steal info should be null
./bin/voldemort-admin-tool.sh —get-metadata rebalancing.steal.info.key —url [ url ]
– The servers should be NORMAL_STATE
./bin/voldemort-admin-tool.sh —get-metadata server.state —url [ url ]
• To check whether the keys were moved correctly we need to save some keys and later check if they have been migrated to their new locations
./bin/voldemort-rebalance.sh —current-cluster [ current_cluster ] —current-stores [ current_stores_path ] —entropy false —output-dir * directory where we’ll store the keys for later use. Keys are stored on a per store basis +
• Generate the new cluster xml. This can be done either by hand or if you want to do it automatically here is a tool to do to
./bin/voldemort-rebalance.sh —current-cluster [ current_cluster_path ] —target-cluster [ should be the same as current-cluster but with new nodes put in with empty partitions ] —current-stores [ current_stores_path ] —generate
• Run the new metadata through key-distribution generator to get an idea of skew if any. Make sure your standard deviation is close to 0.
./bin/run-class.sh voldemort.utils.KeyDistributionGenerator —cluster-xml [ new_cluster_xml_generated_above ] —stores-xml [ stores_metadata ]
• Use the new_cluster_xml and run the real rebalancing BUT first with —show-plan ( to check what we’re going to move )
./bin/voldemort-rebalance.sh —url [ url ] —target-cluster [ new_cluster_metadata_generated ] —show-plan
• Run the real deal
./bin/voldemort-rebalance.sh —url [ url ] —target-cluster [ new_cluster_metadata_generated ]
• Monitor by checking the async jobs
./bin/voldemort-admin-tool.sh —async get —url [ url ]
• The following takes place when we are running the real rebalancing (i) Pick batch of partitions to move (ii) Generate transition plan (iii) Execute the plan as a series of ‘stealer-donor’ node copying steps. By default we do only one ‘stealer-donor’ tuple movement at once. You can increase this by setting —parallelism option.
• If anything fails we can rollback easily ( as long as —delete was not used while running voldemort-rebalance.sh )
• To stop an async process
./bin/voldemort-admin-tool.sh —async stop —async-id [comma separated list of async jobs ] —url [ url ] —node [ node on which the async job is running ]
• To clear the rebalancing information on a particular node
./bin/voldemort-admin-tool.sh —clear-rebalancing-metadata —url [ url ] —node [ node-id ]
• Following are some of the configurations that will help you on the server side
• Parameter on serverDefaultWhat it doesenable.rebalancingtrueShould be true so as to run rebalancingmax.rebalancing.attempts3Once a stealer node receives the plan to copy from a donor node, it will attempt this many times to copy the data ( in case of failure )rebalancing.timeout.seconds10 * 24 * 60 * 60Time we give for the server side rebalancing to finish copying data from a donor nodemax.parallel.stores.rebalancing3Stores to rebalance in parallelrebalancing.optimizationtrueSome times
we have data stored without being partition aware ( Example : BDB ). In this scenario we can run an optimization phase which ignores copying data over if a replica already existsWhat is left to make this process better?
• a) Execute tasks should be smarter and choose tasks to execute so as to avoid two disk sweeps happening on the same node.
• b) Fix deletes! – Make it run at the end instead of in the middle * Even though I’ll never run this in production +
• c) Logging – Currently we propagate the message at the lowest level all the way to the top. Instead we should try to make a better progress bar ( “Number of stores completed – 5 / 10” ) and push that upwards.
• d) Currently the stealer node goes into REBALANCING_MASTER state and doesn’t allow any disk sweeps ( like slop pusher job, etc ) from not taking place. But what about the poor donor node
17. HAND-MADE SHARDING
apt-get install x-sql
./configure
Мигрируем
• легкий способ:
1. стопнуть сервис
2. запустить скрипты миграции
3. запустить сервис
• сложный способ:
1. дописать код работающий с 2мя партициями одновременно
2. научить приложение понимать, когда идет миграция и что надо
читатьписать в ОБА места
3. следить чтоб при переезде не разбалансировался кластер
4. ощутить после переезда всю прелесть неучтенных
многопоточных ситуацийнеконсистентных данных
19. RIAK: КогдаГде ?
1. мне критична высоконадежность и
высокодоступность
2. у меня миллионы
юзеровприложенийстраниц
топиковфайловобьектовлент новостей
3. они между собой не связаны (нет real-time
GROUP BY/JOIN)
4. можно для аналитики типа time-series data
5. можно слепить ежа с ужом и прикрутить Riak
core например к Lucene
23. Riak: дополнительные плюшки
• Postpre-commit хуки:
• Поиск – это тот же пре-коммит хук
• Интеграция с RabbitMq
https://github.com/jbrisbin/riak-rabbitmq-commit-hooks
• Map-reduce.
• На эрланге =)
• На javascript`e.
• По заданному списку ключей!
29. RIAK: Неудобства и WTF`S
Неудобства:
• Adhoc запросы невозможны, нужно ЗАРАНЕЕ
подумать о своих паттернах доступа
• COUNTGROUP BY, только пред-агрегацией
(map-reduce`ом довольно не удобно), либо
хранить счетчики в Redis`е
• Eventual consistency, планируем свои данные
чтоб уметь мерджить
• Нет паджинации и сортировки - либо
редис, либо что-то вроде linked-list`a
30. RIAK: Неудобства и WTF`S
WTF’S:
• BUCKET = PREFIX
• «урезанный» map-reduce
• Восстановление данных только по read-
repair на чтении, данные сами не починятся
(без EDS)
• RiakSearch – нет анти-энтропии
• Специфичные проблемы, каждого из
бекэндов
39. RIAK: Стоит или не стоит ?
• У вас есть проблема с количеством IOPS.
• Вам важна высокодоступность и хочется минимум
проблем при расширение кластера
• Вы готовы заранее полностью продумать то, как
ваши данные положить на kv
• Вы знаете, как будете решать конфликты
• Вам не нужны транзакции (pessimisticoptimistic
locking)
• Вы не боитесь erlang`a =) И не боитесь обратиться в
community, чтобы получить ответ на свой вопрос