This document summarizes a talk given by Noa Resare about Spotify's use of Cassandra for persistent storage. It introduces Spotify and its service needs, discusses why Cassandra was chosen, how it has performed, and some lessons learned. Key points covered include how Cassandra solved Spotify's storage problems through sharding and replication, its fast write speeds, flexibility for playlists, and potential for upgrades without downtime. Challenges with backups and solutions using incremental backups and a separate backup datacenter are also summarized.
Less Is More: Novel Approaches to MySQL Compression for Modern Data Sets - Pe...Ernie Souhrada
As data volume grows, finding ways to slow the growth velocity becomes more and more important. We want to do everything possible to maximize the efficiency of our hardware before we spend the money on more storage, so one way to do that is with compression. These slides discuss compression theory and compression options in MySQL, ending with some benchmark data that compares column-level compression in InnoDB with other available compression technologies. Presented at Percona Live 2016.
Operational Buddhism: Building Reliable Services From Unreliable Components -...Ernie Souhrada
Operational Buddhism is a philosophy for building cloud-based services by embracing the inherent ephemerality of the servers themselves and designing failure-resilient services. Attachment to servers leads to suffering. Presented at Percona Live 2016.
C* Summit 2013: How Not to Use Cassandra by Axel LiljencrantzDataStax Academy
At Spotify, we see failure as an opportunity to learn. During the two years we've used Cassandra in our production environment, we have learned a lot. This session touches on some of the exciting design anti-patterns, performance killers and other opportunities to lose a finger that are at your disposal with Cassandra.
Scaling Cassandra in all directions - Jimmy Mardell SpotifyEvention
At Spotify we run over 100 Cassandra clusters, from small 3 node clusters to clusters with up to 100 nodes. Many of them are multi-datacenter clusters. I will talk about the challenges of having so many clusters and what tools we are using and have built for managing them. There will also be some war stories of when we have failed
SOA stands for Service-Oriented Architecture, which is an architectural style that defines how components should be loosely coupled, modular, and independent. SOA breaks down applications into black-box components that communicate through well-defined interfaces and share business logic, processes, and data across a network. This allows the components to be reused and replaced without disrupting the entire system. In contrast, traditional applications bundle components together, so changing one part requires changing the whole application. SOA takes a modular approach where components like speakers, amplifiers, and players can be mixed and matched as needed.
Slides from a talk at a meetup organized by SF Scala at Spotify's San Francisco office. The slides present details of playlist recommendations at Spotify and how Spotify uses Scalding to develop robust and reliable pipelines to generate these recommendations.
Meetup details: http://www.meetup.com/SF-Scala/events/224430674/
Less Is More: Novel Approaches to MySQL Compression for Modern Data Sets - Pe...Ernie Souhrada
As data volume grows, finding ways to slow the growth velocity becomes more and more important. We want to do everything possible to maximize the efficiency of our hardware before we spend the money on more storage, so one way to do that is with compression. These slides discuss compression theory and compression options in MySQL, ending with some benchmark data that compares column-level compression in InnoDB with other available compression technologies. Presented at Percona Live 2016.
Operational Buddhism: Building Reliable Services From Unreliable Components -...Ernie Souhrada
Operational Buddhism is a philosophy for building cloud-based services by embracing the inherent ephemerality of the servers themselves and designing failure-resilient services. Attachment to servers leads to suffering. Presented at Percona Live 2016.
C* Summit 2013: How Not to Use Cassandra by Axel LiljencrantzDataStax Academy
At Spotify, we see failure as an opportunity to learn. During the two years we've used Cassandra in our production environment, we have learned a lot. This session touches on some of the exciting design anti-patterns, performance killers and other opportunities to lose a finger that are at your disposal with Cassandra.
Scaling Cassandra in all directions - Jimmy Mardell SpotifyEvention
At Spotify we run over 100 Cassandra clusters, from small 3 node clusters to clusters with up to 100 nodes. Many of them are multi-datacenter clusters. I will talk about the challenges of having so many clusters and what tools we are using and have built for managing them. There will also be some war stories of when we have failed
SOA stands for Service-Oriented Architecture, which is an architectural style that defines how components should be loosely coupled, modular, and independent. SOA breaks down applications into black-box components that communicate through well-defined interfaces and share business logic, processes, and data across a network. This allows the components to be reused and replaced without disrupting the entire system. In contrast, traditional applications bundle components together, so changing one part requires changing the whole application. SOA takes a modular approach where components like speakers, amplifiers, and players can be mixed and matched as needed.
Slides from a talk at a meetup organized by SF Scala at Spotify's San Francisco office. The slides present details of playlist recommendations at Spotify and how Spotify uses Scalding to develop robust and reliable pipelines to generate these recommendations.
Meetup details: http://www.meetup.com/SF-Scala/events/224430674/
The document summarizes lessons learned by Spotify about scaling infrastructure and operations. Some key points include: starting with letting experts handle data centers when small, streamlining procurement processes, treating capacity in standardized "pods", focusing infrastructure teams on platforms rather than individual services, implementing automated processes for configuration, provisioning and monitoring, and having individual product teams take on operational responsibilities for their own services with guidance from infrastructure teams. The presentation also covers specific scaling challenges faced with storage, networking, and resilience strategies like retry policies and load shedding.
The Evolution of Spotify Home Architecture - Qcon 2019Karthik Murugesan
This talk will take the audience through the evolution of Spotify's architecture that serves recommendations (playlist, albums, etc) on the home tab. We'll discuss the tradeoffs of the different architectural decisions we made and how we went from batch pipelines to services to a combination of services and streaming pipelines.
SF Big Analytics: Introduction to Succinct by UC Berkeley AmpLabChester Chen
Topic: Introduction to Succinct by UC Berkeley AmpLab.
"Cloud services today need to perform fast, interactive queries on large data volumes. Several recent studies have shown that data is growing faster than memory capacity, making in-memory query execution increasingly challenging. At UC Berkeley, we have built Succinct, a distributed data store that overcomes this problem by enabling a wide range of interactive queries (e.g., search, random access, range queries, and even regular expressions) directly on compressed data. Besides its ability to execute queries on compressed data, Succinct differs from existing data stores along several dimensions. First, Succinct unifies several powerful data models (key-value stores, document stores, tables, etc.) using a single interface. Second, Succinct enables applications to choose a desired compression factor, allowing applications to use larger memory for improved performance. Finally, Succinct allows applications to change the compression factor on the fly, enabling new approaches to handling skewed query distributions, time-varying loads, and failure tolerance. In this talk, I will describe Succinct's design, implementation and semantics. Succinct is completely open-sourced, and we have also released Succinct as a library that simplifies integration of Succinct data structures and techniques with existing data stores.”
Speaker bio:
"Anurag is a graduate student at AMPLab, UC Berkeley, where he is advised by Prof. Ion Stoica. He co-created Succinct with Rachit Agarwal and Ion Stoica."
You can find more information about the project here: http://succinct.cs.berkeley.edu/wp/wordpress/?p=143
Free The Enterprise With Ruby & Master Your Own DomainKen Collins
On the heals of Luis Lavena's RailsConf talk "Infiltrating Ruby Onto The Enterprise Death Star Using Guerilla Tactics" comes a local and frank talk about the current state of Open Source Software (OSS) participation from Windows developers. Learn what OSS is, what motivates its contributors, and how OSS can make you a stronger developer. Be prepared to fall in love with writing software again!
We will start off with a 101 introduction to both the Ruby programming language and the Ruby on Rails web application framework. You will learn about ActiveRecord, a powerful ORM that maps rich objects to your databases, and the latest components to use it with SQL Server. As a Rails core contributor and author of the SQL Server stack, I will give you a modern insight into both that will allow you to leverage your legacy data with Ruby.
Lastly, I will review the bleeding edge tools being actively created for Windows developers to ease the transition to Ruby, Rails and OSS from a POSIX driven world. Many things have changed. It is time to learn and perform some occupational maintenance.
Matt Franklin - Apache Software (Geekfest)W2O Group
The document discusses the potential benefits of container technologies like Docker. It notes that containers offer significantly higher density than virtual machines by avoiding hypervisor overhead. This density improvement can lead to major cost reductions by reducing infrastructure needs. Containers also improve developer efficiency by making development environments portable and disposable. This allows more rapid experimentation and innovation, potentially translating to increased revenue. Technologies like Amazon Lambda take the on-demand aspects of containers even further by abstracting compute resources. The document promotes StackEngine as a solution for managing containers at scale in production environments.
The document discusses the requirements and basics of interacting with databases using Perl. It requires the DBI module to provide a database interface and a DBD driver specific to the database. It provides examples of simple queries to retrieve letter counts of last names and barcodes of patrons, demonstrating prepared statements, nested queries, and the benefits of binding variables. Chunking queries in large loops is more efficient than retrieving all records at once when working with BLOB fields.
This document provides a recap of the AWS re:Invent 2016 conference. Some key details include:
- The conference was spread over 4 hotels in Las Vegas over 4 days with over 30,000 attendees and over 400 partner sponsors.
- There were over 680 sessions covering topics like AWS Lambda, Well Architected frameworks, and database migration.
- The agenda includes sessions on re:Invent recaps, new AWS services like Lambda and technical discussions on architecture best practices.
- Networking opportunities are provided through breakfast, breaks and lunch to interact with AWS technical experts and partners.
Maximum Uptime Cluster Orchestration with AnsibleScyllaDB
Ansible is a flexible orchestration tool for ScyllaDB clusters. Learn tried and tested patterns with Ansible to maximize the uptime of your ScyllaDB clusters when making changes.
This talk will go over tangible code snippets to help operators and developers safely make changes to their ScyllaDB clusters. These are tips learned the hard way in production so you don't have to.
Come along and learn how to orchestrate your ScyllaDB clusters with confidence and not have to take a maintenance window in the middle of the night.
This document discusses NoSQL databases and when they should be used. It describes what NoSQL databases are, when to consider using one over a relational database, and introduces DynamoDB as an AWS NoSQL solution. Specific topics covered include the differences between relational and NoSQL data models, common use cases for NoSQL databases, and how to access and query DynamoDB tables.
Modern Release Engineering in a Nutshell - Why Researchers should Care!Bram Adams
Invited talk at the Leaders of Tomorrow Symposium of the 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER 2016).
The presentation (and its accompanying paper, see http://mcis.polymtl.ca/publications/2016/fose.pdf) explain the basics of release engineering pipelines, common challenges industry is facing as well as pitfalls software engineering researchers are falling into.
Speakers are Bram Adams (MCIS, http://mcis.polymtl.ca) and Shane McIntosh (McGill University, http://shanemcintosh.org).
A video-taped version of the talk will be available soon at https://www.youtube.com/channel/UCL8yG6qpHk7V66l1Jt3aZrA/featured.
Test driven development is a popular concept in software development, leading to higher quality code that’s easier to maintain. Automated testing is normally a foreign concept in the operations world, but as you ssh into your servers to make that quick fix or run your updated script (fingers crossed), you might be wondering if there’s a better way. A way that gives you the confidence in your script and lets you test those scripts in isolation. Well Arthur has good news for you, there is a better way! Test driven infrastructure (TDI) is now possible. He knows, it sounds crazy.
At this session you’ll learn the how, and more importantly the why, of TDI. You’ll see how Chef (or any other Config Management framework) can be tested with Test Kitchen and ServerSpec. You’ll also learn how to improve your feedback cycle with Docker, and using the Docker approach on a CI server. There may even be some live demos!
Finally, the Ops world collides with the Dev world in true DevOps testing bliss.
This document summarizes Mitch Pirtle's presentation on Joomla! extreme performance. The presentation covered assessing performance from all aspects including server output, network throughput, and browser footprint. It provided tips on establishing a performance baseline, optimizing server output through caching and tuning, improving network throughput with gzipping and CDNs, and reducing browser footprint by consolidating assets and removing inline styling. It also addressed scaling issues and tools for debugging performance problems.
The document discusses using the TPC-C benchmark to study Firebird database performance under load. It describes running tests with different Firebird configurations, hardware, and database sizes to determine optimal settings. Analysis found page size, buffer size, and hash slots impact performance, but settings optimized for HDDs did not always help SSD performance which responded differently. The tests provided valuable insights into Firebird performance tuning but also showed more analysis is needed to optimize configurations for different hardware.
The computer science behind a modern disributed data storeJ On The Beach
What we see in the modern data store world is a race between different approaches to achieve a distributed and resilient storage of data. Every application needs a stateful layer which holds the data. There are at least three necessary components which are everything else than trivial to combine, and, of course, even more challenging when heading for an acceptable performance.
Over the past years there has been significant progress in both the science and practical implementations of such data stores. In his talk Max Neunhoeffer will introduce the audience to some of the needed ingredients, address the difficulties of their interplay and show four modern approaches of distributed open-source data stores (ArangoDB, Cassandra, Cockroach and RethinkDB).
The document discusses challenges and approaches for implementing internationalization (i18n) and right-to-left (RTL) support in web applications. It covers creating localized test data, implementing RTL by overriding or separating CSS styles, challenges in testing RTL functionality, and automating CSS for left-to-right and right-to-left variants. Examples of automated CSS testing and limitations of dynamic attribute testing are also provided.
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScyllaDB
In this talk AWS’ Ken Krupa, Head of Specialized Solutions Architecture, will describe the architecture and capabilities of two new AWS EC2 instance types perfect for data-intensive storage and IO-heavy workloads like ScyllaDB: the Intel-based I4i and the Graviton2-based I4g series.
The Intel Xeon Ice Lake-based I4i series provides unparalleled raw horsepower for your most demanding workloads. Meanwhile, the Graviton2-powered I4g instances provide lower cost per storage on a power-efficient platform to deploy your cloud-native applications.
Ken will also describe the AWS Nitro SSD, a new form of high-speed NVMe storage with a Flash Translation Layer built with Nitro controllers, which powers both of these instance families.
ScyllaDB VP of Product Tzach Livyatan will then share benchmarking results showing how ScyllaDB behaves under load on these two instance types, providing maximum system utility and efficiency.
To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.
Redis is a key-value store that can be used as a database, cache, and message broker. It supports basic data structures like strings, hashes, lists, sets, sorted sets with operations that are fast thanks to storing the entire dataset in memory. Redis also provides features like replication, transactions, pub/sub messaging and can be used for caching, queueing, statistics and inter-process communication.
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...DataStax
- Repair is a maintenance operation that restores consistency in Cassandra by comparing and syncing data across nodes. It is needed due to eventual consistency and to ensure safe deletes.
- Traditional full repair reads and compares all data partitions, while incremental repair only repairs data that has changed since the last repair.
- Automated repair tools like Spotify's Cassandra Reaper help orchestrate repairs across large clusters to limit their impact on performance and availability. Future improvements may further reduce the need to manually manage repairs.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
The document summarizes lessons learned by Spotify about scaling infrastructure and operations. Some key points include: starting with letting experts handle data centers when small, streamlining procurement processes, treating capacity in standardized "pods", focusing infrastructure teams on platforms rather than individual services, implementing automated processes for configuration, provisioning and monitoring, and having individual product teams take on operational responsibilities for their own services with guidance from infrastructure teams. The presentation also covers specific scaling challenges faced with storage, networking, and resilience strategies like retry policies and load shedding.
The Evolution of Spotify Home Architecture - Qcon 2019Karthik Murugesan
This talk will take the audience through the evolution of Spotify's architecture that serves recommendations (playlist, albums, etc) on the home tab. We'll discuss the tradeoffs of the different architectural decisions we made and how we went from batch pipelines to services to a combination of services and streaming pipelines.
SF Big Analytics: Introduction to Succinct by UC Berkeley AmpLabChester Chen
Topic: Introduction to Succinct by UC Berkeley AmpLab.
"Cloud services today need to perform fast, interactive queries on large data volumes. Several recent studies have shown that data is growing faster than memory capacity, making in-memory query execution increasingly challenging. At UC Berkeley, we have built Succinct, a distributed data store that overcomes this problem by enabling a wide range of interactive queries (e.g., search, random access, range queries, and even regular expressions) directly on compressed data. Besides its ability to execute queries on compressed data, Succinct differs from existing data stores along several dimensions. First, Succinct unifies several powerful data models (key-value stores, document stores, tables, etc.) using a single interface. Second, Succinct enables applications to choose a desired compression factor, allowing applications to use larger memory for improved performance. Finally, Succinct allows applications to change the compression factor on the fly, enabling new approaches to handling skewed query distributions, time-varying loads, and failure tolerance. In this talk, I will describe Succinct's design, implementation and semantics. Succinct is completely open-sourced, and we have also released Succinct as a library that simplifies integration of Succinct data structures and techniques with existing data stores.”
Speaker bio:
"Anurag is a graduate student at AMPLab, UC Berkeley, where he is advised by Prof. Ion Stoica. He co-created Succinct with Rachit Agarwal and Ion Stoica."
You can find more information about the project here: http://succinct.cs.berkeley.edu/wp/wordpress/?p=143
Free The Enterprise With Ruby & Master Your Own DomainKen Collins
On the heals of Luis Lavena's RailsConf talk "Infiltrating Ruby Onto The Enterprise Death Star Using Guerilla Tactics" comes a local and frank talk about the current state of Open Source Software (OSS) participation from Windows developers. Learn what OSS is, what motivates its contributors, and how OSS can make you a stronger developer. Be prepared to fall in love with writing software again!
We will start off with a 101 introduction to both the Ruby programming language and the Ruby on Rails web application framework. You will learn about ActiveRecord, a powerful ORM that maps rich objects to your databases, and the latest components to use it with SQL Server. As a Rails core contributor and author of the SQL Server stack, I will give you a modern insight into both that will allow you to leverage your legacy data with Ruby.
Lastly, I will review the bleeding edge tools being actively created for Windows developers to ease the transition to Ruby, Rails and OSS from a POSIX driven world. Many things have changed. It is time to learn and perform some occupational maintenance.
Matt Franklin - Apache Software (Geekfest)W2O Group
The document discusses the potential benefits of container technologies like Docker. It notes that containers offer significantly higher density than virtual machines by avoiding hypervisor overhead. This density improvement can lead to major cost reductions by reducing infrastructure needs. Containers also improve developer efficiency by making development environments portable and disposable. This allows more rapid experimentation and innovation, potentially translating to increased revenue. Technologies like Amazon Lambda take the on-demand aspects of containers even further by abstracting compute resources. The document promotes StackEngine as a solution for managing containers at scale in production environments.
The document discusses the requirements and basics of interacting with databases using Perl. It requires the DBI module to provide a database interface and a DBD driver specific to the database. It provides examples of simple queries to retrieve letter counts of last names and barcodes of patrons, demonstrating prepared statements, nested queries, and the benefits of binding variables. Chunking queries in large loops is more efficient than retrieving all records at once when working with BLOB fields.
This document provides a recap of the AWS re:Invent 2016 conference. Some key details include:
- The conference was spread over 4 hotels in Las Vegas over 4 days with over 30,000 attendees and over 400 partner sponsors.
- There were over 680 sessions covering topics like AWS Lambda, Well Architected frameworks, and database migration.
- The agenda includes sessions on re:Invent recaps, new AWS services like Lambda and technical discussions on architecture best practices.
- Networking opportunities are provided through breakfast, breaks and lunch to interact with AWS technical experts and partners.
Maximum Uptime Cluster Orchestration with AnsibleScyllaDB
Ansible is a flexible orchestration tool for ScyllaDB clusters. Learn tried and tested patterns with Ansible to maximize the uptime of your ScyllaDB clusters when making changes.
This talk will go over tangible code snippets to help operators and developers safely make changes to their ScyllaDB clusters. These are tips learned the hard way in production so you don't have to.
Come along and learn how to orchestrate your ScyllaDB clusters with confidence and not have to take a maintenance window in the middle of the night.
This document discusses NoSQL databases and when they should be used. It describes what NoSQL databases are, when to consider using one over a relational database, and introduces DynamoDB as an AWS NoSQL solution. Specific topics covered include the differences between relational and NoSQL data models, common use cases for NoSQL databases, and how to access and query DynamoDB tables.
Modern Release Engineering in a Nutshell - Why Researchers should Care!Bram Adams
Invited talk at the Leaders of Tomorrow Symposium of the 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER 2016).
The presentation (and its accompanying paper, see http://mcis.polymtl.ca/publications/2016/fose.pdf) explain the basics of release engineering pipelines, common challenges industry is facing as well as pitfalls software engineering researchers are falling into.
Speakers are Bram Adams (MCIS, http://mcis.polymtl.ca) and Shane McIntosh (McGill University, http://shanemcintosh.org).
A video-taped version of the talk will be available soon at https://www.youtube.com/channel/UCL8yG6qpHk7V66l1Jt3aZrA/featured.
Test driven development is a popular concept in software development, leading to higher quality code that’s easier to maintain. Automated testing is normally a foreign concept in the operations world, but as you ssh into your servers to make that quick fix or run your updated script (fingers crossed), you might be wondering if there’s a better way. A way that gives you the confidence in your script and lets you test those scripts in isolation. Well Arthur has good news for you, there is a better way! Test driven infrastructure (TDI) is now possible. He knows, it sounds crazy.
At this session you’ll learn the how, and more importantly the why, of TDI. You’ll see how Chef (or any other Config Management framework) can be tested with Test Kitchen and ServerSpec. You’ll also learn how to improve your feedback cycle with Docker, and using the Docker approach on a CI server. There may even be some live demos!
Finally, the Ops world collides with the Dev world in true DevOps testing bliss.
This document summarizes Mitch Pirtle's presentation on Joomla! extreme performance. The presentation covered assessing performance from all aspects including server output, network throughput, and browser footprint. It provided tips on establishing a performance baseline, optimizing server output through caching and tuning, improving network throughput with gzipping and CDNs, and reducing browser footprint by consolidating assets and removing inline styling. It also addressed scaling issues and tools for debugging performance problems.
The document discusses using the TPC-C benchmark to study Firebird database performance under load. It describes running tests with different Firebird configurations, hardware, and database sizes to determine optimal settings. Analysis found page size, buffer size, and hash slots impact performance, but settings optimized for HDDs did not always help SSD performance which responded differently. The tests provided valuable insights into Firebird performance tuning but also showed more analysis is needed to optimize configurations for different hardware.
The computer science behind a modern disributed data storeJ On The Beach
What we see in the modern data store world is a race between different approaches to achieve a distributed and resilient storage of data. Every application needs a stateful layer which holds the data. There are at least three necessary components which are everything else than trivial to combine, and, of course, even more challenging when heading for an acceptable performance.
Over the past years there has been significant progress in both the science and practical implementations of such data stores. In his talk Max Neunhoeffer will introduce the audience to some of the needed ingredients, address the difficulties of their interplay and show four modern approaches of distributed open-source data stores (ArangoDB, Cassandra, Cockroach and RethinkDB).
The document discusses challenges and approaches for implementing internationalization (i18n) and right-to-left (RTL) support in web applications. It covers creating localized test data, implementing RTL by overriding or separating CSS styles, challenges in testing RTL functionality, and automating CSS for left-to-right and right-to-left variants. Examples of automated CSS testing and limitations of dynamic attribute testing are also provided.
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScyllaDB
In this talk AWS’ Ken Krupa, Head of Specialized Solutions Architecture, will describe the architecture and capabilities of two new AWS EC2 instance types perfect for data-intensive storage and IO-heavy workloads like ScyllaDB: the Intel-based I4i and the Graviton2-based I4g series.
The Intel Xeon Ice Lake-based I4i series provides unparalleled raw horsepower for your most demanding workloads. Meanwhile, the Graviton2-powered I4g instances provide lower cost per storage on a power-efficient platform to deploy your cloud-native applications.
Ken will also describe the AWS Nitro SSD, a new form of high-speed NVMe storage with a Flash Translation Layer built with Nitro controllers, which powers both of these instance families.
ScyllaDB VP of Product Tzach Livyatan will then share benchmarking results showing how ScyllaDB behaves under load on these two instance types, providing maximum system utility and efficiency.
To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.
Redis is a key-value store that can be used as a database, cache, and message broker. It supports basic data structures like strings, hashes, lists, sets, sorted sets with operations that are fast thanks to storing the entire dataset in memory. Redis also provides features like replication, transactions, pub/sub messaging and can be used for caching, queueing, statistics and inter-process communication.
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...DataStax
- Repair is a maintenance operation that restores consistency in Cassandra by comparing and syncing data across nodes. It is needed due to eventual consistency and to ensure safe deletes.
- Traditional full repair reads and compares all data partitions, while incremental repair only repairs data that has changed since the last repair.
- Automated repair tools like Spotify's Cassandra Reaper help orchestrate repairs across large clusters to limit their impact on performance and availability. Future improvements may further reduce the need to manually manage repairs.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
3. About this talk
An introduction Spotify, to our service and
our persistent storage needs
4. About this talk
An introduction Spotify, to our service and
our persistent storage needs
What Cassandra brings
5. About this talk
An introduction Spotify, to our service and
our persistent storage needs
What Cassandra brings
What we have learned
6. About this talk
An introduction Spotify, to our service and
our persistent storage needs
What Cassandra brings
What we have learned
What I would have liked to have known a year
ago
7. About this talk
An introduction Spotify, to our service and
our persistent storage needs
What Cassandra brings
What we have learned
What I would have liked to have known a year
ago
Not a comparison between different NoSQL
solutions
8. About this talk
An introduction Spotify, to our service and
our persistent storage needs
What Cassandra brings
What we have learned
What I would have liked to have known a year
ago
Not a comparison between different NoSQL
solutions
The real reason: yes, we are hiring.
15. Spotify — all music, all the time
A better user experience than file sharing.
16. Spotify — all music, all the time
A better user experience than file sharing.
Native desktop and mobile clients.
17. Spotify — all music, all the time
A better user experience than file sharing.
Native desktop and mobile clients.
Custom backend, built for performance and
scalability.
18. Spotify — all music, all the time
A better user experience than file sharing.
Native desktop and mobile clients.
Custom backend, built for performance and
scalability.
19. Spotify — all music, all the time
A better user experience than file sharing.
Native desktop and mobile clients.
Custom backend, built for performance and
scalability.
13 markets. More than ten million users.
20. Spotify — all music, all the time
A better user experience than file sharing.
Native desktop and mobile clients.
Custom backend, built for performance and
scalability.
13 markets. More than ten million users.
3 datacenters.
21. Spotify — all music, all the time
A better user experience than file sharing.
Native desktop and mobile clients.
Custom backend, built for performance and
scalability.
13 markets. More than ten million users.
3 datacenters.
Tens of gigabits of data pushed per
datacenter.
22. Spotify — all music, all the time
A better user experience than file sharing.
Native desktop and mobile clients.
Custom backend, built for performance and
scalability.
13 markets. More than ten million users.
3 datacenters.
Tens of gigabits of data pushed per
datacenter.
Backend systems that support a large set of
innovative features.
26. Innovative features in practice
Playlist
A named list of tracks
Keep multiple devices in sync
27. Innovative features in practice
Playlist
A named list of tracks
Keep multiple devices in sync
Support nested playlists
28. Innovative features in practice
Playlist
A named list of tracks
Keep multiple devices in sync
Support nested playlists
Offline editing, pubsub
29. Innovative features in practice
Playlist
A named list of tracks
Keep multiple devices in sync
Support nested playlists
Offline editing, pubsub
Scale. More than half a billion lists currently
in the system
30. Innovative features in practice
Playlist
A named list of tracks
Keep multiple devices in sync
Support nested playlists
Offline editing, pubsub
Scale. More than half a billion lists currently
in the system
About 10 kHz on peak traffic.
31. Innovative features in practice
Playlist
A named list of tracks
Keep multiple devices in sync
Support nested playlists
Offline editing, pubsub
Scale. More than half a billion lists currently
in the system
About 10 kHz on peak traffic.
Result: accidentally implemented VCS
38. Suggested solutions
Flat files
We don’t need ACID
Linux page cache kicks ass.
(Not really)
SQL
Tried and true. Facebook does this
39. Suggested solutions
Flat files
We don’t need ACID
Linux page cache kicks ass.
(Not really)
SQL
Tried and true. Facebook does this
Simple Key-Value store
40. Suggested solutions
Flat files
We don’t need ACID
Linux page cache kicks ass.
(Not really)
SQL
Tried and true. Facebook does this
Simple Key-Value store
Tokyo cabinet, some experience
41. Suggested solutions
Flat files
We don’t need ACID
Linux page cache kicks ass.
(Not really)
SQL
Tried and true. Facebook does this
Simple Key-Value store
Tokyo cabinet, some experience
Clustered Key-Value store
42. Suggested solutions
Flat files
We don’t need ACID
Linux page cache kicks ass.
(Not really)
SQL
Tried and true. Facebook does this
Simple Key-Value store
Tokyo cabinet, some experience
Clustered Key-Value store
Evaluated a lot, end game contestants HBase
and Cassandra
46. Enter Cassandra
Solves a large subset of storage related
problems
Sharding, replication
No single point of failure
47. Enter Cassandra
Solves a large subset of storage related
problems
Sharding, replication
No single point of failure
Free software
48. Enter Cassandra
Solves a large subset of storage related
problems
Sharding, replication
No single point of failure
Free software
Active community, commercial backing
49. Enter Cassandra
Solves a large subset of storage related
problems
Sharding, replication
No single point of failure
Free software
Active community, commercial backing
66 + 18 + 9 + 28 production nodes
50. Enter Cassandra
Solves a large subset of storage related
problems
Sharding, replication
No single point of failure
Free software
Active community, commercial backing
66 + 18 + 9 + 28 production nodes
About twenty nodes for various testing
clusters
51. Enter Cassandra
Solves a large subset of storage related
problems
Sharding, replication
No single point of failure
Free software
Active community, commercial backing
66 + 18 + 9 + 28 production nodes
About twenty nodes for various testing
clusters
Datasets ranging from 8T to a few gigs.
55. Cassandra, winning!
Major upgrades without service interruptions
(in theory)
Crazy fast writes
Not just because you have a hardware RAID
card that is good at lying to you
56. Cassandra, winning!
Major upgrades without service interruptions
(in theory)
Crazy fast writes
Not just because you have a hardware RAID
card that is good at lying to you
Uses the knowledge that sequential is I/O
faster than random I/O
57. Cassandra, winning!
Major upgrades without service interruptions
(in theory)
Crazy fast writes
Not just because you have a hardware RAID
card that is good at lying to you
Uses the knowledge that sequential is I/O
faster than random I/O
In case of inconsistencies, knows what to do
58. Cassandra, winning!
Major upgrades without service interruptions
(in theory)
Crazy fast writes
Not just because you have a hardware RAID
card that is good at lying to you
Uses the knowledge that sequential is I/O
faster than random I/O
In case of inconsistencies, knows what to do
Cross datacenter replication support
59. Cassandra, winning!
Major upgrades without service interruptions
(in theory)
Crazy fast writes
Not just because you have a hardware RAID
card that is good at lying to you
Uses the knowledge that sequential is I/O
faster than random I/O
In case of inconsistencies, knows what to do
Cross datacenter replication support
Tinker friendly
60. Cassandra, winning!
Major upgrades without service interruptions
(in theory)
Crazy fast writes
Not just because you have a hardware RAID
card that is good at lying to you
Uses the knowledge that sequential is I/O
faster than random I/O
In case of inconsistencies, knows what to do
Cross datacenter replication support
Tinker friendly
Readable code
63. Cassandra flexibility for Playlist
The main use cases for playlist:
Get me all changes since version N of playlist P
64. Cassandra flexibility for Playlist
The main use cases for playlist:
Get me all changes since version N of playlist P
Apply the following changes on top of version
M of playlist Q
65. Cassandra flexibility for Playlist
The main use cases for playlist:
Get me all changes since version N of playlist P
Apply the following changes on top of version
M of playlist Q
This translates to CFs head and change
66. Cassandra flexibility for Playlist
The main use cases for playlist:
Get me all changes since version N of playlist P
Apply the following changes on top of version
M of playlist Q
This translates to CFs head and change
Asymmetric sizes
67. Cassandra flexibility for Playlist
The main use cases for playlist:
Get me all changes since version N of playlist P
Apply the following changes on top of version
M of playlist Q
This translates to CFs head and change
Asymmetric sizes
Neat trick: read change with level=ONE,
fallback to LOCAL_QUORUM
68. Cassandra flexibility for Playlist
The main use cases for playlist:
Get me all changes since version N of playlist P
Apply the following changes on top of version
M of playlist Q
This translates to CFs head and change
Asymmetric sizes
Neat trick: read change with level=ONE,
fallback to LOCAL_QUORUM
70. Let me tell you a story
Latest stable kernel from Debian Squeeze
2.6.32-5
71. Let me tell you a story
Latest stable kernel from Debian Squeeze
2.6.32-5
What happens after 209 days of uptime?
72. Let me tell you a story
Latest stable kernel from Debian Squeeze
2.6.32-5
What happens after 209 days of uptime?
Load average around 120.
73. Let me tell you a story
Latest stable kernel from Debian Squeeze
2.6.32-5
What happens after 209 days of uptime?
Load average around 120.
No CPU activity reported by top
74. Let me tell you a story
Latest stable kernel from Debian Squeeze
2.6.32-5
What happens after 209 days of uptime?
Load average around 120.
No CPU activity reported by top
Mattias de Zalenski:
log((209 days) / (1 nanoseconds)) / log(2) = 54.0034557
(2^54) nanoseconds = 208.499983 days
Somewhere nanosecond values are shifted ten bits?
75. Let me tell you a story
Latest stable kernel from Debian Squeeze
2.6.32-5
What happens after 209 days of uptime?
Load average around 120.
No CPU activity reported by top
Mattias de Zalenski:
log((209 days) / (1 nanoseconds)) / log(2) = 54.0034557
(2^54) nanoseconds = 208.499983 days
Somewhere nanosecond values are shifted ten bits?
Downtime for payment
76. Let me tell you a story
Latest stable kernel from Debian Squeeze
2.6.32-5
What happens after 209 days of uptime?
Load average around 120.
No CPU activity reported by top
Mattias de Zalenski:
log((209 days) / (1 nanoseconds)) / log(2) = 54.0034557
(2^54) nanoseconds = 208.499983 days
Somewhere nanosecond values are shifted ten bits?
Downtime for payment
Downtime for account creation
77. Let me tell you a story
Latest stable kernel from Debian Squeeze
2.6.32-5
What happens after 209 days of uptime?
Load average around 120.
No CPU activity reported by top
Mattias de Zalenski:
log((209 days) / (1 nanoseconds)) / log(2) = 54.0034557
(2^54) nanoseconds = 208.499983 days
Somewhere nanosecond values are shifted ten bits?
Downtime for payment
Downtime for account creation
No downtime for cassandra backed systems
79. Backups
A few terabytes of live data, many nodes.
Painful.
80. Backups
A few terabytes of live data, many nodes.
Painful.
Inefficient. Copy of on disk structure, at least
3 times the data
81. Backups
A few terabytes of live data, many nodes.
Painful.
Inefficient. Copy of on disk structure, at least
3 times the data
Non-compacted. Possibly a few tens of old
versions.
82. Backups
A few terabytes of live data, many nodes.
Painful.
Inefficient. Copy of on disk structure, at least
3 times the data
Non-compacted. Possibly a few tens of old
versions.
Pulling data off nodes evict hot data from
page cache.
83. Backups
A few terabytes of live data, many nodes.
Painful.
Inefficient. Copy of on disk structure, at least
3 times the data
Non-compacted. Possibly a few tens of old
versions.
Pulling data off nodes evict hot data from
page cache.
Initially, only full backups (pre 0.8)
86. Our solution to backups
NetworkTopologyStrategy is cool
Separate datacenter for backups with RF=1
87. Our solution to backups
NetworkTopologyStrategy is cool
Separate datacenter for backups with RF=1
Beware: tricky
88. Our solution to backups
NetworkTopologyStrategy is cool
Separate datacenter for backups with RF=1
Beware: tricky
Once removed from production performance
considerations
89. Our solution to backups
NetworkTopologyStrategy is cool
Separate datacenter for backups with RF=1
Beware: tricky
Once removed from production performance
considerations
Application level incremental backups
90. Our solution to backups
NetworkTopologyStrategy is cool
Separate datacenter for backups with RF=1
Beware: tricky
Once removed from production performance
considerations
Application level incremental backups
This week, cassandra level incremental
backups
91. Our solution to backups
NetworkTopologyStrategy is cool
Separate datacenter for backups with RF=1
Beware: tricky
Once removed from production performance
considerations
Application level incremental backups
This week, cassandra level incremental
backups
Still some issues: lots of SSTables
93. Solid state is a game changer
Asymmetrically sized datasets
94. Solid state is a game changer
Asymmetrically sized datasets
I Can Haz superlarge SSD?
95. Solid state is a game changer
Asymmetrically sized datasets
I Can Haz superlarge SSD?
No.
96. Solid state is a game changer
Asymmetrically sized datasets
I Can Haz superlarge SSD?
No.
With small disks, on disk data structure size
matters a lot
97. Solid state is a game changer
Asymmetrically sized datasets
I Can Haz superlarge SSD?
No.
With small disks, on disk data structure size
matters a lot
Our plan:
98. Solid state is a game changer
Asymmetrically sized datasets
I Can Haz superlarge SSD?
No.
With small disks, on disk data structure size
matters a lot
Our plan:
Leveled compaction strategy, new in 1.0
99. Solid state is a game changer
Asymmetrically sized datasets
I Can Haz superlarge SSD?
No.
With small disks, on disk data structure size
matters a lot
Our plan:
Leveled compaction strategy, new in 1.0
Hack cassandra to have configurable datadirs
per keyspace.
100. Solid state is a game changer
Asymmetrically sized datasets
I Can Haz superlarge SSD?
No.
With small disks, on disk data structure size
matters a lot
Our plan:
Leveled compaction strategy, new in 1.0
Hack cassandra to have configurable datadirs
per keyspace.
Our patch is integrated in Cassandra 1.1
104. Some unpleasant surprises
Immaturity.
Has anyone written nodetool -h ring?
Broken on disk bloom filters in 0.8. Very
painful upgrade to 1.0
105. Some unpleasant surprises
Immaturity.
Has anyone written nodetool -h ring?
Broken on disk bloom filters in 0.8. Very
painful upgrade to 1.0
Small disk, high load, very possible to get
into an Out Of Disk condition
106. Some unpleasant surprises
Immaturity.
Has anyone written nodetool -h ring?
Broken on disk bloom filters in 0.8. Very
painful upgrade to 1.0
Small disk, high load, very possible to get
into an Out Of Disk condition
Logging is lacking
109. Lessons learned from backup datacenter
Asymmetric cluster sizes are painful.
60 production nodes, 6 backup nodes
110. Lessons learned from backup datacenter
Asymmetric cluster sizes are painful.
60 production nodes, 6 backup nodes
Repairs that replicate all data 10 times
111. Lessons learned from backup datacenter
Asymmetric cluster sizes are painful.
60 production nodes, 6 backup nodes
Repairs that replicate all data 10 times
The workaround: manual repairs
112. Lessons learned from backup datacenter
Asymmetric cluster sizes are painful.
60 production nodes, 6 backup nodes
Repairs that replicate all data 10 times
The workaround: manual repairs
Remove sstables from broken node (to free up
space)
113. Lessons learned from backup datacenter
Asymmetric cluster sizes are painful.
60 production nodes, 6 backup nodes
Repairs that replicate all data 10 times
The workaround: manual repairs
Remove sstables from broken node (to free up
space)
Start it to have it take writes while repopulating
114. Lessons learned from backup datacenter
Asymmetric cluster sizes are painful.
60 production nodes, 6 backup nodes
Repairs that replicate all data 10 times
The workaround: manual repairs
Remove sstables from broken node (to free up
space)
Start it to have it take writes while repopulating
Snapshot and move SSTables from 4 evenly
spaced nodes
115. Lessons learned from backup datacenter
Asymmetric cluster sizes are painful.
60 production nodes, 6 backup nodes
Repairs that replicate all data 10 times
The workaround: manual repairs
Remove sstables from broken node (to free up
space)
Start it to have it take writes while repopulating
Snapshot and move SSTables from 4 evenly
spaced nodes
Do a full compaction
116. Lessons learned from backup datacenter
Asymmetric cluster sizes are painful.
60 production nodes, 6 backup nodes
Repairs that replicate all data 10 times
The workaround: manual repairs
Remove sstables from broken node (to free up
space)
Start it to have it take writes while repopulating
Snapshot and move SSTables from 4 evenly
spaced nodes
Do a full compaction
Do a repair and hope for the best
119. Spot the bug
Hector java cassandra driver:
private AtomicInteger counter = new AtomicInteger();
private Server getNextServer() {
counter.compareAndSet(16384, 0);
return servers[counter.getAndIncrement() % servers.length];
}
120. Spot the bug
Hector java cassandra driver:
private AtomicInteger counter = new AtomicInteger();
private Server getNextServer() {
counter.compareAndSet(16384, 0);
return servers[counter.getAndIncrement() % servers.length];
}
Race condition
121. Spot the bug
Hector java cassandra driver:
private AtomicInteger counter = new AtomicInteger();
private Server getNextServer() {
counter.compareAndSet(16384, 0);
return servers[counter.getAndIncrement() % servers.length];
}
Race condition
java.lang.ArrayIndexOutOfBoundsException
122. Spot the bug
Hector java cassandra driver:
private AtomicInteger counter = new AtomicInteger();
private Server getNextServer() {
counter.compareAndSet(16384, 0);
return servers[counter.getAndIncrement() % servers.length];
}
Race condition
java.lang.ArrayIndexOutOfBoundsException
After close to 2**31 requests
123. Spot the bug
Hector java cassandra driver:
private AtomicInteger counter = new AtomicInteger();
private Server getNextServer() {
counter.compareAndSet(16384, 0);
return servers[counter.getAndIncrement() % servers.length];
}
Race condition
java.lang.ArrayIndexOutOfBoundsException
After close to 2**31 requests
Took a few days
124. Thrift payload size limits
Communication with Cassandra is based on
thrift
Large mutations, larger than 15MiB
Thrift drops the underlying TCP connection
Hector considers the connection drop a node
specific problem
Retries on all cassandra nodes
Effectively shutting down all cassandra traffic
126. Conclusions
In the 0.6-1.0 timeframe, developers and
operations engineers are needed
127. Conclusions
In the 0.6-1.0 timeframe, developers and
operations engineers are needed
You need to keep an eye on bugs created, be
part of the community
128. Conclusions
In the 0.6-1.0 timeframe, developers and
operations engineers are needed
You need to keep an eye on bugs created, be
part of the community
Exotic stuff (such a asymmetrically sized
datacenters) is tricky
129. Conclusions
In the 0.6-1.0 timeframe, developers and
operations engineers are needed
You need to keep an eye on bugs created, be
part of the community
Exotic stuff (such a asymmetrically sized
datacenters) is tricky
Lots of things gets fixed. You need to keep
up with upstream
130. Conclusions
In the 0.6-1.0 timeframe, developers and
operations engineers are needed
You need to keep an eye on bugs created, be
part of the community
Exotic stuff (such a asymmetrically sized
datacenters) is tricky
Lots of things gets fixed. You need to keep
up with upstream
You need to integrate with monitoring and
graphing
131. Conclusions
In the 0.6-1.0 timeframe, developers and
operations engineers are needed
You need to keep an eye on bugs created, be
part of the community
Exotic stuff (such a asymmetrically sized
datacenters) is tricky
Lots of things gets fixed. You need to keep
up with upstream
You need to integrate with monitoring and
graphing
Consider it a toolkit for constructing
solutions.