The database world is undergoing a major upheaval. NoSQL databases such as MongoDB and Cassandra are emerging as a compelling choice for many applications. They can simplify the persistence of complex data models and offering significantly better scalability and performance. But these databases have a very different and unfamiliar data model and APIs as well as a limited transaction model. Moreover, the relational world is fighting back with so-called NewSQL databases such as VoltDB, which by using a radically different architecture offers high scalability and performance as well as the familiar relational model and ACID transactions. Sounds great but unlike the traditional relational database you can't use JDBC and must partition your data.
In this presentation you will learn about popular NoSQL databases - MongoDB, and Cassandra - as well at VoltDB. We will compare and contrast each database's data model and Java API using NoSQL and NewSQL versions of a use case from the book POJOs in Action. We will learn about the benefits and drawbacks of using NoSQL and NewSQL databases.
Historically, enterprises used traditional RDBMs for on-line transaction processing (OLTP) applications. We affectionately call these systems OldSQL, because they are largely legacy code bases, originally written decades ago. New OLTP applications have more extreme performance requirements than the Old OLTP applications of yesteryears. These are caused by web users directly submitting transactions rather than using a professional terminal operator and by mobile devices (and sensors) enabling transaction submission from many more locations. In a considerable number of modern applications (multiplayer games, risk analysis in electronic trading, gambling, social networks, etc.) OldSQL is cracking under the volume of interactions. This talk contrasts two alternatives to OldSQL in this area:
NoSQL, where both SQL and ACID transactions are jettisoned for better performance
NewSQL, where SQL and ACID are retained, and better performance is delivered through innovative architectures
NewSQL overview:
- History of RDBMs
- The reasons why NoSQL concept appeared
- Why NoSQL was not enough, the necessity of NewSQL
- Characteristics of NewSQL
- 7 DBs that belongs to NewSQL
- Overview Table with main properties
Scylla Summit 2016: Using ScyllaDB for a Microservice-based Pipeline in GoScyllaDB
How do you handle the continuous transformation and refinement of billions of entities with some sort of reliability and performance? In this talk, Henrik will describe how Scylla enabled him and his team to create a pipelined solution using a series of microservices written in Go communicating with each other using Nats. You’ll hear about the mistakes and learnings they had along the way as they built the services that led to the great performance and stability they are experiencing today.
Historically, enterprises used traditional RDBMs for on-line transaction processing (OLTP) applications. We affectionately call these systems OldSQL, because they are largely legacy code bases, originally written decades ago. New OLTP applications have more extreme performance requirements than the Old OLTP applications of yesteryears. These are caused by web users directly submitting transactions rather than using a professional terminal operator and by mobile devices (and sensors) enabling transaction submission from many more locations. In a considerable number of modern applications (multiplayer games, risk analysis in electronic trading, gambling, social networks, etc.) OldSQL is cracking under the volume of interactions. This talk contrasts two alternatives to OldSQL in this area:
NoSQL, where both SQL and ACID transactions are jettisoned for better performance
NewSQL, where SQL and ACID are retained, and better performance is delivered through innovative architectures
NewSQL overview:
- History of RDBMs
- The reasons why NoSQL concept appeared
- Why NoSQL was not enough, the necessity of NewSQL
- Characteristics of NewSQL
- 7 DBs that belongs to NewSQL
- Overview Table with main properties
Scylla Summit 2016: Using ScyllaDB for a Microservice-based Pipeline in GoScyllaDB
How do you handle the continuous transformation and refinement of billions of entities with some sort of reliability and performance? In this talk, Henrik will describe how Scylla enabled him and his team to create a pipelined solution using a series of microservices written in Go communicating with each other using Nats. You’ll hear about the mistakes and learnings they had along the way as they built the services that led to the great performance and stability they are experiencing today.
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyScyllaDB
This webinar compares NoSQL and NewSQL databases. We will look at the significant architectural differences between the two, tradeoffs between availability, scalable performance and consistency, data models, and share benchmark results to display the performance implications of NoSQL versus NewSQL.
Mesosphere and Contentteam: A New Way to Run CassandraDataStax Academy
We, Ben Whitehead and Robert Stupp, will show you how to run Cassandra on Mesos. We will go through all the technical steps how to plan, setup and operate even large scale Cassandra clusters on Mesos. Further we illustrate how the Cassandra-on-Mesos framework helps you to setup Cassandra on Mesos, schedule regular maintenance tasks and manage hardware failures in the heart of your data center.
Comparing Apache Cassandra 4.0, 3.0, and ScyllaDBScyllaDB
How does Cassandra 4.0’s performance compare to Cassandra 3.x’s? What’s been fixed in 4.0 and what remains unchanged? Should you upgrade or consider other options?
Join us for a webinar where we’ll answer these questions and more, based on our extensive benchmarks comparing Cassandra 4.0 against Cassandra 3.11. We’ll also share how the new release of Cassandra stacks up against Scylla Open Source. You’ll learn the rationale and results for our head-to-head comparisons, including:
- Throughput under various loads
- Comparison of long-tail (p95, p99) latencies
- Improvements to operations such as compactions
If you are considering upgrading your existing infrastructure from Cassandra 3.11, or if you are considering a new wide column database for a greenfield deployment, this is a session you won’t want to miss!
Scylla Summit 2016: Why Kenshoo is about to displace Cassandra with ScyllaScyllaDB
Kenshoo is a leader in digital marketing with very heavy data usage. Learn about their big data challenges, the tools that they use, and their experience evaluating Scylla.
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...ScyllaDB
Outbrain is the world's largest content discovery program. Learn about their use case with Scylla where they lowered latency while doing 20X IOPS of Cassandra.
Cisco: Cassandra adoption on Cisco UCS & OpenStackDataStax Academy
n this talk we will address how we developed our Cassandra environments utilizing Cisco UCS Open Stack Platform with the DataStax Enterprise Edition software. In addition we are utilizing OpenSource CEPH storage in our Infrastructure to optimize the Performance and reduce the costs.
Why you need benchmarks
Finding the right database solution for your use case can be an arduous journey. The database deployment touches aspects of throughput performance, latency control, high availability and data resilience.
You will need to decide on the infrastructure to use: Cloud, on-premise or a hybrid solution.
Data models also have an impact on finding the right fit for the use case. Once you establish a requirements set, the next step is to test your use case against the databases of choice.
In this workshop, we will discuss the different data points you need to collect in order to get the most realistic testing environment.
We will cover:
Data model impact on performance and latency
Client behavior related to database capabilities
Failover and high availability testing
Hardware selection and cluster configuration impact
We will show 2 benchmarking tools you can use to test and benchmark your clusters to identify the optimal deployment scenario for your use case.
Attend this virtual workshop if you are:
Looking to minimize the cost of your database deployment
Making a database decision based on performance and scale data
Planning to emulate your workload on a pre-production system where you can test, fail fast and learn.
Postgres-XC as a Key Value Store Compared To MongoDBMason Sharp
This presentation discusses how Postgres-XC can be used as a PostgreSQL-based key-value store using features like hstore and JSON. It also compares performance to MongoDB for a read workload
Under the Hood of a Shard-per-Core Database ArchitectureScyllaDB
Most databases are based on architectures that pre-date advances to modern hardware. This results in performance issues, the need to overprovision, and a high total cost of ownership. In this webinar we will discuss the advances to modern server technology and take a deep dive into Scylla’s shard-per-core architecture and our asynchronous engine, the Seastar framework.
Join us to learn how Seastar (and Scylla):
Avoid locks and contention on the CPU level
Bypass kernel bottlenecks
Implement its per-core shared-nothing autosharding mechanism
Utilize modern storage hardware
Leverage NUMA to get the best RAM performance
Balance your data across CPUs and nodes for best and smoothest performance
Plus we’ll cover the advantages of unlocking vertical scalability.
ScyllaDB recently announced Project Alternator, a new open source project that will enable Amazon DynamoDB users to easily migrate to an open-source database that runs anywhere — on most cloud platforms, on-premises, on bare-metal, virtual machines or via Kubernetes — all while preserving their investments in their existing application code.
Project Alternator will help DynamoDB users achieve much better and more reliable performance, reduce database costs by 80% - 90%, support large items (10s of MBs) and large partitions (multiple GBs), control the number of replicas, balance cost vs. redundancy, and much more.
Join ScyllaDB founders Avi Kivity and Dor Laor and lead engineer Nadav Har’El for a live webinar on September 25th, where they will share an overview of Project Alternator, including:
Alternator’s design implementation and goals
How to configure Alternator (ok, add alternator_port: 8000 to your scylla.yaml)
Demo how to easily run it from docker/rpm
Run several examples:
Tic-tac-toe based DynamoDB example with Alternator
How to benchmark Scylla Alternator with YCSB and considerations around it
How to run a serverless application along with Alternator
How to migrate DynamoDB data to Alternator using the Spark migrator
Discuss the current limitations of Alternator
Plus we will discuss current limitations of Alternator, describe different consistencies and active-active vs leader model, share the project roadmap, and answer your questions at the end.
Ramp-Tutorial for MYSQL Cluster - Scaling with Continuous AvailabilityPythian
Rene Cannao's Ramp-Tutorial for MYSQL Cluster - Scaling with Continuous Availability. Rene, a Senior Operational DBA at PalominoDB.com, will guide attendees through a hands-on experience in the installation, configuration management and tuning of MySQL Cluster.
Agenda:
- MySQL Cluster Concepts and Architecture: we will review the principle of a fault-tolerant shared nothing architecture, and how this is implemented into NDB;
- MySQL Cluster processes : attendees will understand the various roles and interactions between Data Nodes, API Nodes and Management Nodes;
- Installation : we will install a minimal HA solution with MySQL Cluster on 3 virtual machines;
- Configuration of a basic system : upon describing the most important configuration parameters, Data/API/Management nodes will be configured and the Cluster launched;
- Loading data: the "world" schema will be imported into NDB using "in memory" and "disk based" storages; the attendees will experience how data changes are visible across API Nodes;
- Understand the NDB Storage Engine : internal implementation details will be explained, like synchronous replication, transaction coordinator, heartbeat, communication, failure detection and handling, checkpoint, etc;
- Query and schema design : attendees will understand the execution plan of queries with NDB, how SQL and Data Nodes communicate, how indexes and partitions are implemented, condition pushdown, join pushdown, query cache;
- Management and Administration: the attendees will test High Availability of NDB when a node become unavailable will learn how to read log file, how to stop/start any component of the Cluster to perform a rolling restart with no downtime, and how to handle a degraded setup;
- Backup and Recovery: attendees will be driven through the procedure of using NDB-native online backup and restore, and how this differs from mysqldump;
- Monitor and improve performance: attendee will learn how to boost performance tweaking variables according to hardware configuration and application workload
Run Cloud Native MySQL NDB Cluster in KubernetesBernd Ocklin
The more your database aligns with Cloud Native principles such as resilience, scaling, auto-healing and data consistency across all nodes, the better it also runs as DBaaS in Kubernetes. I walk through running databases in Kubernetes and demos manual deployment and deployment with an NDB operator.
This talk was given at the MySQL Dev Room FOSDEM 2021.
Scylla on Kubernetes: Introducing the Scylla OperatorScyllaDB
How can Kubernetes be best used to automate the deployment, scaling, and various operations of a Scylla database?
Enter Kubernetes Operators, the way to combine domain-specific knowledge about Scylla with the automation framework of Kubernetes.
In this presentation, we will quickly explore what Kubernetes is and why it works so well, highlight the pain points of running Scylla with just Kubernetes primitives, and show how we extended Kubernetes so that it can correctly operate a Scylla database.
Finally, we will show the Scylla Operator in action and show how easily you can spin up a Scylla cluster with just one command.
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, EgyptChris Richardson
The database world is undergoing a major upheaval. NoSQL databases such as MongoDB and Cassandra are emerging as a compelling choice for many applications. They can simplify the persistence of complex data models and offering significantly better scalability and performance. But these databases have a very different and unfamiliar data model and APIs as well as a limited transaction model. Moreover, the relational world is fighting back with so-called NewSQL databases such as VoltDB, which by using a radically different architecture offers high scalability and performance as well as the familiar relational model and ACID transactions. Sounds great but unlike the traditional relational database you can’t use JDBC and must partition your data.
In this presentation you will learn about popular NoSQL databases – MongoDB, and Cassandra - as well at VoltDB. We will compare and contrast each database’s data model and Java API using NoSQL and NewSQL versions of a use case from the book POJOs in Action. We will learn about the benefits and drawbacks of using NoSQL and NewSQL databases.
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyScyllaDB
This webinar compares NoSQL and NewSQL databases. We will look at the significant architectural differences between the two, tradeoffs between availability, scalable performance and consistency, data models, and share benchmark results to display the performance implications of NoSQL versus NewSQL.
Mesosphere and Contentteam: A New Way to Run CassandraDataStax Academy
We, Ben Whitehead and Robert Stupp, will show you how to run Cassandra on Mesos. We will go through all the technical steps how to plan, setup and operate even large scale Cassandra clusters on Mesos. Further we illustrate how the Cassandra-on-Mesos framework helps you to setup Cassandra on Mesos, schedule regular maintenance tasks and manage hardware failures in the heart of your data center.
Comparing Apache Cassandra 4.0, 3.0, and ScyllaDBScyllaDB
How does Cassandra 4.0’s performance compare to Cassandra 3.x’s? What’s been fixed in 4.0 and what remains unchanged? Should you upgrade or consider other options?
Join us for a webinar where we’ll answer these questions and more, based on our extensive benchmarks comparing Cassandra 4.0 against Cassandra 3.11. We’ll also share how the new release of Cassandra stacks up against Scylla Open Source. You’ll learn the rationale and results for our head-to-head comparisons, including:
- Throughput under various loads
- Comparison of long-tail (p95, p99) latencies
- Improvements to operations such as compactions
If you are considering upgrading your existing infrastructure from Cassandra 3.11, or if you are considering a new wide column database for a greenfield deployment, this is a session you won’t want to miss!
Scylla Summit 2016: Why Kenshoo is about to displace Cassandra with ScyllaScyllaDB
Kenshoo is a leader in digital marketing with very heavy data usage. Learn about their big data challenges, the tools that they use, and their experience evaluating Scylla.
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...ScyllaDB
Outbrain is the world's largest content discovery program. Learn about their use case with Scylla where they lowered latency while doing 20X IOPS of Cassandra.
Cisco: Cassandra adoption on Cisco UCS & OpenStackDataStax Academy
n this talk we will address how we developed our Cassandra environments utilizing Cisco UCS Open Stack Platform with the DataStax Enterprise Edition software. In addition we are utilizing OpenSource CEPH storage in our Infrastructure to optimize the Performance and reduce the costs.
Why you need benchmarks
Finding the right database solution for your use case can be an arduous journey. The database deployment touches aspects of throughput performance, latency control, high availability and data resilience.
You will need to decide on the infrastructure to use: Cloud, on-premise or a hybrid solution.
Data models also have an impact on finding the right fit for the use case. Once you establish a requirements set, the next step is to test your use case against the databases of choice.
In this workshop, we will discuss the different data points you need to collect in order to get the most realistic testing environment.
We will cover:
Data model impact on performance and latency
Client behavior related to database capabilities
Failover and high availability testing
Hardware selection and cluster configuration impact
We will show 2 benchmarking tools you can use to test and benchmark your clusters to identify the optimal deployment scenario for your use case.
Attend this virtual workshop if you are:
Looking to minimize the cost of your database deployment
Making a database decision based on performance and scale data
Planning to emulate your workload on a pre-production system where you can test, fail fast and learn.
Postgres-XC as a Key Value Store Compared To MongoDBMason Sharp
This presentation discusses how Postgres-XC can be used as a PostgreSQL-based key-value store using features like hstore and JSON. It also compares performance to MongoDB for a read workload
Under the Hood of a Shard-per-Core Database ArchitectureScyllaDB
Most databases are based on architectures that pre-date advances to modern hardware. This results in performance issues, the need to overprovision, and a high total cost of ownership. In this webinar we will discuss the advances to modern server technology and take a deep dive into Scylla’s shard-per-core architecture and our asynchronous engine, the Seastar framework.
Join us to learn how Seastar (and Scylla):
Avoid locks and contention on the CPU level
Bypass kernel bottlenecks
Implement its per-core shared-nothing autosharding mechanism
Utilize modern storage hardware
Leverage NUMA to get the best RAM performance
Balance your data across CPUs and nodes for best and smoothest performance
Plus we’ll cover the advantages of unlocking vertical scalability.
ScyllaDB recently announced Project Alternator, a new open source project that will enable Amazon DynamoDB users to easily migrate to an open-source database that runs anywhere — on most cloud platforms, on-premises, on bare-metal, virtual machines or via Kubernetes — all while preserving their investments in their existing application code.
Project Alternator will help DynamoDB users achieve much better and more reliable performance, reduce database costs by 80% - 90%, support large items (10s of MBs) and large partitions (multiple GBs), control the number of replicas, balance cost vs. redundancy, and much more.
Join ScyllaDB founders Avi Kivity and Dor Laor and lead engineer Nadav Har’El for a live webinar on September 25th, where they will share an overview of Project Alternator, including:
Alternator’s design implementation and goals
How to configure Alternator (ok, add alternator_port: 8000 to your scylla.yaml)
Demo how to easily run it from docker/rpm
Run several examples:
Tic-tac-toe based DynamoDB example with Alternator
How to benchmark Scylla Alternator with YCSB and considerations around it
How to run a serverless application along with Alternator
How to migrate DynamoDB data to Alternator using the Spark migrator
Discuss the current limitations of Alternator
Plus we will discuss current limitations of Alternator, describe different consistencies and active-active vs leader model, share the project roadmap, and answer your questions at the end.
Ramp-Tutorial for MYSQL Cluster - Scaling with Continuous AvailabilityPythian
Rene Cannao's Ramp-Tutorial for MYSQL Cluster - Scaling with Continuous Availability. Rene, a Senior Operational DBA at PalominoDB.com, will guide attendees through a hands-on experience in the installation, configuration management and tuning of MySQL Cluster.
Agenda:
- MySQL Cluster Concepts and Architecture: we will review the principle of a fault-tolerant shared nothing architecture, and how this is implemented into NDB;
- MySQL Cluster processes : attendees will understand the various roles and interactions between Data Nodes, API Nodes and Management Nodes;
- Installation : we will install a minimal HA solution with MySQL Cluster on 3 virtual machines;
- Configuration of a basic system : upon describing the most important configuration parameters, Data/API/Management nodes will be configured and the Cluster launched;
- Loading data: the "world" schema will be imported into NDB using "in memory" and "disk based" storages; the attendees will experience how data changes are visible across API Nodes;
- Understand the NDB Storage Engine : internal implementation details will be explained, like synchronous replication, transaction coordinator, heartbeat, communication, failure detection and handling, checkpoint, etc;
- Query and schema design : attendees will understand the execution plan of queries with NDB, how SQL and Data Nodes communicate, how indexes and partitions are implemented, condition pushdown, join pushdown, query cache;
- Management and Administration: the attendees will test High Availability of NDB when a node become unavailable will learn how to read log file, how to stop/start any component of the Cluster to perform a rolling restart with no downtime, and how to handle a degraded setup;
- Backup and Recovery: attendees will be driven through the procedure of using NDB-native online backup and restore, and how this differs from mysqldump;
- Monitor and improve performance: attendee will learn how to boost performance tweaking variables according to hardware configuration and application workload
Run Cloud Native MySQL NDB Cluster in KubernetesBernd Ocklin
The more your database aligns with Cloud Native principles such as resilience, scaling, auto-healing and data consistency across all nodes, the better it also runs as DBaaS in Kubernetes. I walk through running databases in Kubernetes and demos manual deployment and deployment with an NDB operator.
This talk was given at the MySQL Dev Room FOSDEM 2021.
Scylla on Kubernetes: Introducing the Scylla OperatorScyllaDB
How can Kubernetes be best used to automate the deployment, scaling, and various operations of a Scylla database?
Enter Kubernetes Operators, the way to combine domain-specific knowledge about Scylla with the automation framework of Kubernetes.
In this presentation, we will quickly explore what Kubernetes is and why it works so well, highlight the pain points of running Scylla with just Kubernetes primitives, and show how we extended Kubernetes so that it can correctly operate a Scylla database.
Finally, we will show the Scylla Operator in action and show how easily you can spin up a Scylla cluster with just one command.
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, EgyptChris Richardson
The database world is undergoing a major upheaval. NoSQL databases such as MongoDB and Cassandra are emerging as a compelling choice for many applications. They can simplify the persistence of complex data models and offering significantly better scalability and performance. But these databases have a very different and unfamiliar data model and APIs as well as a limited transaction model. Moreover, the relational world is fighting back with so-called NewSQL databases such as VoltDB, which by using a radically different architecture offers high scalability and performance as well as the familiar relational model and ACID transactions. Sounds great but unlike the traditional relational database you can’t use JDBC and must partition your data.
In this presentation you will learn about popular NoSQL databases – MongoDB, and Cassandra - as well at VoltDB. We will compare and contrast each database’s data model and Java API using NoSQL and NewSQL versions of a use case from the book POJOs in Action. We will learn about the benefits and drawbacks of using NoSQL and NewSQL databases.
If NoSQL is your answer, you are probably asking the wrong question.Lukas Smith
This session is not about bad mouthing MongoDB, CoachDB, big data, map reduce or any of the other more recent additions to the database buzzword bingo. Instead it is about looking at how NoSQL is a confusing term and a more realistic assessment how old and new approaches in databases impact todays architectures...
Polyglot persistence for Java developers: time to move out of the relational ...Chris Richardson
Relational databases have long been considered the one true way to persist enterprise data. Even today, they are an excellent choice for many applications. But for some applications NoSQL databases are a viable alternative. They can simplify the persistence of complex data models and offer significantly better scalability, and performance. But using NoSQL databases is very different than the ACID/SQL/JDBC/JPA world that we have become accustomed to. They have different and unfamiliar APIs and a very different and usually limited transaction model. So what’s a Java developer to do?
NativeX (formerly W3i) recently transitioned a large portion of their backend infrastructure from MS SQL Server to Apache Cassandra. Today, its Cassandra cluster backs its mobile advertising network supporting over 10 million daily active users producing over 10,000 transactions per second with an average database request latency of under 2 milliseconds. Going from relational to noSQL required NativeX's engineers to re-train, re-tool and re-think the way it architects applications and infrastructure. Learn why Cassandra was selected as a replacement, what challenges were encountered along the way, and what architecture and infrastructure were involved in the implementation.
NoSQL databases such as Redis, MongoDB and Cassandra are emerging as a compelling choice for many applications. They can simplify the persistence of complex data models and offer significantly better scalability and performance. However, using a NoSQL database means giving up the benefits of the relational model such as SQL, constraints and ACID transactions. For some applications, the solution is polyglot persistence: using SQL and NoSQL databases together.
In this talk, you will learn about the benefits and drawbacks of polyglot persistence and how to design applications that use this approach. We will explore the architecture and implementation of an example application that uses MySQL as the system of record and Redis as a very high-performance database that handles queries from the front-end. You will learn about mechanisms for maintaining consistency across the various databases.
DAT322_The Nanoservices Architecture That Powers BBC OnlineAmazon Web Services
The BBC’s website and apps are used around the world by an audience of millions who read, watch, and interact with a range of content. The BBC handles this scale with an innovative website platform, built on Amazon ElastiCache and Amazon EC2 and based on nanoservices. The BBC has over a thousand nanoservices, powering many of its biggest webpages. Explore its nanoservices platform and use of ElastiCache. Learn how Redis’s ultra-fast queues and pub/sub allow thousands of nanoservices to interact efficiently with low latency. Discover intelligent caching strategies to optimize rendering costs and ensure lightning fast performance. Together, ElastiCache and nanoservices can make real-time systems that can handle thousands of requests per second.
CodeFutures - Scaling Your Database in the CloudRightScale
RightScale Conference Santa Clara 2011: Scaling an application in the cloud often hits the most common bottleneck – the database tier. Not only is database performance the number one cause of poor application performance, but also the issue is magnified in cloud environments where I/O and bandwidth is generally slower and less predictable than in dedicated data centers. Database sharding is a highly effective method of removing the database scalability barrier, operating on top of proven RDBMS products such as MySQL and Postgres – as well as the new NoSQL database platforms. One critical aspect often given too little consideration is monitoring and continuous operation of your databases, including the full lifecycle, to ensure that they stay up.
NoSQL is not a buzzword anymore. The array of non- relational technologies have found wide-scale adoption even in non-Internet scale focus areas. With the advent of the Cloud...the churn has increased even more yet there is no crystal clear guidance on adoption techniques and architectural choices surrounding the plethora of options available. This session initiates you into the whys & wherefores, architectural patterns, caveats and techniques that will augment your decision making process & boost your perception of architecting scalable, fault-tolerant & distributed solutions.
A common microservice architecture anti-pattern is more the merrier. It occurs when an organization team builds an excessively fine-grained architecture, e.g. one service-per-developer. In this talk, you will learn about the criteria that you should consider when deciding service granularity. I'll discuss the downsides of a fine-grained microservice architecture. You will learn how sometimes the solution to a design problem is simply a JAR file.
YOW London - Considering Migrating a Monolith to Microservices? A Dark Energy...Chris Richardson
This is a talk I gave at YOW! London 2022.
Let's imagine that you are responsible for an aging monolithic application that's critical to your business. Sadly, getting changes into production is a painful ordeal that regularly causes outages. And to make matters worse, the application's technology stack is growing increasingly obsolete. Neither the business nor the developers are happy. You need to modernize your application and have read about the benefits of microservices. But is the microservice architecture a good choice for your application?
In this presentation, I describe the dark energy and dark matter forces (a.k.a. concerns) that you must consider when deciding between the monolithic and microservice architectural styles. You will learn about how well each architectural style resolves each of these forces. I describe how to evaluate the relative importance of each of these forces to your application. You will learn how to use the results of this evaluation to decide whether to migrate to the microservice architecture.
Dark Energy, Dark Matter and the Microservices Patterns?!Chris Richardson
Dark matter and dark energy are mysterious concepts from astrophysics that are used to explain observations of distant stars and galaxies. The Microservices pattern language - a collection of patterns that solve architecture, design, development, and operational problems — enables software developers to use the microservice architecture effectively. But how could there possibly be a connection between microservices and these esoteric concepts from astrophysics?
In this presentation, I describe how dark energy and dark matter are excellent metaphors for the competing forces (a.k.a. concerns) that must be resolved by the microservices pattern language. You will learn that dark energy, which is an anti-gravity, is a metaphor for the repulsive forces that encourage decomposition into services. I describe how dark matter, which is an invisible matter that has a gravitational effect, is a metaphor for the attractive forces that resist decomposition and encourage the use of a monolithic architecture. You will learn how to use the dark energy and dark matter forces as guide when designing services and operations.
Dark energy, dark matter and microservice architecture collaboration patternsChris Richardson
Dark energy and dark matter are useful metaphors for the repulsive forces, which encourage decomposition into services, and the attractive forces, which resist decomposition. You must balance these conflicting forces when defining a microservice architecture including when designing system operations (a.k.a. requests) that span services.
In this talk, I describe the dark energy and dark matter forces. You will learn how to design system operations that span services using microservice architecture collaboration patterns: Saga, Command-side replica, API composition, and CQRS patterns. I describe how each of these patterns resolve the dark energy and dark matter forces differently.
It sounds dull but good architecture documentation is essential. Especially when you are actively trying to improve your architecture.
For example, I spend a lot time helping clients modernize their software architecture. More often than I like, I’m presented with a vague and lifeless collection of boxes and lines. As a result, it’s sometimes difficult to discuss the architecture in a meaningful and productive way. In this presentation, I’ll describe techniques for creating minimal yet effective documentation for your application’s microservice architecture. In particular, you will learn how documenting scenarios can bring your architecture to life.
Using patterns and pattern languages to make better architectural decisions Chris Richardson
This is a presentation that gave at the O'Reilly Software Architecture Superstream: Software Architecture Patterns.
The talk's focus is the microservices pattern language.
However, it also shows how thinking with the pattern mindset - context/problem/forces/solution/consequences - leads to better technically decisions.
The microservices architecture offers tremendous benefits, but it’s not a silver bullet. It also has some significant drawbacks. The microservices pattern language—a collection of patterns that solve architecture, design, development, and operational problems—enables software developers to apply the microservices architecture effectively. I provide an overview of the microservices architecture and examines the motivations for the pattern language, then takes you through the key patterns in the pattern language.
Rapid, reliable, frequent and sustainable software development requires an architecture that is loosely coupled and modular.
Teams need to be able complete their work with minimal coordination and communication with other teams.
They also need to be able keep the software’s technology stack up to date.
However, the microservice architecture isn’t always the only way to satisfy these requirements.
Yet, neither is the monolithic architecture.
In this talk, I describe loose coupling and modularity and why they are is essential.
You will learn about three architectural patterns: traditional monolith, modular monolith and microservices.
I describe the benefits, drawbacks and issues of each pattern and how well it supports rapid, reliable, frequent and sustainable development.
You will learn some heuristics for selecting the appropriate pattern for your application.
Events to the rescue: solving distributed data problems in a microservice arc...Chris Richardson
To deliver a large complex application rapidly, frequently and reliably, you often must use the microservice architecture.
The microservice architecture is an architectural style that structures the application as a collection of loosely coupled services.
One challenge with using microservices is that in order to be loosely coupled each service has its own private database.
As a result, implementing transactions and queries that span services is no longer straightforward.
In this presentation, you will learn how event-driven microservices address this challenge.
I describe how to use sagas, which is an asynchronous messaging-based pattern, to implement transactions that span services.
You will learn how to implement queries that span services using the CQRS pattern, which maintain easily queryable replicas using events.
A pattern language for microservices - June 2021 Chris Richardson
The microservice architecture is growing in popularity. It is an architectural style that structures an application as a set of loosely coupled services that are organized around business capabilities. Its goal is to enable the continuous delivery of large, complex applications. However, the microservice architecture is not a silver bullet and it has some significant drawbacks.
The goal of the microservices pattern language is to enable software developers to apply the microservice architecture effectively. It is a collection of patterns that solve architecture, design, development and operational problems. In this talk, I’ll provide an overview of the microservice architecture and describe the motivations for the pattern language. You will learn about the key patterns in the pattern language.
QConPlus 2021: Minimizing Design Time Coupling in a Microservice ArchitectureChris Richardson
Delivering large, complex software rapidly, frequently and reliably requires a loosely coupled organization. DevOps teams should rarely need to communicate and coordinate in order to get work done. Conway's law states that an organization and the architecture that it develops mirror one another. Hence, a loosely coupled organization requires a loosely coupled architecture.
In this presentation, you will learn about design-time coupling in a microservice architecture and why it's essential to minimize it. I describe how to design service APIs to reduce coupling. You will learn how to minimize design-time coupling by applying a version of the DRY principle. I describe how key microservices patterns potentially result in tight design time coupling and how to avoid it.
Mucon 2021 - Dark energy, dark matter: imperfect metaphors for designing micr...Chris Richardson
In order to explain certain astronomical observations, physicists created the mysterious concepts of dark energy and dark matter.
Dark energy is a repulsive force.
It’s an anti-gravity that is forcing matter apart and accelerating the expansion of the universe.
Dark matter has the opposite attraction effect.
Although it’s invisible, dark matter has a gravitational effect on stars and galaxies.
In this presentation, you will learn how these metaphors apply to the microservice architecture.
I describe how there are multiple repulsive forces that drive the decomposition of your application into services.
You will learn, however, that there are also multiple attractive forces that resist decomposition and bind software elements together.
I describe how as an architect you must find a way to balance these opposing forces.
Skillsmatter CloudNative eXchange 2020
The microservice architecture is a key part of cloud native.
An essential principle of the microservice architecture is loose coupling.
If you ignore this principle and develop tightly coupled services the result will mostly likely be yet another "microservices failure story”.
Your application will be brittle and have all of disadvantages of both the monolithic and microservice architectures.
In this talk you will learn about the different kinds of coupling and how to design loosely coupled microservices.
I describe how to minimize design time and increase the productivity of your DevOps teams.
You will learn how how to reduce runtime coupling and improve availability.
I describe how to improve availability by minimizing the coupling caused by your infrastructure.
DDD SoCal: Decompose your monolith: Ten principles for refactoring a monolith...Chris Richardson
This is a talk I gave at DDD SoCal.
1. Make the most of your monolith
2. Adopt microservices for the right reasons
3. It’s not just architecture
4. Get the support of the business
5. Migrate incrementally
6. Know your starting point
7. Begin with the end in mind
8. Migrate high-value modules first
9. Success is improved velocity and reliability
10. If it hurts, don’t do it
Decompose your monolith: Six principles for refactoring a monolith to microse...Chris Richardson
This was a talk I gave at the CTO virtual summit on July 28th. It describes 6 principles for refactoring to a microservice architecture.
1. Make the most of your monolith
2. Adopt microservices for the right reasons
3. Migrate incrementally
4. Begin with the end in mind
5. Migrate high-value modules first
6. Success is improved velocity and reliability
The microservice architecture is becoming increasingly important. But what is it exactly? Why should you care about microservices? And, what do you need to do to ensure that your organization uses the microservice architecture successfully? In this talk, I’ll answer these and other questions. You will learn about the motivations for the microservice architecture and why simply adopting microservices is insufficient. I describe essential characteristics of microservices, You will learn how a successful microservice architecture consists of loosely coupled services with stable APIs that communicate asynchronously.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 3
SQL, NoSQL, NewSQL? What's a developer to do?
1. SQL, NoSQL, NewSQL?
What's a developer to do?
Chris Richardson
Author of POJOs in Action
Founder of CloudFoundry.com
chris.richardson@springsource.com
@crichardson
7. About Chris
http://www.theregister.co.uk/2009/08/19/springsource_cloud_foundry/
7
8. About Chris
Developer Advocate for
CloudFoundry.com
Signup at CloudFoundry.com
using promo code JFokus
8
9. Agenda
o Why NoSQL? NewSQL?
o Persisting entities
o Implementing queries
2/14/12 Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 9
10. Food to Go
o Take-out food delivery
service
o “Launched” in 2006
o Used a relational
database (naturally)
10
11. Success è Growth challenges
o Increasing traffic
o Increasing data volume
o Distribute across a few data centers
o Increasing domain model complexity
12. Limitations of relational databases
o Scaling
o Distribution
o Updating schema
o O/R impedance mismatch
o Handling semi-structured data
12
13. Solution: Spend Money
o Buy SSD and RAM
o Buy Oracle
o Buy high-end servers
o …
OR
http://upload.wikimedia.org/wikipedia/commons/e/e5/Rising_Sun_Yacht.JPG
o Hire more DevOps
o Use application-level sharding
o Build your own middleware
o …
http://www.trekbikes.com/us/en/bikes/road/race_performance/madone_5_series/madone_5_2/#
13
14. Solution: Use NoSQL
Benefits
Higher Limited
performance transactions
Higher scalability Relaxed
Richer data- consistency
model Unconstrained
Schema-less data
Drawbacks
14
15. MongoDB
o Document-oriented database
n JSON-style documents: Lists, Maps, primitives
n Schema-less
o Transaction = update of a single document
o Rich query language for dynamic queries
o Tunable writes: speed ó reliability
o Highly scalable and available
o Use cases
n High volume writes
n Complex data
n Semi-structured data
15
16. Apache Cassandra
o Column-oriented database/Extensible row store
n Think Row ~= java.util.SortedMap
o Transaction = update of a row
o Fast writes = append to a log
o Tunable reads/writes: consistency ó latency/
availability
o Extremely scalable
n Transparent and dynamic clustering
n Rack and datacenter aware data replication
o CQL = “SQL”-like DDL and DML
o Use cases
n Big data
n Multiple Data Center distributed database
n (Write intensive) Logging
n High-availability (writes)
16
18. Solution: Use NewSQL
o Relational databases with SQL and
ACID transactions
AND
o New and improved architecture
o Radically better scalability and
performance
o NewSQL vendors: ScaleDB,
NimbusDB, …, VoltDB
2/14/12 Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 18
19. Stonebraker’s motivations
“…Current databases are designed for
1970s hardware …”
Stonebraker: http://www.slideshare.net/VoltDB/sql-myths-webinar
Significant overhead in “…logging, latching,
locking, B-tree, and buffer management
operations…”
SIGMOD 08: Though the looking glass: http://dl.acm.org/citation.cfm?id=1376713
19
20. About VoltDB
o Open-source
o In-memory relational database
o Durability thru replication; snapshots
and logging
o Transparent partitioning
o Fast and scalable
…VoltDB is very scalable; it should scale to 120
partitions, 39 servers, and 1.6 million complex
transactions per second at over 300 CPU cores…
http://www.mysqlperformanceblog.com/2011/02/28/is-voltdb-really-as-scalable-as-they-claim/
2/14/12 Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 20
21. The future is polyglot persistence
e.g. Netflix
• RDBMS
• SimpleDB
• Cassandra
• Hadoop/Hbase
IEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg
21
22. Spring Data is here to help
For
NoSQL databases
http://www.springsource.org/spring-data
22
23. Agenda
o Why NoSQL? NewSQL?
o Persisting entities
o Implementing queries
2/14/12 Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 23
24. Food to Go – Place Order use case
1. Customer enters delivery address and
delivery time
2. System displays available restaurants
3. Customer picks restaurant
4. System displays menu
5. Customer selects menu items
6. Customer places order
24
25. Food to Go – Domain model (partial)
class Restaurant { class TimeRange {
long id; long id;
String name; int dayOfWeek;
Set<String> serviceArea; int openingTime;
Set<TimeRange> openingHours;
int closingTime;
List<MenuItem> menuItems;
}
}
class MenuItem {
String name;
double price;
}
25
33. Cassandra– retrieving data
Column Family
K1 N1 V1 TS1 N2 V2 TS2 N3 V3 TS3 N4 V4 TS4
…
CF.slice(key=K1, startColumn=N2, endColumn=N4)
K1 N2 V2 TS2 N3 V3 TS3 N4 V4 TS4
Cassandra has secondary indexes but they
aren’t helpful for these use cases
33
34. Option #1: Use a column per attribute
Column Name = path/expression to access property value
Column Family: RestaurantDetails
openingHours[0].dayOfWeek Monday
name Ajanta serviceArea[0] 94619
1 openingHours[0].open 1130
type indian serviceArea[1] 94707
openingHours[0].close 1430
Egg openingHours[0].dayOfWeek Monday
name serviceArea[0] 94611
shop
2 Break openingHours[0].open 0830
type serviceArea[1] 94619
Fast
openingHours[0].close 1430
35. Option #2: Use a single column
Column value = serialized object graph, e.g. JSON
Column Family: RestaurantDetails
2 attributes: { name: “Montclair Eggshop”, … }
1 attributes { name: “Ajanta”, …}
2 attributes { name: “Eggshop”, …}
✔
35
36. Cassandra code
public class AvailableRestaurantRepositoryCassandraKeyImpl
implements AvailableRestaurantRepository {
@Autowired Home grown
private final CassandraTemplate cassandraTemplate;
wrapper class
public void add(Restaurant restaurant) {
cassandraTemplate.insertEntity(keyspace,
RESTAURANT_DETAILS_CF,
restaurant);
}
public Restaurant findDetailsById(int id) {
String key = Integer.toString(id);
return cassandraTemplate.findEntity(Restaurant.class,
keyspace, key, RESTAURANT_DETAILS_CF);
…
}
… http://en.wikipedia.org/wiki/Hector
36
37. Using VoltDB
o Use the original schema
o Standard SQL statements
BUT YOU MUST
o Write stored procedures and invoke
them using proprietary interface
o Partition your data
2/14/12 Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 37
38. About VoltDB stored procedures
o Key part of VoltDB
o Replication = executing stored
procedure on replica
o Logging = log stored procedure
invocation
o Stored procedure invocation =
transaction
2/14/12 Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 38
39. About partitioning
Partition column
RESTAURANT table
ID Name …
1 Ajanta
2 Eggshop
…
2/14/12 Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 39
40. Example cluster
Partition 1a Partition 2a Partition 3a
ID Name … ID Name … ID Name …
1 Ajanta 2 Eggshop … ..
… … …
Partition 3b Partition 1b Partition 2b
ID Name … ID Name … ID Name …
… .. 1 Ajanta 2 Eggshop
… … …
VoltDB Server 1 VoltDB Server 2 VoltDB Server 3
2/14/12 Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 40
41. Single partition procedure: FAST
SELECT * FROM RESTAURANT WHERE ID = 1
High-performance lock free code
ID Name … ID Name … ID Name …
1 Ajanta 1 Eggshop … ..
… … …
… … …
VoltDB Server 1 VoltDB Server 2 VoltDB Server 3
2/14/12 Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 41
42. Multi-partition procedure: SLOWER
SELECT * FROM RESTAURANT WHERE NAME = ‘Ajanta’
Communication/Coordination overhead
ID Name … ID Name … ID Name …
1 Ajanta 1 Eggshop … ..
… … …
… … …
VoltDB Server 1 VoltDB Server 2 VoltDB Server 3
2/14/12 Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 42
43. Chosen partitioning scheme
<partitions>
<partition table="restaurant" column="id"/>
<partition table="service_area" column="restaurant_id"/>
<partition table="menu_item" column="restaurant_id"/>
<partition table="time_range" column="restaurant_id"/>
<partition table="available_time_range" column="restaurant_id"/>
</partitions>
Performance is excellent: much
faster than MySQL
43
44. Stored procedure – AddRestaurant
@ProcInfo( singlePartition = true, partitionInfo = "Restaurant.id: 0”)
public class AddRestaurant extends VoltProcedure {
public final SQLStmt insertRestaurant =
new SQLStmt("INSERT INTO Restaurant VALUES (?,?);");
public final SQLStmt insertServiceArea =
new SQLStmt("INSERT INTO service_area VALUES (?,?);");
public final SQLStmt insertOpeningTimes =
new SQLStmt("INSERT INTO time_range VALUES (?,?,?,?);");
public final SQLStmt insertMenuItem =
new SQLStmt("INSERT INTO menu_item VALUES (?,?,?);");
public long run(int id, String name, String[] serviceArea, long[] daysOfWeek, long[] openingTimes,
long[] closingTimes, String[] names, double[] prices) {
voltQueueSQL(insertRestaurant, id, name);
for (String zipCode : serviceArea)
voltQueueSQL(insertServiceArea, id, zipCode);
for (int i = 0; i < daysOfWeek.length ; i++)
voltQueueSQL(insertOpeningTimes, id, daysOfWeek[i], openingTimes[i], closingTimes[i]);
for (int i = 0; i < names.length ; i++)
voltQueueSQL(insertMenuItem, id, names[i], prices[i]);
voltExecuteSQL(true);
return 0;
}
}
44
48. Agenda
o Why NoSQL? NewSQL?
o Persisting entities
o Implementing queries
2/14/12 Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 48
49. Finding available restaurants
Available restaurants =
Serve the zip code of the delivery address
AND
Are open at the delivery time
public interface AvailableRestaurantRepository {
List<AvailableRestaurant>
findAvailableRestaurants(Address deliveryAddress,
Date deliveryTime); …
}
49
50. Finding available restaurants on Monday,
6.15pm for 94619 zip
select r.* Straightforward
from restaurant r three-way join
inner join restaurant_time_range tr
on r.id =tr.restaurant_id
inner join restaurant_zipcode sa
on r.id = sa.restaurant_id
Where ’94619’ = sa.zip_code
and tr.day_of_week=’monday’
and tr.openingtime <= 1815
and 1815 <= tr.closingtime
50
51. MongoDB = easy to query
{
serviceArea:"94619", Find a
openingHours: {
$elemMatch : { restaurant
"dayOfWeek" : "Monday",
"open": {$lte: 1815}, that serves
}
"close": {$gte: 1815}
the 94619 zip
}
}
code and is
open at
DBCursor cursor = collection.find(qbeObject);
while (cursor.hasNext()) { 6.15pm on a
DBObject o = cursor.next();
… Monday
}
db.availableRestaurants.ensureIndex({serviceArea: 1})
51
52. MongoTemplate-based code
@Repository
public class AvailableRestaurantRepositoryMongoDbImpl
implements AvailableRestaurantRepository {
@Autowired private final MongoTemplate mongoTemplate;
@Override
public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress,
Date deliveryTime) {
int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime);
int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime);
Query query = new Query(where("serviceArea").is(deliveryAddress.getZip())
.and("openingHours”).elemMatch(where("dayOfWeek").is(dayOfWeek)
.and("openingTime").lte(timeOfDay)
.and("closingTime").gte(timeOfDay)));
return mongoTemplate.find(AVAILABLE_RESTAURANTS_COLLECTION, query,
AvailableRestaurant.class);
}
mongoTemplate.ensureIndex(“availableRestaurants”,
new Index().on("serviceArea", Order.ASCENDING));
52
53. BUT how to do this with Cassandra??!
o How can Cassandra support a query that has
?
n A 3-way join
n Multiple =
n > and <
è We need to implement an index
Queries instead of data
model drives NoSQL
database design
53
54. Simplification #1: Denormalization
Restaurant_id Day_of_week Open_time Close_time Zip_code
1 Monday 1130 1430 94707
1 Monday 1130 1430 94619
1 Monday 1730 2130 94707
1 Monday 1730 2130 94619
2 Monday 0700 1430 94619
…
SELECT restaurant_id
FROM time_range_zip_code
WHERE day_of_week = ‘Monday’ Simpler query:
AND zip_code = 94619 § No joins
§ Two = and two <
AND 1815 < close_time
AND open_time < 1815
54
55. Simplification #2: Application filtering
SELECT restaurant_id, open_time
FROM time_range_zip_code
WHERE day_of_week = ‘Monday’ Even simpler query
AND zip_code = 94619 • No joins
AND 1815 < close_time • Two = and one <
AND open_time < 1815
55
56. Simplification #3: Eliminate multiple =’s with
concatenation
Restaurant_id Zip_dow Open_time Close_time
1 94707:Monday 1130 1430
1 94619:Monday 1130 1430
1 94707:Monday 1730 2130
1 94619:Monday 1730 2130
2 94619:Monday 0700 1430
…
SELECT restaurant_id, open_time
FROM time_range_zip_code
WHERE zip_code_day_of_week = ‘94619:Monday’
AND 1815 < close_time
key
range
56
57. Column family with composite column
names as an index
Restaurant_id Zip_dow Open_time Close_time
1 94707:Monday 1130 1430
1 94619:Monday 1130 1430
1 94707:Monday 1730 2130
1 94619:Monday 1730 2130
2 94619:Monday 0700 1430
…
Column Family: AvailableRestaurants
JSON FOR JSON FOR
(1430,0700,2) (2130,1730,1)
94619:Monday EGG AJANTA
JSON FOR
(1430,1130,1)
AJANTA
58. Querying with a slice
Column Family: AvailableRestaurants
JSON FOR JSON FOR
(1430,0700,2) (2130,1730,1)
EGG AJANTA
94619:Monday
JSON FOR
(1430,1130,1)
AJANTA
slice(key= 94619:Monday, sliceStart = (1815, *, *), sliceEnd = (2359, *, *))
JSON FOR
(2130,1730,1)
94619:Monday AJANTA
18:15 is after 17:30 è {Ajanta}
58
59. Needs a few pages of code
private void insertAvailability(Restaurant restaurant) {
for (String zipCode : (Set<String>) restaurant.getServiceArea()) {
@Override for (TimeRange tr : (Set<TimeRange>) restaurant.getOpeningHours()) {
public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime) {
String dayOfWeek = format2(tr.getDayOfWeek());
int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime);
String openingTime = format4(tr.getOpeningTime());
int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime);
String closingTime = format4(tr.getClosingTime());
String zipCode = deliveryAddress.getZip();
String key = formatKey(zipCode, format2(dayOfWeek));
String restaurantId = format8(restaurant.getId());
HSlicePredicate<Composite> predicate = new HSlicePredicate<Composite>(new CompositeSerializer());
String key = formatKey(zipCode, dayOfWeek);
Composite start = new Composite();
String columnValue = toJson(restaurant);
Composite finish = new Composite();
start.addComponent(0, format4(timeOfDay), ComponentEquality.GREATER_THAN_EQUAL);
finish.addComponent(0, format4(2359), ComponentEquality.GREATER_THAN_EQUAL);
Composite columnName = new Composite();
predicate.setRange(start, finish, false, 100);
columnName.add(0, closingTime);
final List<AvailableRestaurantIndexEntry> closingAfter = new ArrayList<AvailableRestaurantIndexEntry>();
columnName.add(1, openingTime);
columnName.add(2, restaurantId);
ColumnFamilyRowMapper<String, Composite, Object> mapper = new ColumnFamilyRowMapper<String, Composite, Object>() {
@Override
ColumnFamilyUpdater<String, Composite> updater
public Object mapRow(ColumnFamilyResult<String, Composite> results) {
= compositeCloseTemplate.createUpdater(key);
for (Composite columnName : results.getColumnNames()) {
String openTime = columnName.get(1, new StringSerializer());
updater.setString(columnName, columnValue);
String restaurantId = columnName.get(2, new StringSerializer());
closingAfter.add(new AvailableRestaurantIndexEntry(openTime, restaurantId, results.getString(columnName)));
}
return null;
}
};
compositeCloseTemplate.update(updater);
}
compositeCloseTemplate.queryColumns(key, predicate, mapper);
}
List<AvailableRestaurant> result = new LinkedList<AvailableRestaurant>();
}
for (AvailableRestaurantIndexEntry trIdAndAvailableRestaurant : closingAfter) {
if (trIdAndAvailableRestaurant.isOpenBefore(timeOfDay))
result.add(trIdAndAvailableRestaurant.getAvailableRestaurant());
}
return result;
} 59
61. Mongo vs. Cassandra
DC1 DC2
Shard A Master Shard B Master
MongoDB Remote
DC1 Client DC2 Client
DC1 DC2
Async
Cassandra Or
Cassandra
Cassandra
Sync
DC1 Client DC2 Client
2/14/12 Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 61
62. VoltDB - attempt #1
@ProcInfo( singlePartition = false)
public class FindAvailableRestaurants extends VoltProcedure { ... }
ERROR 10:12:03,251 [main] COMPILER: Failed to plan for statement
type(findAvailableRestaurants_with_join) select r.* from restaurant
r,time_range tr, service_area sa Where ? = sa.zip_code and r.id
=tr.restaurant_id and r.id = sa.restaurant_id and tr.day_of_week=?
and tr.open_time <= ? and ? <= tr.close_time Error: "Unable to plan
for statement. Likely statement is joining two partitioned tables in a
multi-partition statement. This is not supported at this time."
ERROR 10:12:03,251 [main] COMPILER: Catalog compilation failed.
Bummer!
2/14/12 Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 62
63. VoltDB - attempt #2
@ProcInfo( singlePartition = true, partitionInfo = "Restaurant.id: 0”)
public class AddRestaurant extends VoltProcedure {
public final SQLStmt insertAvailable=
new SQLStmt("INSERT INTO available_time_range VALUES (?,?,?, ?, ?, ?);");
public long run(....) {
...
for (int i = 0; i < daysOfWeek.length ; i++) {
voltQueueSQL(insertOpeningTimes, id, daysOfWeek[i], openingTimes[i], closingTimes[i]);
for (String zipCode : serviceArea) {
voltQueueSQL(insertAvailable, id, daysOfWeek[i], openingTimes[i],
closingTimes[i], zipCode, name);
}
}
... public final SQLStmt findAvailableRestaurants_denorm = new SQLStmt(
voltExecuteSQL(true); "select restaurant_id, name from available_time_range tr " +
return 0; "where ? = tr.zip_code " +
} "and tr.day_of_week=? " +
} "and tr.open_time <= ? " +
" and ? <= tr.close_time ");
Works but queries are only slightly
faster than MySQL!
2/14/12 Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 63
64. VoltDB - attempt #3
<partitions>
...
<partition table="available_time_range" column="zip_code"/>
</partitions>
@ProcInfo( singlePartition = false, ...)
public class AddRestaurant extends VoltProcedure { ... }
@ProcInfo( singlePartition = true,
partitionInfo = "available_time_range.zip_code: 0")
public class FindAvailableRestaurants extends VoltProcedure { ... }
Queries are really fast but inserts are not L
Partitioning scheme – optimal for some use
cases but not others
64
65. Summary…
o Relational databases are great BUT there
are limitations
o Each NoSQL database solves some
problems BUT
n Limited transactions: NoSQL = NoACID
n One day needing ACID è major rewrite
n Query-driven, denormalized database design
n …
o NewSQL databases such as VoltDB provides
SQL, ACID transactions and incredible
performance BUT
n Not all operations are fast
n Non-JDBC API
65
66. … Summary
o Very carefully pick the NewSQL/
NoSQL DB for your application
o Consider a polyglot persistence
architecture
o Encapsulate your data access code so
you can switch
o Startups = avoid NewSQL/NoSQL for
shorter time to market?
2/14/12 Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 66
67. Thank you!
Signup at CloudFoundry.com
using promo code JFokus
My contact info:
chris.richardson@springsource.com
@crichardson
67