R2DBC started as an experiment to enable integration of SQL databases into systems that use reactive programming models. Now it specifies a robust specification that can be implemented to manage data in a fully-reactive and completely non-blocking fashion.
11. The Anatomy of a Data Stream
Variables
changes in data
Starting event
initiates stream of data
Error
disruption in data
Completion
all data is processed
Time
@probablyrealrob
15. Reactive Streams API
public interface Publisher<T> {
public void subscribe(Subscriber<? super T> s);
}
public interface Subscriber<T> {
public void onSubscribe(Subscription s);
public void onNext(T t);
public void onError(Throwable t);
public void onComplete();
}
public interface Subscription {
public void request(long n);
public void cancel();
}
@probablyrealrob
Project Reactor
22. Goals and design principles
✓ Provide a completely open specification (Apache v2 license)
✓ Be completely non-blocking, all the way to the database
✓ Utilize Reactive Streams Types and Patterns
✓ Provide a minimal set of operations that are implementation specific
✓ Enable “humane” APIs to be built on top of the driver
@probablyrealrob
33. Establishing a connection
// Create a new ConnectionFactory object
MariadbConnectionFactory connFactory
= new MariadbConnectionFactory(config);
// Obtain a Connection object from a ConnectionFactory
MariadbConnection conn = connFactory.create().block();
@probablyrealrob
34. SELECT
MariadbStatement selectStatement = conn.createStatement("select * from todo.tasks");
Publisher<MariadbResult> selectPublisher = selectStatement.execute();
Flux.from(selectPublisher)
.flatMap(
res -> res.map(
(row, metadata) -> {
int id = row.get(0, Integer.class);
String description = row.get(1, String.class);
Boolean completed = row.get(2, Boolean.class);
return new Task(id,description,completed);
})
).subscribe(task -> { // Do something with task });
Project Reactor
@probablyrealrob
Publisher<T> == Flux<Task>
Just like object-oriented programming, functional programming, or procedural programming, reactive programming is just another programming paradigm.
Reactive Streams, on the other hand, is a specification. For Java programmers, Reactive Streams is an API. Reactive Streams gives us a common API for Reactive Programming in Java.
--Up to this point 8 min--
Publisher: Publisher, Producer of Data Items
Subscriber: Data item subscribers, consumers
Subscription: The relationship between publisher and subscriber, subscription token
Processor: Data Processor
Publisher publishes data streams to registered Subscriber s.It typically publishes items to subscribers asynchronously using Executor.Publisher needs to ensure that the subscriber method for each subscription is called strictly in sequence.
Subscriber subscribes to Publisher's streams and accepts callbacks.If Subscriber does not make a request, it will not receive data.For a given Subscription, the method of calling Subscriber is strictly sequential.
onSubscribe: The publisher calls the subscriber's method to deliver the subscription asynchronously, which is executed after the publisher.subscribe method is called
onNext: The publisher calls this method to pass data to the subscriber
onError: Call this method when Publisher or Subscriber encounters an unrecoverable error, and then no other method is called
onComplete: Call this method when the data has been sent and no errors have caused the subscription to terminate, then no other methods will be called
Subscription is used to connect Publisher to Subscriber.Subscriber receives items only when requested and can unsubscribe through Subscription.Subscription has two main methods:
Request: Subscribers call this method to request data
cancel: Subscribers invoke this method to unsubscribe and dissociate subscribers from publishers
Processor is located between Publisher and Subscriber for data conversion.Multiple Processors can be used together to form a processing chain in which the results of the last processor are sent to Subscriber.The JDK does not provide any specific processors.Processors are both subscribers and publishers, and interface definitions inherit both as subscribers and as publishers, receiving data as subscribers, processing it, and publishing it as publishers.
A stream is a sequence of ongoing events ordered in time. It can emit three different things: a value (of some type), an error, or a "completed" signal. Consider that the "completed" takes place, for instance, when the current window or view containing that button is closed.
We capture these emitted events only asynchronously, by defining a function that will execute when a value is emitted, another function when an error is emitted, and another function when 'completed' is emitted. Sometimes these last two can be omitted and you can just focus on defining the function for values. The "listening" to the stream is called subscribing. The functions we are defining are observers. The stream is the subject (or "observable") being observed. This is precisely the Observer Design Pattern.
Reactive Streams is an initiative started in 2013 with different companies like Netflix, Pivotal, Lightbend etc. The core goal is to provide a standard for asynchronous stream processing with a non-blocking back pressure.
--Up to this point 8 min--
Publisher: Publisher, Producer of Data Items
Subscriber: Data item subscribers, consumers
Subscription: The relationship between publisher and subscriber, subscription token
Processor: Data Processor
Publisher publishes data streams to registered Subscriber s.It typically publishes items to subscribers asynchronously using Executor.Publisher needs to ensure that the subscriber method for each subscription is called strictly in sequence.
Subscriber subscribes to Publisher's streams and accepts callbacks.If Subscriber does not make a request, it will not receive data.For a given Subscription, the method of calling Subscriber is strictly sequential.
onSubscribe: The publisher calls the subscriber's method to deliver the subscription asynchronously, which is executed after the publisher.subscribe method is called
onNext: The publisher calls this method to pass data to the subscriber
onError: Call this method when Publisher or Subscriber encounters an unrecoverable error, and then no other method is called
onComplete: Call this method when the data has been sent and no errors have caused the subscription to terminate, then no other methods will be called
Subscription is used to connect Publisher to Subscriber.Subscriber receives items only when requested and can unsubscribe through Subscription.Subscription has two main methods:
Request: Subscribers call this method to request data
cancel: Subscribers invoke this method to unsubscribe and dissociate subscribers from publishers
Processor is located between Publisher and Subscriber for data conversion.Multiple Processors can be used together to form a processing chain in which the results of the last processor are sent to Subscriber.The JDK does not provide any specific processors.Processors are both subscribers and publishers, and interface definitions inherit both as subscribers and as publishers, receiving data as subscribers, processing it, and publishing it as publishers.
Available in JDK9 java.util.concurrent.Flow
Also, github -> Reactive Streams SPI
--Up to this point 8 min--
Publisher: Publisher, Producer of Data Items
Subscriber: Data item subscribers, consumers
Subscription: The relationship between publisher and subscriber, subscription token
Processor: Data Processor
Publisher publishes data streams to registered Subscriber s.It typically publishes items to subscribers asynchronously using Executor.Publisher needs to ensure that the subscriber method for each subscription is called strictly in sequence.
Subscriber subscribes to Publisher's streams and accepts callbacks.If Subscriber does not make a request, it will not receive data.For a given Subscription, the method of calling Subscriber is strictly sequential.
onSubscribe: The publisher calls the subscriber's method to deliver the subscription asynchronously, which is executed after the publisher.subscribe method is called
onNext: The publisher calls this method to pass data to the subscriber
onError: Call this method when Publisher or Subscriber encounters an unrecoverable error, and then no other method is called
onComplete: Call this method when the data has been sent and no errors have caused the subscription to terminate, then no other methods will be called
Subscription is used to connect Publisher to Subscriber.Subscriber receives items only when requested and can unsubscribe through Subscription.Subscription has two main methods:
Request: Subscribers call this method to request data
cancel: Subscribers invoke this method to unsubscribe and dissociate subscribers from publishers
Processor is located between Publisher and Subscriber for data conversion.Multiple Processors can be used together to form a processing chain in which the results of the last processor are sent to Subscriber.The JDK does not provide any specific processors.Processors are both subscribers and publishers, and interface definitions inherit both as subscribers and as publishers, receiving data as subscribers, processing it, and publishing it as publishers.
An initiative started in 2013 (Netflix, Pivotal, Lightbend and many others)
A specification
API types
Technology Compatibility Kit (TCK)
Provide a standard for asynchronous stream processing with non-blocking back pressure.
Akka Streams is a Reactive Streams and JDK 9+ java.util.concurrent.Flow-compliant implementation and therefore fully interoperable with other implementations.
HIGHLY EFFICIENT, more so than highly performant
Much reactive code is largely synchronous, but things with databases it meshes well with async (multi-threaded) processes
Reactive Programming raises the level of abstraction of your code so you can focus on the interdependence of events that define the business logic, rather than having to constantly fiddle with a large amount of implementation details. Code in RP will likely be more concise.
The benefit is more evident in modern webapps and mobile apps that are highly interactive with a multitude of UI events related to data events. 10 years ago, interaction with web pages was basically about submitting a long form to the backend and performing simple rendering to the frontend. Apps have evolved to be more real-time: modifying a single form field can automatically trigger a save to the backend, "likes" to some content can be reflected in real time to other connected users, and so forth.
Apps nowadays have an abundancy of real-time events of every kind that enable a highly interactive experience to the user. We need tools for properly dealing with that, and Reactive Programming is an answer.
--Up to this point 12 min--
R2DBC stands for Reactive Relational Database Connectivity. R2DBC started as an experiment and proof of concept to enable integration of SQL databases into systems that use reactive programming models –- Reactive in the sense of an event-driven, non-blocking, and functional programming model that does not make assumptions over concurrency or asynchronicity. Instead, it assumes that scheduling and parallelization happen as part of runtime scheduling.
TL;DR: Brings reactive programming APIs to relational databases
A key difference between R2DBC and imperative data access SPIs is the deferred nature of execution. R2DBC is, therefore, based on Reactive Streams and uses the concepts of Publisher and Subscriber to allow non-blocking back-pressure-aware data access.
TODO:
When?? 2018? Started?
First official release??
Pivotal
Version 0.8.1
Last release Feb. 4th 2020
There are a variety of ways that reactive development can be done, but the the community has come together, or standardized, using Reactive Streams..so R2DBC wanted to follow this community driven convention
R2DBC = Just a spec… brand new wire-protocol, networking protocols between clients and databases
Reactive Streams - R2DBC is founded on Reactive Streams providing a fully reactive non-blocking API.
Relational Databases - R2DBC engages SQL databases with a reactive API, something not possible with the blocking nature of JDBC.
Scalable Solutions - Reactive Streams makes it possible to move from the classic one thread per connection approach to a more powerful, more scalable approach.
Open Specification - R2DBC is an open specification establishing a SPI that driver vendors can implement and clients can consume.
R2DBC = Just a spec… brand new wire-protocol, networking protocols between clients and databases
R2DBC stands for Reactive Relational Database Connectivity. R2DBC started as an experiment and proof of concept to enable integration of SQL databases into systems that use reactive programming models –- Reactive in the sense of an event-driven, non-blocking, and functional programming model that does not make assumptions over concurrency or asynchronicity. Instead, it assumes that scheduling and parallelization happen as part of runtime scheduling.
Pivotal
Version 0.8.1
Last release Feb. 4th 2020
0.8.4 release available in Maven Central Repository
Going to go through some of the API
The R2DBC SPI provides reactive programmatic access to relational databases from Java and other JVM-based programming languages.
R2DBC provides an API for Java programs to access one or more sources of data. In the majority of cases, the data source is a relational DBMS and its data is accessed using SQL. R2DBC drivers are not limited to RDBMS but can be implemented on top of other data sources, including stream-oriented systems and object-oriented systems. A primary motivation for R2DBC SPI is to provide a standard API for reactive applications to integrate with a wide variety of data sources.
The following guidelines apply to R2DBC compliance:
An R2DBC SPI should implement SQL support as its primary interface. R2DBC does not rely upon (nor does it presume) a specific SQL version. SQL and aspects of statements can be entirely handled in the data source or as part of the driver.
The specification consists of this specification document and the specifications documented in each interface’s Javadoc.
Drivers supporting parameterized statements must support bind parameter markers.
Drivers supporting parameterized statements must support at least one parameter binding method (indexed or named).
Drivers must support transactions.
Index references to columns and parameters are zero-based. That is, the first index begins with 0.
ConnectionFactory allows you to create a connection and get the metadata back for a connection.
However, when you use “create” you don’t actually get a connection back, you get a Publisher, which of course is a Reactive Streams type that says “some point in the future after you’ve subscribed to me I will give you a connection”. This is in line with the type of message driven behavior that you drives reactive development.
ConnectionFactory: A ConnectionFactory is implemented by a driver and provides access to Connection creation. An application that wants to configure vendor-specific aspects of a driver can use the vendor-specific ConnectionFactory creation mechanism to configure a ConnectionFactory.
The following rules apply:
A ConnectionFactory represents a resource factory for deferred connection creation. It may create connections by itself, wrap a ConnectionFactory, or apply connection pooling on top of a ConnectionFactory.
A ConnectionFactory provides metadata about the driver itself through ConnectionFactoryMetadata.
A ConnectionFactory uses deferred initialization and should initiate connection resource allocation after requesting the item (Subscription.request(1)).
Connection creation must emit exactly one Connection or an error signal.
Connection creation must be cancellable (Subscription.cancel()). Canceling connection creation must release (“close”) the connection and all associated resources.
A ConnectionFactory should expect that it can be wrapped. Wrappers must implement the Wrapped<ConnectionFactory> interface and return the underlying ConnectionFactory when Wrapped.unwrap() gets called.
Once you have that connection, that’s when you get to start doing more interesting stuff.
We can manage transactions, beginning/committing/rolling them back, and of course.. Close the connections.
You can create batches, and Statements. We’re going to focus on this coming up.
Taking a closer look you can see that of them return the same type - Publisher<void>. And that’s because, if you think back to how Reactive Programming works - you’re always saying that “something will happen in the future”. Because you can’t return void.
In the Reactive World the idea of Publisher void - this thing that will signal completion, but will never actually send any data is what is being represented here.
Because I don’t actually know when you’re going to send the packet to being a transaction, but as soon as you’ve done that let me know and then I’ll know to queue up anything that happens after that.
R2DBC uses the Connection interface to define a logical connection API to the underlying data source. The structure of a connection depends on the actual requirements of the data source and how the driver implements these.
The data source can be an RDBMS, a stream-oriented data system, or some other source of data with a corresponding R2DBC driver. A single application that uses R2DBC SPI can maintain multiple connections to either a single data source or across multiple data sources. From a R2DBC driver perspective, a Connection object represents a single client session. It has associated state information, such as user ID and what transaction semantics are in effect. A Connection object is not safe for concurrent state-changing by multiple subscribers. A connection object can be shared across multiple threads that serially run operations by using appropriate synchronization mechanisms.
To obtain a connection, the application can:
Interact with the ConnectionFactories class by working with one or more ConnectionFactoryProvider implementations.
Directly interact with a ConnectionFactory implementation.
IsolationLevel getTransactionIsolationLevel();
boolean isAutoCommit();
Publisher<Void> createSavepoint(String name);
Publisher<Void> releaseSavepoint(String name);
Publisher<Void> rollbackTransaction();
Publisher<Void> rollbackTransactionToSavepoint(String name);
Publisher<Void> setAutoCommit(boolean autoCommit);
Publisher<Void> setTransactionIsolationLevel(IsolationLevel isolationLevel);
Publisher<Boolean> validate(ValidationDepth depth);
Statements. This is where things become much more interesting for developers.
You can of course execute statements, but what’s so interesting here is that you can bind data (bind to some kind of identifier) to the statements you wanted to execute.
R2DBC has been designed in such a way that, while the idea of “binding” is specified, the actual implementation of this has been left up to the database driver (MariaDB in this case) to implement that - because very little of SQL is actually portable between databases. There’s this idea of different dialects that allow for writing SQL that are optimized for the different kinds of vendors. That has been left up to the vendors to engineer. NOT R2DBC,
Binding to named parameters.. Or you can bind to positional. We’ll take a closer look at this in a couple of slides and in the demo.
Decided to break convention with JDBC and go with 0 indexing.. You know.. Like the rest of the world. :)
After that is all said and done you’re going to “execute” and at which point you’ll subscribe to a publisher to send you back either a result or a stream of results. (See how it’s all coming together now?)
The Statement interface defines methods for running SQL statements. SQL statements may contain parameter bind markers for input parameters.
The Statement interface defines bind(…) and bindNull(…) methods to provide parameter values for bind marker substitution. The parameter type is defined by the actual value that is bound to a parameter. Each bind method accepts two arguments. The first is either an ordinal position parameter starting at 0 (zero) or the parameter placeholder representation. The method of parameter binding (positional or by identifier) is vendor-specific, and a driver should document its preferred binding mechanism. The second and any remaining parameters specify the value to be assigned to the parameter. The following example shows how to bind parameters to a statement object by using placeholders:
Results then give you the ability to find out the number of rows that were updated. Very common for insert or update.
And then, given a row (and it’s metadata).. Transform that into a Publisher of T. Basically, take the row and the row’s information and map it into an object (string, integer, whatever) and then publish that back to me. Then you’ll start getting a stream of those for each individual row returned.
Result objects are forward-only and read-only objects that allow consumption of two result types:
Tabular results
Update count
Results move forward from the first Row to the last one. After emitting the last row, a Result object gets invalidated and rows from the same Result object can no longer be consumed. Rows contained in the result depend on how the underlying database materializes the results. That is, it contains the rows that satisfy the query at either the time the query is run or as the rows are retrieved. An R2DBC driver can obtain a Result either directly or by using cursors.
Result reports the number of rows affected for SQL statements, such as updates for SQL Data Manipulation Language (DML) statements. The update count can be empty for statements that do not modify rows. After emitting the update count, a Result object gets invalidated and rows from the same Result object can no longer be consumed. The following example shows how to get a count of updated rows:
// result is a Result object
Publisher<Integer> rowsUpdated = result.getRowsUpdated();
The streaming nature of a result allows consumption of either tabular results or an update count. Depending on how the underlying database materializes results, an R2DBC driver can lift this limitation.
A Result object is emitted for each statement result in a forward-only direction.
The row itself - very simple - allows you to get a column by its identifier and tell the driver which type you think it is.
The Result interface provides a map(…) method for retrieving values from Row objects. The map method accepts a BiFunction (also referred to as mapping function) object that accepts Row and RowMetadata. The mapping function is called upon row emission with Row and RowMetadata objects. A Row is only valid during the mapping function callback and is invalid outside of the mapping function callback. Thus, Row objects must be entirely consumed by the mapping function.
R2DBC SPI -> 0.8.3 release available in Maven Central Repository
* 0.9 GA: May 2021
Finally, the 0.9 development line brings us on the path towards R2DBC 1.0.
MariaDB Driver -> 0.8.4-rc… GA end of November.
R2DBC encourages libraries to provide a “humane” API in the form of a client library. R2DBC avoids implementing user-space features in each driver, and leaves these for specific clients to implement.
If you’re eager to start using R2DBC to build an application, check out the existing clients listed below. If you’re interested in crafting your own client, check out the section on new clients further down.
Statement objects are created by Connection objects, as the following example shows:
FLUX = A Reactive Streams Publisher with rx operators that emits 0 to N elements, and then completes (successfully or with an error).
Put the simplest way possible, databases are like buckets that hold data
Brief MariaDB history
R2DBC encourages libraries to provide a “humane” API in the form of a client library. R2DBC avoids implementing user-space features in each driver, and leaves these for specific clients to implement.
If you’re eager to start using R2DBC to build an application, check out the existing clients listed below. If you’re interested in crafting your own client, check out the section on new clients further down.