The future of scalable data processing is event-driven microservices! They provide a powerful paradigm that solves issues typically associated with distributed applications such as availability, data consistency, or communication complexity, and allows the creation of sophisticated and extensible data processing pipelines.
Building on the ease of development and deployment provided by Spring Boot and the cloud native capabilities of Spring Cloud, the Spring Cloud Stream project provides a simple and powerful framework for creating event-driven microservices. They make it easy to develop data-processing Spring Boot applications that build upon the capabilities of Spring Integration. At a higher level of abstraction, Spring Cloud Data Flow is an integrated orchestration layer that provides a highly productive experience for deploying and managing sophisticated data pipelines consisting of standalone microservices. Streams are defined using a DSL abstraction and can be managed via shell and a web UI. Furthermore, a pluggable runtime SPI allows Spring Cloud Data Flow to coordinate these applications across a variety of distributed runtime platforms such as Apache YARN, Cloud Foundry, Kubernetes, or Apache Mesos.
2. !2
Stream Processing in the Cloud with Data Microservices
• Use Cases
• Predictive maintenance
• Fraud detection
• QoS measurement
• Log analysis
• High throughput/low latency
• Growing quantities of data
• Immediate response is required
• Grouping and ordering of data
• Partitioning
• Windowing
3. !3
Stream Processing in the Cloud with Data Microservices
• Huge quantities of data to be analyzed efficiently
• Scaling requirements
• Massive storage
• Massive computing power (memory/CPU)
• Massive scalability, from a few machines to data center level
• Reliance on platform’s resource management abilities
• public and private cloud: AWS
• cluster managers: Apache YARN, Apache Mesos,
Kubernetes
• full application platforms: Cloud Foundry
4. !4
Stream Processing in the Cloud with Data Microservices
• Microservice pattern applied to data processing applications
• Typical benefits of microservices:
• scalability, isolation, agility, continuous deployment,
operational control
• Tuning process-specific resources
• Instance count
• Memory
• CPU
• Event-driven interaction
• communication decoupling, distributed data consistency
• There’s life after REST
10. !10
Consumer groups
▪ Borrowed from Kafka, applied to all binders
▪ Groups of competing consumers within the pub-sub destination
▪ Durable subscriptions
▪ Used in scaling and partitioning
11. !11
Partitioning
▪ Required for stateful processing scenarios
▪ Outputs specify a data partitioning strategy
▪ Inputs can be bound to a specific partition
12. !12
Programming model: Spring Integration
package org.springframework.cloud.stream.messaging;
public interface Source {
String OUTPUT = "output";
@Output(Source.OUTPUT)
MessageChannel output();
}
@EnableBinding(Source.class)
@SpringBootApplication
public class Application {
@InboundChannelAdapter(Source.OUTPUT)
public String sayHello() {
return “hello” + System.currentTimeMillis();
}
}
14. !14
@Enable All the Things
▪ @EnableBindings(Source.class)
▪ one output
▪ @EnableBindings(Sink.class)
▪ one input
▪ @EnableBinding(Processor.class)
▪ one input and one output
▪ @EnableBinding(MyOrderHandler.class)
▪ custom interfaces with as many inputs and outputs
▪ @EnableRxJavaProcessor
▪ OOTB support for RxJava with one input and one output
17. !17
Spring Cloud Data Flow
▪ Portable Orchestration Layer for Stream and Tasks
▪ Stream and Task DSL
http | transform | hdfs
▪ REST API
▪ Shell
▪ UI
▪ OOTB apps for common integration use-cases
18. !18
Spring Cloud Deployer
▪ SPI for deploying applications to modern runtimes
▪ Local (for testing)
▪ YARN
▪ Cloud Foundry
▪ Kubernetes
▪ Mesos + Marathon
19. !19
dataflow:> stream create demo --definition “http | file”
▪ Stream definition
▪ Launching Boot applications
▪ Can pass configuration parameters via Spring Boot
▪ Control instance count, resource allocation
24. !24
Roadmap
▪ Spring Cloud Stream 1.0.0.RC1 - March 22, 2016
▪ Spring Cloud Stream 1.0.0.GA - March 2016
▪ Spring Cloud Data Flow - 1.0.0.GA Q2 2016