SlideShare a Scribd company logo
1 of 39
Download to read offline
Understanding Data Streaming
•To understand Fast Data we must first understand traditional data streaming:
•The processing of large or unbounded sequences of data
•Dataset’s that are too large to fit in memory or disk
•Can be used to provide insights for data that never ends
•Typical use cases are to provide aggregations and predictions
•Usually synchronous and single threaded
•Very low latency and close to real time
What is Fast Data?
•The processing of high volumes of a continuous stream of data
•Fast Data combines properties from traditional stream processing, big data infrastructure,
and reactive applications
•Low to medium latency data processing in near real time
•Scales horizontally to handle a high volume of data
•Stream processing is parallelized across many CPU cores and machines by partitioning the
data stream
•Fault Tolerance provides resilience and the ability to recover from failures
•High Availability provides responsiveness and ensures uptime guarantees
Fast Data Sources
•Sensor data: Processing discrete data from many Internet of Things (IoT) devices
•Network traffic: Telecommunication network optimization using SDN’s
•Web/mobile user activity: Up-to-date trends of user behaviour from web or mobile apps
•Database event logs: Data pumps and streaming ETL to create new views of source data
Applications for Fast Data
Monitoring
Better Products & Services
Application: Monitoring
•Monitoring data for anomalies has many finance applications: Fraud Detection, Risk Management, Compliance
•Credit card companies have multiple levels of Fraud Detection
•Transaction-time fraud detection occurs at time of purchase
•Secondary fraud detection occurs after transaction time
•Requirements
•Reliable data capture is important for monitoring compliance
•Model training & scoring for fraud detection
Application: Better Products & Services
•Recommendation Engines
•Media companies suggest new songs & tv shows to users (Netflix, Spotify)
•eCommerce companies recommend new products (Amazon)
•Requirements
•Joining historical data with real-time data
So how should we design Fast Data systems?
•To implement Fast Data systems we need to review the evolution of two subsets of software
development
•Building Application Services
•Building Data Systems
Why?
•The worlds of Data Systems (aka Streaming Applications) and Applications (Microservices) are
converging.
Building Application Services
The Software Spectrum
•Monoliths and Microservices exist on a spectrum
•Monolith on one end, Microservices on the other
•Most applications live somewhere in the middle
Characteristics of a Monolith
•Deployed as a single unit
•Single shared database
•Communicate with synchronous method calls
•Deep coupling between libraries and components(often through the DB)
•“Big Bang” style releases
•Long cycle times (weeks to months)
•Teams carefully synchronize features and releases
The Monolithic Ball of Mud
•The ball of mud represents the worst case scenario for a Monolith
•No clear isolation in the application
•Complex dependencies
•Hard to understand and harder to modify
The Microservice Architecture
•Microservices are a subset of SOA
•Logical components are separated into isolated microservices
•Microservices can be physically separated and independently deployed
•Each component/microservice has it’s own independent data store
Scaling a Microservice Application
•Each microservice is scaled independently
•Could be one or more copies of a service per machine
•Each machine hosts a subset of the entire system
Characteristics of Microservices
•Each service is deployed independently
•Multiple independent databases
•Communication is synchronous or asynchronous (Reactive Microservices)
•Loose coupling between components
•Rapid deployments (possibly continuous)
•Teams release features when they are ready
Building Data Services
Fast Data Pipeline
Hadoop
•Hadoop is a system for collecting and processing massive amounts of data
•Focus on batch processing and analytics
•Divided into three projects: MapReduce, HDFS, YARN
•Linear scalability with inexpensive commodity servers
•Open Source
Disadvantages of Hadoop
•Batch semantics delay gaining insight from your data
•Discovering insights faster is a competitive advantage
•Customers today expect up-to-date and accurate information
•It’s difficult to implement business processes in MapReduce programming
model
•A poor choice for iterative operations such as Machine Learning
Distributed Stream Processors
•There are lots of distributed stream processors to choose from: Spark Streaming, Storm, Samza, Flink, Apex, Gear Pump
•They fill in the gap of streaming requirements that exists in Hadoop
•Target YARN, Mesos, and standalone cluster resource managers
Complexity of Distributed Stream Processors
•Distributed stream processors address complexities not found in
batch semantics
•Handling out of order messages
•Message delivery & processing semantics
•Event-time vs processing-time
Reactive Principles
•Responsive: A Reactive System consistently responds in a timely fashion
•Resilient: A Reactive System remains responsive, even when failures occur
•Elastic: A Reactive System remains responsive, despite changes system load
•Message Driven: A Reactive System is built on a foundation of async, non-blocking messages
Introducing
Lightbend Fast Data Platform
What is Lightbend Fast Data Platform?
Lightbend Fast Data Platform is a
● curated,
● integrated and
● fully supported platform
that provides you with an easy on-ramp for designing, building and
running streaming Fast Data applications.
Why Lightbend Fast Data Platform?
● For architects: Design capabilities and guided tool choices so you can demystify
complexity and reduce risk.
● For developers: An easy on-ramp that accelerates developer velocity so you can
build & launch performant apps on time.
● For ops teams: Run-time capabilities so you can serve users reliably at scale, along
with one-stop support for all components to ensure peace of mind.
1
2
3
4
5
6
78
7
Introducing Lightbend Fast Data Platform
Stream Processing
1. Streaming Engines
Machine Learning
2. Pluggable ML Libraries
Microservices
3. Reactive Platform
Operational Tooling
4. Intelligent Management and Monitoring
5. Cluster Analysis (FUTURE)
Infrastructure
6. Durable Messaging Backplane
7. Persistence
8. Infrastructure (On-Prem, Cloud, Hybrid)
1
2
3
4
5
6
78
7
Lightbend Fast Data Platform Components
Stream Processing
1. Streaming Engines
Machine Learning
2. Pluggable ML Libraries
Microservices
3. Reactive Platform
Operational Tooling
4. Intelligent Management and Monitoring
5. Cluster Analysis (FUTURE)
Infrastructure
6. Durable Messaging Backplane
7. Persistence
8. Infrastructure (On-Prem, Cloud, Hybrid)
When Choosing Streaming Engines…
•Low latency? How low?
•High volume? How high?
•Kinds of data processing and analytics? Which ones?
•Process data:
•Individually (e.g., complex event processing)?
•In bulk (e.g., like SQL joins)?
•Required integrations with other tools? Which ones?
1
2
3
4
5
6
78
7
Lightbend Fast Data Platform Components
Stream Processing
1. Streaming Engines
Machine Learning
2. Pluggable ML Libraries
Microservices
3. Reactive Platform
Operational Tooling
4. Intelligent Management and Monitoring
5. Cluster Analysis (FUTURE)
Infrastructure
6. Durable Messaging Backplane
7. Persistence
8. Infrastructure (On-Prem, Cloud, Hybrid)
1
2
3
4
5
6
78
7
Lightbend Fast Data Platform Components
Stream Processing
1. Streaming Engines
Machine Learning
2. Pluggable ML Libraries
Microservices
3. Reactive Platform
Operational Tooling
4. Intelligent Management and Monitoring
5. Cluster Analysis (FUTURE)
Infrastructure
6. Durable Messaging Backplane
7. Persistence
8. Infrastructure (On-Prem, Cloud, Hybrid)
1
2
3
4
5
6
78
7
Lightbend Fast Data Platform Components
Stream Processing
1. Streaming Engines
Machine Learning
2. Pluggable ML Libraries
Microservices
3. Reactive Platform
Operational Tooling
4. Intelligent Management and Monitoring
5. Cluster Analysis (FUTURE)
Infrastructure
6. Durable Messaging Backplane
7. Persistence
8. Infrastructure (On-Prem, Cloud, Hybrid)
1
2
3
4
5
6
78
7
Lightbend Fast Data Platform Components
Stream Processing
1. Streaming Engines
Machine Learning
2. Pluggable ML Libraries
Microservices
3. Reactive Platform
Operational Tooling
4. Intelligent Management and Monitoring
5. Cluster Analysis (FUTURE)
Infrastructure
6. Durable Messaging Backplane
7. Persistence
8. Infrastructure (On-Prem, Cloud, Hybrid)
1
2
3
4
5
6
78
7
Lightbend Fast Data Platform Components
Stream Processing
1. Streaming Engines
Machine Learning
2. Pluggable ML Libraries
Microservices
3. Reactive Platform
Operational Tooling
4. Intelligent Management and Monitoring
5. Cluster Analysis (FUTURE)
Infrastructure
6. Durable Messaging Backplane
7. Persistence
8. Infrastructure (On-Prem, Cloud, Hybrid)
1
2
3
4
5
6
78
7
Lightbend Fast Data Platform Components
Stream Processing
1. Streaming Engines
Machine Learning
2. Pluggable ML Libraries
Microservices
3. Reactive Platform
Operational Tooling
4. Intelligent Management and Monitoring
5. Cluster Analysis (FUTURE)
Infrastructure
6. Durable Messaging Backplane
7. Persistence
8. Infrastructure (On-Prem, Cloud, Hybrid)
1
2
3
4
5
6
78
7
Lightbend Fast Data Platform Components
Stream Processing
1. Streaming Engines
Machine Learning
2. Pluggable ML Libraries
Microservices
3. Reactive Platform
Operational Tooling
4. Intelligent Management and Monitoring
5. Cluster Analysis (FUTURE)
Infrastructure
6. Durable Messaging Backplane
7. Persistence
8. Infrastructure (On-Prem, Cloud, Hybrid)
1
2
3
4
5
6
78
7
Lightbend Fast Data Platform Components
Stream Processing
1. Streaming Engines
Machine Learning
2. Pluggable ML Libraries
Microservices
3. Reactive Platform
Operational Tooling
4. Intelligent Management and Monitoring
5. Cluster Analysis (FUTURE)
Infrastructure
6. Durable Messaging Backplane
7. Persistence
8. Infrastructure (On-Prem, Cloud, Hybrid)
Design Streaming Fast Data Applications with Spark, Akka, Kafka and Cassandra on Mesos & DC/OS
Design Streaming Fast Data Applications with Spark, Akka, Kafka and Cassandra on Mesos & DC/OS

More Related Content

Viewers also liked

Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lightbend
 
Reactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka StreamsReactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka Streams
Konrad Malawski
 

Viewers also liked (20)

Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
 
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
 
Reactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka StreamsReactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka Streams
 
Rethinking Streaming Analytics For Scale
Rethinking Streaming Analytics For ScaleRethinking Streaming Analytics For Scale
Rethinking Streaming Analytics For Scale
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
 
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaStreaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and Akka
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
Reactive app using actor model & apache spark
Reactive app using actor model & apache sparkReactive app using actor model & apache spark
Reactive app using actor model & apache spark
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
 
Reactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkReactive dashboard’s using apache spark
Reactive dashboard’s using apache spark
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Sa introduction to big data pipelining with cassandra & spark   west mins...Sa introduction to big data pipelining with cassandra & spark   west mins...
Sa introduction to big data pipelining with cassandra & spark west mins...
 
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaLambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
 
How to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOSHow to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOS
 
Alpine academy apache spark series #1 introduction to cluster computing wit...
Alpine academy apache spark series #1   introduction to cluster computing wit...Alpine academy apache spark series #1   introduction to cluster computing wit...
Alpine academy apache spark series #1 introduction to cluster computing wit...
 
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo LeeData Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
 
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time PersonalizationUsing Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
 
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksFour Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
 
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
 

More from Lightbend

More from Lightbend (20)

IoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with AkkaIoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with Akka
 
How Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a ClusterHow Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a Cluster
 
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsThe Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
 
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka MicroservicesPutting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
 
Akka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsAkka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed Applications
 
Digital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and MicroservicesDigital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and Microservices
 
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesDetecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
 
Cloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful ServerlessCloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful Serverless
 
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and BeyondDigital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
 
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
 
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
 
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
 
Microservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done RightMicroservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done Right
 
Full Stack Reactive In Practice
Full Stack Reactive In PracticeFull Stack Reactive In Practice
Full Stack Reactive In Practice
 
Akka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love StoryAkka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love Story
 
Scala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To KnowScala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To Know
 
Migrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive SystemsMigrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive Systems
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
 
Designing Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native WorldDesigning Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native World
 
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For ScalaScala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
 

Recently uploaded

CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
anilsa9823
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 

Recently uploaded (20)

How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 

Design Streaming Fast Data Applications with Spark, Akka, Kafka and Cassandra on Mesos & DC/OS

  • 1.
  • 2. Understanding Data Streaming •To understand Fast Data we must first understand traditional data streaming: •The processing of large or unbounded sequences of data •Dataset’s that are too large to fit in memory or disk •Can be used to provide insights for data that never ends •Typical use cases are to provide aggregations and predictions •Usually synchronous and single threaded •Very low latency and close to real time
  • 3. What is Fast Data? •The processing of high volumes of a continuous stream of data •Fast Data combines properties from traditional stream processing, big data infrastructure, and reactive applications •Low to medium latency data processing in near real time •Scales horizontally to handle a high volume of data •Stream processing is parallelized across many CPU cores and machines by partitioning the data stream •Fault Tolerance provides resilience and the ability to recover from failures •High Availability provides responsiveness and ensures uptime guarantees
  • 4. Fast Data Sources •Sensor data: Processing discrete data from many Internet of Things (IoT) devices •Network traffic: Telecommunication network optimization using SDN’s •Web/mobile user activity: Up-to-date trends of user behaviour from web or mobile apps •Database event logs: Data pumps and streaming ETL to create new views of source data
  • 5. Applications for Fast Data Monitoring Better Products & Services
  • 6. Application: Monitoring •Monitoring data for anomalies has many finance applications: Fraud Detection, Risk Management, Compliance •Credit card companies have multiple levels of Fraud Detection •Transaction-time fraud detection occurs at time of purchase •Secondary fraud detection occurs after transaction time •Requirements •Reliable data capture is important for monitoring compliance •Model training & scoring for fraud detection
  • 7. Application: Better Products & Services •Recommendation Engines •Media companies suggest new songs & tv shows to users (Netflix, Spotify) •eCommerce companies recommend new products (Amazon) •Requirements •Joining historical data with real-time data
  • 8. So how should we design Fast Data systems? •To implement Fast Data systems we need to review the evolution of two subsets of software development •Building Application Services •Building Data Systems
  • 9. Why? •The worlds of Data Systems (aka Streaming Applications) and Applications (Microservices) are converging.
  • 11. The Software Spectrum •Monoliths and Microservices exist on a spectrum •Monolith on one end, Microservices on the other •Most applications live somewhere in the middle
  • 12. Characteristics of a Monolith •Deployed as a single unit •Single shared database •Communicate with synchronous method calls •Deep coupling between libraries and components(often through the DB) •“Big Bang” style releases •Long cycle times (weeks to months) •Teams carefully synchronize features and releases
  • 13. The Monolithic Ball of Mud •The ball of mud represents the worst case scenario for a Monolith •No clear isolation in the application •Complex dependencies •Hard to understand and harder to modify
  • 14. The Microservice Architecture •Microservices are a subset of SOA •Logical components are separated into isolated microservices •Microservices can be physically separated and independently deployed •Each component/microservice has it’s own independent data store
  • 15. Scaling a Microservice Application •Each microservice is scaled independently •Could be one or more copies of a service per machine •Each machine hosts a subset of the entire system
  • 16. Characteristics of Microservices •Each service is deployed independently •Multiple independent databases •Communication is synchronous or asynchronous (Reactive Microservices) •Loose coupling between components •Rapid deployments (possibly continuous) •Teams release features when they are ready
  • 19. Hadoop •Hadoop is a system for collecting and processing massive amounts of data •Focus on batch processing and analytics •Divided into three projects: MapReduce, HDFS, YARN •Linear scalability with inexpensive commodity servers •Open Source
  • 20. Disadvantages of Hadoop •Batch semantics delay gaining insight from your data •Discovering insights faster is a competitive advantage •Customers today expect up-to-date and accurate information •It’s difficult to implement business processes in MapReduce programming model •A poor choice for iterative operations such as Machine Learning
  • 21. Distributed Stream Processors •There are lots of distributed stream processors to choose from: Spark Streaming, Storm, Samza, Flink, Apex, Gear Pump •They fill in the gap of streaming requirements that exists in Hadoop •Target YARN, Mesos, and standalone cluster resource managers
  • 22. Complexity of Distributed Stream Processors •Distributed stream processors address complexities not found in batch semantics •Handling out of order messages •Message delivery & processing semantics •Event-time vs processing-time
  • 23. Reactive Principles •Responsive: A Reactive System consistently responds in a timely fashion •Resilient: A Reactive System remains responsive, even when failures occur •Elastic: A Reactive System remains responsive, despite changes system load •Message Driven: A Reactive System is built on a foundation of async, non-blocking messages
  • 25. What is Lightbend Fast Data Platform? Lightbend Fast Data Platform is a ● curated, ● integrated and ● fully supported platform that provides you with an easy on-ramp for designing, building and running streaming Fast Data applications.
  • 26. Why Lightbend Fast Data Platform? ● For architects: Design capabilities and guided tool choices so you can demystify complexity and reduce risk. ● For developers: An easy on-ramp that accelerates developer velocity so you can build & launch performant apps on time. ● For ops teams: Run-time capabilities so you can serve users reliably at scale, along with one-stop support for all components to ensure peace of mind.
  • 27. 1 2 3 4 5 6 78 7 Introducing Lightbend Fast Data Platform Stream Processing 1. Streaming Engines Machine Learning 2. Pluggable ML Libraries Microservices 3. Reactive Platform Operational Tooling 4. Intelligent Management and Monitoring 5. Cluster Analysis (FUTURE) Infrastructure 6. Durable Messaging Backplane 7. Persistence 8. Infrastructure (On-Prem, Cloud, Hybrid)
  • 28. 1 2 3 4 5 6 78 7 Lightbend Fast Data Platform Components Stream Processing 1. Streaming Engines Machine Learning 2. Pluggable ML Libraries Microservices 3. Reactive Platform Operational Tooling 4. Intelligent Management and Monitoring 5. Cluster Analysis (FUTURE) Infrastructure 6. Durable Messaging Backplane 7. Persistence 8. Infrastructure (On-Prem, Cloud, Hybrid)
  • 29. When Choosing Streaming Engines… •Low latency? How low? •High volume? How high? •Kinds of data processing and analytics? Which ones? •Process data: •Individually (e.g., complex event processing)? •In bulk (e.g., like SQL joins)? •Required integrations with other tools? Which ones?
  • 30. 1 2 3 4 5 6 78 7 Lightbend Fast Data Platform Components Stream Processing 1. Streaming Engines Machine Learning 2. Pluggable ML Libraries Microservices 3. Reactive Platform Operational Tooling 4. Intelligent Management and Monitoring 5. Cluster Analysis (FUTURE) Infrastructure 6. Durable Messaging Backplane 7. Persistence 8. Infrastructure (On-Prem, Cloud, Hybrid)
  • 31. 1 2 3 4 5 6 78 7 Lightbend Fast Data Platform Components Stream Processing 1. Streaming Engines Machine Learning 2. Pluggable ML Libraries Microservices 3. Reactive Platform Operational Tooling 4. Intelligent Management and Monitoring 5. Cluster Analysis (FUTURE) Infrastructure 6. Durable Messaging Backplane 7. Persistence 8. Infrastructure (On-Prem, Cloud, Hybrid)
  • 32. 1 2 3 4 5 6 78 7 Lightbend Fast Data Platform Components Stream Processing 1. Streaming Engines Machine Learning 2. Pluggable ML Libraries Microservices 3. Reactive Platform Operational Tooling 4. Intelligent Management and Monitoring 5. Cluster Analysis (FUTURE) Infrastructure 6. Durable Messaging Backplane 7. Persistence 8. Infrastructure (On-Prem, Cloud, Hybrid)
  • 33. 1 2 3 4 5 6 78 7 Lightbend Fast Data Platform Components Stream Processing 1. Streaming Engines Machine Learning 2. Pluggable ML Libraries Microservices 3. Reactive Platform Operational Tooling 4. Intelligent Management and Monitoring 5. Cluster Analysis (FUTURE) Infrastructure 6. Durable Messaging Backplane 7. Persistence 8. Infrastructure (On-Prem, Cloud, Hybrid)
  • 34. 1 2 3 4 5 6 78 7 Lightbend Fast Data Platform Components Stream Processing 1. Streaming Engines Machine Learning 2. Pluggable ML Libraries Microservices 3. Reactive Platform Operational Tooling 4. Intelligent Management and Monitoring 5. Cluster Analysis (FUTURE) Infrastructure 6. Durable Messaging Backplane 7. Persistence 8. Infrastructure (On-Prem, Cloud, Hybrid)
  • 35. 1 2 3 4 5 6 78 7 Lightbend Fast Data Platform Components Stream Processing 1. Streaming Engines Machine Learning 2. Pluggable ML Libraries Microservices 3. Reactive Platform Operational Tooling 4. Intelligent Management and Monitoring 5. Cluster Analysis (FUTURE) Infrastructure 6. Durable Messaging Backplane 7. Persistence 8. Infrastructure (On-Prem, Cloud, Hybrid)
  • 36. 1 2 3 4 5 6 78 7 Lightbend Fast Data Platform Components Stream Processing 1. Streaming Engines Machine Learning 2. Pluggable ML Libraries Microservices 3. Reactive Platform Operational Tooling 4. Intelligent Management and Monitoring 5. Cluster Analysis (FUTURE) Infrastructure 6. Durable Messaging Backplane 7. Persistence 8. Infrastructure (On-Prem, Cloud, Hybrid)
  • 37. 1 2 3 4 5 6 78 7 Lightbend Fast Data Platform Components Stream Processing 1. Streaming Engines Machine Learning 2. Pluggable ML Libraries Microservices 3. Reactive Platform Operational Tooling 4. Intelligent Management and Monitoring 5. Cluster Analysis (FUTURE) Infrastructure 6. Durable Messaging Backplane 7. Persistence 8. Infrastructure (On-Prem, Cloud, Hybrid)