Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Kai Wähner
Technology Evangelist
kontakt@kai-waehner.de
LinkedIn
@KaiWaehner
www.kai-waehner.de
Big Data Spain @ Madrid (N...
© Copyright 2000-2016 TIBCO Software Inc.
Key Take-Aways
• Streaming Analytics processes Data while it is in Motion!
• Aut...
© Copyright 2000-2016 TIBCO Software Inc.
Agenda
• Real World Use Cases
• Introduction to Streaming Analytics
• Market Ove...
© Copyright 2000-2016 TIBCO Software Inc.
Agenda
• Real World Use Cases
• Introduction to Streaming Analytics
• Market Ove...
© Copyright 2000-2016 TIBCO Software Inc.
Analyze and Act on Critical Business Moments
© Copyright 2000-2016 TIBCO Software Inc.
Success Story
Predictive
Fault Management
© Copyright 2000-2013 TIBCO Software Inc.
“An outage on one well can cost $10M per
hour. We have 20-100 outages per year.“...
Data Monitoring
• Motor temperature
• Motor vibration
• Current
• Intake pressure
• Intake
temperature
Ø Flow
Electrical p...
Voltage
Temperature
Vibration
Device
history
Temporal analytic: “If vibration spike is followed by temp spike then
voltage...
© Copyright 2000-2016 TIBCO Software Inc.
Live Surveillance of Equipment
Continuous, live geospatial display of pump healt...
© Copyright 2000-2016 TIBCO Software Inc.
Success Story
Crowd Management
© Copyright 2000-2013 TIBCO Software Inc.
“Turn the customer into a fan and increase
revenue significantly.“
© Copyright 2000-2016 TIBCO Software Inc.
World’s Smartest Building
© Copyright 2000-2015 TIBCO Software Inc.
© Copyright 2000-2016 TIBCO Software Inc.
All Customers are different… Treat them that way…
14
Capture – Engage – Expand -...
© Copyright 2000-2016 TIBCO Software Inc.
Success Story
Smart Manufacturing
© Copyright 2000-2013 TIBCO Software Inc.
““For every 1% increase in shipped
product, we make $11MM in profit. The
demand ...
Scenario: Predictive Scrapping of Parts in an Assembly Line
Goal: Scrap parts as early as possible automatically to reduce...
Machine Learning Applied to Sensor Events in Real Time
© Copyright 2000-2016 TIBCO Software Inc.
Example: Predictive Analy...
© Copyright 2000-2016 TIBCO Software Inc.
Great success stories, but …
… how to realize these use cases?
© Copyright 2000-2016 TIBCO Software Inc.
Agenda
• Real World Use Cases
• Introduction to Streaming Analytics
• Market Ove...
© Copyright 2000-2016 TIBCO Software Inc.
Traditional Data Processing: ”Request – Response”
Store
Analyze
Act
© Copyright 2000-2016 TIBCO Software Inc.
Traditional Data Processing: ”Request – Response”
• Data is collected from a var...
© Copyright 2000-2016 TIBCO Software Inc.
Traditional Data Processing: Challenges
Store
Analyze
Act
• Introduces too much ...
© Copyright 2000-2016 TIBCO Software Inc.
Event Value Decreases Over TimeValue
Time
© Copyright 2000-2016 TIBCO Software Inc.
Event Value Decreases Over TimeValue
Time
• Events are often most
valuable “clos...
© Copyright 2000-2016 TIBCO Software Inc.
The New Era: Streaming Analytics
Act &
Monitor
Analyze
Store
© Copyright 2000-2016 TIBCO Software Inc.
The New Era: Streaming Analytics
• Events are analyzed and processed in
real-tim...
© Copyright 2000-2016 TIBCO Software Inc.
Streaming Analytics: What Is A “Stream”?
Clickstream
Sensors
Social Data
Logs
• ...
© Copyright 2000-2016 TIBCO Software Inc.
Streaming Analytics Processing Pipeline
APIs
Adapters /
Channels
Integration
Mes...
© Copyright 2000-2016 TIBCO Software Inc.
Streaming Analytics Processing Pipeline
Separation of concerns
to easily adjust ...
© Copyright 2000-2016 TIBCO Software Inc.
Streaming Analytics: Ingest
APIs
Adapters /
Channels
Integration
Messaging
Strea...
© Copyright 2000-2016 TIBCO Software Inc.
Streaming Analytics: Preprocessing
Transformation
Aggregation
Enrichment
Filteri...
© Copyright 2000-2016 TIBCO Software Inc.
Streaming Analytics: Processing
Batch
• Transform
• Deep ML
• Analytics
• Data L...
© Copyright 2000-2016 TIBCO Software Inc.
Streaming Analytics Processing Pipeline
APIs
Adapters /
Channels
Integration
Mes...
© Copyright 2000-2016 TIBCO Software Inc.
Dataflow Streaming Pipeline – Extract, Transform, Load in Real Time
https://www....
© Copyright 2000-2016 TIBCO Software Inc.
Streaming Analytics Processing Pipeline
APIs
Adapters /
Channels
Integration
Mes...
© Copyright 2000-2016 TIBCO Software Inc.
Streaming Analytics: “Windows”
https://www.oreilly.com/ideas/the-world-beyond-ba...
© Copyright 2000-2016 TIBCO Software Inc.
Automation and Augmented Intelligence for Humans
Actions by Operations
Human	dec...
Big Data Reference Architecture
Augmented	Intelligence
Operations
SENSOR DATA
TRANSACTIONS
MESSAGE BUS
MACHINE DATA
SOCIAL...
© Copyright 2000-2016 TIBCO Software Inc.
Agenda
• Real World Use Cases
• Introduction to Streaming Analytics
• Market Ove...
© Copyright 2000-2016 TIBCO Software Inc.
Streaming Analytics Market Growing Significantly
“Everything Flows:
The value of...
© Copyright 2000-2016 TIBCO Software Inc.
Alternatives for Stream Processing
Time
to
Market
Streaming
Frameworks
Streaming...
© Copyright 2000-2016 TIBCO Software Inc.
Alternatives for Stream Processing
Concepts (Continuous Queries, Sliding Windows...
© Copyright 2000-2016 TIBCO Software Inc.
Usually not an option ...
… as there are a lot of
Frameworks and
Products availa...
© Copyright 2000-2016 TIBCO Software Inc.
Alternatives for Stream Processing
Library (Java, .NET, Python)
Query Language (...
© Copyright 2000-2016 TIBCO Software Inc.
Streaming Analytics Processing Pipeline
APIs
Adapters /
Channels
Integration
Mes...
© Copyright 2000-2016 TIBCO Software Inc.
Example for an Open Source Streaming Pipeline
http://hortonworks.com/hadoop-tuto...
Dataflow Streaming Pipeline (Ingest, Preprocess)
Augmented	Intelligence
Operations
SENSOR DATA
TRANSACTIONS
MESSAGE BUS
MA...
© Copyright 2000-2016 TIBCO Software Inc.
Open Source Dataflow Streaming Pipelines
Streaming Analytics
Augmented	Intelligence
Operations
SENSOR DATA
TRANSACTIONS
MESSAGE BUS
MACHINE DATA
SOCIAL DATA
Stream...
© Copyright 2000-2016 TIBCO Software Inc.
Frameworks and Products (no complete list!)
OPEN SOURCE CLOSED SOURCE
PRODUCT
FR...
© Copyright 2000-2016 TIBCO Software Inc.
Frameworks and Products (no complete list!)
OPEN SOURCE CLOSED SOURCE
PRODUCT
FR...
© Copyright 2000-2016 TIBCO Software Inc.
Apache Storm
Spout Bolt
© Copyright 2000-2016 TIBCO Software Inc.
Apache Storm – Hello World
http://wpcertification.blogspot.ch/2014/02/helloworld...
© Copyright 2000-2016 TIBCO Software Inc.
AWS Kinesis – Integration with other AWS Components
https://aws.amazon.com/kines...
© Copyright 2000-2016 TIBCO Software Inc.
AWS Kinesis – Hello World
© Copyright 2000-2016 TIBCO Software Inc.
AWS Kinesis – Public Cloud Trade-Off
… is easy to setup and scale.
But you do no...
© Copyright 2000-2016 TIBCO Software Inc.
Apache Spark
General Data-processing Framework
à However, focus is especially on...
© Copyright 2000-2016 TIBCO Software Inc.
Apache Spark – Focus on Analytics
http://aptuz.com/blog/is-apache-spark-going-to...
© Copyright 2000-2016 TIBCO Software Inc.
Spark Streaming
Spark Streaming
• is no real streaming solution
• uses micro-bat...
© Copyright 2000-2016 TIBCO Software Inc.
Apache Spark – Hello World
Spark Streaming API
Spark Core API
© Copyright 2000-2016 TIBCO Software Inc.
Apache Spark – as a Cloud Service
© Copyright 2000-2016 TIBCO Software Inc.
Apache Flink
Spark Streaming
• „Newcomer“
• Looks very similar to Spark
• But „S...
© Copyright 2000-2016 TIBCO Software Inc.
Apache Beam
Generic API with unified programming model for stream processing fra...
© Copyright 2000-2016 TIBCO Software Inc.
Frameworks and Products (no complete list!)
OPEN SOURCE CLOSED SOURCE
PRODUCT
FR...
Alternatives for Stream Processing
Library (Java, .NET, Python)
Query Language (often similar to SQL)
Scalability (horizon...
© Copyright 2000-2016 TIBCO Software Inc.
Streaming Analytics Processing Pipeline
APIs
Adapters /
Channels
Integration
Mes...
Dataflow Streaming Pipeline + Streaming Analytics
Augmented	Intelligence
Operations
SENSOR DATA
TRANSACTIONS
MESSAGE BUS
M...
© Copyright 2000-2016 TIBCO Software Inc.
IBM Streams
© Copyright 2000-2016 TIBCO Software Inc.
TIBCO StreamBase
• Performance: Latency, Throughput, Scalability
• Multi-threade...
© Copyright 2000-2016 TIBCO Software Inc.
TIBCO StreamBase - Visual Programming
Aggregate
Capture	card	activations	per	
lo...
Visual Debugger
Feed Simulation
Unit Testing
StreamBase Development StudioTIBCO StreamBase - Visual Programming
Live UI for Augmented Intelligence
Augmented	Intelligence
Operations
SENSOR DATA
TRANSACTIONS
MESSAGE BUS
MACHINE DATA
SOC...
© Copyright 2000-2016 TIBCO Software Inc.
Live User Interface
Live UI
Continuous Query Processor Alerts
CEP
MQTT
JMS
In-Me...
© Copyright 2000-2016 TIBCO Software Inc.
Live UI in Desktop / Web Browser / Mobile App
Dynamic aggregation
Live visualiza...
© Copyright 2000-2016 TIBCO Software Inc.
Live UI - Products
Characteristics to Check
• Alternative clients (rich client, ...
© Copyright 2000-2016 TIBCO Software Inc.
Spoilt for Choice
Does it make sense to
combine frameworks
and products?
© Copyright 2000-2016 TIBCO Software Inc.
Customer Example: Apache Storm + TIBCO Live Datamart
External
Data
Snapshot
Resu...
© Copyright 2000-2016 TIBCO Software Inc.
Agenda
• Real World Use Cases
• Introduction to Streaming Analytics
• Market Ove...
© Copyright 2000-2016 TIBCO Software Inc.
Closed Loop: Understand – Anticipate – Act
© Copyright 2000-2016 TIBCO Software Inc.
Closed Loop: Understand – Anticipate – Act
Insights Actions
MONITOR
PREDICT
ACT
...
Data Discovery via Visual Analytics, Big Data and Machine Learning
Augmented	Intelligence
Operations
SENSOR DATA
TRANSACTI...
Find Insights and Patterns in Historical Data
Visual Analytics + Machine Learning
Apply Insights and Analytic Models to Proactive Actions
Streaming
AnalyticsH20.ai
Open Source
R
TERR
Spark ML
MATLAB
SAS
P...
© Copyright 2000-2013 TIBCO Software Inc.
80% of betting happens
AFTER the game begins
TODAY
Case Study: Streaming Analytics for Betting
• Situation: Today, 80% of Betting is Done After the
Game Starts
• It’s not yo...
© Copyright 2000-2016 TIBCO Software Inc.
Big Data Architecture for Streaming Betting Analytics
Event Processing
MONITOR
R...
Real-Time Social Media Analytics
Twitter
(#TomBradyBrokenLeg)
Twitter (#Boston)
Brady’s
Stats
Actionable
Insights
Twitter ...
© Copyright 2000-2016 TIBCO Software Inc.
Real-Time Social Media Analytics
© Copyright 2000-2016 TIBCO Software Inc.
Agenda
• Real World Use Cases
• Introduction to Streaming Analytics
• Market Ove...
Scenario: Predictive Scrapping of Parts in an Assembly Line
Goal: Scrap parts as early as possible automatically to reduce...
Big Data Architecture for Predictive Maintenance
Operational	Analytics
Operations
Live	UI
CSV Batch
JSON Real Time
XML Rea...
Find Patterns à TIBCO Spotfire with H2O Integration
© Copyright 2000-2016 TIBCO Software Inc.
Example: Predictive Analytic...
© Copyright 2000-2016 TIBCO Software Inc.
Apply Patterns à TIBCO StreamBase Connector for H2O.ai
Monitor Patterns à TIBCO Live Datamart
Augmented Intelligence (“Monitor the manufacturing process and change rules in real...
Monitor Patterns à TIBCO Live Datamart
Augmented Intelligence (“Monitor the manufacturing process and change rules in real...
TIBCO Spotfire + StreamBase + Live Datamart + H2O.ai
Live DemoLive Demo
© Copyright 2000-2016 TIBCO Software Inc.
Key Take-Aways
• Streaming Analytics processes Data while it is in Motion!
• Aut...
Questions? Please contact me!
Kai Wähner
Technology Evangelist
kontakt@kai-waehner.de
@KaiWaehner
www.kai-waehner.de
Linke...
Upcoming SlideShare
Loading in …5
×

Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Services

14,435 views

Published on

Streaming Analytics Comparison of Open Source Frameworks, Products and Cloud Services. Includes Apache Storm, Flink, Spark, TIBCO, IBM, AWS Kinesis, Striim, Zoomdata, ...

This session discusses the technical concepts of stream processing / streaming analytics and how it is related to big data, mobile, cloud and internet of things. Different use cases such as predictive fault management or fraud detection are used to show and compare alternative frameworks and products for stream processing and streaming analytics.

The focus of the session lies on comparing

- different open source frameworks such as Apache Apex, Apache Flink or Apache Spark Streaming
- engines from software vendors such as IBM InfoSphere Streams, TIBCO StreamBase
- cloud offerings such as AWS Kinesis.
- real time streaming UIs such as Striim, Zoomdata or TIBCO Live Datamart.

Live demos will give the audience a good feeling about how to use these frameworks and tools.

The session will also discuss how stream processing is related to Apache Hadoop frameworks (such as MapReduce, Hive, Pig or Impala) and machine learning (such as R, Spark ML or H2O.ai).

Published in: Technology

Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Services

  1. 1. Kai Wähner Technology Evangelist kontakt@kai-waehner.de LinkedIn @KaiWaehner www.kai-waehner.de Big Data Spain @ Madrid (November 2016) Comparison of Streaming Analytics Frameworks
  2. 2. © Copyright 2000-2016 TIBCO Software Inc. Key Take-Aways • Streaming Analytics processes Data while it is in Motion! • Automation and Proactive Human Interaction are BOTH needed! • Streaming Analytics is Complementary to Hadoop and Machine Learning!
  3. 3. © Copyright 2000-2016 TIBCO Software Inc. Agenda • Real World Use Cases • Introduction to Streaming Analytics • Market Overview • Relation to other Big Data Components • Live Demo
  4. 4. © Copyright 2000-2016 TIBCO Software Inc. Agenda • Real World Use Cases • Introduction to Streaming Analytics • Market Overview • Relation to other Big Data Components • Live Demo
  5. 5. © Copyright 2000-2016 TIBCO Software Inc. Analyze and Act on Critical Business Moments
  6. 6. © Copyright 2000-2016 TIBCO Software Inc. Success Story Predictive Fault Management
  7. 7. © Copyright 2000-2013 TIBCO Software Inc. “An outage on one well can cost $10M per hour. We have 20-100 outages per year.“ - Drilling operations VP, major oil company
  8. 8. Data Monitoring • Motor temperature • Motor vibration • Current • Intake pressure • Intake temperature Ø Flow Electrical power cable Pump Intake Protector ESP motor Pump monitoring unit Electric Submersible Pumps (ESP) Predictive Analytics - Fault Management
  9. 9. Voltage Temperature Vibration Device history Temporal analytic: “If vibration spike is followed by temp spike then voltage spike [within 4 hours] then flag high severity alert.” Predictive Analytics - Fault Management
  10. 10. © Copyright 2000-2016 TIBCO Software Inc. Live Surveillance of Equipment Continuous, live geospatial display of pump health and predictive signal breeches Alerts based on predictive signals Compare live readings and signals to historical average and means Continuous, live visualization of stats per 100’s of wells
  11. 11. © Copyright 2000-2016 TIBCO Software Inc. Success Story Crowd Management
  12. 12. © Copyright 2000-2013 TIBCO Software Inc. “Turn the customer into a fan and increase revenue significantly.“
  13. 13. © Copyright 2000-2016 TIBCO Software Inc. World’s Smartest Building © Copyright 2000-2015 TIBCO Software Inc.
  14. 14. © Copyright 2000-2016 TIBCO Software Inc. All Customers are different… Treat them that way… 14 Capture – Engage – Expand - Monetize Patterns – Real time MOREPERSONAL MORE CONTEXT social CRM POS mobileweb e-mails
  15. 15. © Copyright 2000-2016 TIBCO Software Inc. Success Story Smart Manufacturing
  16. 16. © Copyright 2000-2013 TIBCO Software Inc. ““For every 1% increase in shipped product, we make $11MM in profit. The demand is there, we just need to fulfill it.“ - Head of Quality, Solar Panel Manufacturer
  17. 17. Scenario: Predictive Scrapping of Parts in an Assembly Line Goal: Scrap parts as early as possible automatically to reduce costs in a manufacturing process. Question: When to scrap a part in Station 1 instead of doing re-work or sending it to Station 2? Station 1 Station 2 Cost Before 9€ 7€ 13€ Total Cost 29€ (or more) Scrap? Scrap?
  18. 18. Machine Learning Applied to Sensor Events in Real Time © Copyright 2000-2016 TIBCO Software Inc. Example: Predictive Analytics for Manufacturing (“scrap parts as early as possible”)
  19. 19. © Copyright 2000-2016 TIBCO Software Inc. Great success stories, but … … how to realize these use cases?
  20. 20. © Copyright 2000-2016 TIBCO Software Inc. Agenda • Real World Use Cases • Introduction to Streaming Analytics • Market Overview • Relation to other Big Data Components • Live Demo
  21. 21. © Copyright 2000-2016 TIBCO Software Inc. Traditional Data Processing: ”Request – Response” Store Analyze Act
  22. 22. © Copyright 2000-2016 TIBCO Software Inc. Traditional Data Processing: ”Request – Response” • Data is collected from a variety of sources, and placed in a persistent store. – Relational database. – NoSQL store. – Hadoop environment. • Analytical processes are executed against the stored data to detect opportunities or threats. • Actions are identified, delivered, and executed across various business channels. Store Analyze Act
  23. 23. © Copyright 2000-2016 TIBCO Software Inc. Traditional Data Processing: Challenges Store Analyze Act • Introduces too much “decision latency” into the business. • Responses are delivered “after-the- fact”. • Maximum value of the identified situation is lost. – Cross-sell / up-sell opportunities are lost, impending equipment failure is missed, business processes are slow to respond and lack timely context. • Decisions are made on old and stale data.
  24. 24. © Copyright 2000-2016 TIBCO Software Inc. Event Value Decreases Over TimeValue Time
  25. 25. © Copyright 2000-2016 TIBCO Software Inc. Event Value Decreases Over TimeValue Time • Events are often most valuable “close to” the point of collection. • As time passes, events tend to lose their value. • The ability to proactively identify “threats” or “opportunities” will typically decrease. • Real-time capability is needed to maximize event value.
  26. 26. © Copyright 2000-2016 TIBCO Software Inc. The New Era: Streaming Analytics Act & Monitor Analyze Store
  27. 27. © Copyright 2000-2016 TIBCO Software Inc. The New Era: Streaming Analytics • Events are analyzed and processed in real-time as they arrive. • Decisions are timely, contextual, and based on fresh data. • Decision latency is eliminated, resulting in: ü Superior Customer Experience ü Operational Excellence ü Instant Awareness and Timely Decisions Act & Monitor Analyze Store
  28. 28. © Copyright 2000-2016 TIBCO Software Inc. Streaming Analytics: What Is A “Stream”? Clickstream Sensors Social Data Logs • Consists of pieces of data typically generated due to a change of state. • One or more identifiers • Timestamp & payload • Immutable • Typically unbounded; there is no end to the data. • Batch dataset: “bounded”. • Can be raw or derived.
  29. 29. © Copyright 2000-2016 TIBCO Software Inc. Streaming Analytics Processing Pipeline APIs Adapters / Channels Integration Messaging Stream Ingest Transformation Aggregation Enrichment Filtering Stream Preprocessing Process Management Analytics (Real Time) Applications & APIs Analytics / DW Reporting Stream Outcomes • Contextual Rules • Windowing • Patterns • Deep ML • Analytics • … Stream Analytics & Processing Index / SearchNormalization
  30. 30. © Copyright 2000-2016 TIBCO Software Inc. Streaming Analytics Processing Pipeline Separation of concerns to easily adjust one part in response to changing business requirements without the need for rewriting other parts!
  31. 31. © Copyright 2000-2016 TIBCO Software Inc. Streaming Analytics: Ingest APIs Adapters / Channels Integration Messaging Stream Ingest • Stream data may come from a number sources, either at the edge, in the data center, or via the cloud. • Need to handle a variety of data formats and protocols, all at global scale. • Pay attention to “event time” vs. “processing time” !! • Event Time: Time the event was created. • Processing Time: Time the event was received or processed. • Event time is typically more relevant, and will lead to more predictable results. • Eliminate time skew associated with clock synchronization, system outages, processing latency, network issues, etc.
  32. 32. © Copyright 2000-2016 TIBCO Software Inc. Streaming Analytics: Preprocessing Transformation Aggregation Enrichment Filtering Stream Preprocessing Normalization • Stream data often needs to be manipulated before it is processed by downstream components. • Normalization • Transformation • May filter unwanted events close to the source to eliminate “noise”. • Events may also be enriched with additional context to provide additional data for further processing. • Customer details, equipment details, location information, etc. • Data may be stored in a high-speed cache or other store for rapid access.
  33. 33. © Copyright 2000-2016 TIBCO Software Inc. Streaming Analytics: Processing Batch • Transform • Deep ML • Analytics • Data Lake • … Stream Analytics & Processing Real-Time • RT Analytics • Contextual Rules • Windowing • Patterns • … • Streams may be immediately pushed to a data lake. • May be raw or preprocessed. • Used for subsequent analysis as part of an immutable data layer. • Typically processed in batch in this part of the architecture. • In parallel, streams may be processed in real-time against a number of constructs. • Real-time analytics. • Graph analysis / Geo Analysis • Rules. • Results from the real-time processing may be fed into the batch component. • The results of batch processing may also be pushed into the real- time layer.
  34. 34. © Copyright 2000-2016 TIBCO Software Inc. Streaming Analytics Processing Pipeline APIs Adapters / Channels Integration Messaging Stream Ingest Transformation Aggregation Enrichment Filtering Stream Preprocessing Process Management Analytics (Real Time) Applications & APIs Analytics / DW Reporting Stream Outcomes • Contextual Rules • Windowing • Patterns • Deep ML • Analytics • … Stream Analytics & Processing Index / SearchNormalization
  35. 35. © Copyright 2000-2016 TIBCO Software Inc. Dataflow Streaming Pipeline – Extract, Transform, Load in Real Time https://www.linkedin.com/pulse/data-pipeline-hadoop-part-1-2-birender-saini
  36. 36. © Copyright 2000-2016 TIBCO Software Inc. Streaming Analytics Processing Pipeline APIs Adapters / Channels Integration Messaging Stream Ingest Transformation Aggregation Enrichment Filtering Stream Preprocessing Process Management Analytics (Real Time) Applications & APIs Analytics / DW Reporting Stream Outcomes • Contextual Rules • Windowing • Patterns • Deep ML • Analytics • … Stream Analytics & Processing Index / SearchNormalization
  37. 37. © Copyright 2000-2016 TIBCO Software Inc. Streaming Analytics: “Windows” https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101
  38. 38. © Copyright 2000-2016 TIBCO Software Inc. Automation and Augmented Intelligence for Humans Actions by Operations Human decisions in real time informed by up to date information 38 Automated action based on models of history combined with live context and business rules Machine-to-Machine Automation
  39. 39. Big Data Reference Architecture Augmented Intelligence Operations SENSOR DATA TRANSACTIONS MESSAGE BUS MACHINE DATA SOCIAL DATA Streaming AnalyticsAction Aggregate Rules Stream Processing Analytics Correlate Continuous query processing Alerts Manual action, escalation Data Discovery Python R Data Scientists Cleansed Data History Visual Analytics Spark Integration ERP MDM DB WMS SOA / Microservices BIG DATA Data Warehouse, Hadoop Internal Data Integration Bus API Event Server H2O.ai Live UI
  40. 40. © Copyright 2000-2016 TIBCO Software Inc. Agenda • Real World Use Cases • Introduction to Streaming Analytics • Market Overview • Relation to other Big Data Components • Live Demo
  41. 41. © Copyright 2000-2016 TIBCO Software Inc. Streaming Analytics Market Growing Significantly “Everything Flows: The value of stream processing and streaming integration” (September 2016) http://hortonworks.com/info/value-streaming-integration/
  42. 42. © Copyright 2000-2016 TIBCO Software Inc. Alternatives for Stream Processing Time to Market Streaming Frameworks Streaming Products Slow Fast Streaming Concepts IncludesIncludes
  43. 43. © Copyright 2000-2016 TIBCO Software Inc. Alternatives for Stream Processing Concepts (Continuous Queries, Sliding Windows) Patterns (Counting, Sequencing, Tracking, Trends) Build everything by yourself! L Time to Market Streaming Frameworks Streaming Products Slow Fast Streaming Concepts
  44. 44. © Copyright 2000-2016 TIBCO Software Inc. Usually not an option ... … as there are a lot of Frameworks and Products available!
  45. 45. © Copyright 2000-2016 TIBCO Software Inc. Alternatives for Stream Processing Library (Java, .NET, Python) Query Language (often similar to SQL) Scalability (horizontal and vertical, fail over) Connectivity (technologies, markets, products) Operators (Filter, Sort, Aggregate) Time to Market Streaming Frameworks Streaming Products Slow Fast Streaming Concepts Different frameworks (ingest, preprocess, analytics) combined!
  46. 46. © Copyright 2000-2016 TIBCO Software Inc. Streaming Analytics Processing Pipeline APIs Adapters / Channels Integration Messaging Stream Ingest Transformation Aggregation Enrichment Filtering Stream Preprocessing Process Management Analytics (Real Time) Applications & APIs Analytics / DW Reporting Stream Outcomes • Contextual Rules • Windowing • Patterns • Deep ML • Analytics • … Stream Analytics & Processing Index / SearchNormalization
  47. 47. © Copyright 2000-2016 TIBCO Software Inc. Example for an Open Source Streaming Pipeline http://hortonworks.com/hadoop-tutorial/realtime-event-processing-nifi-kafka-storm “Realtime Event Processing in Hadoop with Apache NiFi, Kafka and Storm”
  48. 48. Dataflow Streaming Pipeline (Ingest, Preprocess) Augmented Intelligence Operations SENSOR DATA TRANSACTIONS MESSAGE BUS MACHINE DATA SOCIAL DATA Streaming AnalyticsAction Aggregate Rules Stream Processing Analytics Correlate Continuous query processing Alerts Manual action, escalation Data Discovery Python R Data Scientists Cleansed Data History Visual Analytics Spark Integration ERP MDM DB WMS SOA / Microservices BIG DATA Data Warehouse, Hadoop Internal Data Integration Bus API Event Server H2O.ai Live UI
  49. 49. © Copyright 2000-2016 TIBCO Software Inc. Open Source Dataflow Streaming Pipelines
  50. 50. Streaming Analytics Augmented Intelligence Operations SENSOR DATA TRANSACTIONS MESSAGE BUS MACHINE DATA SOCIAL DATA Streaming AnalyticsAction Aggregate Rules Stream Processing Analytics Correlate Continuous query processing Alerts Manual action, escalation Data Discovery Python R Data Scientists Cleansed Data History Visual Analytics Spark Integration ERP MDM DB WMS SOA / Microservices BIG DATA Data Warehouse, Hadoop Internal Data Integration Bus API Event Server H2O.ai Live UI
  51. 51. © Copyright 2000-2016 TIBCO Software Inc. Frameworks and Products (no complete list!) OPEN SOURCE CLOSED SOURCE PRODUCT FRAMEWORK Azure Microsoft Stream Analytics Google Cloud Dataflow
  52. 52. © Copyright 2000-2016 TIBCO Software Inc. Frameworks and Products (no complete list!) OPEN SOURCE CLOSED SOURCE PRODUCT FRAMEWORK Azure Microsoft Stream Analytics Google Cloud Dataflow
  53. 53. © Copyright 2000-2016 TIBCO Software Inc. Apache Storm Spout Bolt
  54. 54. © Copyright 2000-2016 TIBCO Software Inc. Apache Storm – Hello World http://wpcertification.blogspot.ch/2014/02/helloworld-apache-storm-word-counter.html
  55. 55. © Copyright 2000-2016 TIBCO Software Inc. AWS Kinesis – Integration with other AWS Components https://aws.amazon.com/kinesis/ AWS S3 RedShift DynamoDB
  56. 56. © Copyright 2000-2016 TIBCO Software Inc. AWS Kinesis – Hello World
  57. 57. © Copyright 2000-2016 TIBCO Software Inc. AWS Kinesis – Public Cloud Trade-Off … is easy to setup and scale. But you do not have full control! L • Any data that is older than 24 hours is automatically deleted • Every Kinesis application consists of just one procedure, so you can’t use Kinesis to perform complex stream processing unless you connect multiple applications • Kinesis can only support a maximum size of 50KB for each data item http://diamondstream.com/amazon-kinesis-big-real-time-data-processing-solution/ (blog post from 2014, might be outdated, but shows that you do not have full control over a cloud service)
  58. 58. © Copyright 2000-2016 TIBCO Software Inc. Apache Spark General Data-processing Framework à However, focus is especially on Analytics (these days) x
  59. 59. © Copyright 2000-2016 TIBCO Software Inc. Apache Spark – Focus on Analytics http://aptuz.com/blog/is-apache-spark-going-to-replace-hadoop/ http://fortune.com/2015/09/09/cloudera-spark-mapreduce/ http://www.ebaytechblog.com/2014/05/28/using-spark-to-ignite-data-analytics/ http://www.forbes.com/sites/paulmiller/2015/06/15/ibm-backs-apache-spark-for-big-data-analytics/ “[IBM’s initiatives] include: • deepening the integration between Apache Spark and existing IBM products like the Watson Health Cloud; • open sourcing IBM’s existing SystemML machine learning technology;
  60. 60. © Copyright 2000-2016 TIBCO Software Inc. Spark Streaming Spark Streaming • is no real streaming solution • uses micro-batches • cannot process data in real-time (i.e. no ultra-low latency) • allows easy combination with other Spark components (SQL, Machine Learning, etc.)
  61. 61. © Copyright 2000-2016 TIBCO Software Inc. Apache Spark – Hello World Spark Streaming API Spark Core API
  62. 62. © Copyright 2000-2016 TIBCO Software Inc. Apache Spark – as a Cloud Service
  63. 63. © Copyright 2000-2016 TIBCO Software Inc. Apache Flink Spark Streaming • „Newcomer“ • Looks very similar to Spark • But „Streaming First“ concept
  64. 64. © Copyright 2000-2016 TIBCO Software Inc. Apache Beam Generic API with unified programming model for stream processing frameworks http://www.slideshare.net/DataTorrent/apache-beam-incubating-67428372
  65. 65. © Copyright 2000-2016 TIBCO Software Inc. Frameworks and Products (no complete list!) OPEN SOURCE CLOSED SOURCE PRODUCT FRAMEWORK Azure Microsoft Stream Analytics Google Cloud Dataflow
  66. 66. Alternatives for Stream Processing Library (Java, .NET, Python) Query Language (often similar to SQL) Scalability (horizontal and vertical, fail over) Connectivity (technologies, markets, products) Operators (Filter, Sort, Aggregate) Time to Market Streaming Frameworks Streaming Products Slow Fast Streaming Concepts Single Tool (Complete Processing Pipeline) Visual IDE (Dev, Test, Debug) Simulation (Feed Testing, Test Generation) Live UI (monitoring, proactive interaction) Maturity (24/7 support, consulting) Integration (out-of-the-box: ESB, MDM, Analytics, etc.)
  67. 67. © Copyright 2000-2016 TIBCO Software Inc. Streaming Analytics Processing Pipeline APIs Adapters / Channels Integration Messaging Stream Ingest Transformation Aggregation Enrichment Filtering Stream Preprocessing Process Management Analytics (Real Time) Applications & APIs Analytics / DW Reporting Stream Outcomes • Contextual Rules • Windowing • Patterns • Deep ML • Analytics • … Stream Analytics & Processing Index / SearchNormalization
  68. 68. Dataflow Streaming Pipeline + Streaming Analytics Augmented Intelligence Operations SENSOR DATA TRANSACTIONS MESSAGE BUS MACHINE DATA SOCIAL DATA Streaming AnalyticsAction Aggregate Rules Stream Processing Analytics Correlate Continuous query processing Alerts Manual action, escalation Data Discovery Python R Data Scientists Cleansed Data History Visual Analytics Spark Integration ERP MDM DB WMS SOA / Microservices BIG DATA Data Warehouse, Hadoop Internal Data Integration Bus API Event Server H2O.ai Live UI
  69. 69. © Copyright 2000-2016 TIBCO Software Inc. IBM Streams
  70. 70. © Copyright 2000-2016 TIBCO Software Inc. TIBCO StreamBase • Performance: Latency, Throughput, Scalability • Multi-threaded and clustered server from version 1 • High throughput: Millions of messages, 100,000s of quotes, 10,000s of orders • Low-latency: microsecond latency for algo trading, pre-trade risk, market data • Take Advantage of High Performance Hardware • Multicore (12, 24, 32 core) large memory (10s of gigabytes) • 64-bit Linux, Windows, Solaris deployment • Hardware acceleration (GPU, Solace, Tervela) • Enterprise Deployment • High availability and fault tolerance • Distributed state management for large data sets • Management and monitoring tools • Security and entitlements Integration • Continuous deployment and QA Process Support StreamSQL compiler and static optimizer In process, in thread adapter architecture Visual parallelism and scaling In-Memory Data Grid integration for distributed shared state Data parallelism and dispatch StreamBase Server Innovations
  71. 71. © Copyright 2000-2016 TIBCO Software Inc. TIBCO StreamBase - Visual Programming Aggregate Capture card activations per location Sales too high à Fraud Log to any database No Fraud Sales too high?
  72. 72. Visual Debugger Feed Simulation Unit Testing StreamBase Development StudioTIBCO StreamBase - Visual Programming
  73. 73. Live UI for Augmented Intelligence Augmented Intelligence Operations SENSOR DATA TRANSACTIONS MESSAGE BUS MACHINE DATA SOCIAL DATA Streaming AnalyticsAction Aggregate Rules Stream Processing Analytics Correlate Continuous query processing Alerts Manual action, escalation Data Discovery Python R Data Scientists Cleansed Data History Visual Analytics Spark Integration ERP MDM DB WMS SOA / Microservices BIG DATA Data Warehouse, Hadoop Internal Data Integration Bus API Event Server H2O.ai Live UI
  74. 74. © Copyright 2000-2016 TIBCO Software Inc. Live User Interface Live UI Continuous Query Processor Alerts CEP MQTT JMS In-Memory Data Grid Integration Social Media Data Market Data Sensor Data Historical Data In-Memory Data Grid Enterprise dataMarket Data IoT Mobile Social Browser / App Command & Control ACTION Continuous Query
  75. 75. © Copyright 2000-2016 TIBCO Software Inc. Live UI in Desktop / Web Browser / Mobile App Dynamic aggregation Live visualization Ad-hoc continuous query Alerts Action
  76. 76. © Copyright 2000-2016 TIBCO Software Inc. Live UI - Products Characteristics to Check • Alternative clients (rich client, browser, mobile app) • Maturity for enterprise use cases • Performance and scalability • “Big data native” deployment (YARN, Mesos) • Monitoring and proactive actions • Streaming engine under the hood (not just visualization layer) • New Ad-hoc queries by the business user (without the help of IT department) • Various visual components • Extendibility (graphical designer vs. coding) … or build your own solution using Websockets, Angular JS, etc.
  77. 77. © Copyright 2000-2016 TIBCO Software Inc. Spoilt for Choice Does it make sense to combine frameworks and products?
  78. 78. © Copyright 2000-2016 TIBCO Software Inc. Customer Example: Apache Storm + TIBCO Live Datamart External Data Snapshot Results Continuous Query Processor Query TIBCO Live Datamart Continuous Alerting Active Tables Active Tables Continuous Updates Clients Message Bus Public Data Customer Data StreamBase Bolt StreamBase Spout Operational Data StreamBase Bolt and Spout connect Apache Storm to StreamBase to provide real-time analytics on operational data
  79. 79. © Copyright 2000-2016 TIBCO Software Inc. Agenda • Real World Use Cases • Introduction to Streaming Analytics • Market Overview • Relation to other Big Data Components • Live Demo
  80. 80. © Copyright 2000-2016 TIBCO Software Inc. Closed Loop: Understand – Anticipate – Act
  81. 81. © Copyright 2000-2016 TIBCO Software Inc. Closed Loop: Understand – Anticipate – Act Insights Actions MONITOR PREDICT ACT DECIDE MODEL ORGANIZE ANALYZE WRANGLE
  82. 82. Data Discovery via Visual Analytics, Big Data and Machine Learning Augmented Intelligence Operations SENSOR DATA TRANSACTIONS MESSAGE BUS MACHINE DATA SOCIAL DATA Streaming AnalyticsAction Aggregate Rules Stream Processing Analytics Correlate Continuous query processing Alerts Manual action, escalation Data Discovery Python R Data Scientists Cleansed Data History Visual Analytics Spark Integration ERP MDM DB WMS SOA / Microservices BIG DATA Data Warehouse, Hadoop Internal Data Integration Bus API Event Server H2O.ai Live UI
  83. 83. Find Insights and Patterns in Historical Data Visual Analytics + Machine Learning
  84. 84. Apply Insights and Analytic Models to Proactive Actions Streaming AnalyticsH20.ai Open Source R TERR Spark ML MATLAB SAS PMML
  85. 85. © Copyright 2000-2013 TIBCO Software Inc. 80% of betting happens AFTER the game begins TODAY
  86. 86. Case Study: Streaming Analytics for Betting • Situation: Today, 80% of Betting is Done After the Game Starts • It’s not your father’s bookie anymore! • Problem: How to Analyze Big Betting Data? • Thousands of concurrent games, constantly adjusting odds, dozens of betting networks – firms must correlate millions of events a day to find the best betting opportunities in real-time • Solution: TIBCO for Fast Data Architecture • TXOdds uses TIBCO to correlate, aggregate, and analyze large volumes of streaming betting data in real-time and publish innovative predictive betting analytics to their customers • Result: TXOdds First to Market with Innovative Zero Latency Betting Analytics • Innovative real-time analytics help players who can process electronic data in real-time the edge “With StreamBase, in two months we had our first betting analytics feed live, and we continually deploy new ideas and evolve our old ones.” - Alex Kozlenkov, VP of technology, TXOdds
  87. 87. © Copyright 2000-2016 TIBCO Software Inc. Big Data Architecture for Streaming Betting Analytics Event Processing MONITOR REAL-TIME ANALYTICS AGGREGATE HISTORICAL COMPARISON Predictive odds analytics Zero Latency Betting Analytics GLOBAL, DISTRIBUTED INFRASTRUCTURE Historical odds deviations B U S BETTING LINES SCORES NEWS HADOOP Context: Historical Betting Data, Odds, Outcomes B U S CACHE CACHE CACHE Real-Time Analytics CORRELATE Live Datamart SOCIAL
  88. 88. Real-Time Social Media Analytics Twitter (#TomBradyBrokenLeg) Twitter (#Boston) Brady’s Stats Actionable Insights Twitter (#NFL) Something relevant happening? Every second counts! Change Odds (automated or manually triggered): Stop live-betting for the current running game? • Who will win the game? • How many interceptions will the Quarterback throw? • Will the Patriots win the Super Bowl? • …
  89. 89. © Copyright 2000-2016 TIBCO Software Inc. Real-Time Social Media Analytics
  90. 90. © Copyright 2000-2016 TIBCO Software Inc. Agenda • Real World Use Cases • Introduction to Streaming Analytics • Market Overview • Relation to other Big Data Components • Live Demo
  91. 91. Scenario: Predictive Scrapping of Parts in an Assembly Line Goal: Scrap parts as early as possible automatically to reduce costs in a manufacturing process. Question: When to scrap a part in Station 1 instead of doing re-work or sending it to Station 2? Station 1 Station 2 Cost Before 9€ 7€ 13€ Total Cost 29€ (or more) Scrap? Scrap?
  92. 92. Big Data Architecture for Predictive Maintenance Operational Analytics Operations Live UI CSV Batch JSON Real Time XML Real Time Streaming AnalyticsAction Aggregate Rules Analytics Correlate Live Datamart Continuous query processing Alerts Manual action, escalation HISTORICAL ANALYSIS Data Scientists Flume HDFS Spotfire R / TERR HDFS Hadoop (Cloudera) StreamBase TIBCO Fast Data Platform H2O Oracle RDBMS Avro Parquet … PMML Internal Data
  93. 93. Find Patterns à TIBCO Spotfire with H2O Integration © Copyright 2000-2016 TIBCO Software Inc. Example: Predictive Analytics for Manufacturing (“scrap parts as early as possible”)
  94. 94. © Copyright 2000-2016 TIBCO Software Inc. Apply Patterns à TIBCO StreamBase Connector for H2O.ai
  95. 95. Monitor Patterns à TIBCO Live Datamart Augmented Intelligence (“Monitor the manufacturing process and change rules in real time!”) Live Dartmart Desktop Client
  96. 96. Monitor Patterns à TIBCO Live Datamart Augmented Intelligence (“Monitor the manufacturing process and change rules in real time!”) Live Dartmart Web API
  97. 97. TIBCO Spotfire + StreamBase + Live Datamart + H2O.ai Live DemoLive Demo
  98. 98. © Copyright 2000-2016 TIBCO Software Inc. Key Take-Aways • Streaming Analytics processes Data while it is in Motion! • Automation and Proactive Human Interaction are BOTH needed! • Streaming Analytics is Complementary to Hadoop and Machine Learning!
  99. 99. Questions? Please contact me! Kai Wähner Technology Evangelist kontakt@kai-waehner.de @KaiWaehner www.kai-waehner.de LinkedIn

×