Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Flink for Everyone: Self-Service Data Analytics with StreamPipes

146 views

Published on

Flink Forward 2019

StreamPipes is an open source self-service IoT toolbox to enable non-technical users to connect, analyze and explore IoT data streams

https://www.streampipes.org

Published in: Software
  • Be the first to comment

Flink for Everyone: Self-Service Data Analytics with StreamPipes

  1. 1. Flink for Everyone: Self-Service Data Analytics with StreamPipes Patrick Wiener, Philipp Zehnder Flink Forward Europe 2019, Berlin, 2019-10-08
  2. 2. www.streampipes.org | @streampipes | github.com/streampipes 2 "A self-service IoT toolbox to enable non-technical users to connect, analyze and explore IoT data streams" What's StreamPipes?
  3. 3. www.streampipes.org | @streampipes | github.com/streampipes 3 What's StreamPipes? Big Data / Edge InfrastructureExecute Reusable algorithm toolbox Install Model pipelines
  4. 4. www.streampipes.org | @streampipes | github.com/streampipes About us 4 Dominik Riemer Senior Research Scientist Philipp Zehnder Research Scientist Patrick Wiener Research Scientist FZI Research Center for Information Technology, Karlsruhe, Germany Stream Processing, Data Management, Machine Learning Non-profit research center for applied ICT research (250 employees) Started StreamPipes in 2014, first OSS release 2018
  5. 5. www.streampipes.org | @streampipes | github.com/streampipes Agenda The need for self-service IoT data analytics1 StreamPipes: Technical Overview Demo 2 Lessons Learned w/ Flink & Getting Started3
  6. 6. The need for self-service IoT data analytics 1
  7. 7. www.streampipes.org | @streampipes | github.com/streampipes Conveyor Belts Pressure Oil temperature Dust particles Production plans Environmental Data Gear box drive Energy consumption Telematics Industrial Internet of Things Data streams everywhere
  8. 8. Continuous Monitoring Situational Awareness Continuous Data Harmonization Flexible data integration from heterogeneous sources and monitoring of current system states Detect time-critical situations, e.g., by means of rules or ML approaches Continuous pre- processing and transformation of input streams for third party systems Industrial Internet of Things Typical application scenarios
  9. 9. www.streampipes.org | @streampipes | github.com/streampipes StreamPipes Open Source framework to easily manage IoT data Data Access Data analytics & harmonization Data exploration & exploitation Generic adapters Specific adapters Metadata Data streams & sets Pre-processing Filter/Aggregation Pattern Detection ML Situation detection Harmonized data sets Visualizations Third-party systems 9
  10. 10. Technical Overview 2
  11. 11. www.streampipes.org | @streampipes | github.com/streampipes High-level architecture Analytics Microservices Data Integration Data Sources Adapter Library Pipeline Editor Streaming Engine 11
  12. 12. www.streampipes.org | @streampipes | github.com/streampipes High-level architecture Analytics Microservices Data Integration Data Sources Adapter Library Pipeline Editor Streaming Engine 12
  13. 13. Data Access StreamPipes Connect: Easily connect IoT sources
  14. 14. www.streampipes.org | @streampipes | github.com/streampipes Data Access Machine-interpretable metadata 100 011 010 001 010 010 100 101 000 111 data stream {  "tstamp": 1453478160, "machineId": "ID5", "temperature": 73.5, "flowRate": 4.2 } Semantic metadata Data type, runtime name, semantic type Frequency, latency, measurement unit Format, Protocol Schema Quality Grounding 14
  15. 15. www.streampipes.org | @streampipes | github.com/streampipes Data Access Machine-interpretable metadata Example temperature schema.org/temperature schema.org/degreeCelsius xsd:float [0,80] 100 011 010 001 010 010 100 101 000 111 data stream {  "tstamp": 1453478160, "machineId": "ID5", "temperature": 73.5, "flowRate": 4.2 } Semantic metadata 15
  16. 16. www.streampipes.org | @streampipes | github.com/streampipes Data Access StreamPipes Connect: Architecture Connect Master Connect Worker 1 Connect Worker 2 Connect Worker n MySQL RESTROS OPC-UAPLC MQTT Messaging Edge Worker Cloud Worker … register capabilities 16
  17. 17. Demo Introduction to StreamPipes Connecting and visualizing flow rate measurements of a multi tank system
  18. 18. Demo Introduction to StreamPipes Flow Sensor Aggregate data VisualizeMQTT StreamPipes Connect Connecting and visualizing flow rate measurements of a multi tank system
  19. 19. www.streampipes.org | @streampipes | github.com/streampipes High-level architecture Analytics Microservices Data Sources Adapter Library Pipeline Editor Data Integration 19 Streaming Engine
  20. 20. Analytics microservices Extensible toolbox
  21. 21. www.streampipes.org | @streampipes | github.com/streampipes • Extensible toolbox for pre- processing & analytics • Semantics-based consistency checking • Exchangable run-time wrappers • Stateful/stateless • Inclusion of ML-models possible Features Analytics microservices Extensible toolbox 21
  22. 22. www.streampipes.org | @streampipes | github.com/streampipes Analytics microservices Anatomy of a processing element Aggregation Controller output eventsinput events Runtime 22
  23. 23. www.streampipes.org | @streampipes | github.com/streampipes Analytics microservices How to implement a new processing element Select Wrapper Implement runtime Describe controller Build / Install Maven Archetype StreamPipes SDK StreamPipes SDK SDK, Docker, UI Aggregation Controller Runtime 23
  24. 24. www.streampipes.org | @streampipes | github.com/streampipes Analytics microservices Runtime Wrapper Standalone/Edge Wrapper Kafka Streams Wrapper Python Wrapper Select Wrapper Implement runtime Describe controller Build / Install Aggregation Controller output eventsinput events Runtime 24 Flink Wrapper
  25. 25. www.streampipes.org | @streampipes | github.com/streampipes Analytics microservices SDK: Runtime Select Wrapper Implement runtime Describe controller Build / Install Aggregation Controller Runtime 25
  26. 26. www.streampipes.org | @streampipes | github.com/streampipes Analytics microservices Processing Element Description User Configuration Output StrategyInput Requirements Schema, Quality, Protocol, Format Text Input, Selections, Domain Knowledge, … Keep, Custom, Transform, Append, … Semantic Metadata Select Wrapper Implement runtime Describe controller Build / Install Aggregation Controller Runtime 26
  27. 27. www.streampipes.org | @streampipes | github.com/streampipes Analytics microservices Development: Maven Archetypes & SDK Select Wrapper Implement runtime Describe controller Build / Install Aggregation Controller Runtime 27 Input User Config Output
  28. 28. www.streampipes.org | @streampipes | github.com/streampipes Flink Cluster Aggregation Job 28 Select Wrapper Implement runtime Describe controller Build / InstallAnalytics microservices Flink Deployment Pipeline Management register start Controller output eventsinput events Runtime Aggregation RemoteEnvironment Upload jar Submit execution graph Kafka Source Kafka Sink
  29. 29. Demo Condition monitoring + StreamPipes Rule-based monitoring of flow rate measurements in a multi tank system
  30. 30. Demo Condition monitoring + StreamPipes Rule-based monitoring of flow rate measurements in a multi tank system Flow Sensor Aggregate data Detect Leakage Notify MQTT IoTDB StreamPipes Connect Calculate Statistics
  31. 31. Lessons Learned & Getting Started 3
  32. 32. www.streampipes.org | @streampipes | github.com/streampipes Potentially huge stream of sensor data needs scalability Remote Environment eased the implementation of Flink Wrapper Clean & intuitive Flink API enables fast processor development Simple setup for development (mini cluster) and deployment Easy to configure & monitor Good integration with Apache Kafka Flink + StreamPipes Lessons learned      
  33. 33. www.streampipes.org | @streampipes | github.com/streampipes How to start Setting up StreamPipes Docker-based installation streampipes.org/en/download Download installer from Github1 ./streampipes start2 Finish installation in browser3 33
  34. 34. www.streampipes.org | @streampipes | github.com/streampipes 34 What's next? Data Access Data analytics & harmonization Data exploration & exploitation Metadata recognition PLC4X Flink fault tolerance Python wrapper AutoML Historical data explorer New features: Current work-in-progress Infrastructure (Edge / Fog)
  35. 35. Let's connect! …and if you like StreamPipes, star us on Github  streampipes.org docs.streampipes.org github.com/streampipes/streampipes twitter.com/streampipes feedback@streampipes.org

×