Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kunigk, Cloudera


Published on

New use cases under the Industry 4.0 umbrella are playing a key role in improving factory operations, process optimization, cost reduction and quality improvement. We propose an event streaming architecture to streamline the information flow all the way from the factory to the main data center. Building such a streaming architecture enables a manufacturer to react faster to critical operational events. However, it presents two main challenges:

Data acquisition in real time: data should be collected regardless of its location or access challenges are. It is commonplace to ingest data from hundreds of heterogeneous data sources (ERP, MES, Sensors, maintenance systems, etc).
Event processing in real time: events collected from different parts of the organization should be combined into actionable insights in real time. This is extremely challenging in a context where events can be lost or delayed.
In this talk, we show how Apache NiFi and MiNiFi can be used to collect a wide range of datasources in real-time, connecting the industrial and information worlds. Then, we show how Apache Flink’s unique features enables us to make sense of this data. For instance, we will explain how Flink’s time management such Event Time mode, late arrival handling and watermark mechanism can be used to address the challenge of processing IoT data originating from geographically distributed plants. Finally, we demonstrate an end to end streaming architecture for Industry 4.0 based on the Cloudera DataFlow platform.

Published in: Technology

Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kunigk, Cloudera

  1. 1. Event Streaming Architecture for Industry 4.0 Abdelkrim Hadjidj - Sr. Data Streaming Specialist Jan Kunigk - Principal Architect & Field CTO EMEA
  2. 2. © 2019 Cloudera, Inc. All rights reserved. 2 The Industry 4.0 economics Source: American Society of Quality Cost of Quality 20 Percent of Sales Source: Deloitte Plant Downtime Costs $50 Billion Per Year Source: AlixPartners Quality & Recall Costs $22B Recall Costs (US/16) Source: Nielsen Research Stopped production cost $22,000 Per Minute Source: McKinsey Big Data, Streaming IOT-Enabled Analytics 10% - 20% cost of quality reduction Source: Deloitte 5% - 20% equipment cost reduction
  3. 3. © 2019 Cloudera, Inc. All rights reserved. 3 Key Industry 4.0 Use Cases •Harmonization of screw tightening in all plants •Re-calibrate manufacturing robots •Saving in-fab sensor points forever (batch data) Process 360 Process Monitoring Predictive Maintenance Quality Event Forensic Analysis Quality & Yield Optimization •Optimize concentration of cutting fluids Use Case Examples •Single-point access to critical information and control •Reduce, downtime, scrap and late shipment costs •Reduce equipment downtime and maintenance costs •Reduce scope of service campaigns and warranty costs •Optimize process variables to improve yields and quality Benefits •Cycle time monitoring of CNC machines
  4. 4. © 2019 Cloudera, Inc. All rights reserved. 4 DATA-IN-MOTION REFERENCE ARCHITECTURE MiNiFi Apache Kafka Apache NiFi Apache Kafka Apache Flink DATA SYNDICATION SERVICE BY KAFKA Kafka Topic syndicate- transmission Kafka Topic syndicate- speed Kafka Topic syndicate- temp Kafka Topic syndicate- geo DATA COLLECTION AT THE EDGE C++ agent US-West Fleet C++ agent US-Central Fleet C++ agent US-East Fleet INGEST GATEWAY POWERED BY KAFKA gateway-west- raw-sensors gateway-central- raw-sensors gateway-east- raw-sensors DATA FLOW APPS POWERED BY NIFI Kafka Topic syndicate- battery Kafka Topic syndicate- start/stop Kafka Topic syndicate- acceleration Kafka Topic syndicate- idle SUBSCRIBING STREAM PROCESSING APPS PROCESSING APP 1 PROCESSING APP 2 PROCESSING APP 3 Apache Flink Structured Streaming Kafka Topic syndicate- oil Kafka Topic syndicate- breaks
  5. 5. © 2019 Cloudera, Inc. All rights reserved. 5 Two main challenges Traditional solutions are local and limited ● Single process focused analysis ○ Analytics performed at each plant, refinery.. ○ Missed opportunity to detect globally connected (eg. quality, optimizations) ● No condition-based analysis ○ Existing edge analytics check if a sensor is within control (quasi hardcoded thresholds) ○ Sensor data should be correlated to reference data (productivity data, maintenance schedule) ● Simplicity and Manageability ○ No tooling = embedded coding nightmare Many data sources increase complexity ● Data from different time universes ○ Alignment of timeseries and timestamped data (Sensor to Quality inspection) ○ Network conditions makes time management even harder (late arrivals) ● Real time analytics & CEP at scale ○ Trends and aggregates are more meaningful than events ● Prediction and RCA requires AA/ML ○ High data volume/cardinality ○ Data calibration is a requirement for ML/AI
  6. 6. © 2019 Cloudera, Inc. All rights reserved. 6 Apache NiFi • Over 300 Prebuilt Processors • Easy to build your own • Parse, Enrich & Apply Schema • Filter, Split, Merger & Route • Throttle & Backpressure • Full data provenance from acquisition to delivery • Diverse, Non-Traditional Sources • Eco-system integration Advanced tooling to industrialize flow development (Flow Development Life Cycle) FTP SFTP HL7 UDP XML HTTP EMAIL HTML IMAGE SYSLOG FTP SFTP HL7 UDP XML HTTP EMAIL HTML IMAGE SYSLOG HASH MERGE EXTRACT DUPLICATE SPLIT ROUTE TEXT ROUTE CONTENT ROUTE CONTEXT CONTROL RATE DISTRIBUTE LOAD GEOENRICH SCAN REPLACE TRANSLATE CONVERT ENCRYPT TALL EVALUATE EXECUTE
  7. 7. © 2019 Cloudera, Inc. All rights reserved. 7 How NiFi can help? Large set of OOTB connectors MQTT, OPC-UA, AMQP, .. S3, ADLS, PubSub, .. FTP, JDBC, NoSQL, Search, .. UI based fast development Salable distributed system MiNiFi agents Java / C++ lightweight agents Edge collection (NiFi connectors) Edge processing (Filtering, compression, encryption, etc) C&C: Central command and control Security Hub & Spoke architecture From Edge to Cloud/DC Site to Site protocol (S2S) Backpressure, Latency, Throughput, queuing End-to-End lineage and security
  8. 8. © 2019 Cloudera, Inc. All rights reserved. 8 How Flink can help? Time Management IoT network challenges Event time management Late arrival management State Management Enrichment Combining knowledge State of components Performance at scale Industrial Internet to grow 2X faster than any other data Data preparation Filtering, enriching, aggregation, joining
  9. 9. © 2019 Cloudera, Inc. All rights reserved. 9 NiFi-Flink integration • Direct integration via Site to Site • Simple, NiFi is just a Flink source • How to handle data spike or Flink outage. NiFi is not a data store! • Point to point: what if the same data is needed by several Flink apps • Native integration via Kafka API • Requires installing/managing another distributed system • Kafka retention can save your life with data spikes or Flink outage • Easy to build pipelines with several steps and intermediate topics
  10. 10. © 2019 Cloudera, Inc. All rights reserved. 10 INTELLIGENT EDGE Event Streaming Edge2AI Architecture for Industry 4.0 Analyze •Self-Service Business Intelligence (BI) •Enterprise Analytics Learn • Historical sensor data • Historical maintenance records • Historical usage characteristics • Historical failures Model Inputs Enterprise Transaction Data MES, ERP, Maintenance, Supply Chain, Warranty, Design, etc. E N R I C H Edge Collection/Analytics Transmission Connected Process/Plant 1 Sensors PLCs Historians SCADAs Connected Process/Plant N Sensors PLCs Historians SCADAs Feedback REAL-TIME ACTION ACT CDSW Standard plants solutions Edge to Cloud
  11. 11. © 2019 Cloudera, Inc. All rights reserved. 11 End to End pipeline Plant 2 Plant 1 Plant 3 Enterprise sources IoT Errors Aggregates Alertes Other data ETL Analytics Cross Plants Enterprise Analysis Real Time Analytics Complexity Reduction
  12. 12. © 2019 Cloudera, Inc. All rights reserved. 12 Why is it important? And what does it have to do with ML? item item itemitem item item
  13. 13. Demo
  14. 14. © 2019 Cloudera, Inc. All rights reserved. 14 Standards solutions and EdgeToAI architecture Characteristic Standard solutions Cloudera Corporate Positioning Real-Time Analytics in the Factory Streaming Platform for Enterprise Analytics Market Position IOT Platform Event and Streaming Management Platform Analytics Scope Edge Enterprise and cross-factory Data Ingestion Factory Edge: Specialized for Machine data From Edge to Cloud/Data Center, Enterprise flow management (MiNiFi/NiFi) Data Storage None Enterprise Data Lake Data Processing Edge Batch and Real Time, with advanced time and state management capabilities
  15. 15. 15 Conclusion
  16. 16. © 2019 Cloudera, Inc. All rights reserved. 16 TH N Y U