Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Magneti Marelli, ICT Innovation
Road to Enterprise Architecture for Big Data
Applications
Mixing Apache Spark with singlet...
Company Overview
Magneti Marelli is an international company committed to the design and production of hi-tech systems and...
Magneti Marelli Worldwide Footprint
Corporate Presentation 3
PP - AC
PP - AC
PP - AC
USA PP – AC
MEXICO
BRASIL
ARGENTINA
G...
#SAISEnt4
Big Data storyline
4
BUSINESS
EXPLORATION
PHASE
PROOF OF CONCEPT
PHASE
PRODUCTION
PHASE
CUMULATIVE ROWS PROCESSE...
#SAISEnt4
The Surface-Mount Technology (SMT) project
5
Surface-Mount Technology PCB Preparation Assembly Line
Pre Producti...
#SAISEnt4
The Surface-Mount Technology (SMT) project
6
Surface-Mount Technology PCB Preparation Assembly Line
Pre Producti...
#SAISEnt4
The Surface-Mount Technology (SMT) project
7
Surface-Mount Technology PCB Preparation Assembly Line
Pre Producti...
#SAISEnt4
A Dream Project
8
Production process is well known1
1
Data source is clearly defined2
2
Need is raised by plant ...
#SAISEnt4
?
Becoming a Nightmare
9
Success!!!
• So… Where is the data?
• How can I read/access the data?
• How can I be su...
#SAISEnt4
Acquire Store, Transform, Enrich, Organize & Aggregate
Data Integration, Transform, Aggregate
Data Sources
Struc...
#SAISEnt4 11
Keep it simple, stupid
PRESENTATIONEXPLORATION PRODUCTIONINGESTION AND STORAGE
Hammer
Gateway
HG Job 1
HG Job...
#SAISEnt4 12
Enterprise Architecture – Data, where are you?
Hammer
Gateway
HG Job 1
HG Job 2
HG Job n
Pick & Place
DATA
EX...
#SAISEnt4
Enterprise Architecture – The Jupyter case
• How could I work together with other data
scientists?
• How can I d...
#SAISEnt4
One singleton to rule them all
mco
Pattern: Singleton
Use: bring tokens and
technical access to notebook
Benefit...
#SAISEnt4 15
Enterprise Architecture – The model is ready!
• Is the code production ready?
• Who is going to port the note...
#SAISEnt4 16
A For-what? What the hell?
MESSAGE
QUEUE
FORWARDER
Clean
SMT Data
Cycle
Time
Anomaly Det
Super
Secret
Service...
#SAISEnt4 17
A For-what? What the hell?
MESSAGE
QUEUE
FORWARDER
Clean
SMT Data
Cycle
Time
Anomaly Det
Super
Secret
Service...
#SAISEnt4 18
A For-what? What the hell?
MESSAGE
QUEUE
FORWARDER
Clean
SMT Data
Cycle
Time
Anomaly Det
Super
Secret
Service...
#SAISEnt4
ON PREMISE
19
A For-what? What the hell?
MESSAGE
QUEUE
FORWARDER
Clean
SMT Data
Cycle
Time
Anomaly Det
Super
Sec...
#SAISEnt4 20
A For-what? What the hell? – Predictive balancing
Clean
PS Data
RAM
Intensive
JOB
TO-DO
TO-DO
WIPDONE
WIP
X
A...
#SAISEnt4 21
Enterprise Architecture – Don’t mind about nerd stuff
HammerGateway
HG
Job1
HG
Job2
HG
Job
n
DATALAKE
Dat
a
L...
#SAISEnt4
NO-COMPLEXITY
DATA INGESTION AND STORAGE
22
PRESENTATION
DATA
SCIENTIST
TOYBOX
PRODUCTION
Enterprise Architectur...
#SAISEnt4
“I have done the deed. Did you hear a noise?”
23
“The Guide says there is an art to flying", said Ford, "or
rath...
#SAISEnt4
The Surface-Mount Technology (SMT) project
24
Surface-Mount Technology PCB Preparation Assembly Line
Pre Product...
#SAISEnt4
Anomaly Recommender Engine
25
Use Case
Support maintenance team to
prioritize standard and
extraordinary mainten...
#SAISEnt4
Much ado about nothing… ?
26
Surface-Mount Technology PCB Preparation Assembly Line
Pre Production & Assembly Li...
#SAISEnt4
People behind
27
Andrea
CONDORELLI
Giovanni
FAZZI
Manuela
DETOMASO
Florindo
PALLADINO
THE TEAM
Alessandro
SICOLI...
Upcoming SlideShare
Loading in …5
×
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

0

Share

Download to read offline

Road to Enterprise Architecture for Big Data Applications: Mixing Apache Spark with Singletons, Wrapping, and Facade with Andrea Condorelli

Download to read offline

In the Manufacturing industry, reliability and time to market are key factors to accomplish business goals. Nowadays, Analytics are more and more deployed to get insights’ from data and foster a data driven culture to achieve a greater effectiveness and efficiency within business operations.

In the Analytics domain, real challenges are often represented by data collection, such as the existence of heterogeneous and widespread data sources and the choice of ingestion technologies and strategies, the need to ensure a continuous data inflow and to release production-ready Analytics services to be integrated into in daily operations

In order to address those challenges, Magneti Marelli ICT Innovation team has adopted a structured approach starting from foundations, that is by building a distinctive Big Data Architecture, known as the Magneti Marelli Architecture (MARC). Differently from common Big Data architectures, which are developed on batch or streaming paradigms, MARC is an event and service oriented architecture with the flexibility to manage complex tasks running both in the DMZ plant, in the plant network and in Cloud. It combines traditional patterns for handling data, such as “Service Broker”, “Forwarder”, “Singleton”, “Wrapping”, “Store and Forward”, with best of breed technologies such as Databricks, Microsoft Azure Data Lake Store, Azure SQL, PowerBI and Azure Functions.

In this presentation MARC key components will be introduced, together with main integrated services. Additionally, it will be shown how mentioned routine issues in data management will be addressed and solved with the aid of MARC unique structure and related services: practical examples will be provided for incremental data ingestion, incremental data processing, hybrid Spark deployments and the usage of heterogeneous Application Servers. Finally, it will be clear how the adoption of a structured approach to the development of Big Data Architecture for data management has dramatically fostered the demand for Analytics services and their effective use by the business to accomplish manufacturing cost reduction.



Related Books

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

Road to Enterprise Architecture for Big Data Applications: Mixing Apache Spark with Singletons, Wrapping, and Facade with Andrea Condorelli

  1. 1. Magneti Marelli, ICT Innovation Road to Enterprise Architecture for Big Data Applications Mixing Apache Spark with singletons, wrapping, facade London, United Kingdom #SAISEnt4 #SAISEnt4
  2. 2. Company Overview Magneti Marelli is an international company committed to the design and production of hi-tech systems and components for the automotive sector. AUTOMOTIVE LIGHTING (Headlamp, Rearlamp, Lighting and Body Electronics) ELECTRONICS (Instrument Clusters, Infotainment & Telematics) SUSPENSION SYSTEMS AND SHOCK ABSORBERS (Suspension Systems, Shock Absorbers and Dynamic Systems) PLASTIC COMPONENTS AND MODULES (Bumper, Dashboard, Central Console, Pedals, Hand Brake Levers and Fuel System) AFTERMARKET PARTS & SERVICES (Mechanical, Body Work, Electrics and Electronic and Consumables) EXHAUST SYSTEMS (Manifolds, Catalytic converter, Diesel Particulate Filter and Mufflers) POWERTRAIN (Gasoline and Diesel engine control, Electric Motor, Inverter and Transmission) Corporate Presentation 2October 6, 2018 MOTORSPORT (Injection Systems, Electronic Control Units, Hybrid Systems, Telemetry Systems, Electric Actuators)
  3. 3. Magneti Marelli Worldwide Footprint Corporate Presentation 3 PP - AC PP - AC PP - AC USA PP – AC MEXICO BRASIL ARGENTINA GERMANY POLAND CZECH REP. SLOVAKIA RUSSIA SERBIA TURKEY PP - R&D – ACCHINA JAPAN KOREA MALAYSIA INDIAITALY SPAIN PP - AC PP – R&D - AC PP PP - AC PP – R&D – AC PP – R&D - AC PP - AC PP - R&D – AC PP - R&D – AC PP PP - AC AC PP October 6, 2018 PP: Production Plant R&D: R&D Center AC: Application Center FRANCE ROMANIA R&D - AC
  4. 4. #SAISEnt4 Big Data storyline 4 BUSINESS EXPLORATION PHASE PROOF OF CONCEPT PHASE PRODUCTION PHASE CUMULATIVE ROWS PROCESSED 30bn 0bn 10bn 20bn Apr 2017 Jun 2017 Oct 2017 Jan 2018 Mar 2018 Jul 2018 Jan 2017 Big Data group was created Aug 2017 Welding Machines POC 29th Jan 2018 Databricks was adopted Apr 2018 MARC 1.0 released Jun 2018 The SMT project Aug 2018 The Metalizers project Nov 2017 SMT POC Sep 2017 Telemetry POC Feb 2017 First POC approved, data loaded from USB
  5. 5. #SAISEnt4 The Surface-Mount Technology (SMT) project 5 Surface-Mount Technology PCB Preparation Assembly Line Pre Production & Assembly Line
  6. 6. #SAISEnt4 The Surface-Mount Technology (SMT) project 6 Surface-Mount Technology PCB Preparation Assembly Line Pre Production & Assembly Line Timestamp PCB ID 1 file per Item Timestamp Temperature Humidity 1 file per day Timestamp PCB ID Soldering Paste Sensor Data Temperature … 1 file per Item NIP Pick Up Info Feeder Info DB SQL NIP Images Sensor Data Anomalies MDB NIP Images Sensor Data PCB Final Status DB SQL Lasermarker Serigraphy Automated Optical Inspection Post Printing Pick and Place Oven Automated Optical Inspection Post Reflow
  7. 7. #SAISEnt4 The Surface-Mount Technology (SMT) project 7 Surface-Mount Technology PCB Preparation Assembly Line Pre Production & Assembly Line Timestamp PCB ID 1 file per Item Timestamp Temperature Humidity 1 file per day Timestamp PCB ID Soldering Paste Sensor Data Temperature … 1 file per Item NIP Pick Up Info Feeder Info DB SQL NIP Images Sensor Data Anomalies MDB NIP Images Sensor Data PCB Final Status DB SQL Lasermarker Serigraphy Automated Optical Inspection Post Printing Pick and Place Oven Automated Optical Inspection Post Reflow Machine Learning Problems 1. Machine Status Monitoring 2. Bottleneck harmonic model 3. Anomaly Recommender Engine
  8. 8. #SAISEnt4 A Dream Project 8 Production process is well known1 1 Data source is clearly defined2 2 Need is raised by plant people3 3 Algorithmic challenges are clear4 4 #SAISEnt4
  9. 9. #SAISEnt4 ? Becoming a Nightmare 9 Success!!! • So… Where is the data? • How can I read/access the data? • How can I be supported by my data scientist colleagues? • How can I attach a Spark cluster to my Jupyter notebook? • Who is going to port the notebook to production? • What do you mean with production? Production process is well known1 Data source is clearly defined2 Need is raised by plant people3 Algorithmic challenges are clear4
  10. 10. #SAISEnt4 Acquire Store, Transform, Enrich, Organize & Aggregate Data Integration, Transform, Aggregate Data Sources Structured Data 10 Enterprise Architecture Logical Data Warehouse Data Marts Traditional Enterprise Data Warehouse Distributed Process Other/ HadoopNoSQL Analyze & Deliver Insight Data Services Market- place or Datasets Self-Service Data Preparation / Data Access Layer Analytic Capabilities Analyze Optimize Forecast Report Plan Discover Collaborate Predict Model Advanced Analytics AudioVideo IoT Feeds Streaming Unstructured/Semi- structured Data ImageIT Log SAP Operational Systems Text Doc External IOT MES Stream Ingestion Real Time Batch/Micro- batch/CDC Staging, At Rest, CDC Streaming, In Motion External Doc Text Other Sources Master Data Management Data Quality MDM DQ SQL Data Lake (Curated, Enriched & Transformed Data) DataLake(RawData) SQL Machine Learning layer Data Science Environment The Technology Bazar ARCHITECTURE != LIST OF TECHNOLOGIES AND FANCY ARROWS
  11. 11. #SAISEnt4 11 Keep it simple, stupid PRESENTATIONEXPLORATION PRODUCTIONINGESTION AND STORAGE Hammer Gateway HG Job 1 HG Job 2 HG Job n DATA LAKE Azure Data Lake Store DATA EXPLORATION Notebooks Datamart ARCHITECTURAL OBJECTS MESSAGE QUEUE FORWARDER BUSINESS LOGICS Dbutils+Spark APP MCO mco.read mco.write AZURE FUNCTIONS mco.log
  12. 12. #SAISEnt4 12 Enterprise Architecture – Data, where are you? Hammer Gateway HG Job 1 HG Job 2 HG Job n Pick & Place DATA EXPLORATION Noteb ooks DATAMART APP ENGINE M ES S A G E Q U E U E F OR W A R D E RMCO mco.read mco.write mco.run I ON S BUSINESS LOGICS Dbutils +Spark HammerGateway HG Job1 HG Job2 HG Jobn DATALAKE Dat a Lak e Sto re • Where are the csv? • What do you mean with bcp out? • How can I get a copy in the cloud? • How can I update data on a regular basis? ? Write from scratch a sort of ETL tool Production ready tentative (10% of times) USB loading Quick and Dirty (90% of times) DATA LAKE Azure Data Lake Store
  13. 13. #SAISEnt4 Enterprise Architecture – The Jupyter case • How could I work together with other data scientists? • How can I deal with computation spikes? • How can I attach an Apache Spark cluster to my Jupyter? • Damn, Java Heap Memory Exception: what do you mean? Hammer Gateway HG Job 1 HG Job 2 HG Job n Pick & Place HammerGateway HG Job1 HG Job2 HG Jobn DATALAKE Dat a Lak e Sto re DATAMART APP ENGINE M ES S A G E Q U E U E F OR W A R D E RMCO mco.read mco.write mco.run BUSINESS LOGICS Dbutils +Spark DATA EXPLORATION Noteb ooks DATA EXPLORATION Notebooks MCO mco.read mco.write AZURE FUNCTIONS mco.log DATA LAKE Azure Data Lake Store
  14. 14. #SAISEnt4 One singleton to rule them all mco Pattern: Singleton Use: bring tokens and technical access to notebook Benefit: • enhancing security • enabling access control • reducing vendor lock-in effect mco.read Pattern: Wrapping Use: take data from data lake knowing only data names Benefit: • no one will need to know where data are or how data are stored • incremental read capability out-of-the-box • reducing time to port code in production • reduced reading time • reduce the vendor lock-in effect (to propagate a new HDFS PAAS vendor on all services is a matter of hours) mco.log Pattern: Wrapping Use: bring developer grade logging capability Benefit: • reducing debug time • enabling process audits mco.write Pattern: Wrapping Use: save data everywhere Benefit: • no one will need to know where data must be put or how • avoid dangerous behavious such as writing on a SQL with a transformation action (connection pool, my beloved friend…) MCO mco.read mco.write AZURE FUNCTIONS mco.log
  15. 15. #SAISEnt4 15 Enterprise Architecture – The model is ready! • Is the code production ready? • Who is going to port the notebook to production? • Developer algorithm is wrong: it produces different numbers… • Ok, I got it! I’ll need a crontab… but where? HammerGateway HG Job1 HG Job2 HG Job n DATALAKE Dat a Lak e Sto re DATAMART APP ENGINE M ES S A G E Q U E U E F OR W A R D E RMCO mco.read mco.write mco.run I O N S BUSINESS LOGICS Dbutils +Spark DATA EXPLORATION Noteb ooks Hammer Gateway HG Job 1 HG Job 2 HG Job n Pick & Place DATA EXPLORATION Notebooks BUSINESS LOGICS Dbutils+Spark MESSAGE QUEUE FORWARDER
  16. 16. #SAISEnt4 16 A For-what? What the hell? MESSAGE QUEUE FORWARDER Clean SMT Data Cycle Time Anomaly Det Super Secret Service … DATAMART Clean SMT Data service is the first in the MESSAGE QUEUE and is sent to the FORWARDER. Status is set to WIP This service is forwarded to Databricks Databricks gets data through the MCO Once data is loaded, the Spark code starts the cleaning job Cleaned data is cached in the DATAMART through the MCO. Status is set to DONE TO-DO TO-DO TO-DO … WIPDONE WIPDONE APP BUSINESS LOGICS Dbutils+Spark
  17. 17. #SAISEnt4 17 A For-what? What the hell? MESSAGE QUEUE FORWARDER Clean SMT Data Cycle Time Anomaly Det Super Secret Service … DATAMART TO-DO TO-DO TO-DO … WIPDONE WIPDONE APP WIP BUSINESS LOGICS Dbutils+Spark
  18. 18. #SAISEnt4 18 A For-what? What the hell? MESSAGE QUEUE FORWARDER Clean SMT Data Cycle Time Anomaly Det Super Secret Service … DATAMART TO-DO TO-DO TO-DO … WIPDONE WIPDONE APP WIP X • What if some datasets could not be moved into the cloud? • How to deal with super secret business logic? FORWARDER as the main component for cloud hybridization! Data needed to run Super Secret Service cannot be moved outside Magneti Marelli servers BUSINESS LOGICS Dbutils+Spark
  19. 19. #SAISEnt4 ON PREMISE 19 A For-what? What the hell? MESSAGE QUEUE FORWARDER Clean SMT Data Cycle Time Anomaly Det Super Secret Service … DATAMART TO-DO TO-DO TO-DO … WIPDONE WIPDONE APP WIP EDW Super Secret Service is forwarded to an on premise Apache Spark cluster Apache Spark cluster gets data through the MCO Outcome is persisted on an on premise SQL Server A custom web app allow user to see job output BUSINESS LOGICS Dbutils+Spark
  20. 20. #SAISEnt4 20 A For-what? What the hell? – Predictive balancing Clean PS Data RAM Intensive JOB TO-DO TO-DO WIPDONE WIP X A first service is submitted by the forwarder A second service is submited before the first finished. The Cluster is busy witht the other computation The forwarder can submit the job to any application server, so it creates a new Databricks cluster and submit to it. Cluster creation is anticipated using predictive algorithms. MESSAGE QUEUE Clean SMT Data Cycle Time Anomaly Det Super Urgent Service … TO-DO TO-DO TO-DO … WIPDONE WIPDONE WIP 300 GB RAM (90%) 2 hours long 100 GB RAM 5 minutes long Slow Services Cluster Dbutils+Spark Fast Services Cluster Dbutils+Spark
  21. 21. #SAISEnt4 21 Enterprise Architecture – Don’t mind about nerd stuff HammerGateway HG Job1 HG Job2 HG Job n DATALAKE Dat a Lak e Sto re DATAMART APP ENGINE M ES S A G E Q U E U E F OR W A R D E RMCO mco.read mco.write mco.run I O N S BUSINESS LOGICS Dbutils +Spark DATA EXPLORATION Noteb ooks Hammer Gateway HG Job 1 HG Job 2 HG Job n Pick & Place BUSINESS LOGICS Dbutils+Spark Data Scientists’ presentation concerns: • How do I write a web page? • Do I need to bootstrap? • MV-what? I thought Spring was just a season! • Single sign-on? What do you mean? MESSAGE QUEUE FORWARDER DATAMART DATA EXPLORATION Notebooks APP
  22. 22. #SAISEnt4 NO-COMPLEXITY DATA INGESTION AND STORAGE 22 PRESENTATION DATA SCIENTIST TOYBOX PRODUCTION Enterprise Architecture (“…and in the darkness bind them”) 1 day to add a new source1 Architecture «embeds» the guideline2 Presentation layer is drag and drop3 Service queuing ensures enterprise and managed scalability 4 Data scientists do not waste time in boring activities 5 Pick & Place MESSAGE QUEUE FORWARDER Hammer Gateway HG Job 1 HG Job 2 HG Job n DATA EXPLORATION Notebooks Datamart BUSINESS LOGICS Dbutils+Spark APP MCO mco.read mco.write AZURE FUNCTIONS mco.log DATA LAKE Azure Data Lake Store
  23. 23. #SAISEnt4 “I have done the deed. Did you hear a noise?” 23 “The Guide says there is an art to flying", said Ford, "or rather a knack. The knack lies in learning how to throw yourself at the ground and miss.” Production process is well known1 Data source is clearly defined2 Need is raised by plant people3 Algorithmic challenges are clear4
  24. 24. #SAISEnt4 The Surface-Mount Technology (SMT) project 24 Surface-Mount Technology PCB Preparation Assembly Line Pre Production & Assembly Line Timestamp PCB ID 1 file per Item Timestamp Temperature Humidity 1 file per day Timestamp PCB ID Soldering Paste Sensor Data Temperature … 1 file per Item NIP Pick Up Info Feeder Info DB SQL NIP Images Sensor Data Anomalies MDB NIP Images Sensor Data PCB Final Status DB SQL Lasermarker Serigraphy Automated Optical Inspection Post Printing Pick and Place Oven Automated Optical Inspection Post Reflow Machine Learning Problems 1. Machine Status Monitoring 2. Bottleneck harmonic model 3. Anomaly Recommender Engine
  25. 25. #SAISEnt4 Anomaly Recommender Engine 25 Use Case Support maintenance team to prioritize standard and extraordinary maintenance activities. Benefit Reducing machine stoppage losses per year per line. Down time reduction. Description A summary dashboard shows the health of each part of the line. A drill down with details is available.
  26. 26. #SAISEnt4 Much ado about nothing… ? 26 Surface-Mount Technology PCB Preparation Assembly Line Pre Production & Assembly Line Break-even point reached after 8 months Cost per line reduced by 90% after the first one Return On Investment: 12X in 3 years • Databricks and Microsoft PowerBI allow a very cost effective first project • Hammer Gateway allows cost effective ingestion • MCO enabled Data Scientists to convert notebooks in services with a very very low effort • Microsoft Azure and Databricks ensure endless scalability
  27. 27. #SAISEnt4 People behind 27 Andrea CONDORELLI Giovanni FAZZI Manuela DETOMASO Florindo PALLADINO THE TEAM Alessandro SICOLI Dario CASTELLO Heinrich-Gerhard SCHUERING SUPPORTERS

In the Manufacturing industry, reliability and time to market are key factors to accomplish business goals. Nowadays, Analytics are more and more deployed to get insights’ from data and foster a data driven culture to achieve a greater effectiveness and efficiency within business operations. In the Analytics domain, real challenges are often represented by data collection, such as the existence of heterogeneous and widespread data sources and the choice of ingestion technologies and strategies, the need to ensure a continuous data inflow and to release production-ready Analytics services to be integrated into in daily operations In order to address those challenges, Magneti Marelli ICT Innovation team has adopted a structured approach starting from foundations, that is by building a distinctive Big Data Architecture, known as the Magneti Marelli Architecture (MARC). Differently from common Big Data architectures, which are developed on batch or streaming paradigms, MARC is an event and service oriented architecture with the flexibility to manage complex tasks running both in the DMZ plant, in the plant network and in Cloud. It combines traditional patterns for handling data, such as “Service Broker”, “Forwarder”, “Singleton”, “Wrapping”, “Store and Forward”, with best of breed technologies such as Databricks, Microsoft Azure Data Lake Store, Azure SQL, PowerBI and Azure Functions. In this presentation MARC key components will be introduced, together with main integrated services. Additionally, it will be shown how mentioned routine issues in data management will be addressed and solved with the aid of MARC unique structure and related services: practical examples will be provided for incremental data ingestion, incremental data processing, hybrid Spark deployments and the usage of heterogeneous Application Servers. Finally, it will be clear how the adoption of a structured approach to the development of Big Data Architecture for data management has dramatically fostered the demand for Analytics services and their effective use by the business to accomplish manufacturing cost reduction.

Views

Total views

701

On Slideshare

0

From embeds

0

Number of embeds

15

Actions

Downloads

41

Shares

0

Comments

0

Likes

0

×