Application of the Actor Model to Large Scale NDE Data Analysis

•Download as PPTX, PDF•

1 like•710 views

The Actor model of concurrent computation discretizes a problem into a series of independent units or actors that interact only through the exchange of messages. Without direct coupling between individual components, an Actor-based system is inherently concurrent and fault-tolerant. These traits lend themselves to so-called “Big Data” applications in which the volume of data to analyze requires a distributed multi-system design. For a practical demonstration of the Actor computational model, a system was developed to assist with the automated analysis of Nondestructive Evaluation (NDE) datasets using the open source Myriad Data Reduction Framework. A machine learning model trained to detect damage in two-dimensional slices of C-Scan data was deployed in a streaming data processing pipeline. To demonstrate the flexibility of the Actor model, the pipeline was deployed on a local system and re-deployed as a distributed system without recompiling, reconfiguring, or restarting the running application.

Data & Analytics

Application of the Actor Model to Large Scale NDE
Data Analysis
SPIE Smart Structures and NDE for Industry 4.0
4 - 8 March 2018

• Distributed Processing Architectures
• Actor Model
• Defect Detection Algorithm
• Sample Results
• Q & A
Agenda
Introduction
Myriad Desktop UI
Emphysic Actor Model for NDE Analysis

Comparing Distributed Processing Models
Architecture
Apache Spark
Batch Processing Model (aka
Map-Reduce)
Apache Storm
Stream Processing Model
Akka
Actor Processing Model
Emphysic Actor Model for NDE Analysis

• Lightweight
• 1 actor ~ 300 bytes in RAM
• Fault-tolerant
• “Let it crash”
• Configurable
• Understandable
Benefits of Actor Model
Emphysic Actor Model for NDE Analysis

• Actor-based “pipeline parallelism” structure
• Algorithm is divided into a series of concurrent
stages
• Each stage in the algorithm consists of a central
routing Actor, one or more worker Actors, and a
work queue
• Output of one stage  input to subsequent stage
• Pyramid Actor blurs and subsamples data, sends each
step to a Window Actor
• For each Window the Window Actor sends to a Defect
Scanner Actor
• Defect Scanner Actor sends to Reporter Actor
Overview
Architecture
Defect Detection Structure
Emphysic Actor Model for NDE Analysis

• Blur
• Convolve with a blur kernel, usually
• Box or
• Gaussian
• Usually approximated w. 3 Box filter
passes
• Must account for edges
• Subsample
• Also known as down-sampling or decimation
• Take every nth element
Pyramid Actor
Algorithms & Components
Pyramid algorithm
Emphysic Actor Model for NDE Analysis

Gaussian Pyramid
Algorithms & Components
STEP 1 STEP 2 STEP 3 STEP 4
80×80 40×40 20×20 10×10
Emphysic Actor Model for NDE Analysis

• Scan across each dataset
• Each window is scanned independently for defect
signals
• Tradeoffs:
• Speed of scan affected by size of input data and
• Size of window
• Amount of overlap (smaller step size)
• Step size makes it more likely to detect ROI but
also more likely to find the same ROI multiple
times
Window Actor
Algorithms & Components
Sliding Window Algorithm
Emphysic Actor Model for NDE Analysis

• Simple Interface – get data, return True if defect found
• Bundle online learning algorithm with (optional)
preprocessor into a single small (~ 10kB) binary
package, or
• Parallelize existing algorithms
• No need to port to Java, can call external code
(Python, MATLAB, C++, etc.) with system call
Defect Scan Actor
Algorithms & Components
Defect scanner interface
Emphysic Actor Model for NDE Analysis

• Compiles results of defect scanning
• Every stage in the process adds metadata to the
message
• Data ingestion – data source
• Pyramid – scaling factor
• Sliding Window – position within scaled data
• Defect detection – ROI found
• Metadata allows the Reporting stage to find ROI relative
to the original input
Reporter Actor
Algorithms & Components
Reporting ROI Results
Emphysic Actor Model for NDE Analysis

• Training data
• 2-D slices of ultrasonic
• 15x15 elements
• Model
• Passive Aggressive
learning algorithm
• Sobel edge detection
preprocessing
• Pipeline
• 423 workers
• 1 Ingestor
• 2 Scalers
• 4 Pre-processors
• 128 Sliding
• 256 Defect
• 32 Reporters
Using A Model
Demonstration
Emphysic Actor Model for NDE Analysis

• Sample Input
• 33 separate data files (CSV, JPEG, TIFF, etc.)
• 60 million data points
• Single System Single Process (SSSP)
• Eight cores 32GB RAM
• 1 process
• Single System Multiprocess (SSMP)
• Eight cores 32GB RAM
• 184 processes
• Multisystem Multiprocess (MSMP)
• Eight cores 32GB RAM local
• Eight cores 32GB RAM remote (Azure VM)
• 88 local processes 128 remote (216 total)
Trial Number
Architecture
SSSP SSMP MSMP
1 302.66 106.69 107.28
2 299.16 99.43 106.94
3 297.00 111.87 106.11
4 303.22 110.20 106.05
5 299.39 103.83 106.13
Mean Processing Time [s] 300.28 106.40 106.50
Mean Throughput [Points Per
Second]
2.07E+05 5.87E+05 5.85E+05
Sample Throughputs
Emphysic Actor Model for NDE Analysis

Sample Throughputs
Emphysic Actor Model for NDE Analysis

Sample Throughputs – Doubled Input
Emphysic Actor Model for NDE Analysis

ANY QUESTIONS?
Emphysic Actor Model for NDE Analysis

What's hot

C# 8 in Libraries and Applications - BASTA! Frankfurt 2020Christian Nagel

Mini training - Moving to xUnit.netBetclic Everest Group Tech Team

Performance Test Automation With GatlingKnoldus Inc.

Parallel ProgrammingMindfire Solutions

Introduction to Reactive programmingDwi Randy Herdinanto

TMPA-2017: Distributed Analysis of the BMC Kind: Making It Fit the Tornado Su...Iosif Itkin

Nag software For Financefcassier

A framework for nonlinear model predictive controlModelon

Orcad pspice intro and basicsPraveen Kumar

Getting started with Innoslate - Systems EngineeringElizabeth Steiner

Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...MLconf

Automated testing of ASP .Net Core applications nispas

What's hot (12)

C# 8 in Libraries and Applications - BASTA! Frankfurt 2020

Mini training - Moving to xUnit.net

Performance Test Automation With Gatling

Parallel Programming

Introduction to Reactive programming

TMPA-2017: Distributed Analysis of the BMC Kind: Making It Fit the Tornado Su...

Nag software For Finance

A framework for nonlinear model predictive control

Orcad pspice intro and basics

Getting started with Innoslate - Systems Engineering

Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...

Automated testing of ASP .Net Core applications

Similar to Application of the Actor Model to Large Scale NDE Data Analysis

Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Apex

Next Gen Big Data Analytics with Apache Apex DataWorks Summit/Hadoop Summit

Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexApache Apex

Combining out - of - band monitoring with AI and big data for datacenter aut...Ganesan Narayanasamy

Intro to Apache Apex - Next Gen Platform for Ingest and TransformApache Apex

The DEBS Grand Challenge 2017Roman Katerinenko

The Diabolical Developers Guide to Performance TuningjClarity

Predictive Maintenance - Predict the UnpredictableIvo Andreev

Sista: Improving Cog’s JIT performanceESUG

Spark Autotuning: Spark Summit East talk by Lawrence SpracklenSpark Summit

Spark Autotuning - Spark Summit East 2017 Alpine Data

Labview1_ Computer Applications in Control_ACRRLMohammad Sabouri

Know More About Rational Performance - Snehamoy KRoopa Nadkarni

3 know more_about_rational_performance_tester_8-1-snehamoy_kIBM

Hadoop cluster performance profilerIhor Bobak

Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Apex

Project Slides for Website 2020-22.pptxAkshitAgiwal1

PAC 2019 virtual Alexander Podelko Neotys

A Framework for Scene Recognition Using Convolutional Neural Network as Featu...Tahmid Abtahi

Python Raster Function - Esri Developer Conference - 2015akferoz07

Similar to Application of the Actor Model to Large Scale NDE Data Analysis (20)

Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex

Next Gen Big Data Analytics with Apache Apex

Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex

Combining out - of - band monitoring with AI and big data for datacenter aut...

Intro to Apache Apex - Next Gen Platform for Ingest and Transform

The DEBS Grand Challenge 2017

The Diabolical Developers Guide to Performance Tuning

Predictive Maintenance - Predict the Unpredictable

Sista: Improving Cog’s JIT performance

Spark Autotuning: Spark Summit East talk by Lawrence Spracklen

Spark Autotuning - Spark Summit East 2017

Labview1_ Computer Applications in Control_ACRRL

Know More About Rational Performance - Snehamoy K

3 know more_about_rational_performance_tester_8-1-snehamoy_k

Hadoop cluster performance profiler

Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex

Project Slides for Website 2020-22.pptx

PAC 2019 virtual Alexander Podelko

A Framework for Scene Recognition Using Convolutional Neural Network as Featu...

Python Raster Function - Esri Developer Conference - 2015

Recently uploaded

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor

Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster

CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor

Ukraine War presentation: KNOW THE BASICSAishani27

FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083

Industrialised data - the key to AI success.pdfLars Albertsson

Week-01-2.ppt BBB human Computer interactionfulawalesam

E-Commerce Order PredictionShraddha Kamble.pptxBoston Institute of Analytics

BigBuy dropshipping via API with DroFx.pptxolyaivanovalion

Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann

Carero dropshipping via API with DroFx.pptxolyaivanovalion

100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate

Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth

RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh

Smarteg dropshipping via API with DroFx.pptxolyaivanovalion

Halmar dropshipping via API with DroFxolyaivanovalion

Recently uploaded (20)

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai

Call Girls In Mahipalpur O9654467111 Escorts Service

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx

CebaBaby dropshipping via API with DroFX.pptx

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...

Ukraine War presentation: KNOW THE BASICS

FESE Capital Markets Fact Sheet 2024 Q1.pdf

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call

Industrialised data - the key to AI success.pdf

Week-01-2.ppt BBB human Computer interaction

E-Commerce Order PredictionShraddha Kamble.pptx

BigBuy dropshipping via API with DroFx.pptx

Generative AI on Enterprise Cloud with NiFi and Milvus

Carero dropshipping via API with DroFx.pptx

100-Concepts-of-AI by Anupama Kate .pptx

Unveiling Insights: The Role of a Data Analyst

RA-11058_IRR-COMPRESS Do 198 series of 1998

Smarteg dropshipping via API with DroFx.pptx

Halmar dropshipping via API with DroFx

Application of the Actor Model to Large Scale NDE Data Analysis

1. Application of the Actor Model to Large Scale NDE Data Analysis SPIE Smart Structures and NDE for Industry 4.0 4 - 8 March 2018

2. Emphysic® Chris Coughlin

3. • Distributed Processing Architectures • Actor Model • Defect Detection Algorithm • Sample Results • Q & A Agenda Introduction Myriad Desktop UI Emphysic Actor Model for NDE Analysis

4. Comparing Distributed Processing Models Architecture Apache Spark Batch Processing Model (aka Map-Reduce) Apache Storm Stream Processing Model Akka Actor Processing Model Emphysic Actor Model for NDE Analysis

5. • Lightweight • 1 actor ~ 300 bytes in RAM • Fault-tolerant • “Let it crash” • Configurable • Understandable Benefits of Actor Model Emphysic Actor Model for NDE Analysis

6. • Actor-based “pipeline parallelism” structure • Algorithm is divided into a series of concurrent stages • Each stage in the algorithm consists of a central routing Actor, one or more worker Actors, and a work queue • Output of one stage  input to subsequent stage • Pyramid Actor blurs and subsamples data, sends each step to a Window Actor • For each Window the Window Actor sends to a Defect Scanner Actor • Defect Scanner Actor sends to Reporter Actor Overview Architecture Defect Detection Structure Emphysic Actor Model for NDE Analysis

7. • Blur • Convolve with a blur kernel, usually • Box or • Gaussian • Usually approximated w. 3 Box filter passes • Must account for edges • Subsample • Also known as down-sampling or decimation • Take every nth element Pyramid Actor Algorithms & Components Pyramid algorithm Emphysic Actor Model for NDE Analysis

8. Gaussian Pyramid Algorithms & Components STEP 1 STEP 2 STEP 3 STEP 4 80×80 40×40 20×20 10×10 Emphysic Actor Model for NDE Analysis

9. • Scan across each dataset • Each window is scanned independently for defect signals • Tradeoffs: • Speed of scan affected by size of input data and • Size of window • Amount of overlap (smaller step size) • Step size makes it more likely to detect ROI but also more likely to find the same ROI multiple times Window Actor Algorithms & Components Sliding Window Algorithm Emphysic Actor Model for NDE Analysis

10. • Simple Interface – get data, return True if defect found • Bundle online learning algorithm with (optional) preprocessor into a single small (~ 10kB) binary package, or • Parallelize existing algorithms • No need to port to Java, can call external code (Python, MATLAB, C++, etc.) with system call Defect Scan Actor Algorithms & Components Defect scanner interface Emphysic Actor Model for NDE Analysis

11. • Compiles results of defect scanning • Every stage in the process adds metadata to the message • Data ingestion – data source • Pyramid – scaling factor • Sliding Window – position within scaled data • Defect detection – ROI found • Metadata allows the Reporting stage to find ROI relative to the original input Reporter Actor Algorithms & Components Reporting ROI Results Emphysic Actor Model for NDE Analysis

12. • Training data • 2-D slices of ultrasonic • 15x15 elements • Model • Passive Aggressive learning algorithm • Sobel edge detection preprocessing • Pipeline • 423 workers • 1 Ingestor • 2 Scalers • 4 Pre-processors • 128 Sliding • 256 Defect • 32 Reporters Using A Model Demonstration Emphysic Actor Model for NDE Analysis

13. • Sample Input • 33 separate data files (CSV, JPEG, TIFF, etc.) • 60 million data points • Single System Single Process (SSSP) • Eight cores 32GB RAM • 1 process • Single System Multiprocess (SSMP) • Eight cores 32GB RAM • 184 processes • Multisystem Multiprocess (MSMP) • Eight cores 32GB RAM local • Eight cores 32GB RAM remote (Azure VM) • 88 local processes 128 remote (216 total) Trial Number Architecture SSSP SSMP MSMP 1 302.66 106.69 107.28 2 299.16 99.43 106.94 3 297.00 111.87 106.11 4 303.22 110.20 106.05 5 299.39 103.83 106.13 Mean Processing Time [s] 300.28 106.40 106.50 Mean Throughput [Points Per Second] 2.07E+05 5.87E+05 5.85E+05 Sample Throughputs Emphysic Actor Model for NDE Analysis

14. Sample Throughputs Emphysic Actor Model for NDE Analysis

15. Sample Throughputs – Doubled Input Emphysic Actor Model for NDE Analysis

16. ANY QUESTIONS? Emphysic Actor Model for NDE Analysis

Editor's Notes

When designing a distributed processing system there are two primary models you’ll encounter. Batch processing is typically used for “slow” data, when you have a large amount of data already and/or it’s OK if you take hours to process. Stream processing is for “fast” data in which the data comes in continuously and/or you need to analyze in or near real time. Actor processing is a third model that’s not as well known as the first two, but is used in Google’s Go programming language, the Erlang programming language, and if you dig deep into Spark’s structure. Actors are independent entities that neither know nor care about the rest of the system, only interacting with their mailbox.
Often when considering a distributed processing structure you’ll have a mental model of your application and how you envision processing the data. Batch and stream processing tend to force you to adapt your mental model to their mode of operation, while as a lower-level architecture the Actor model is more easily adapted to fit your approach.
Each stage runs simultaneously and is itself concurrent. Stages can be on the same or different systems – as long as a stage is reachable at a URL it can be anywhere and can be moved dynamically (i.e. without recompiling or even restarting the running application).
The Gaussian Pyramid stage provides scale invariance to our defect detection. If a flaw signal was much larger than the area the defect detection algorithm scans for example it might not be detected without considering the input data at multiple scales.
The Sliding Window stage extracts subsets or “windows” of data from the output of the Gaussian Pyramid stage. The size of the window is determined by the size of the input expected by the defect detection algorithm.
Although in the present work we use machine learning algorithms, any algorithm in any language can be built into a distributed data processing pipeline.
Each stage of our detection pipeline not only sends output data to the subsequent stage but metadata as well. The result is by the time we get to the end of the pipeline, we have sufficient information about what’s happened to the data that we can visually indicate detected anomalies on the original input.
In this video, a mid-range 8 core desktop is used to spin up more than 400 Actors to build a local processing pipeline. We can build an ad-hoc distributed pipeline by updating the URL of one or more stages to point at a remote system. We see that ROI are often detected several times which is normal in machine vision applications – here it’s because we’re seeing the ROI multiple times as we resize and raster across the input. Later builds of the desktop tool include tools to reduce this visual clutter (union of bounding boxes, non-maximum suppression techniques, confidence thresholds, etc.).
Each of the applications uses the algorithm we’ve outlined and only differs in the number of processes and systems in the pipeline. As expected multiple processes are able to process our sample dataset much faster than a single process.
At first glance the chart on the left doesn’t provide much support for a distributed architecture considering it has nearly the same data throughput. In a processing pipeline however it’s not just data points per second we’re interested in, we’re also interested in the system’s capacity to gracefully deal with bursts or long-term increases in data. One way to gauge capacity is to measure resource usage during processing, and as the chart on the right shows the single system multiprocess application is using virtually all available resources while the distributed architecture is not. This suggests that the single system is likely already running at peak capacity and would not be able to handle an increase in data.
When the sample input is doubled, both the distributed and the single process systems deal with the increase gracefully (doubled input leads to roughly doubled output time). In contrast the single system multiprocess system which was already using 100% of the CPU is not able to deal with the increase and wasn’t able to complete processing after more than 6 hours of runtime.

Application of the Actor Model to Large Scale NDE Data Analysis

Recommended

Recommended

More Related Content

What's hot

What's hot (12)

Similar to Application of the Actor Model to Large Scale NDE Data Analysis

Similar to Application of the Actor Model to Large Scale NDE Data Analysis (20)

Recently uploaded

Recently uploaded (20)

Application of the Actor Model to Large Scale NDE Data Analysis

Editor's Notes